-
Notifications
You must be signed in to change notification settings - Fork 705
Description
This issue further addresses: sql-machine-learning/playground#42 (comment)
Where are the global variables
-
Exported global map which stores the parameter docs of models, including
sqlflow/pkg/attribute/attribute.go
Lines 219 to 226 in befcd74
var PremadeModelParamsDocs map[string]map[string]string var extractDocStringsOnce sync.Once // OptimizerParamsDocs stores parameters and documents of optimizers var OptimizerParamsDocs map[string]map[string]string // XGBoostObjectiveDocs stores options for xgboost objective var XGBoostObjectiveDocs map[string]string
sqlflow/pkg/attribute/model_parameters.go
Lines 18 to 20 in befcd74
const ModelParameterJSON = ` { "DNNClassifier": {
sqlflow/pkg/attribute/model_parameters.go
Lines 227 to 229 in befcd74
const OptimizerParameterJSON = ` { "Adadelta": {
sqlflow/pkg/attribute/xgboost_objective_params.go
Lines 18 to 20 in befcd74
const XGBoostObjectiveJSON = ` { "binary:hinge": "hinge loss for binary classification. This makes predictions of 0 or 1, rather than producing probabilities.", -
Exported parameter type definitions, including
sqlflow/pkg/attribute/attribute.go
Lines 32 to 45 in befcd74
var ( // Bool indicates that the corresponding attribute is a boolean Bool = reflect.TypeOf(true) // Int indicates that the corresponding attribute is an integer Int = reflect.TypeOf(0) // Float indicates that the corresponding attribute is a float32 Float = reflect.TypeOf(float32(0.)) // String indicates the corresponding attribute is a string String = reflect.TypeOf("") // IntList indicates the corresponding attribute is a list of integers IntList = reflect.TypeOf([]int{}) // Unknown type indicates that the attribute type is dynamically determined. Unknown = reflect.Type(nil) )
How to remove the exported global map which stores the parameter docs of models
Some of these global variables share the same information. For example:
-
PremadeModelParamsDocsis deserialized fromModelParameterJSON. So we can removeModelParameterJSON. Furthermore,ModelParameterJSONis a variable which can be auto generated bypython extract_docstring.py > model_parameters.gocommand. We can do this auto generation usinggo generateto generatePremadeModelParamsDocsautomatically too. -
OptimizerParameterDocsis deserialized fromOptimizerParameterJSON. So we can removeOptimizerParameterJSON. Furthermore,OptimizerParameterJSONcan also be auto generated bypython extract_docstring.py > model_parameters.gousinggo generate. -
XGBoostObjectiveDocsis deserialized fromXGBoostObjectiveJSON. So we can removeXGBoostObjectiveJSONand keepXGBoostObjectiveDocs.
Note that PremadeModelParamsDocs, OptimizerParameterDocs and XGBoostObjectiveDocs are also used in cli prompt suggestion (see 1, 2, 3). So we cannot remove these 3 variables or hide them.
How to remove exported parameter type definitions
sql-machine-learning/playground#42 (comment) suggests to enhance compile time data checking using the following ways.
var distributedTrainingAttributes = attribute.Dictionary{}.
Int("train.num_ps", 0, "", nil).
Int("train.num_workers", 1, "", nil).
Int("train.worker_cpu", 400, "", nil)In this way, attribute.Description, attribute.Int, attribute.Float, etc can be hidden. The signature of Dictionary.Int would be:
func (d Dictionary) Int(name string, value int, doc string, checker func(int) error)The only concern of this method is that we cannot support nil default value. Some of the models may have attributes with nil default values. For example, the default value of num_class in XGBoost model is nil(see here), because only multi-class (>2) classification models need num_class while the other models do not need num_class. And once num_class is provided in SQL WITH statement, it must be an integer number, so the data type of num_class attribute should be int. The meaning of nil default value in SQLFlow is:
- If the attribute is provided in
WITHstatement, SQLFlow would check whether its type is right and callDescription.Checker()to check whether it is valid. For example, ifnum_classis provided in SQLWITHstatement, SQLFlow would first check whether the value ofnum_classis an integer, and callDescription.Checker()to check whether it is a positive number. - If the attribute is not provided in
WITHstatement, nothing would be checked.
We can enhance this method to support nil default value.
- Add
Defaultmethod to set default value. The code is something like: Redesign attribute.go playground#42 (comment) .
NewDictionary().Int("num_class", "Number of classes").Default(5)Since the Default method can be used both after Int(...) and Float(...), the input parameter type of Default must be interface{}. Therefore, the signature of Default method should be
func (d Dictionary) Default(interface{}) DictionaryIn this way, we cannot check whether the default value is of the right type in compile time.
- Add both
Dictionary.IntandDictionary.IntOrNilmethod. The signature ofDictionary.IntOrNilmethod is like:
func (d Dictionary) IntOrNil(name string, doc string, checker func(int) error).In this way, we would double the APIs to Dictionary.
- Use optional package to make
Dictionary.Intmethod accepts both int default value and nil default value. The signature ofDictionary.Intwould be
func (d Dictionary) Int(name string, defaultValue optional.Int,
doc string, checker func(int) error) {
...
}
var dict = Dictionary{}.
Int("num_class", optional.Int{}, "doc1", checker1). // default value is nil
Int("attr_with_default_value", optional.NewInt(0), "doc2", checker2) // default value is 0