## How to tune parameters?

1. Select the most influential params
    - Understand which ones to tune first (someimes stated in the documentation) by knowing what their influence is.
2. Tune them: Manually or Automatically (hyperopt, scikit-optimize, spearmint, GPyOpt, RoBO, SMAC3) by defining a search space.

## Only tuning doesn't optimize a model

- Feature engineering helps more than hyperparameter tunning, so don't spend too much time doing tunning.
- It may help but when having more computational power or time to let the model train.

## Final models

- Submit 5 models with small deviations from optimal parameters.
- Use different random seeds.

## Split parameters into 2 groups

#### Red parameter:
- Increasing it **impedes fitting**, so increase it to reduce overfitting.
- Decrease to allow easier model fit

#### Green parameter:
- Increasing it **leads to overfitting**, so increase it if the model underfits.

## 1. TB Models

### 1.1. Gradient Boosting: GBDT, XGBoost, LightGBM, CatBoost

Regularized Greedy Forest (RGF)

These 2 are the most common, so one needs to know how to tune them.

Most important parameters (because this model is super common):
- max_depth = 7
- min_child_weight = 0, 5, 15, or 300
- eta
- num_round

A nice explanation of these parameters is in this video https://www.coursera.org/learn/competitive-data-science/lecture/wzi5a/hyperparameter-tuning-ii

![](images/GB_params.png)

### 1.2. RF, ExtraTrees

Extratrees has the same parameters as RFs. The difference is that RFs build each tree to be independent of others. That means that for RFs, many trees don't lead to overfitting.

![](images/RF_params.png)

## 2. NNs: Keras, PyTorch, Lasagne, TF, MxNet

This is only for dense NNs.

Most important parameters: 
- Number of neurons per layer: 64 units per layer
- Number of layers: 1 or 2
- Static dropconnect (for regularization), TODO https://www.coursera.org/learn/competitive-data-science/lecture/Hg3xw/hyperparameter-tuning-iii

![](images/NN_params.png)

## 3. Linear Models

### 3.1. SVC/SVR (SVMs) from Scikit-learn

- It uses libLinear and LibSVM. One needs to compiler them for multicore support.

- Logistic and Linear regression

- SGDClassifier/SGDRegressor

![](images/SVM_params.png)
### 3.2. Vowpal Wabbit

- This is used for datasets that don't fit in the memory. (FTRL = Follow the Regularized Leader)



## 4. Factorization Machines (LibFM, LibFFM)

These are not too common so they are not covered.