# Using several classifiers and tunign parameters - Parameters grid
[From official `scikit-learn` documentation](http://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html)


Example of usage of the ***model selection*** features of `scikit-learn` and comparison of several classification methods.
1. import a sample dataset 
1. split the dataset into two parts: train and test
    - the *train* part will be used for training and validation (i.e. for *development*)
    - the *test* part will be used for test (i.e. for *evaluation*)
    - the fraction of test data will be _ts_ (a value of your choice between 0.2 and 0.5)
1. the function `GridSearchCV` iterates a cross validation experiment to train and test a model with different combinations of paramater values
    - for each parameter we set a list of values to test, the function will generate all the combinations
    - we choose a *score function* which will be used for the optimization
        - e.g. `accuracy_score`, `precision_score`, `cohen_kappa_score`, `f1_score`, see this [page](http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics) for reference
    - the output is a dictionary containing 
        - the set of parameters which maximize the score 
        - the test scores
1. prepare the parameters for the grid
    - it is a list of dictionaries
1. set the parameters by cross validation and the *score functions* to choose from
1. Loop on scores and, for each score, loop on the model labels (see details below)


In [1]:
"""
http://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html
@author: scikit-learn.org and Claudio Sartori
"""
import warnings
warnings.filterwarnings('ignore') # uncomment this line to suppress warnings

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from sklearn.svm import SVC
from sklearn.linear_model import Perceptron
from sklearn.neural_network import MLPClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.naive_bayes import GaussianNB

print(__doc__) # print information included in the triple quotes at the beginning

# Loading a standard dataset
dataset = datasets.load_digits()
#dataset = datasets.fetch_olivetti_faces()
#dataset = datasets.fetch_covtype()
#dataset = datasets.load_iris()
#dataset = datasets.load_wine()
#dataset = datasets.load_breast_cancer()


http://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html
@author: scikit-learn.org and Claudio Sartori



### Prepare the environment
The `dataset` module contains, among others, a few sample datasets.

See this [page](http://scikit-learn.org/stable/datasets/index.html) for reference

Prepare the data and the target in X and y. Set `ts`. Set the random state

In [2]:
X = dataset.data
y = dataset.target
ts = 0.3
random_state = 42

Split the dataset into the train and test parts

In [3]:
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=ts, random_state=random_state)
print("Training on %d examples" % len(X_train))

Training on 1257 examples


The code below is intended to ease the remainder of the exercise

In [7]:
model_lbls = [
             'dt', 
             'nb', 
             'lp', 
             'svc', 
            ]

# Set the parameters by cross-validation
tuned_param_dt = [{'max_depth': range(1,20)}]
tuned_param_nb = [{'var_smoothing': [10, 1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-07, 1e-8, 1e-9, 1e-10]}]
tuned_param_lp = [{'early_stopping': [True]}]
tuned_param_svc = [{'kernel': ['rbf'], 
                    'gamma': [1e-3, 1e-4],
                    'C': [1, 10, 100, 1000],
                    },
                    {'kernel': ['linear'],
                     'C': [1, 10, 100, 1000],                     
                    },
                   ]

models = {
    'dt': {'name': 'Decision Tree       ',
           'estimator': DecisionTreeClassifier(), 
           'param': tuned_param_dt,
          },
    'nb': {'name': 'Gaussian Naive Bayes',
           'estimator': GaussianNB(),
           'param': tuned_param_nb
          },
    'lp': {'name': 'Linear Perceptron   ',
           'estimator': Perceptron(),
           'param': tuned_param_lp,
          },
    'svc':{'name': 'Support Vector      ',
           'estimator': SVC(), 
           'param': tuned_param_svc
          }
}

scores = ['precision', 'recall']

### The function below groups all the outputs
Write a function which has as parameter the fitted model and uses the components of the fitted model to inspect the results of the search with the parameters grid.

The components are:<br>
`model.best_params_`<br>
`model.cv_results_['mean_test_score']`<br>`
model.cv_results_['std_test_score']`<br>
`model.cv_results_['params']`

The classification report is generated by the function imported above from sklearn.metrics, which takes as argument the true and the predicted test labels.

The +/- in the results is obtained doubling the `std_test_score`

The function will be used to print the results for each set of parameters

In [11]:
def print_results(model):
    print("Best parameters set found on train set:")
    print()
    print(clf.best_params_)
    print()
    print("Grid scores on train set:")
    means = model.cv_results_['mean_test_score']
    stds = model.cv_results_['std_test_score']
    params = model.cv_results_['params']
    for mean, std, params_tuple in zip(means, stds, params):
        print("%0.3f (+/-%0.03f) for %r"
             % (mean, std * 2, params_tuple))
    print()
    print("Detailed classification report for the best parameter set:")
    #print()
    print("The model is trained on the full train set.")
    print("The scores are computed on the full test set.")
    #print()
    y_true, y_pred = y_test, model.predict(X_test)
    print(classification_report(y_true, y_pred))
    print()

### Loop on scores and, for each score, loop on the model labels
- iterate varying the score function
    1. iterate varying the classification model among Decision Tree, Naive Bayes, Linear Perceptron, Support Vector
        - activate the *grid search*
            1. the resulting model will be the best one according to the current score function
        - print the best parameter set and the results for each set of parameters using the above defined function
        - print the classification report
        - store the `.best score_` in a dictionary for a final report
    1. print the final report for the current *score funtion*

In [12]:
results_short = {}
for score in scores:
    print("# Tuning hyper-parameters for %s" % score)
    print()
    
    #'%s_macro' % score ## is a string formatting expression
    # the parameter after % is substituted in the string placeholder %s
    for m in model_lbls: 
        print('-'*40)
        print("Trying model {}".format(models[m]['name']))
        
        clf = GridSearchCV(models[m]['estimator'], 
                           models[m]['param'], 
                           cv=5,
                           scoring='%s_macro' % score,
                           iid = False,
                           return_train_score = False,
                           n_jobs = 2, # this allows using multi-cores
                           )
        clf.fit(X_train, y_train)
        print_results(clf)
        results_short[m] = clf.best_score_
    print("Summary of results for {}".format(score))
    print("Estimator")
    for m in results_short.keys():
        print("{}\t - score: {:4.2}%".format(models[m]['name'], results_short[m]))
    print('='*40)

# Tuning hyper-parameters for precision

----------------------------------------
Trying model Decision Tree       
Best parameters set found on train set:

{'max_depth': 12}

Grid scores on train set:
0.076 (+/-0.007) for {'max_depth': 1}
0.214 (+/-0.027) for {'max_depth': 2}
0.422 (+/-0.040) for {'max_depth': 3}
0.567 (+/-0.047) for {'max_depth': 4}
0.720 (+/-0.036) for {'max_depth': 5}
0.769 (+/-0.050) for {'max_depth': 6}
0.816 (+/-0.049) for {'max_depth': 7}
0.819 (+/-0.045) for {'max_depth': 8}
0.827 (+/-0.033) for {'max_depth': 9}
0.828 (+/-0.051) for {'max_depth': 10}
0.833 (+/-0.035) for {'max_depth': 11}
0.843 (+/-0.043) for {'max_depth': 12}
0.822 (+/-0.043) for {'max_depth': 13}
0.822 (+/-0.043) for {'max_depth': 14}
0.840 (+/-0.026) for {'max_depth': 15}
0.832 (+/-0.038) for {'max_depth': 16}
0.839 (+/-0.032) for {'max_depth': 17}
0.826 (+/-0.032) for {'max_depth': 18}
0.829 (+/-0.026) for {'max_depth': 19}

Detailed classification report for the best parameter set:
The mo

Best parameters set found on train set:

{'var_smoothing': 0.1}

Grid scores on train set:
0.888 (+/-0.011) for {'var_smoothing': 10}
0.905 (+/-0.034) for {'var_smoothing': 1}
0.913 (+/-0.038) for {'var_smoothing': 0.1}
0.912 (+/-0.040) for {'var_smoothing': 0.01}
0.902 (+/-0.048) for {'var_smoothing': 0.001}
0.894 (+/-0.059) for {'var_smoothing': 0.0001}
0.889 (+/-0.052) for {'var_smoothing': 1e-05}
0.881 (+/-0.053) for {'var_smoothing': 1e-06}
0.865 (+/-0.059) for {'var_smoothing': 1e-07}
0.849 (+/-0.061) for {'var_smoothing': 1e-08}
0.831 (+/-0.063) for {'var_smoothing': 1e-09}
0.815 (+/-0.061) for {'var_smoothing': 1e-10}

Detailed classification report for the best parameter set:
The model is trained on the full train set.
The scores are computed on the full test set.
              precision    recall  f1-score   support

           0       1.00      0.98      0.99        53
           1       0.92      0.72      0.81        50
           2       0.88      0.94      0.91        47