<a href="https://colab.research.google.com/github/moeinsql/ML_DL_Examples/blob/main/ml_lab_03_03_using_several_classifiers.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using several classifiers and tuning parameters - Parameters grid
[From official `scikit-learn` documentation](http://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html)

Adapted by Claudio Sartori

Example of usage of the ***model selection*** features of `scikit-learn` and comparison of several classification methods.
1. import a sample dataset 
1. split the dataset into two parts: train and test
    - the *train* part will be used for training and validation (i.e. for *development*)
    - the *test* part will be used for test (i.e. for *evaluation*)
    - the fraction of test data will be _ts_ (a value of your choice between 0.2 and 0.5)
1. the function `GridSearchCV` iterates a cross validation experiment to train and test a model with different combinations of paramater values
    - for each parameter we set a list of values to test, the function will generate all the combinations
    - we choose a *score function* which will be used for the optimization
        - e.g. `accuracy_score`, `precision_score`, `cohen_kappa_score`, `f1_score`, see this [page](http://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics) for reference
    - the output is a dictionary containing 
        - the set of parameters which maximize the score 
        - the test scores
1. prepare the parameters for the grid
    - it is a list of dictionaries
1. set the parameters by cross validation and the *score functions* to choose from
1. Loop on scores and, for each score, loop on the model labels (see details below)

In [1]:
"""
http://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html
@author: scikit-learn.org and Claudio Sartori
"""
import warnings
warnings.filterwarnings('ignore') # uncomment this line to suppress warnings

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from sklearn.svm import SVC
from sklearn.linear_model import Perceptron
from sklearn.neural_network import MLPClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.naive_bayes import GaussianNB
from sklearn.neighbors import KNeighborsClassifier
import numpy as np
print(__doc__) # print information included in the triple quotes at the beginning

# Loading a standard dataset
#dataset = datasets.load_digits()
# dataset = datasets.fetch_olivetti_faces() # 40 classes!
# dataset = datasets.fetch_covtype()        # 581012 examples	 54 features 
# dataset = datasets.load_iris()    # 150 examples -- 4 features -- 3 classes
dataset = datasets.load_wine()      # 178 examples -- 13 features -- 3 classes
# dataset = datasets.load_breast_cancer() # binary


http://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_digits.html
@author: scikit-learn.org and Claudio Sartori



### Prepare the environment
The `dataset` module contains, among others, a few sample datasets.

See this [page](http://scikit-learn.org/stable/datasets/index.html) for reference

Prepare the data and the target in X and y. Set `ts`. Set the random state

In [18]:
X = dataset.data
y = dataset.target

Split the dataset into the train and test parts

In [16]:
Xtrain, Xtest, ytrain, ytest = train_test_split(X,y, random_state=10)

The code below is intended to ease the remainder of the exercise

In [17]:
model_lbls = [
             'dt', 
             'nb', 
#              'lp', 
#              'svc', 
#              'knn',
            ]

# Set the parameters to be explored by the grid for each classifier
tuned_param_dt = [{'max_depth': list(range(1,20))}]
tuned_param_nb = [{'var_smoothing': [10, 1, 1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-07, 1e-8, 1e-9, 1e-10]}]
tuned_param_lp = [{'early_stopping': [True]}]
tuned_param_svc = [{'kernel': ['rbf'], 
                    'gamma': [1e-3, 1e-4],
                    'C': [1, 10, 100, 1000],
                    },
                    {'kernel': ['linear'],
                     'C': [1, 10, 100, 1000],                     
                    },
                   ]
tuned_param_knn =[{'n_neighbors': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]}]

# set the models to be fitted specifying name, estimator and parameter structure
models = {
    'dt': {'name': 'Decision Tree       ',
           'estimator': DecisionTreeClassifier(), 
           'param': tuned_param_dt,
          },
    'nb': {'name': 'Gaussian Naive Bayes',
           'estimator': GaussianNB(),
           'param': tuned_param_nb
          },
    'lp': {'name': 'Linear Perceptron   ',
           'estimator': Perceptron(),
           'param': tuned_param_lp,
          },
    'svc':{'name': 'Support Vector      ',
           'estimator': SVC(), 
           'param': tuned_param_svc
          },
    'knn':{'name': 'K Nearest Neighbor ',
           'estimator': KNeighborsClassifier(),
           'param': tuned_param_knn
        
    }
}

# scores to be explored
scores = [
          'precision', 
#           'recall',
         ]

### The function below groups all the outputs
Write a function which has as parameter the fitted model and uses the components of the fitted model to inspect the results of the search with the parameters grid.

The components are:<br>
`model.best_params_`<br>
`model.cv_results_['mean_test_score']`<br>`
model.cv_results_['std_test_score']`<br>
`model.cv_results_['params']`

The classification report is generated by the function imported above from sklearn.metrics, which takes as argument the true and the predicted test labels.

The +/- in the results is obtained doubling the `std_test_score`. Mean and standard test scores are computed considering the various results on the cross-validation chunks.

The function will be used to print the results for each set of parameters

In [58]:
def showResult(gscv):
    print("Grid scores on development set:")
    print()
    print("Best parameters set found on development set:")
    print()

    print(gscv.best_params_)
    print()

    print("Grid scores on development set:")
    print()


    means = gscv.cv_results_["mean_test_score"]
    stds = gscv.cv_results_["std_test_score"]
    for mean, std, params in zip(means, stds, gscv.cv_results_["params"]):
        print("%0.3f (+/-%0.03f) for %r" % (mean, std * 2, params))
    print()

    print("Detailed classification report:")
    print()
    print("The model is trained on the full development set.")
    print("The scores are computed on the full evaluation set.")
    print()
    y_true, y_pred = ytest, gscv.predict(Xtest)
    print(classification_report(y_true, y_pred))
    print()


### Loop on scores and, for each score, loop on the model labels
- iterate varying the score function
    1. iterate varying the classification model among Decision Tree, Naive Bayes, Linear Perceptron, Support Vector
        - activate the *grid search*
            1. the resulting model will be the best one according to the current score function
        - print the best parameter set and the results for each set of parameters using the above defined function
        - print the classification report
        - store the `.best score_` in a dictionary for a final report
    1. print the final report for the current *score funtion*

In [59]:
for score in scores:
  print("# Tuning hyper-parameters for %s" % score)
  print()

  for modelid, modeldict in models.items():  
    gscv = GridSearchCV(modeldict.get('estimator'), modeldict.get('param'), scoring="%s_macro" % score)
    gscv.fit(Xtrain, ytrain)
    showResult(gscv)



# Tuning hyper-parameters for precision

Grid scores on development set:

Best parameters set found on development set:

{'max_depth': 6}

Grid scores on development set:

0.444 (+/-0.073) for {'max_depth': 1}
0.916 (+/-0.093) for {'max_depth': 2}
0.925 (+/-0.114) for {'max_depth': 3}
0.899 (+/-0.108) for {'max_depth': 4}
0.932 (+/-0.096) for {'max_depth': 5}
0.947 (+/-0.106) for {'max_depth': 6}
0.921 (+/-0.084) for {'max_depth': 7}
0.929 (+/-0.102) for {'max_depth': 8}
0.924 (+/-0.097) for {'max_depth': 9}
0.930 (+/-0.090) for {'max_depth': 10}
0.917 (+/-0.098) for {'max_depth': 11}
0.936 (+/-0.083) for {'max_depth': 12}
0.930 (+/-0.103) for {'max_depth': 13}
0.914 (+/-0.119) for {'max_depth': 14}
0.936 (+/-0.080) for {'max_depth': 15}
0.929 (+/-0.100) for {'max_depth': 16}
0.917 (+/-0.115) for {'max_depth': 17}
0.927 (+/-0.117) for {'max_depth': 18}
0.891 (+/-0.193) for {'max_depth': 19}

Detailed classification report:

The model is trained on the full development set.
The scores a