>>> ## Supervised Learning - Part II (Chapter 5)

From Tuesday's class, you had an opportunity learn few more classification methods and modules to validate.

    Cross-Validation
    Learning Curves
    Support Vector Machines
    Random Forest
    
* In previous lab session, we looked into KNN, preformance metrics, confusion matrix and scikit plot.

* In this lab session, we will look into the following

    * We can look into importing and loading the MNIST Data and install the required libraries (If someone is still facing the trouble loading)
    * SciKit learning methods and useful functions
    * SVM's different Kernels, features and ways to tune the model
    * Similarly Random forest and its model tuning 


Please also download, the week-05 jupyter notebook file, there are explanations that might be necessary to get insightful.

### Useful Links: 
#### http://scikit-learn.org/stable/index.html

In [1]:
# """ Decision boundary plotting function from Hands-On Machine Learning with Scikit-Learn
# and TensorFlow """

def plot_predictions(clf, axes):
    x0s = np.linspace(axes[0], axes[1], 100)
    x1s = np.linspace(axes[2], axes[3], 100)
    x0, x1 = np.meshgrid(x0s, x1s)
    X = np.c_[x0.ravel(), x1.ravel()]
    y_pred = clf.predict(X).reshape(x0.shape)
    y_decision = clf.decision_function(X).reshape(x0.shape)
    plt.contourf(x0, x1, y_pred, cmap=plt.cm.brg, alpha=0.2)
    plt.contourf(x0, x1, y_decision, cmap=plt.cm.brg, alpha=0.1)
    
    
# Function adapted from source: https://jakevdp.github.io/PythonDataScienceHandbook/05.08-random-forests.html

def visualize_classifier(model, X, y, ax=None, cmap='rainbow'):
    ax = ax or plt.gca()
    
    # Plot the training points
    ax.scatter(X[:, 0], X[:, 1], c=y, s=30, cmap=cmap,clim=(y.min(), y.max()), zorder=3)
    ax.axis('tight')
    ax.axis('off')
    xlim = ax.get_xlim()
    ylim = ax.get_ylim()
    
    # fit the estimator
    model.fit(X, y)
    xx, yy = np.meshgrid(np.linspace(*xlim, num=200),
                         np.linspace(*ylim, num=200))
    Z = model.predict(np.c_[xx.ravel(), yy.ravel()]).reshape(xx.shape)

    # Create a color plot with the results
    n_classes = len(np.unique(y))
    contours = ax.contourf(xx, yy, Z, alpha=0.3,
                           levels=np.arange(n_classes + 1) - 0.5,
                           cmap=cmap, clim=(y.min(), y.max()),
                           zorder=1)

    ax.set(xlim=xlim, ylim=ylim)

Perform the following steps:
* Import library numpy, matplotlib, datasets, neighbors, metrics, model selection libraries
* Load cancer std dataset using sklearn datasets and assign to "cancer"
* Look into the various attributes cancer

    

From the link, additional functions and its attributes are illustrated. 

http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html

### Lets have a very quick recap of KNN and model fitting
Perform the below actions:
* From the sklearn model selection module import train test split, use features like random_state, stratify 
* Use KNN method with 3 neighbors


In [None]:



""" No of neighbors can be fixed as shown below, but is not efficient. we do not know the model 
characteristic beforehand and hence to evaluate  """
clf = 

### 1. Compute and visualize the training and testing accuracies when number of neighbors in KNN is changed from 1 to 30

Use the following parameters 
* Test size  = 0.3 
* To maintain consistency among students, let use random state and of 55, 60, 65 and 70
* use a for loop to test knn classifiers from 1 to 30 neighbors, inside looping function you will have to build the knn, train the model and compute the score i.e., knn.score (Xtest, ytest)
* The training and testing accuracy for each neighbor (iteration) can be stored and plotted after looping
* At what choice do you observe large variance and which random state do ML converge
* What is the difference between this plot vs the one discussed in learning curves in class?
* look into time module in python, you can make a note of time to train the model for each iteration, this could be used in the below sections

In [None]:
""" use plot to visualize the duration vs n_neighbors """


## 2. Support Vector Machines
The following reference from Tuesday class gives be very good overview of SVM classifiers and other advanced methods.
Refer to page 16 of the presentation link. SVM classification with kernel trick is visualized. 
https://med.nyu.edu/chibi/sites/default/files/chibi/Final.pdf


Some of the different kernels that can be accessed by Scikit learn are shown below. 
* Linear 
* Gaussian 
* Exponential
* Polynomial 
* hybrid
* Sigmoidal

Read the below reference for accessing kernels in scikit learn
http://scikit-learn.org/stable/modules/svm.html#svm-kernels

We will use Linear, Gaussian and polynominal for current dataset problem and possibly we could try different kernel approach during regression lab session. 

Do the following:
        - The cancer dataset needs to be loaded for working on the below problem
        - Import necessary svm modules

## 2.1. For simiplicity lets consider only the first two features for classification i.e., mean radius and mean texture

In [None]:
# Take mean radius and mean texture for cancer classification data, assign to "X" and cancer target names to "y"


    - Using link provided above for reference, build SVM with different kernel type and use default c, gamma, degree, coeff0 necessary
    - Note: each svm kernel type attribute differs

### A.) Apply "linear Kernel" and observe what happens with different values of gamma and C 

In [None]:
# Using SVC fit complete data of cancer data and target features 
svc = 

###  Once, we have fit the linear SVC kernel, use function plot_predictions function to visualize the classifiers and data

In [None]:
""" Use the function provided in the early cell plot_prediction to visualize """
plot_predictions()


#### Observe what happens when you fine tune the values of C and gamma ? 

    - Similar to linear kernel, train the svc for rbf kernel and visualize. Observe the pattern for different values of c and gamma functions

### B.) Apply "rbf Kernel" and observe what happens with different values of gamma and C 

###  Once, we have fit the rbf SVC kernel, use function plot_predictions function to visualize the SVC working


In [None]:
plot_predictions()


    - Similar to linear kernel, train the svc for poly kernel and visualize. 
    - Do you still need gamma? 
    - Check what all attributes are useful for this kernel, please look into link or sckikit learn help
    - Change for different degrees starting from 1 to 5
    - There might be a case, where jupyter notebook takes almost impossible time to train. In this case, you may need to interrupt the Jupyter Kernel and restart 

### C.) Apply "polynomial Kernel" and observe what happens with different values of degree, C & Coeff 

#### Try for degree  = 1, 3, 5 and 10

In [None]:
plot_predictions()


### Observations:
1. What do you observe as you increase the degree of polynomial (with respect to the plot and time for computation)
2. See what happens when d=1 and linear kernel problem
3. In this case how do you know that degree of polynomial you have reached is optimal ?
4. What do you think of Pros and Cons of SVM


### Solutions



## 2.2. Now that you have been introduced to fine tunning of parameters, apply SVC to complete dataset and observe the score.

# Decision Tree code from class

from sklearn import tree

dt = tree.DecisionTreeClassifier(max_depth=2)
x_train, x_test, y_train, y_test = train_test_split(cancer.data, cancer.target, test_size=0.3, random_state = 5)
dt.fit(x_train, y_train)

import graphviz 
dot_data = tree.export_graphviz(dt, out_file=None, 
                         feature_names=cancer.feature_names,  
                         class_names=cancer.target_names,  
                         filled=True, rounded=True,  
                         special_characters=True)  
graph = graphviz.Source(dot_data)  
graph

# Random Forest

http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

## 1. Develop RF classifier and fine tune the parameter to optimize the result.

- Import the necessary modules for random forest classifiers
- Use random state = 5
- test size = 40%
- check the score
- check the classification report

### Discussion about the visualizing the tree from random forest

https://towardsdatascience.com/how-to-visualize-a-decision-tree-from-a-random-forest-in-python-using-scikit-learn-38ad2d75f21c

http://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html

In [None]:
# We can modify the code for Decision Tree visualization to Random forest
from sklearn.tree import export_graphviz
RFmodel = RandomForestClassifier(n_estimators=50)

X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, test_size=0.3, random_state = 3)
RFmodel.fit(X_train, y_train)

import graphviz 

visual_tree = RFmodel.estimators_[4]
export_graphviz(visual_tree, out_file = 'best_tree.dot', feature_names = cancer.feature_names,
                precision = 2, filled = True, rounded = True, max_depth = None)
graph = graphviz.Source(dot_data)  
graph

## 2. Fine tunning parameters for RF

    - Lets visualize the RF based classifiers and dataset as shown in the previous section about SVM
    - Make sure, you are training the model with only two features (it is upto you).
    - Based on important features, I found column 3 and 8 are cruicial for proper classification
    - Using function visualize_classifier(), visualize the RF and scatter plots of data
    
    ex:
    visualize_classifier(classifier, iris.data[:, :2], iris.target)
    

In [None]:
# develop the random forest model with 50 estimators


In [None]:
# develop the random forest model with 500 estimators



### Do you observe any overfitting or underfitting for any change in parameter change? Make a note of, if yes