<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Classifier-Visualization-Playground" data-toc-modified-id="Classifier-Visualization-Playground-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Classifier Visualization Playground</a></span></li><li><span><a href="#plot_mushroom_boundary()" data-toc-modified-id="plot_mushroom_boundary()-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>plot_mushroom_boundary()</a></span></li><li><span><a href="#LogisticRegression" data-toc-modified-id="LogisticRegression-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>LogisticRegression</a></span></li><li><span><a href="#KNeighborsClassifier" data-toc-modified-id="KNeighborsClassifier-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>KNeighborsClassifier</a></span></li><li><span><a href="#DecisionTreeClassifier" data-toc-modified-id="DecisionTreeClassifier-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>DecisionTreeClassifier</a></span><ul class="toc-item"><li><span><a href="#max_depth=3" data-toc-modified-id="max_depth=3-5.1"><span class="toc-item-num">5.1&nbsp;&nbsp;</span>max_depth=3</a></span></li><li><span><a href="#Maximum-Depth-[Overfitted]" data-toc-modified-id="Maximum-Depth-[Overfitted]-5.2"><span class="toc-item-num">5.2&nbsp;&nbsp;</span>Maximum Depth [Overfitted]</a></span></li></ul></li><li><span><a href="#RandomForestClassifier" data-toc-modified-id="RandomForestClassifier-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>RandomForestClassifier</a></span></li><li><span><a href="#SVC" data-toc-modified-id="SVC-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>SVC</a></span><ul class="toc-item"><li><span><a href="#Linear-SVC" data-toc-modified-id="Linear-SVC-7.1"><span class="toc-item-num">7.1&nbsp;&nbsp;</span>Linear SVC</a></span></li><li><span><a href="#RBF(Radial-Basis-Function)-kernel-SVC" data-toc-modified-id="RBF(Radial-Basis-Function)-kernel-SVC-7.2"><span class="toc-item-num">7.2&nbsp;&nbsp;</span>RBF(Radial Basis Function) kernel SVC</a></span><ul class="toc-item"><li><span><a href="#RBF-SVC-[C-=-1]" data-toc-modified-id="RBF-SVC-[C-=-1]-7.2.1"><span class="toc-item-num">7.2.1&nbsp;&nbsp;</span>RBF SVC [C = 1]</a></span></li><li><span><a href="#RBF-SVC-[C-=-10]" data-toc-modified-id="RBF-SVC-[C-=-10]-7.2.2"><span class="toc-item-num">7.2.2&nbsp;&nbsp;</span>RBF SVC [C = 10]</a></span></li></ul></li></ul></li><li><span><a href="#GaussianNB-(Naive-Bayes)" data-toc-modified-id="GaussianNB-(Naive-Bayes)-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>GaussianNB (Naive Bayes)</a></span></li><li><span><a href="#MLPClassifier" data-toc-modified-id="MLPClassifier-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>MLPClassifier</a></span></li></ul></div>

---

_You are currently looking at **version 1.0** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-machine-learning/resources/bANLa) course resource._

---

# Classifier Visualization Playground

The purpose of this notebook is to let you visualize various classsifiers' decision boundaries.

The data used in this notebook is based on the [UCI Mushroom Data Set](http://archive.ics.uci.edu/ml/datasets/Mushroom?ref=datanews.io) stored in `mushrooms.csv`. 

In order to better vizualize the decision boundaries, we'll perform Principal Component Analysis (PCA) on the data to reduce the dimensionality to 2 dimensions. Dimensionality reduction will be covered in a later module of this course.

Play around with different models and parameters to see how they affect the classifier's decision boundary and accuracy!

In [1]:
%matplotlib notebook

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.model_selection import train_test_split

df = pd.read_csv('mushrooms.csv')
df2 = pd.get_dummies(df)

df3 = df2.sample(frac=0.08)

X = df3.iloc[:,2:]
y = df3.iloc[:,1]


pca = PCA(n_components=2).fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(pca, y, random_state=0)


plt.figure(dpi=120)
plt.scatter(pca[y.values==0,0], pca[y.values==0,1], alpha=0.5, label='Edible', s=2)
plt.scatter(pca[y.values==1,0], pca[y.values==1,1], alpha=0.5, label='Poisonous', s=2)
plt.legend()
plt.title('Mushroom Data Set\nFirst Two Principal Components')
plt.xlabel('PC1')
plt.ylabel('PC2')
plt.gca().set_aspect('equal')

<IPython.core.display.Javascript object>

# plot_mushroom_boundary()

In [2]:
def plot_mushroom_boundary(X, y, fitted_model):

    plt.figure(figsize=(9.8,5), dpi=100)
    
    for i, plot_type in enumerate(['Decision Boundary', 'Decision Probabilities']):
        plt.subplot(1,2,i+1)

        mesh_step_size = 0.01  # step size in the mesh
        x_min, x_max = X[:, 0].min() - .1, X[:, 0].max() + .1
        y_min, y_max = X[:, 1].min() - .1, X[:, 1].max() + .1
        xx, yy = np.meshgrid(np.arange(x_min, x_max, mesh_step_size), np.arange(y_min, y_max, mesh_step_size))
        if i == 0:
            Z = fitted_model.predict(np.c_[xx.ravel(), yy.ravel()])
        else:
            try:
                Z = fitted_model.predict_proba(np.c_[xx.ravel(), yy.ravel()])[:,1]
            except:
                plt.text(0.4, 0.5, 'Probabilities Unavailable', horizontalalignment='center',
                     verticalalignment='center', transform = plt.gca().transAxes, fontsize=12)
                plt.axis('off')
                break
        Z = Z.reshape(xx.shape)
        plt.scatter(X[y.values==0,0], X[y.values==0,1], alpha=0.4, label='Edible', s=5)
        plt.scatter(X[y.values==1,0], X[y.values==1,1], alpha=0.4, label='Posionous', s=5)
        plt.imshow(Z, interpolation='nearest', cmap='RdYlBu_r', alpha=0.15, 
                   extent=(x_min, x_max, y_min, y_max), origin='lower')
        plt.title(plot_type + '\n' + 
                  str(fitted_model).split('(')[0]+ ' Test Accuracy: ' + str(np.round(fitted_model.score(X, y), 5)))
        plt.gca().set_aspect('equal');
        
    plt.tight_layout()
    plt.subplots_adjust(top=0.9, bottom=0.08, wspace=0.02)

# LogisticRegression

In [3]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

# KNeighborsClassifier

In [4]:
from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=20)
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

# DecisionTreeClassifier
## max_depth=3

In [5]:
from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier(max_depth=3)
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

## Maximum Depth [Overfitted]

In [6]:
from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier()
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

# RandomForestClassifier

In [7]:
from sklearn.ensemble import RandomForestClassifier

model = RandomForestClassifier()
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

# SVC
## Linear SVC

In [8]:
from sklearn.svm import SVC

model = SVC(kernel='linear')
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

## RBF(Radial Basis Function) kernel SVC 
### RBF SVC [C = 1]

In [9]:
from sklearn.svm import SVC

model = SVC(kernel='rbf', C=1)
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

### RBF SVC [C = 10]

In [10]:
from sklearn.svm import SVC

model = SVC(kernel='rbf', C=10)
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

# GaussianNB (Naive Bayes)

In [11]:
from sklearn.naive_bayes import GaussianNB

model = GaussianNB()
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)

<IPython.core.display.Javascript object>

# MLPClassifier

In [12]:
from sklearn.neural_network import MLPClassifier

model = MLPClassifier()
model.fit(X_train,y_train)

plot_mushroom_boundary(X_test, y_test, model)



<IPython.core.display.Javascript object>