# Lab 14 SVMs: Illuminating Advanced Classifiers

Justin Breucop

Today we'll cover Support Vector Machines: linear vs. rbf

## SVMs

Support vector machines are powerful tools for performing analysis, built on the theory that there is a higher dimension where data can be seperated (via an appropriate hyperplane for that dimension).

As always, we'll import our standard packages, as well as two new ones: svm.SVC & tree.DecisionTreeClassifier. SVC stands for Support Vector Classification. There is an SVR class as well but that is for using SVMs in regression, which is out of scope for this lab.

In [1]:
import numpy as np
import pandas as pd

from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.preprocessing import OneHotEncoder
from sklearn.metrics import classification_report
from sklearn.cross_validation import ShuffleSplit

from bokeh.plotting import figure,show,output_notebook
output_notebook()

%matplotlib inline

An SVM can also be used for categorical data. Because SVMs are more complex than most classification algorithms we've seen, there are many more parameters to tune and options to set for the SVC. Sklearn SVC documentation:

http://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC

### Load the data!
To demonstrate these classifiers clearly, we will use the Iris dataset again

In [2]:
from sklearn import datasets

# import some data to play with
iris_port = datasets.load_iris()
iris = pd.DataFrame(iris_port.data,columns=iris_port.feature_names)
y = iris_port.target
X = iris

In [3]:
index = range(0,len(X))
np.random.shuffle(index)
train = index[:len(X)*3/5]
test = index[len(X)*3/5:]

In [4]:
model = SVC(kernel='linear',C=1).fit(X.iloc[train],y[train])
print classification_report(y[test],model.predict(X.iloc[test]))

             precision    recall  f1-score   support

          0       1.00      1.00      1.00        16
          1       0.92      0.96      0.94        24
          2       0.95      0.90      0.92        20

avg / total       0.95      0.95      0.95        60



The linear kernel has a coef\_ attribute we can use to plot our features. The coefficients are provided in the order of the classifier target (row 1 corresponds to target 1, etc.)

We'll be able to visually see how important each feature is to our model.

In [5]:
names = iris_port.target_names

In [6]:
data = {}
for i,row in enumerate(model.coef_):
    # Enumerate gives us a counter i for each value in model.coef_
    # we use that to get the full name of the iris.
    data[names[i]] = list(row)
data

{'setosa': [0.0096304177203458963,
  0.53760771538505048,
  -0.82697150618737991,
  -0.38199630459442924],
 'versicolor': [-0.15228426789115657,
  0.13536379368102813,
  -0.49069375209372701,
  -0.23688663894179923],
 'virginica': [-0.058117116664803348,
  0.43927633441967373,
  -1.5281343626024011,
  -1.7967773205978155]}

In [7]:
from bokeh.charts import Bar, show
from bokeh.charts.attributes import cat

#p=Bar(data, cat=list(iris.columns), title="SVC Feature Importance",
#        xlabel='Flowers', ylabel='Linear Coefficient', width=600, height=600, legend="top_right")
#show(p)

p=Bar(data, label='setosa', title="SVC Feature Importance",
        xlabel='Flowers', ylabel='Linear Coefficient', width=600, height=600, legend="top_right")
show(p)

### Aside: Charts as Higher Level Glyph
Bokeh has some default charting functions such as bar which intake data in a specific pattern to generate pretty charts in a small amount of code.

There are more: http://bokeh.pydata.org/en/latest/docs/user_guide/charts.html#userguide-charts

###End Aside

As a reminder, Precision is only part of the story. Classification Report gives us a pretty full understanding of what's going on. 

In [8]:
model = SVC(kernel='rbf',C=1).fit(X.iloc[train],y[train])
print classification_report(y[test],model.predict(X.iloc[test]))

             precision    recall  f1-score   support

          0       1.00      1.00      1.00        16
          1       0.92      0.96      0.94        24
          2       0.95      0.90      0.92        20

avg / total       0.95      0.95      0.95        60



##Hands-on: Mushrooms!

Today we'll be working with a mushroom dataset. If you're lost in a forest and find a gill capped mushroom and have access to your SVM classifier, you'll hopefully be prepared to see if it's poisonous! Humor aside, we'll see the power of an SVM working with a large number of attributes to separate two classes of data.

The attributes are:
1. cap-shape: bell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s 
2. cap-surface: fibrous=f,grooves=g,scaly=y,smooth=s 
3. cap-color: brown=n,buff=b,cinnamon=c,gray=g,green=r, pink=p,purple=u,red=e,white=w,yellow=y 
4. bruises?: bruises=t,no=f 
5. odor: almond=a,anise=l,creosote=c,fishy=y,foul=f, musty=m,none=n,pungent=p,spicy=s 
6. gill-attachment: attached=a,descending=d,free=f,notched=n 
7. gill-spacing: close=c,crowded=w,distant=d 
8. gill-size: broad=b,narrow=n 
9. gill-color: black=k,brown=n,buff=b,chocolate=h,gray=g, green=r,orange=o,pink=p,purple=u,red=e, white=w,yellow=y 
10. stalk-shape: enlarging=e,tapering=t 
11. stalk-root: bulbous=b,club=c,cup=u,equal=e, rhizomorphs=z,rooted=r,missing=? 
12. stalk-surface-above-ring: fibrous=f,scaly=y,silky=k,smooth=s 
13. stalk-surface-below-ring: fibrous=f,scaly=y,silky=k,smooth=s 
14. stalk-color-above-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y 
15. stalk-color-below-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y 
16. veil-type: partial=p,universal=u 
17. veil-color: brown=n,orange=o,white=w,yellow=y 
18. ring-number: none=n,one=o,two=t 
19. ring-type: cobwebby=c,evanescent=e,flaring=f,large=l, none=n,pendant=p,sheathing=s,zone=z 
20. spore-print-color: black=k,brown=n,buff=b,chocolate=h,green=r, orange=o,purple=u,white=w,yellow=y 
21. population: abundant=a,clustered=c,numerous=n, scattered=s,several=v,solitary=y 
22. habitat: grasses=g,leaves=l,meadows=m,paths=p, urban=u,waste=w,woods=d



## Aside: String Processing for Fun And Profit

Because of the structure of the categories, I'm going to create a column:categories dictionary. First step is to put the data into a doc-string which is a special string defined by three apostrophes. The string accepts new lines and ends only when it sees another three apostrophes.

In [9]:
attributes = '''cap-shape: bell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s 
cap-surface: fibrous=f,grooves=g,scaly=y,smooth=s 
cap-color: brown=n,buff=b,cinnamon=c,gray=g,green=r, pink=p,purple=u,red=e,white=w,yellow=y 
bruises?: bruises=t,no=f 
odor: almond=a,anise=l,creosote=c,fishy=y,foul=f, musty=m,none=n,pungent=p,spicy=s 
gill-attachment: attached=a,descending=d,free=f,notched=n 
gill-spacing: close=c,crowded=w,distant=d 
gill-size: broad=b,narrow=n 
gill-color: black=k,brown=n,buff=b,chocolate=h,gray=g, green=r,orange=o,pink=p,purple=u,red=e, white=w,yellow=y 
stalk-shape: enlarging=e,tapering=t 
stalk-root: bulbous=b,club=c,cup=u,equal=e, rhizomorphs=z,rooted=r,missing=? 
stalk-surface-above-ring: fibrous=f,scaly=y,silky=k,smooth=s 
stalk-surface-below-ring: fibrous=f,scaly=y,silky=k,smooth=s 
stalk-color-above-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y 
stalk-color-below-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y 
veil-type: partial=p,universal=u 
veil-color: brown=n,orange=o,white=w,yellow=y 
ring-number: none=n,one=o,two=t 
ring-type: cobwebby=c,evanescent=e,flaring=f,large=l, none=n,pendant=p,sheathing=s,zone=z 
spore-print-color: black=k,brown=n,buff=b,chocolate=h,green=r, orange=o,purple=u,white=w,yellow=y 
population: abundant=a,clustered=c,numerous=n, scattered=s,several=v,solitary=y 
habitat: grasses=g,leaves=l,meadows=m,paths=p, urban=u,waste=w,woods=d'''
attributes_list = attributes.split('\n')
attributes_list

['cap-shape: bell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s ',
 'cap-surface: fibrous=f,grooves=g,scaly=y,smooth=s ',
 'cap-color: brown=n,buff=b,cinnamon=c,gray=g,green=r, pink=p,purple=u,red=e,white=w,yellow=y ',
 'bruises?: bruises=t,no=f ',
 'odor: almond=a,anise=l,creosote=c,fishy=y,foul=f, musty=m,none=n,pungent=p,spicy=s ',
 'gill-attachment: attached=a,descending=d,free=f,notched=n ',
 'gill-spacing: close=c,crowded=w,distant=d ',
 'gill-size: broad=b,narrow=n ',
 'gill-color: black=k,brown=n,buff=b,chocolate=h,gray=g, green=r,orange=o,pink=p,purple=u,red=e, white=w,yellow=y ',
 'stalk-shape: enlarging=e,tapering=t ',
 'stalk-root: bulbous=b,club=c,cup=u,equal=e, rhizomorphs=z,rooted=r,missing=? ',
 'stalk-surface-above-ring: fibrous=f,scaly=y,silky=k,smooth=s ',
 'stalk-surface-below-ring: fibrous=f,scaly=y,silky=k,smooth=s ',
 'stalk-color-above-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y ',
 'stalk-color-below-ring: brown=n,buff=b,cin

In [10]:
ordered_attributes = []
data_attributes = {}
for att in attributes_list:
    #Break our string into the column name and categories
    col_data_split = att.split(': ')
    #next, we split our category labels into a list of name=value
    cat_labels = col_data_split[1].split(',')
    
    # lets now extract only our values (our data is pure letters) and value names. We'll do a 
    # dict comprehension that extracts the second value of a list after splitting on the =.
    
    #I split second on the spaces because there are some trailing spaces in my string
    cats = {x.split('=')[1].split(' ')[0]:x.split('=')[0] for x in cat_labels}
    #Now lets populate our columns dictionary defining a key, columns, and the values, our list
    # called cats here
    data_attributes[col_data_split[0]] = cats
    
    # we also want an ordered list to declare as the columns for our dataframe
    ordered_attributes.append(col_data_split[0])


data_attributes

{'bruises?': {'f': 'no', 't': 'bruises'},
 'cap-color': {'b': 'buff',
  'c': 'cinnamon',
  'e': 'red',
  'g': 'gray',
  'n': 'brown',
  'p': ' pink',
  'r': 'green',
  'u': 'purple',
  'w': 'white',
  'y': 'yellow'},
 'cap-shape': {'b': 'bell',
  'c': 'conical',
  'f': 'flat',
  'k': ' knobbed',
  's': 'sunken',
  'x': 'convex'},
 'cap-surface': {'f': 'fibrous', 'g': 'grooves', 's': 'smooth', 'y': 'scaly'},
 'gill-attachment': {'a': 'attached',
  'd': 'descending',
  'f': 'free',
  'n': 'notched'},
 'gill-color': {'b': 'buff',
  'e': 'red',
  'g': 'gray',
  'h': 'chocolate',
  'k': 'black',
  'n': 'brown',
  'o': 'orange',
  'p': 'pink',
  'r': ' green',
  'u': 'purple',
  'w': ' white',
  'y': 'yellow'},
 'gill-size': {'b': 'broad', 'n': 'narrow'},
 'gill-spacing': {'c': 'close', 'd': 'distant', 'w': 'crowded'},
 'habitat': {'d': 'woods',
  'g': 'grasses',
  'l': 'leaves',
  'm': 'meadows',
  'p': 'paths',
  'u': ' urban',
  'w': 'waste'},
 'odor': {'a': 'almond',
  'c': 'creosote',
 

## End Aside

So we have a dictionary of more verbose categories. We'll now want to apply them to the dataset.

In [11]:
mush = pd.read_csv('../data/mushrooms.data',header=None,names=['edible?']+ordered_attributes)
mush.head()

Unnamed: 0,edible?,cap-shape,cap-surface,cap-color,bruises?,odor,gill-attachment,gill-spacing,gill-size,gill-color,...,stalk-surface-below-ring,stalk-color-above-ring,stalk-color-below-ring,veil-type,veil-color,ring-number,ring-type,spore-print-color,population,habitat
0,p,x,s,n,t,p,f,c,n,k,...,s,w,w,p,w,o,p,k,s,u
1,e,x,s,y,t,a,f,c,b,k,...,s,w,w,p,w,o,p,n,n,g
2,e,b,s,w,t,l,f,c,b,n,...,s,w,w,p,w,o,p,n,n,m
3,p,x,y,w,t,p,f,c,n,n,...,s,w,w,p,w,o,p,k,s,u
4,e,x,s,g,f,n,f,w,b,k,...,s,w,w,p,w,o,e,n,a,g


Let's have verbose names for our data by using .map()

In [12]:
for col in mush.columns:
    if col == 'edible?':
        continue
    mush[col] = mush[col].map(data_attributes[col])
mush.head()

Unnamed: 0,edible?,cap-shape,cap-surface,cap-color,bruises?,odor,gill-attachment,gill-spacing,gill-size,gill-color,...,stalk-surface-below-ring,stalk-color-above-ring,stalk-color-below-ring,veil-type,veil-color,ring-number,ring-type,spore-print-color,population,habitat
0,p,convex,smooth,brown,bruises,pungent,free,close,narrow,black,...,smooth,white,white,partial,white,one,pendant,black,scattered,urban
1,e,convex,smooth,yellow,bruises,almond,free,close,broad,black,...,smooth,white,white,partial,white,one,pendant,brown,numerous,grasses
2,e,bell,smooth,white,bruises,anise,free,close,broad,brown,...,smooth,white,white,partial,white,one,pendant,brown,numerous,meadows
3,p,convex,scaly,white,bruises,pungent,free,close,narrow,brown,...,smooth,white,white,partial,white,one,pendant,black,scattered,urban
4,e,convex,smooth,gray,no,none,free,crowded,broad,black,...,smooth,white,white,partial,white,one,evanescent,brown,abundant,grasses


Because of the sheer # of attributes in this dataset, we will work with a subset of the data.

In [13]:
mush = mush[['edible?','cap-shape','cap-color','cap-surface']]
mush.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8124 entries, 0 to 8123
Data columns (total 4 columns):
edible?        8124 non-null object
cap-shape      8124 non-null object
cap-color      8124 non-null object
cap-surface    8124 non-null object
dtypes: object(4)
memory usage: 253.9+ KB


We'll now convert them into binary features using `pd.get_dummies` function

In [14]:
pd.get_dummies(mush['cap-shape']).head()

Unnamed: 0,knobbed,bell,conical,convex,flat,sunken
0,0.0,0.0,0.0,1.0,0.0,0.0
1,0.0,0.0,0.0,1.0,0.0,0.0
2,0.0,1.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,1.0,0.0,0.0
4,0.0,0.0,0.0,1.0,0.0,0.0


For each new column, we'll preface it with the original column name

In [15]:
mush_code = pd.DataFrame(mush['edible?'].map({'p':0,'e':1}))

for column in mush.columns:
    if column == 'edible?':
        continue
    temp = pd.get_dummies(mush[column],prefix=column)
    mush_code[temp.columns] = temp
mush_code.head()
 

Unnamed: 0,edible?,cap-shape_ knobbed,cap-shape_bell,cap-shape_conical,cap-shape_convex,cap-shape_flat,cap-shape_sunken,cap-color_ pink,cap-color_brown,cap-color_buff,...,cap-color_gray,cap-color_green,cap-color_purple,cap-color_red,cap-color_white,cap-color_yellow,cap-surface_fibrous,cap-surface_grooves,cap-surface_scaly,cap-surface_smooth
0,0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0
1,1,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0
2,1,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0
3,0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,1.0,0.0
4,1,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0


Awesome, we've prepared our data. Now we need to provide it to an SVM Classifier and see how we do.

In [16]:
X = mush_code.drop('edible?',axis=1)
y = mush_code['edible?']

In the interest in showing a variety of approaches and flexing your coding skills, what is this code below doing?

In [17]:
index = range(0,len(X))
np.random.shuffle(index)
train = index[:len(X)*4/5]
test = index[len(X)*4/5:]

We can compare two common kernels (note: the default kernel is rbf). 

In [18]:
model = SVC(C=1,kernel='linear').fit(X.iloc[train],y.iloc[train])
print classification_report(y.iloc[test],model.predict(X.iloc[test]))

             precision    recall  f1-score   support

          0       0.62      0.80      0.70       772
          1       0.76      0.55      0.64       853

avg / total       0.69      0.67      0.67      1625



In [19]:
model = SVC(C=1,kernel='rbf').fit(X.iloc[train],y.iloc[train])
print classification_report(y.iloc[test],model.predict(X.iloc[test]))

             precision    recall  f1-score   support

          0       0.66      0.69      0.67       772
          1       0.70      0.68      0.69       853

avg / total       0.68      0.68      0.68      1625



###Exercise 1
Convert one column to a dummy encoder and add the 'edible?' column to it. Train a linear kernel on it and generate a confusion matrix.

In [20]:
from sklearn.metrics import confusion_matrix
mush_exercise = mush[['edible?','cap-surface']]

mush_exercise_code = pd.DataFrame(mush_exercise['edible?'].map({'p':0,'e':1}))

for column in mush_exercise.columns:
    if column == 'edible?':
        continue
    temp = pd.get_dummies(mush_exercise[column],prefix=column)
    mush_exercise_code[temp.columns] = temp

X_exercise = mush_exercise_code.drop('edible?',axis=1)
y_exercise = mush_exercise_code['edible?']

model_exercise = SVC(C=1,kernel='linear').fit(X_exercise.iloc[train],y_exercise.iloc[train])
print confusion_matrix(y_exercise.iloc[test],model_exercise.predict(X_exercise.iloc[test]))
print classification_report(y_exercise.iloc[test],model_exercise.predict(X_exercise.iloc[test]))

[[627 145]
 [523 330]]
             precision    recall  f1-score   support

          0       0.55      0.81      0.65       772
          1       0.69      0.39      0.50       853

avg / total       0.62      0.59      0.57      1625



###Exercise 2
Plot the coefficients for the columns. Is this surprising? Share the results with your neighbor and identify the category of your column that best identifies an edible mushroom.

In [21]:
data_exercise = {}
targets = pd.unique(mush_exercise['edible?'])

for i,row in enumerate(model_exercise.coef_):
    # Enumerate gives us a counter i for each value in model.coef_
    # we use that to get the full name of the iris.
    data_exercise[targets[i]] = list(row)
data_exercise

p=Bar(data_exercise, title="SVC Feature Importance",
        ylabel='Linear Coefficient', width=600, height=600, legend="top_right")
show(p)

### Exercise 3

Train your SVM on every single category! What would you expect the score to do? (Hint: take your process for Exercise 1 and apply it to every column)

In [22]:
mush_exercise_code2 = pd.DataFrame(mush_exercise['edible?'].map({'p':0,'e':1}))

for column in mush.columns:
    if column == 'edible?':
        continue
    temp = pd.get_dummies(mush[column],prefix=column)
    mush_exercise_code2[temp.columns] = temp
    mush_exercise_code2r = mush_exercise_code2.rename(columns = {'cap-shape_ knobbed':'cap-shape_knobbed','cap-color_ pink':'cap-color_pink'})
    X_exercise2 = mush_exercise_code2r.drop('edible?',axis=1)
    y_exercise2 = mush_exercise_code2r['edible?']

    model_exercise2 = SVC(C=1,kernel='linear').fit(X_exercise2.iloc[train],y_exercise2.iloc[train])

print confusion_matrix(y_exercise2.iloc[test],model_exercise2.predict(X_exercise2.iloc[test]))
print classification_report(y_exercise2.iloc[test],model_exercise2.predict(X_exercise2.iloc[test]))

[[621 151]
 [381 472]]
             precision    recall  f1-score   support

          0       0.62      0.80      0.70       772
          1       0.76      0.55      0.64       853

avg / total       0.69      0.67      0.67      1625



### Exercise 4
Cross validate an SVC model. Remember cross_val_score 

In [23]:
from sklearn.cross_validation import cross_val_score
print "%.4f" % cross_val_score(model_exercise2,X_exercise2.iloc[test],y_exercise2.iloc[test],cv=100).mean()

0.6876


### Exercise 5: Old Dog, New Data
Utilize the mushroom dataset and train a Decision Tree on it. Are you overfitting? How do you know?

In [24]:
DecisionTree_clf = DecisionTreeClassifier(max_depth=3, random_state=1)

DecisionTree_clf.fit(X_exercise2.iloc[test],y_exercise2.iloc[test])

print confusion_matrix(y_exercise2.iloc[test],DecisionTree_clf.predict(X_exercise2.iloc[test]))
print classification_report(y_exercise2.iloc[test],DecisionTree_clf.predict(X_exercise2.iloc[test]))
print "%.4f" % cross_val_score(DecisionTree_clf,X_exercise2.iloc[test],y_exercise2.iloc[test],cv=100).mean()

[[685  87]
 [453 400]]
             precision    recall  f1-score   support

          0       0.60      0.89      0.72       772
          1       0.82      0.47      0.60       853

avg / total       0.72      0.67      0.65      1625

0.6695


### Advanced Topic: Kernel Play

http://scikit-learn.org/stable/auto_examples/svm/plot_svm_kernels.html

What kernel performs best for the mushroom data? Do we know why? (the kernels shown are 'linear', 'poly' and 'rbf' however there are many others you can see in the documentation)

In [25]:
import matplotlib.pyplot as plt

# fit the model
for kernel in ('linear', 'poly', 'rbf'):
    clf = SVC(kernel=kernel, gamma=2)
    clf.fit(X_exercise2.iloc[train],y_exercise2.iloc[train])

    # plot the line, the points, and the nearest vectors to the plane
    print "Results of {}: ".format(kernel)
    print confusion_matrix(y_exercise2.iloc[test],clf.predict(X_exercise2.iloc[test]))
    print classification_report(y_exercise2.iloc[test],clf.predict(X_exercise2.iloc[test]))
    print "Cross Validation %.4f" % cross_val_score(clf,X_exercise2.iloc[test],y_exercise2.iloc[test],cv=100).mean()
    print "============================================"

Results of linear: 
[[621 151]
 [381 472]]
             precision    recall  f1-score   support

          0       0.62      0.80      0.70       772
          1       0.76      0.55      0.64       853

avg / total       0.69      0.67      0.67      1625

Cross Validation 0.6876
Results of poly: 
[[555 217]
 [219 634]]
             precision    recall  f1-score   support

          0       0.72      0.72      0.72       772
          1       0.75      0.74      0.74       853

avg / total       0.73      0.73      0.73      1625

Cross Validation 0.7366
Results of rbf: 
[[553 219]
 [219 634]]
             precision    recall  f1-score   support

          0       0.72      0.72      0.72       772
          1       0.74      0.74      0.74       853

avg / total       0.73      0.73      0.73      1625

Cross Validation 0.7360
