#### Walkthrough from Datacamp
[Read More](https://www.datacamp.com/tutorial/svm-classification-scikit-learn-python)


* kernel trick to handle nonlinear input sapces
* used in many classification - very popular
* separate data based on a hyperplane with the largest amount of margin
* support vectors are data points that are closest to the hyperplane
* hyperplane - decision plane that separates data of different classes
* margin - gap between two lines on the closest class points, larger margin the better 
* for non-linear uses kernel trick to transform data to higher dimensional space - then reduces back down
* kernels: linear (multiplies input), polynomial for curve/nonlinear, radial basis function kernel (infinite dimensions)
* gamma - 0-1, higher more overfitting, 0.1 good default value

In [1]:
#Import scikit-learn dataset library
from sklearn import datasets

#Load dataset
cancer = datasets.load_breast_cancer()

In [2]:
# print the names of the 13 features
print("Features: ", cancer.feature_names)

# print the label type of cancer('malignant' 'benign')
print("Labels: ", cancer.target_names)


Features:  ['mean radius' 'mean texture' 'mean perimeter' 'mean area'
 'mean smoothness' 'mean compactness' 'mean concavity'
 'mean concave points' 'mean symmetry' 'mean fractal dimension'
 'radius error' 'texture error' 'perimeter error' 'area error'
 'smoothness error' 'compactness error' 'concavity error'
 'concave points error' 'symmetry error' 'fractal dimension error'
 'worst radius' 'worst texture' 'worst perimeter' 'worst area'
 'worst smoothness' 'worst compactness' 'worst concavity'
 'worst concave points' 'worst symmetry' 'worst fractal dimension']
Labels:  ['malignant' 'benign']


In [7]:
# data exploration 
# print(cancer.data[0:5])
# cancer.data.shape
# cancer.target # 0: malignant, 1:benign 

In [8]:
# split data
# Import train_test_split function
from sklearn.model_selection import train_test_split

# Split dataset into training set and test set
X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, test_size=0.3,random_state=109) # 70% training and 30% test

In [9]:
#Import svm model
from sklearn import svm

#Create a svm Classifier
clf = svm.SVC(kernel='linear') # Linear Kernel

#Train the model using the training sets
clf.fit(X_train, y_train)

#Predict the response for test dataset
y_pred = clf.predict(X_test)

In [10]:
# evaluate model
#Import scikit-learn metrics module for accuracy calculation
from sklearn import metrics

# Model Accuracy: how often is the classifier correct?
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

Accuracy: 0.9649122807017544


In [11]:
# check precision and recall
# Model Precision: what percentage of positive tuples are labeled as such?
print("Precision:",metrics.precision_score(y_test, y_pred))

# Model Recall: what percentage of positive tuples are labelled as such?
print("Recall:",metrics.recall_score(y_test, y_pred))

Precision: 0.9811320754716981
Recall: 0.9629629629629629


Hyperparameters to tune:
* kernel: linear, polynomial, rbf
* regularization (C): penalty param, reps misclassification or error term
    * sets how much missclassification is bearable
    * controls trade off between decision boundary and misclassification term
    * smaller value of C creates a small-margin hyperplane
* gamma: lower value looser fit with the training data, higher - overfits the data

Advantage
* good accuracy and faster prediction compared to Naive Bayes
* less memory - subset of training points in decision space
* works well with a clear margin, and higher dimensional space

Disadvantage
* not suitable for large datasets because of high-training time (what's large)
* more time in training compared to Naive Bayes
* works poorly with overlapping classes
* sensitive to kernel used 