## 32.1.3.1 Introduction

Support Vector Machines (SVM) is a supervised machine learning algorithm which can be used for

a) Classification

b) Regression

Mostly, used in solving Classification problems.`

#### Advantages:

1) SVM works really well with clear margin of separation.

2) It is effective in high dimension spaces.

3) It is effective in cases where number of dimensions is greater than then number of samples.

4) It uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.

#### Disadvantages:

1) SVM doesnot perform well, when we have large data set (>100K) because the required training time is higher.

2) It also doesnot perform very well, when the data set has more noise i.e., target classes are overlapping.

3) SVM doesnot directly provide probability estimates, these are calculated using an expensive five-fold cross validation. It is related SVC method of Python scikit-learn library.

Note: For more refer 

1) http://scikit-learn.org/stable/modules/svm.html

2) https://www.analyticsvidhya.com/blog/2017/09/understaing-support-vector-machine-example-code/

3) 01 Data Science Lab Copy\03 Reference\support_vector_machines_succinctly.pdf

4) help document - 

from sklearn import svm

help(svm)

help document

## 32.1.3.2 Support Vector Classification (SVC)

### Exercise 1 - Iris flowers classification using SVC

#### import required modules

In [1]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import r2_score

#### import data

In [2]:
iris = load_iris()

In [4]:
iris

{'data': array([[5.1, 3.5, 1.4, 0.2],
        [4.9, 3. , 1.4, 0.2],
        [4.7, 3.2, 1.3, 0.2],
        [4.6, 3.1, 1.5, 0.2],
        [5. , 3.6, 1.4, 0.2],
        [5.4, 3.9, 1.7, 0.4],
        [4.6, 3.4, 1.4, 0.3],
        [5. , 3.4, 1.5, 0.2],
        [4.4, 2.9, 1.4, 0.2],
        [4.9, 3.1, 1.5, 0.1],
        [5.4, 3.7, 1.5, 0.2],
        [4.8, 3.4, 1.6, 0.2],
        [4.8, 3. , 1.4, 0.1],
        [4.3, 3. , 1.1, 0.1],
        [5.8, 4. , 1.2, 0.2],
        [5.7, 4.4, 1.5, 0.4],
        [5.4, 3.9, 1.3, 0.4],
        [5.1, 3.5, 1.4, 0.3],
        [5.7, 3.8, 1.7, 0.3],
        [5.1, 3.8, 1.5, 0.3],
        [5.4, 3.4, 1.7, 0.2],
        [5.1, 3.7, 1.5, 0.4],
        [4.6, 3.6, 1. , 0.2],
        [5.1, 3.3, 1.7, 0.5],
        [4.8, 3.4, 1.9, 0.2],
        [5. , 3. , 1.6, 0.2],
        [5. , 3.4, 1.6, 0.4],
        [5.2, 3.5, 1.5, 0.2],
        [5.2, 3.4, 1.4, 0.2],
        [4.7, 3.2, 1.6, 0.2],
        [4.8, 3.1, 1.6, 0.2],
        [5.4, 3.4, 1.5, 0.4],
        [5.2, 4.1, 1.5, 0.1],
  

In [5]:
X = iris.data[:, :2] # we only take first two features

In [7]:
X

array([[5.1, 3.5],
       [4.9, 3. ],
       [4.7, 3.2],
       [4.6, 3.1],
       [5. , 3.6],
       [5.4, 3.9],
       [4.6, 3.4],
       [5. , 3.4],
       [4.4, 2.9],
       [4.9, 3.1],
       [5.4, 3.7],
       [4.8, 3.4],
       [4.8, 3. ],
       [4.3, 3. ],
       [5.8, 4. ],
       [5.7, 4.4],
       [5.4, 3.9],
       [5.1, 3.5],
       [5.7, 3.8],
       [5.1, 3.8],
       [5.4, 3.4],
       [5.1, 3.7],
       [4.6, 3.6],
       [5.1, 3.3],
       [4.8, 3.4],
       [5. , 3. ],
       [5. , 3.4],
       [5.2, 3.5],
       [5.2, 3.4],
       [4.7, 3.2],
       [4.8, 3.1],
       [5.4, 3.4],
       [5.2, 4.1],
       [5.5, 4.2],
       [4.9, 3.1],
       [5. , 3.2],
       [5.5, 3.5],
       [4.9, 3.1],
       [4.4, 3. ],
       [5.1, 3.4],
       [5. , 3.5],
       [4.5, 2.3],
       [4.4, 3.2],
       [5. , 3.5],
       [5.1, 3.8],
       [4.8, 3. ],
       [5.1, 3.8],
       [4.6, 3.2],
       [5.3, 3.7],
       [5. , 3.3],
       [7. , 3.2],
       [6.4, 3.2],
       [6.9,

In [8]:
type(X)

numpy.ndarray

In [11]:
y = iris.target

In [12]:
y

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [13]:
type(y)

numpy.ndarray

#### split data

In [14]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 42)

#### create model using train data

In [21]:
SVC?

In [20]:
model = SVC(kernel = 'linear', gamma = 1)

In [22]:
# there is various option associated with it, like changing kernel,gamma and C value.

In [23]:
model.fit(X_train, y_train)

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
  decision_function_shape='ovr', degree=3, gamma=1, kernel='linear',
  max_iter=-1, probability=False, random_state=None, shrinking=True,
  tol=0.001, verbose=False)

#### predict output using test data

In [24]:
y_pred = model.predict(X_test)

In [25]:
y_test

array([1, 0, 2, 1, 1, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 1, 2, 1, 1, 2, 0, 2,
       0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 1, 0, 0, 2, 1, 0, 0, 0, 2, 1, 1, 0,
       0])

In [26]:
y_pred

array([1, 0, 2, 1, 2, 0, 1, 2, 1, 1, 2, 0, 0, 0, 0, 2, 2, 1, 1, 2, 0, 1,
       0, 2, 2, 2, 2, 2, 0, 0, 0, 0, 2, 0, 0, 1, 2, 0, 0, 0, 1, 2, 2, 0,
       0])

#### find performance / accuracy

In [28]:
r2_score(y_test, y_pred)

0.7115384615384615

## 32.1.3.3 Tune Parameters of SVM (SVC)

Tuning parameters effectively improves the model performance. Look at help document to find the list of parameters available with SVM.

In [30]:
SVC?

SVC(C=1.0, kernel='rbf', degree=3, gamma='auto', coef0=0.0, shrinking=True, probability=False, tol=0.001, cache_size=200, class_weight=None, verbose=False, max_iter=-1, decision_function_shape='ovr', random_state=None)

In [31]:
from sklearn.model_selection import GridSearchCV

In [37]:
def svc_param_selection(X, y, nfolds):
    Cs = [0.001, 0.01, 0.1, 1, 10]
    gammas = [0.001, 0.01, 0.1, 1]
    kernels = ['linear', 'poly', 'rbf']
    param_grid = {'kernel' : kernels, 'C' : Cs, 'gamma' : gammas}
    grid_search = GridSearchCV(SVC(), param_grid, cv = nfolds)
    grid_search.fit(X, y)
    print('Best Parameters:', grid_search.best_params_)
    print('Best Score:', grid_search.best_score_)
    return (grid_search.best_params_, grid_search.best_score_)

In [38]:
svc_param_selection(X_train, y_train, 5)

Best Parameters: {'C': 0.1, 'gamma': 1, 'kernel': 'rbf'}
Best Score: 0.819047619047619


({'C': 0.1, 'gamma': 1, 'kernel': 'rbf'}, 0.819047619047619)

## 32.1.3.4 Support Vector Regression (SVR)

The method of Support Vector Classification (SVC) can be extended to regression problems solving also. This method is called Support Vector Regression (SVR).

#### SVC vs SVR:

SVC: The model produced by SVC depends only on subset of the training data because the cost function for building the model doesnot care about training points that lie beyond the margin.

SVR: The model produced by SVR depends only on subset of the training data because the cost function for building the model ignores any training data close to the model prediction.

There are three different implementations of Support Vector Regression : SVR, NuSVR and LinearSVR.

LinearSVR provides a faster implementation than SVR but only considers linear kernels, while NuSVR implements a slightly different formulation than SVR and LinearSVR. 

As with classification, in regression also the fit method takes X, y as argumented vectors but only that in this case y is expected to have floating point values instead of integer values.

#### Example from scikit-learn svm documentation

In [43]:
from sklearn.svm import SVR

In [44]:
X = [[0, 0], [2, 2]]

In [45]:
y = [0.5, 2.5]

In [46]:
model = SVR()

In [47]:
model.fit(X, y)

SVR(C=1.0, cache_size=200, coef0=0.0, degree=3, epsilon=0.1, gamma='auto',
  kernel='rbf', max_iter=-1, shrinking=True, tol=0.001, verbose=False)

In [49]:
model.predict([[1, 1]])

array([1.5])