### SUPPORT VECTOR MACHINES

Support vector machines (SVMs) are powerful yet flexible supervised machine learning methods used for classification, regression, and, outliers’ detection. SVMs are very efficient in high dimensional spaces and generally are used in classification problems. SVMs are popular and memory efficient because they use a subset of training points in the decision function.
The main goal of SVMs is to divide the datasets into number of classes in order to find a maximum marginal hyperplane (MMH) which can be done in the following two steps:
1. Support Vector Machines will first generate hyperplanes iteratively that separates the classes in the best way.
2. After that it will choose the hyperplane that segregate the classes correctly.

Some important concepts in SVM are as follows:
1. Support Vectors: They may be defined as the datapoints which are closest to the hyperplane. Support vectors help in deciding the separating line.
2. Hyperplane: The decision plane or space that divides set of objects having different classes.
3. Margin: The gap between two lines on the closet data points of different classes is called margin.

SVM in Scikit-learn supports both sparse and dense sample vectors as input.


#### Classification of SVM
Scikit-learn provides three classes namely SVC, NuSVC and LinearSVC which can perform multiclass-class classification.

#### SVC

It is C-support vector classification whose implementation is based on libsvm. The module used by scikit-learn is sklearn.svm.SVC. This class handles the multiclass support according to one-vs-one scheme.

Attributes
Followings table consist the attributes used by sklearn.svm.SVC class:

### Implementation Example
Like other classifiers, SVC also has to be fitted with following two arrays:
1. An array X holding the training samples. It is of size [n_samples, n_features].
2. An array Y holding the target values i.e. class labels for the training samples. It is of size [n_samples].
Following Python script uses sklearn.svm.SVC class:

In [None]:
import numpy as np
X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
y = np.array([1, 1, 2, 2])
from sklearn.svm import SVC
SVCClf = SVC(kernel='linear',gamma='scale', shrinking=False,)
SVCClf.fit(X, y)

In [None]:
#### Now, once fitted, we can get the weight vector with the help of following python script

SVCClf.coef_

array([[0.5, 0.5]])

In [None]:
#### Similarly, we can get the value of other attributes as follows:

SVCClf.predict([[-0.5,-0.8]])

array([1])

In [None]:
SVCClf.n_support_

array([1, 1], dtype=int32)

In [None]:
SVCClf.support_vectors_

array([[-1., -1.],
       [ 1.,  1.]])

In [None]:
SVCClf.support_

array([0, 2], dtype=int32)

In [None]:
SVCClf.intercept_

array([-0.])

In [None]:
SVCClf.fit_status_

0

### NuSVC

NuSVC is Nu Support Vector Classification. It is another class provided by scikit-learn which can perform multi-class classification. It is like SVC but NuSVC accepts slightly different sets of parameters. The parameter which is different from SVC is as follows:
1. nu: float, optional, default = 0.5
It represents an upper bound on the fraction of training errors and a lower bound of the fraction of support vectors. Its value should be in the interval of (o,1].
Rest of the parameters and attributes are same as of SVC.

#### Implementation Example
We can implement the same example using sklearn.svm.NuSVC class also.

In [None]:
import numpy as np
X = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]])
y = np.array([1, 1, 2, 2])
from sklearn.svm import NuSVC
NuSVCClf = NuSVC(kernel='linear',gamma='scale', shrinking=False,)
NuSVCClf.fit(X, y)

### LinearSVC

It is Linear Support Vector Classification. It is similar to SVC having kernel = ‘linear’. The difference between them is that LinearSVC implemented in terms of liblinear while SVC is implemented in libsvm. That’s the reason LinearSVC has more flexibility in the choice of penalties and loss functions. It also scales better to large number of samples.
If we talk about its parameters and attributes then it does not support ‘kernel’ because it is assumed to be linear and it also lacks some of the attributes like support_, support_vectors_, n_support_, fit_status_ and, dual_coef_.
However, it supports penalty and loss parameters as follows:
1. penalty: string, L1 or L2(default = ‘L2’)
This parameter is used to specify the norm (L1 or L2) used in penalization (regularization).
2. loss: string, hinge, squared_hinge (default = squared_hinge)
It represents the loss function where ‘hinge’ is the standard SVM loss and ‘squared_hinge’ is the square of hinge loss.

#### Implementation Example
Following Python script uses sklearn.svm.LinearSVC class:

In [None]:
from sklearn.svm import LinearSVC
from sklearn.datasets import make_classification
X, y = make_classification(n_features=4, random_state=0)
LSVCClf = LinearSVC(dual = False, random_state=0, penalty='l1',tol=1e-5)
LSVCClf.fit(X, y)

In [None]:
#### Now, once fitted, the model can predict new values as follows:
LSVCClf.coef_

array([[0.        , 0.        , 0.91214599, 0.22630641]])

In [None]:
## Similarly, we can get the value of intercept with the help of following python script

LSVCClf.intercept_

array([0.26860191])

### REGRESSION WITH SVM

As discussed earlier, SVM is used for both classification and regression problems. Scikit-learn’s method of Support Vector Classification (SVC) can be extended to solve regression problems as well. That extended method is called Support Vector Regression (SVR).

**Basic similarity between SVM and SVR**
The model created by SVC depends only on a subset of training data. Why? Because the cost function for building the model doesn’t care about training data points that lie outside the margin.
Whereas, the model produced by SVR (Support Vector Regression) also only depends on a subset of the training data. Why? Because the cost function for building the model ignores any training data points close to the model prediction.
Scikit-learn provides three classes namely SVR, NuSVR and LinearSVR as three different implementations of SVR.

### SVR(Support Vector Regression)

It is Epsilon-support vector regression whose implementation is based on libsvm. As opposite to SVC There are two free parameters in the model namely ‘C’ and ‘epsilon’.
1. epsilon: float, optional, default = 0.1
It represents the epsilon in the epsilon-SVR model, and specifies the epsilon-tube within which no penalty is associated in the training loss function with points predicted within a distance epsilon from the actual value.
Rest of the parameters and attributes are similar as we used in SVC.

In [None]:
from sklearn import svm
X = [[1, 1], [2, 2]]
y = [1, 2]
SVRReg = svm.SVR(kernel='linear',gamma='auto')
SVRReg.fit(X, y)

In [None]:
### Now, once fitted, we can get the weight vector with the help of following python script:
SVRReg.coef_

array([[0.4, 0.4]])

In [None]:
### Similarly, we can get the value of other attributes as follows:
SVRReg.predict([[1,1]])

array([1.1])

### NuSVR

NuSVR is Nu Support Vector Regression. It is like NuSVC, but NuSVR uses a parameter nu to control the number of support vectors. And moreover, unlike NuSVC where nu replaced C parameter, here it replaces epsilon.

Implementation Example
Following Python script uses sklearn.svm.SVR class:

In [None]:
from sklearn.svm import NuSVR
import numpy as np
n_samples, n_features = 20, 15
np.random.seed(0)
y = np.random.randn(n_samples)
X = np.random.randn(n_samples, n_features)
NuSVRReg = NuSVR(kernel='linear', gamma='auto',C=1.0, nu=0.1)
NuSVRReg.fit(X, y)

In [None]:
## Now, once fitted, we can get the weight vector with the help of following python script:
NuSVRReg.coef_

array([[-0.14904483,  0.04596145,  0.22605216, -0.08125403,  0.06564533,
         0.01104285,  0.04068767,  0.2918337 , -0.13473211,  0.36006765,
        -0.2185713 , -0.31836476, -0.03048429,  0.16102126, -0.29317051]])

### LinearSVR

It is Linear Support Vector Regression. It is similar to SVR having kernel = ‘linear’. The difference between them is that LinearSVR implemented in terms of liblinear, while SVC implemented in libsvm. That’s the reason LinearSVR has more flexibility in the choice of penalties and loss functions. It also scales better to large number of samples.
If we talk about its parameters and attributes then it does not support ‘kernel’ because it is assumed to be linear and it also lacks some of the attributes like support_, support_vectors_, n_support_, fit_status_ and, dual_coef_.
However, it supports ‘loss’ parameters as follows:
1. loss: string, optional, default = ‘epsilon_insensitive’
It represents the loss function where epsilon_insensitive loss is the L1 loss and the squared epsilon-insensitive loss is the L2 loss.

In [1]:
from sklearn.svm import LinearSVR
from sklearn.datasets import make_regression
X, y = make_regression(n_features=4, random_state=0)
LSVRReg = LinearSVR(dual = False, random_state=0,loss='squared_epsilon_insensitive',tol=1e-5)
LSVRReg.fit(X, y)

In [2]:
##### Now, once fitted, the model can predict new values as follows:
LSVRReg.predict([[0,0,0,0]])

array([-0.01041416])

In [4]:
LSVRReg.coef_

array([20.47354746, 34.08619401, 67.23189022, 87.47017787])

In [6]:
#### Similarly, we can get the value of intercept with the help of following python script:
LSVRReg.intercept_

array([-0.01041416])