# Support Vector Machine Algorithm (SVM) 

- Support Vector Machine or SVM is one of the most popular Supervised Learning algorithms, which is used for Classification as well as Regression problems. However, primarily, it is used for Classification problems in Machine Learning.

- The goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-dimensional space into classes so that we can easily put the new data point in the correct category in the future. This best decision boundary is called a hyperplane.

__________________________________________________________________________________________________________________________________________________________________________________



**SVM** works by mapping data to a high-dimensional feature space so that data points can be categorized, even when the data are not otherwise linearly separable. A separator between the categories is found, then the data are transformed in such a way that the separator could be drawn as a hyperplane. Following this, characteristics of new data can be used to predict the group to which a new record should belong.

For example, consider the following figure, in which the data points fall into two different categories.

Figure 1. Original dataset

![](https://www.ibm.com/docs/en/SS3RA7_sub/modeler_mainhelp_client_ddita/clementine/images/svm_orig_nocurve.jpg)

The two categories can be separated with a curve, as shown in the following Figures.

Figure 2. Data with separator added

![](https://www.ibm.com/docs/en/SS3RA7_sub/modeler_mainhelp_client_ddita/clementine/images/svm_orig.jpg)

After the transformation, the boundary between the two categories can be defined by a hyperplane, as shown in the following figure.

Figure 3. Transformed data

![](https://www.ibm.com/docs/en/SS3RA7_sub/modeler_mainhelp_client_ddita/clementine/images/svm_transformed.jpg)

**The mathematical function used for the transformation is known as the **kernel** function**

 Kernel Functions include:
- Linear
- Nonlinear 
- Polynomial 
- Radial basis function (RBF) 
- Sigmoid.

**NB**

A linear kernel function is recommended when linear separation of the data is straightforward. In other cases, one of the other functions should be used. You will need to experiment with the different functions to obtain the best model in each case, as they each use different algorithms and parameters.

__________________________________________________________________________________________________________________________________________________________________________________




# Tuning an SVM Model

- Besides the separating line between the categories, a classification SVM model also finds marginal lines that define the space between the two categories.

Figure 1. Data with a preliminary model

![](https://www.ibm.com/docs/en/SS3RA7_sub/modeler_mainhelp_client_ddita/clementine/images/svm_overfit.jpg)

The data points that lie on the margins are known as the support **vectors**.

The wider the margin between the two categories, the better the model will be at predicting the category for new records. In the previous example, the margin is not very wide, and the model is said to be overfitted. A small amount of misclassification can be accepted in order to widen the margin; an example of this is shown in the following figure.

Figure 2. Data with an improved model

![](https://www.ibm.com/docs/en/SS3RA7_sub/modeler_mainhelp_client_ddita/clementine/images/svm_improved.jpg)

In some cases, linear separation is more difficult; an example of this is shown in the following figure.

Figure 3. A problem for linear separation

![](https://www.ibm.com/docs/en/SS3RA7_sub/modeler_mainhelp_client_ddita/clementine/images/svm_nonsep.jpg)

In a case like this, the goal is to find the optimum balance between a wide margin and a small number of misclassified data points. The kernel function has a regularization parameter (known as C) which controls the trade-off between these two values. You will probably need to experiment with different values of this and other kernel parameters in order to find the best model.

__________________________________________________________________________________________________________________________________________________________________________________


# SVM node

The SVM node enables you to use a support vector machine to classify data. SVM is particularly suited for use with wide datasets, that is, those with a large number of predictor fields. You can use the default settings on the node to produce a basic model relatively quickly, or you can use the Expert settings to experiment with different types of SVM model.

When the model has been built, you can:

- Browse the model nugget to display the relative importance of the input fields in building the model.
- Append a Table node to the model nugget to view the model output.

**Example**: A medical researcher has obtained a dataset containing characteristics of a number of human cell samples extracted from patients who were believed to be at risk of developing cancer. Analysis of the original data showed that many of the characteristics differed significantly between benign and malignant samples. The researcher wants to develop an SVM model that can use the values of similar cell characteristics in samples from other patients to give an early indication of whether their samples might be benign or malignant.

## SVM Node Model Options

**Model name**. You can generate the model name automatically based on the target or ID field (or model type in cases where no such field is specified) or specify a custom name.

**Use partitioned data**. If a partition field is defined, this option ensures that data from only the training partition is used to build the model. 

**Create split models**. Builds a separate model for each possible value of input fields that are specified as split fields. See Building Split Models for more information.


## SVM Node Expert Options

If you have detailed knowledge of support vector machines, expert options allow you to fine-tune the training process. To access the expert options, set Mode to **Expert** on the Expert tab.

**Append all probabilities (valid only for categorical targets)**. If selected (checked), specifies that probabilities for each possible value of a nominal or flag target field are displayed for each record processed by the node. If this option is not selected, the probability of only the predicted value is displayed for nominal or flag target fields. The setting of this check box determines the default state of the corresponding check box on the model nugget display.

**Stopping criteria**. Determines when to stop the optimization algorithm. Values range from 1.0E–1 to 1.0E–6; default is 1.0E–3. Reducing the value results in a more accurate model, but the model will take longer to train.

**Regularization parameter (C)** . Controls the trade-off between maximizing the margin and minimizing the training error term. Value should normally be between 1 and 10 inclusive; default is 10. Increasing the value improves the classification accuracy (or reduces the regression error) for the training data, but this can also lead to overfitting.

**Regression precision (epsilon)**. Used only if the measurement level of the target field is Continuous. Causes errors to be accepted provided that they are less than the value specified here. Increasing the value may result in faster modeling, but at the expense of accuracy.

**Kernel type**. Determines the type of kernel function used for the transformation. Different kernel types cause the separator to be calculated in different ways, so it is advisable to experiment with the various options. Default is **RBF** (Radial Basis Function).

**RBF gamma**. Enabled only if the kernel type is set to **RBF**. Value should normally be between 3/k and 6/k, where k is the number of input fields. For example, if there are 12 input fields, values between 0.25 and 0.5 would be worth trying. Increasing the value improves the classification accuracy (or reduces the regression error) for the training data, but this can also lead to overfitting.

**Gamma**. Enabled only if the kernel type is set to **Polynomial or Sigmoid**. Increasing the value improves the classification accuracy (or reduces the regression error) for the training data, but this can also lead to overfitting.

**Bias**. Enabled only if the kernel type is set to **Polynomial or Sigmoid**. Sets the coef0 value in the kernel function. The default value 0 is suitable in most cases.

**Degree**. Enabled only if Kernel type is set to **Polynomial**. Controls the complexity (dimension) of the mapping space. Normally you would not use a value greater than 10.

_________________________________________________________________________________________________________________________________________________


# SVM Model Nugget


The SVM model creates a number of new fields. The most important of these is the **$S-fieldname** field, which shows the target field value predicted by the model.

The number and names of the new fields created by the model depend on the measurement level of the target field (this field is indicated in the following tables by fieldname).

To see these fields and their values, add a Table node to the SVM model nugget and execute the Table node.





### Table 1. Target field measurement level is 'Nominal' or 'Flag'



| New field name | Description                          |
|----------------|--------------------------------------|
| $S-fieldname   | Predicted value of target field.     |
| $SP-fieldname  | Probability of predicted value.      |
| $SP-value      | Probability of each possible value of nominal or flag (displayed only if **Append all probabilities** is checked.
|                | on the Settings tab of the model nugget).
| $SRP-value     |  (Flag targets only) Raw (SRP) and adjusted (SAP) propensity scores, indicating the likelihood of a "true" outcome for the 
|                |target field. These scores are displayed only if the corresponding check boxes are selected on the Analyze tab of the SVM 
|                |modeling node before the model is generated. See the topic Modeling Node Analyze Options for more information.
|                |
| $SAP-value     |


### Table 2.  Target field measurement level is 'Continuous'



| New field name | Description                          |
|----------------|--------------------------------------|
| $S-fieldname   | Predicted value of target field.     |


## Predictor Importance

Optionally, a chart that indicates the relative importance of each predictor in estimating the model may also be displayed on the Model tab. Typically you will want to focus your modeling efforts on the predictors that matter most and consider dropping or ignoring those that matter least. Note this chart is only available if Calculate predictor importance is selected on the Analyze tab before generating the model. See the topic Predictor Importance for more information.

Note: Predictor importance may take longer to calculate for SVM than for other types of models, and is not selected on the Analyze tab by default. Selecting this option may slow performance, particularly with large datasets.