In [None]:
# Based on the Datacamp tutirial : https://www.datacamp.com/tutorial/svm-classification-scikit-learn-python
# Modified by Mehdi Ammi, Univ. Paris 8

# Scikit-Learn: Support Vector Machines (SVM)

## Introduction 

In this notebook, you'll learn about Support Vector Machines, one of the most popular and widely used supervised machine learning algorithms.

SVM offers very high accuracy compared to other classifiers such as logistic regression, and decision trees. It is known for its kernel trick to handle nonlinear input spaces. It is used in a variety of applications such as face detection, intrusion detection, classification of emails, news articles and web pages, classification of genes, and handwriting recognition.

Generally, Support Vector Machines is considered to be a classification approach, it but can be employed in both types of classification and regression problems. It can easily handle multiple continuous and categorical variables. SVM constructs a hyperplane in multidimensional space to separate different classes. SVM generates optimal hyperplane in an iterative manner, which is used to minimize an error. The core idea of SVM is to find a maximum marginal hyperplane(MMH) that best divides the dataset into classes.

![SVM.png](attachment:edbccaac-af32-444c-9fc5-08245135f30e.png)


 - Support Vectors : Support vectors are the data points, which are closest to the hyperplane. These points will define the separating line better by calculating margins. These points are more relevant to the construction of the classifier.

 - Hyperplane : A hyperplane is a decision plane which separates between a set of objects having different class memberships.

 - Margin : A margin is a gap between the two lines on the closest class points. This is calculated as the perpendicular distance from the line to support vectors or closest points. If the margin is larger in between the classes, then it is considered a good margin, a smaller margin is a bad margin.

## How does SVM work?
The main objective is to segregate the given dataset in the best possible way. The distance between the either nearest points is known as the margin. The objective is to select a hyperplane with the maximum possible margin between support vectors in the given dataset. SVM searches for the maximum marginal hyperplane in the following steps:

1. Generate hyperplanes which segregates the classes in the best way. Left-hand side figure showing three hyperplanes black, blue and orange. Here, the blue and orange have higher classification error, but the black is separating the two classes correctly.

2. Select the right hyperplane with the maximum segregation from the either nearest data points as shown in the right-hand side figure.

![SVM_1.png](attachment:8fded9b0-a036-4e89-b091-8c6cab5e4a3a.png)

## Dealing with non-linear and inseparable planes
Some problems can’t be solved using linear hyperplane, as shown in the figure below (left-hand side).

In such situation, SVM uses a kernel trick to transform the input space to a higher dimensional space as shown on the right. The data points are plotted on the x-axis and z-axis (Z is the squared sum of both x and y: z=x^2=y^2). Now you can easily segregate these points using linear separation.

![SVM_2.png](attachment:ff35e35a-3c37-4476-8f76-353fb421e6d7.png)

## SVM Kernels

The SVM algorithm is implemented in practice using a kernel. A kernel transforms an input data space into the required form. SVM uses a technique called the kernel trick. Here, the kernel takes a low-dimensional input space and transforms it into a higher dimensional space. In other words, you can say that it converts nonseparable problem to separable problems by adding more dimension to it. It is most useful in non-linear separation problem. Kernel trick helps you to build a more accurate classifier.

 - Linear Kernel A linear kernel can be used as normal dot product any two given observations. The product between two vectors is the sum of the multiplication of each pair of input values.

    K(x,xi) = 1 + sum(x * xi)^d

 - Polynomial Kernel A polynomial kernel is a more generalized form of the linear kernel. The polynomial kernel can distinguish curved or nonlinear input space.

    K(x,xi) = 1 + sum(x * xi)^d

    Where d is the degree of the polynomial. d=1 is similar to the linear transformation. The degree needs to be manually specified in the learning algorithm.

 - Radial Basis Function Kernel The Radial basis function kernel is a popular kernel function commonly used in support vector machine classification. RBF can map an input space in infinite dimensional space.
 
    K(x,xi) = exp(-gamma * sum((x – xi^2))
    
    Here gamma is a parameter, which ranges from 0 to 1. A higher value of gamma will perfectly fit the training dataset, which causes over-fitting. Gamma=0.1 is considered to be a good default value. The value of gamma needs to be manually specified in the learning algorithm.
    
## Key Concepts:

- **Kernel Trick**: SVMs can efficiently perform a non-linear classification using what is called the kernel trick, implicitly mapping their inputs into high-dimensional feature spaces. Common kernels include polynomial, radial basis function (RBF), and sigmoid.
- **C Parameter**: This regularization parameter determines the trade-off between maximizing the margin and minimizing classification errors on the training data. A smaller C makes the margin larger but allows more misclassified data points.


## Advantages of SVM:
1. **Effective in High-Dimensional Spaces**: SVMs are particularly useful in applications with high-dimensional data, where the number of features is large compared to the number of data points.
2. **Memory Efficient**: Only a subset of training points (the support vectors) are used in the decision function, making SVM memory efficient.
3. **Versatility**: By using different kernel functions, SVMs can be adapted to a variety of data types and distributions.
4. **Robust to Overfitting**: Especially in high-dimensional space, SVMs are less prone to overfitting compared to other classifiers, especially when the number of dimensions exceeds the number of samples.

## Limitations of SVM:
1. **Computational Complexity**: Training an SVM can be computationally intensive, especially with large datasets. The complexity is between quadratic and cubic, making SVMs less suitable for very large datasets.
2. **Choice of Kernel**: The performance of an SVM classifier is highly dependent on the choice of the kernel and its parameters. Finding the right kernel and parameters can be challenging and often requires a lot of experimentation.
3. **Not Probabilistic**: SVM does not directly provide probability estimates, though methods like Platt scaling can be used to convert SVM outputs into probability scores.
4. **Scalability**: While SVMs perform well with smaller datasets, their training time becomes impractical with larger datasets due to their high computational cost.

## Classifier Building in Scikit-learn

Until now, you have learned about the theoretical background of SVM. Now you will learn about its implementation in Python using scikit-learn.

In the model the building part, you can use the cancer dataset, which is a very famous multi-class classification problem. This dataset is computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.

The dataset comprises 30 features (mean radius, mean texture, mean perimeter, mean area, mean smoothness, mean compactness, mean concavity, mean concave points, mean symmetry, mean fractal dimension, radius error, texture error, perimeter error, area error, smoothness error, compactness error, concavity error, concave points error, symmetry error, fractal dimension error, worst radius, worst texture, worst perimeter, worst area, worst smoothness, worst compactness, worst concavity, worst concave points, worst symmetry, and worst fractal dimension) and a target (type of cancer).

This data has two types of cancer classes: malignant (harmful) and benign (not harmful). Here, you can build a model to classify the type of cancer. The dataset is available in the scikit-learn library or you can also download it from the UCI Machine Learning Library.

## Installing Scikit-Learn

In [None]:
!pip install scikit-learn

## Loading Data

Let's first load the required dataset you will use.

In [None]:
# Import scikit-learn dataset library
from sklearn import datasets

# Load the breast cancer dataset
cancer = datasets.load_breast_cancer()

## Exploring Data

After you have loaded the dataset, you might want to know a little bit more about it. You can check feature and target names.

In [None]:
# Print the names of the 13 features in the dataset
print("Features: ", cancer.feature_names)

# Print the label types of cancer ('malignant' and 'benign')
print("Labels: ", cancer.target_names)

In [None]:
>>
Features:  ['mean radius' 'mean texture' 'mean perimeter' 'mean area'
 'mean smoothness' 'mean compactness' 'mean concavity'
 'mean concave points' 'mean symmetry' 'mean fractal dimension'
 'radius error' 'texture error' 'perimeter error' 'area error'
 'smoothness error' 'compactness error' 'concavity error'
 'concave points error' 'symmetry error' 'fractal dimension error'
 'worst radius' 'worst texture' 'worst perimeter' 'worst area'
 'worst smoothness' 'worst compactness' 'worst concavity'
 'worst concave points' 'worst symmetry' 'worst fractal dimension']
Labels:  ['malignant' 'benign']

Let's explore it for a bit more. You can also check the shape of the dataset using shape.

In [None]:
# print data(feature)shape
cancer.data.shape

In [None]:
>>
(569, 30)

Let's check top 5 records of the feature set.

In [None]:
# print the cancer data features (top 5 records)
print(cancer.data[0:5])

In [None]:
>>
[[1.799e+01 1.038e+01 1.228e+02 1.001e+03 1.184e-01 2.776e-01 3.001e-01
  1.471e-01 2.419e-01 7.871e-02 1.095e+00 9.053e-01 8.589e+00 1.534e+02
  6.399e-03 4.904e-02 5.373e-02 1.587e-02 3.003e-02 6.193e-03 2.538e+01
  1.733e+01 1.846e+02 2.019e+03 1.622e-01 6.656e-01 7.119e-01 2.654e-01
  4.601e-01 1.189e-01]
 [2.057e+01 1.777e+01 1.329e+02 1.326e+03 8.474e-02 7.864e-02 8.690e-02
  7.017e-02 1.812e-01 5.667e-02 5.435e-01 7.339e-01 3.398e+00 7.408e+01
  5.225e-03 1.308e-02 1.860e-02 1.340e-02 1.389e-02 3.532e-03 2.499e+01
  2.341e+01 1.588e+02 1.956e+03 1.238e-01 1.866e-01 2.416e-01 1.860e-01
  2.750e-01 8.902e-02]
 [1.969e+01 2.125e+01 1.300e+02 1.203e+03 1.096e-01 1.599e-01 1.974e-01
  1.279e-01 2.069e-01 5.999e-02 7.456e-01 7.869e-01 4.585e+00 9.403e+01
  6.150e-03 4.006e-02 3.832e-02 2.058e-02 2.250e-02 4.571e-03 2.357e+01
  2.553e+01 1.525e+02 1.709e+03 1.444e-01 4.245e-01 4.504e-01 2.430e-01
  3.613e-01 8.758e-02]
 [1.142e+01 2.038e+01 7.758e+01 3.861e+02 1.425e-01 2.839e-01 2.414e-01
  1.052e-01 2.597e-01 9.744e-02 4.956e-01 1.156e+00 3.445e+00 2.723e+01
  9.110e-03 7.458e-02 5.661e-02 1.867e-02 5.963e-02 9.208e-03 1.491e+01
  2.650e+01 9.887e+01 5.677e+02 2.098e-01 8.663e-01 6.869e-01 2.575e-01
  6.638e-01 1.730e-01]
 [2.029e+01 1.434e+01 1.351e+02 1.297e+03 1.003e-01 1.328e-01 1.980e-01
  1.043e-01 1.809e-01 5.883e-02 7.572e-01 7.813e-01 5.438e+00 9.444e+01
  1.149e-02 2.461e-02 5.688e-02 1.885e-02 1.756e-02 5.115e-03 2.254e+01
  1.667e+01 1.522e+02 1.575e+03 1.374e-01 2.050e-01 4.000e-01 1.625e-01
  2.364e-01 7.678e-02]]

Let's take a look at the target set.

In [None]:
# print the cancer labels (0:malignant, 1:benign)
print(cancer.target)

In [None]:
>>
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 1 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 0 0
 1 0 1 0 0 1 1 1 0 0 1 0 0 0 1 1 1 0 1 1 0 0 1 1 1 0 0 1 1 1 1 0 1 1 0 1 1
 1 1 1 1 1 1 0 0 0 1 0 0 1 1 1 0 0 1 0 1 0 0 1 0 0 1 1 0 1 1 0 1 1 1 1 0 1
 1 1 1 1 1 1 1 1 0 1 1 1 1 0 0 1 0 1 1 0 0 1 1 0 0 1 1 1 1 0 1 1 0 0 0 1 0
 1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 1 0 1 0 1 1 0 1 0 0 0 0 1 1 0 0 1 1
 1 0 1 1 1 1 1 0 0 1 1 0 1 1 0 0 1 0 1 1 1 1 0 1 1 1 1 1 0 1 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 0 1 1 0 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1
 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 1 0 0 0 1 1
 1 1 0 1 0 1 0 1 1 1 0 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0
 0 1 0 0 1 1 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1
 1 0 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 0 1 1 1 1 1 0 1 1
 0 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 0 1
 1 1 1 1 1 1 0 1 0 1 1 0 1 1 1 1 1 0 0 1 0 1 0 1 1 1 1 1 0 1 1 0 1 0 1 0 0
 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 0 0 0 0 0 0 1]

## Splitting Data

To understand model performance, dividing the dataset into a training set and a test set is a good strategy.

Split the dataset by using the function train_test_split(). you need to pass 3 parameters features, target, and test_set size. Additionally, you can use random_state to select records randomly.

In [None]:
# Import train_test_split function
from sklearn.model_selection import train_test_split

# Split dataset into training set and test set
X_train, X_test, y_train, y_test = train_test_split(cancer.data, cancer.target, test_size=0.3,random_state=109) # 70% training and 30% test

## Generating Model

Let's build support vector machine model. First, import the SVM module and create support vector classifier object by passing argument kernel as the linear kernel in SVC() function.

Then, fit your model on train set using fit() and perform prediction on the test set using predict().

In [None]:
#Import svm model
from sklearn import svm

#Create a svm Classifier
clf = svm.SVC(kernel='linear') # Linear Kernel

#Train the model using the training sets
clf.fit(X_train, y_train)

#Predict the response for test dataset
y_pred = clf.predict(X_test)

## Evaluating the Model

Let's estimate how accurately the classifier or model can predict the breast cancer of patients.

Accuracy can be computed by comparing actual test set values and predicted values.

In [None]:
#Import scikit-learn metrics module for accuracy calculation
from sklearn import metrics

# Model Accuracy: how often is the classifier correct?
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

In [None]:
>>
Accuracy: 0.9649122807017544

Well, you got a classification rate of 96.49%, considered as very good accuracy.

For further evaluation, you can also check precision and recall of model.

In [None]:
# Model Precision: what percentage of positive tuples are labeled as such?
print("Precision:",metrics.precision_score(y_test, y_pred))

# Model Recall: what percentage of positive tuples are labelled as such?
print("Recall:",metrics.recall_score(y_test, y_pred))

In [None]:
>>
Precision: 0.9811320754716981
Recall: 0.9629629629629629

Well, you got a precision of 98% and recall of 96%, which are considered as very good values.

## Tuning Hyperparameters

 - Kernel: The main function of the kernel is to transform the given dataset input data into the required form. There are various types of functions such as linear, polynomial, and radial basis function (RBF). Polynomial and RBF are useful for non-linear hyperplane. Polynomial and RBF kernels compute the separation line in the higher dimension. In some of the applications, it is suggested to use a more complex kernel to separate the classes that are curved or nonlinear. This transformation can lead to more accurate classifiers.
 - Regularization: Regularization parameter in python's Scikit-learn C parameter used to maintain regularization. Here C is the penalty parameter, which represents misclassification or error term. The misclassification or error term tells the SVM optimization how much error is bearable. This is how you can control the trade-off between decision boundary and misclassification term. A smaller value of C creates a small-margin hyperplane and a larger value of C creates a larger-margin hyperplane.
 - Gamma: A lower value of Gamma will loosely fit the training dataset, whereas a higher value of gamma will exactly fit the training dataset, which causes over-fitting. In other words, you can say a low value of gamma considers only nearby points in calculating the separation line, while the a value of gamma considers all the data points in the calculation of the separation line.

## Exercises

### Exercise 1: Experimenting with Different Kernels

1. **Polynomial Kernel**: Modify the code to use a polynomial kernel (`kernel='poly'`) and observe the changes in accuracy, precision, and recall.
2. **RBF Kernel**: Modify the code to use an RBF kernel (`kernel='rbf'`) and observe the changes in accuracy, precision, and recall.
3. **Sigmoid Kernel**: Modify the code to use a sigmoid kernel (`kernel='sigmoid'`) and observe the changes in accuracy, precision, and recall.


### Exercise 2: Tuning the C Parameter

1. **Low C Value**: Modify the code to set a low value for the `C` parameter (e.g., `C=0.1`) and observe how it affects the model's accuracy, precision, and recall.
2. **High C Value**: Modify the code to set a high value for the `C` parameter (e.g., `C=1000`) and observe how it affects the model's accuracy, precision, and recall.
3. **Intermediate C Value**: Modify the code to set an intermediate value for the `C` parameter (e.g., `C=1`) and observe how it affects the model's accuracy, precision, and recall.


### Exercise 3: Tuning the Gamma Parameter

1. **Low Gamma Value**: Modify the code to set a low value for the `gamma` parameter (e.g., `gamma=0.01`) when using the RBF kernel and observe how it affects the model's accuracy, precision, and recall.
2. **High Gamma Value**: Modify the code to set a high value for the `gamma` parameter (e.g., `gamma=1`) when using the RBF kernel and observe how it affects the model's accuracy, precision, and recall.
3. **Default Gamma Value**: Modify the code to use the default value for the `gamma` parameter (`gamma='scale'`) when using the RBF kernel and observe how it affects the model's accuracy, precision, and recall.

### Exercise 4: Combining Parameter Changes

1. **Combination 1**: Modify the code to use a polynomial kernel with `degree=3` and `C=1`. Observe the changes in accuracy, precision, and recall.
2. **Combination 2**: Modify the code to use an RBF kernel with `gamma=0.1` and `C=10`. Observe the changes in accuracy, precision, and recall.
3. **Combination 3**: Modify the code to use a sigmoid kernel with `C=0.5` and `gamma=0.5`. Observe the changes in accuracy, precision, and recall.

### Exercise 5: Cross-Validation

1. **Cross-Validation with Linear Kernel**: Modify the code to use cross-validation with a linear kernel and observe the changes in the model's performance metrics.
2. **Cross-Validation with RBF Kernel**: Modify the code to use cross-validation with an RBF kernel and different values of `gamma` and `C`. Observe the changes in the model's performance metrics.
3. **Cross-Validation with Polynomial Kernel**: Modify the code to use cross-validation with a polynomial kernel and different values of `degree` and `C`. Observe the changes in the model's performance metrics.

### Exercise 6: Feature Scaling

1. **Without Scaling**: Observe the SVM model performance without scaling the features.
2. **With Standard Scaling**: Use `StandardScaler` from `sklearn.preprocessing` to scale the features and observe the changes in model performance.
3. **With MinMax Scaling**: Use `MinMaxScaler` from `sklearn.preprocessing` to scale the features and observe the changes in model performance.

### Exercise 7: Handling Imbalanced Data

1. **Class Weights Adjustment**: Modify the `class_weight` parameter to handle imbalanced data and observe the changes in model performance.
2. **SMOTE (Synthetic Minority Over-sampling Technique)**: Use the `SMOTE` technique to balance the dataset and observe the changes in model performance.

### Exercise 8: Learning Curves

1. **Plot Learning Curves**: Modify the code to plot learning curves to visualize the training and validation scores.
2. **Analyze Overfitting and Underfitting**: Use the learning curves to analyze if the model is overfitting or underfitting.

### Exercise 9: Grid Search for Hyperparameter Tuning

1. **Grid Search with Linear Kernel**: Use `GridSearchCV` to find the best parameters for an SVM with a linear kernel.
2. **Grid Search with RBF Kernel**: Use `GridSearchCV` to find the best parameters for an SVM with an RBF kernel.
3. **Grid Search with Polynomial Kernel**: Use `GridSearchCV` to find the best parameters for an SVM with a polynomial kernel.

### Exercise 10: SVM for Regression

1. **SVR (Support Vector Regression)**: Modify the code to use SVR for a regression problem and observe the model's performance.
2. **Tuning SVR Parameters**: Experiment with different parameters for SVR and observe the changes in model performance.

### Exercise 11: SVM Classification on the "Iris" Dataset
Perform SVM classification on the famous Iris dataset to classify the species of iris flowers. Experiment with different kernels and parameters to optimize the model's performance.

Instructions:

1. **Load the Iris Dataset**: Use the `load_iris()` function from `sklearn.datasets` to load the Iris dataset.
2. **Experiment with Different Kernels**: Modify the code to use polynomial (`kernel='poly'`), RBF (`kernel='rbf'`), and sigmoid (`kernel='sigmoid'`) kernels. Observe the changes in accuracy, precision, and recall.
3. **Tune the C Parameter**: Experiment with different values of the `C` parameter (e.g., `C=0.1`, `C=1`, `C=1000`). Observe how it affects the model's performance.
4. **Tune the Gamma Parameter**: For the RBF kernel, experiment with different values of the `gamma` parameter (e.g., `gamma=0.01`, `gamma=0.1`, `gamma=1`). Observe how it affects the model's performance.


### Exercise 12: SVR on the "Boston Housing" Dataset
Perform Support Vector Regression (SVR) on the Boston Housing dataset to predict housing prices. Experiment with different kernels and parameters to optimize the model's performance.

Instructions:

1. **Load the Boston Housing Dataset**: Use the `load_boston()` function from `sklearn.datasets` to load the Boston Housing dataset.
2. **SVR with Different Kernels**: Modify the code to use SVR with linear (`kernel='linear'`), polynomial (`kernel='poly'`), and RBF (`kernel='rbf'`) kernels. Observe the changes in mean squared error and R^2 score.
3. **Tune the C Parameter**: Experiment with different values of the `C` parameter (e.g., `C=0.1`, `C=1`, `C=1000`). Observe how it affects the model's performance.
4. **Tune the Gamma Parameter**: For the RBF kernel, experiment with different values of the `gamma` parameter (e.g., `gamma=0.01`, `gamma=0.1`, `gamma=1`). Observe how it affects the model's performance.


### Exercise 13: SVM Classification on the "Wine" Dataset
Perform SVM classification on the Wine dataset to classify the type of wine. Experiment with different kernels and parameters to optimize the model's performance.

Instructions:

1. **Load the Wine Dataset**: Use the `load_wine()` function from `sklearn.datasets` to load the Wine dataset.
2. **Experiment with Different Kernels**: Modify the code to use polynomial (`kernel='poly'`), RBF (`kernel='rbf'`), and sigmoid (`kernel='sigmoid'`) kernels. Observe the changes in accuracy, precision, and recall.
3. **Tune the C Parameter**: Experiment with different values of the `C` parameter (e.g., `C=0.1`, `C=1`, `C=1000`). Observe how it affects the model's performance.
4. **Tune the Gamma Parameter**: For the RBF kernel, experiment with different values of the `gamma` parameter (e.g., `gamma=0.01`, `gamma=0.1`, `gamma=1`). Observe how it affects the model's performance.

### Exercise 14: SVR on the "Diabetes" Dataset
Perform Support Vector Regression (SVR) on the Diabetes dataset to predict disease progression. Experiment with different kernels and parameters to optimize the model's performance.

Instructions:

1. **Load the Diabetes Dataset**: Use the `load_diabetes()` function from `sklearn.datasets` to load the Diabetes dataset.
2. **SVR with Different Kernels**: Modify the code to use SVR with linear (`kernel='linear'`), polynomial (`kernel='poly'`), and RBF (`kernel='rbf'`) kernels. Observe the changes in mean squared error and R^2 score.
3. **Tune the C Parameter**: Experiment with different values of the `C` parameter (e.g., `C=0.1`, `C=1`, `C=1000`). Observe how it affects the model's performance.
4. **Tune the Gamma Parameter**: For the RBF kernel, experiment with different values of the `gamma` parameter (e.g., `gamma=0.01`, `gamma=0.1`, `gamma=1`). Observe how it affects the model's performance.