### ***SVM*** ###

svm is a model used for classification and regression tasks. It finds the optimal hyperplane that best separates different classes in the feature space.
It can handle both linear and non-linear data using kernel functions.

***core concept*** - Find the best line that seperates different classes with the largest possible margin.

***margin*** - The distance between the hyperplane and the closest point from each class.

***support vectors*** - the data points closest to the decision boundary. 

***High margin SVM*** works if data is linearly separable. no negative in positive zone and vice versa.

***Soft margin SVM*** works allows some misclassifications for better overall accuracy.
    
***Non linear SVM*** works when data is not linearly separable. It uses kernel trick to transform data into higher dimension where it is linearly separable.

***Hinge loss*** is a loss function used in training classifiers, especially for support vector machines (SVMs). It measures how well the model's predictions align with the true class labels, penalizing incorrect classifications and those that are correct but not confident enough.
    
if prediction is correct but too close to decision boundary, hinge loss is small but non-zero.

if prediction is incorrect, hinge loss will be hugh and increases linearly with distance from the margin.

if prediction is correct and lies beyond the margin, hinge loss is zero.

***Note*** - not only you want to be correct, you must also be confidently correct by a margin of at least 1.

![image-2.png](attachment:image-2.png)

***Kernel*** is used to transform data into higher dimension where it is linearly separable.

for example a 2d dataset that is not linearly separable can be transformed into a 3d dataset using polynomial kernel.

![image.png](attachment:image.png)

| Kernel                          | Description                           | Example Use           |
| ------------------------------- | ------------------------------------- | --------------------- |
| **Linear**                      | Works when data is linearly separable | Simple classification |
| **Polynomial**                  | Adds polynomial features              | Curved boundaries     |
| **RBF (Radial Basis Function)** | Popular, handles non-linear data      | Most common default   |
| **Sigmoid**                     | Similar to neural networks            | Experimental          |


***Important hyperparameters***

| Parameter  | Description                                                                                    |
| ---------- | ---------------------------------------------------------------------------------------------- |
| **C**      | Regularization parameter — controls trade-off between correct classification and margin width. |
| **Kernel** | Type of kernel to use ('linear', 'poly', 'rbf', 'sigmoid').                                    |
| **Gamma**  | Controls influence of individual training examples in RBF kernel.                              |


In [None]:
# implementation of SVM in python using sklearn

from sklearn.svm import SVC

# create a SVM classifier
model = SVC(kernel='linear', C=1.0)

# where kernel can be 'linear', 'poly', 'rbf', 'sigmoid'
# C is the regularization parameter
# gamma is the kernel coefficient for 'rbf', 'poly' and 'sigmoid'

Order of Implementation — Classification and Regression

Common preliminary steps
1. Import necessary libraries (numpy, pandas, sklearn, matplotlib/seaborn)
2. Load the dataset
3. Exploratory Data Analysis (class balance, distributions, correlations)
4. Preprocess the data (handle missing values, encode categoricals, feature scaling — scaling is important for SVM)
5. Split data into training and testing sets (and optionally a validation set or use CV)

Classification (SVM / SVC)
1. Choose model: SVC (select kernel: 'linear', 'poly', 'rbf', 'sigmoid')
2. Initialize model with baseline hyperparameters (C, kernel, gamma where applicable)
3. Train the model on training data
4. Make predictions (class labels and optionally probabilities/decision function)
5. Evaluate performance: accuracy, precision, recall, F1, confusion matrix, ROC-AUC
6. Tune hyperparameters using GridSearchCV/RandomizedSearchCV with cross-validation
7. Validate final model on test set
8. (Optional) Visualize decision boundary (2D or via PCA/TSNE), learning curves, and feature importances (if applicable)

Regression (SVR)
1. Choose model: SVR (select kernel: 'linear', 'poly', 'rbf', 'sigmoid')
2. Initialize model with baseline hyperparameters (C, epsilon, kernel, gamma)
3. Train the model on training data
4. Make predictions (continuous values)
5. Evaluate performance: MSE/RMSE, MAE, R², residual plots
6. Tune hyperparameters using GridSearchCV/RandomizedSearchCV with cross-validation
7. Validate final model on test set
8. (Optional) Visualize predicted vs actual, residuals, and learning curves

Notes
- Scaling is critical for SVM kernels (StandardScaler/MinMaxScaler).
- For large datasets consider LinearSVC or approximate solvers for performance.
- Use stratified splits for classification when classes are imbalanced; consider resampling or class-weight for imbalance.
