<a href="https://colab.research.google.com/github/ronakbihani123/supervised_ml_algorithms/blob/main/svm_ml_algorithm.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# svm is a supervised machine learning algorithm which is used for both classification and regression task.
# it is particularly effective in dealing wuth complex and high dimensional datasets.
# the fundamental of svm is to find an optimal hyperplane that maximally separates different classes in the input space.
# how svm works?
# svm requires labeled training data consisting of input features and corresponding class labels .
# each data is represented as a feature vector
# data is preprocessed and scaled to ensure that features are on similar scales , typically between 0 & 1.
# hyperplane and margin ?
# svm aims to find at best separates the different classes in the features space in a binary classification problem , the hyperplane
# is a line in 2-D space or a hyperplane in a high dimensional space .
# svm seeks to maximise the margin which is the distance between the hyperplane & the nearest data point from each class .
# the points from the margin are known as support vectors as they play a crucial role in dividing the decision boundary.

In [None]:
# Linear kernel .
# the linear svm finds a linear hyperplane that separates the data into different classes.
# The goal is to find the hyperplane that maximizes the margin while minimizing the miss classification of training examples.
# this can be formulated as on optimization problem with the objective of minimizing the weight of the hyperplane
# subject to the constraint that all training examples lies on the correct side of the hyperplane.

In [None]:
# non linear svm

# in cases where the data is non_linearly separable svm uses a technique called the kernel brick , the kernel brick maps the original input space into a
# highly dimensional feature space where the data points can be linearly separable.

In [None]:
# training --->

# svm training involves finding the optional hyper plane or decision boundary that separates the classes .
# the optimization problem is typically solve using the methods such as quadratic programming or sequential minimal optimization .

# a process involves solving for the weights of the hyperplane and the bias term that define the decision boundary.
# the objective is to minimize the regularization term which ensuring that the training examples are classified correctly.


In [None]:
# prediction--->

# once the svm model is trained it can be used to predict the class labels of new unseen data points
# this algo  computes the distance of the test points to decision boundary and the predicted class label
# is determined based on which side of the decision boundary the test points lies.

# the decision function can also provide a confidence score or probability for the prediction .



In [None]:
# advantage of svm -->
# it is effective in high dimensional space
# it is robust against overfitting due to the margin maximisation principle.
# it is versatile through the use of different kernel function.
# it can handle both linear & non linear classification task .


In [None]:
# limitations--->
# it is computationally expensive for large datasets.
# it requires proper selection of hyper parameters such as the regularization parameter & the kernel parameter .
# it is difficult to interpret the learned model compared to simple interpret simpler algos like logistic regression.


In [None]:
# how i select which kernel i have to use ---> to which data in svm algorithm ?
# while selecting the kernel depends on the data and the problem we are trying to solve--->

# 1. linear kernel ---> it is suitable for linearly separable data . it work well when their is a clear linear boundary b/w classes .
# 2. polynomial kernel ---> it maps the data into a higher dimensional space using polynomial functions
# it is useful when the decision boundary is curved or as higher degree of complexity .
# the degree of the polynomial which determines the complexity can be specified .
# 3. RPF kernel ---> the gaussian / RPF kernel maps the data into a infinity dimensional space.
# is is suitable for non-linearly separable data & works well when the decision boundary is complex .
# it is a popular choice due to its flexibility & ability to capture integrate patterns
# 4. symboid kernel ---> this kernel maps the data into a higher dimensional space using the symboid function .
# it is useful when the decision boundary is S-shaped


In [None]:
from sklearn import svm
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Generate synthetic classification data
x , y = make_classification(n_samples=100,n_features=2,n_informative=2,
                            n_redundant=0,random_state=42)
# Split the data into training and test sets
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2,random_state=42)

# Linear Kernel
linear_svc = svm.SVC(kernel='linear')
linear_svc.fit(x_train,y_train)
linear_predictions = linear_svc.predict(x_test)
linear_accuracy = accuracy_score(y_test,linear_predictions)
print("Linear Kernel Accuracy:",linear_accuracy)

# Polynomial Kernel
poly_svc = svm.SVC(kernel='poly',degree=3) #degree=3 ==> x**0+x**1+x**2
poly_svc.fit(x_train,y_train)
poly_predictions = poly_svc.predict(x_test)
poly_accuracy = accuracy_score(y_test,poly_predictions)
print("Polynomial Kernel Accuracy:",poly_accuracy)

Linear Kernel Accuracy: 0.95
Polynomial Kernel Accuracy: 0.9


In [None]:
from sklearn import svm
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

In [None]:
x, y = make_classification(n_samples = 100 , n_features = 2 , n_informative=2 , n_redundant=0 , random_state = 42)

In [None]:
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.2,random_state=42)

In [None]:
# x_train

In [None]:
linear_kernel = svm.SVC(kernel = "linear")
linear_kernel.fit(x_train,y_train)
y_pred = linear_kernel.predict(x_test)
accuracy_score(y_test,y_pred)

0.95

In [None]:
poly_kernel = svm.SVC(kernel = "poly" , degree = 3)
poly_kernel.fit(x_train,y_train)
y_pred = poly_kernel.predict(x_test)
accuracy_score(y_test,y_pred)

0.9

In [None]:
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.datasets import make_classification

In [None]:
x, y = make_classification(n_samples = 500 , n_features = 5 , n_informative = 2 , n_redundant=0 , random_state = 42)

In [None]:
x_train , x_test, y_train , y_test = train_test_split(x,y,test_size=0.2,random_state=42)

In [None]:
x_train.shape

(400, 5)

In [None]:
x.shape

(500, 5)

In [None]:
y.shape

(500,)

In [None]:
linear_svm = svm.SVC(kernel = "linear")
linear_svm.fit(x_train,y_train)

In [None]:
linear_prediction = linear_svm.predict(x_test)
accuracy_score(y_test,linear_prediction)

0.88

In [None]:
poly_svm = svm.SVC(kernel = "poly")
poly_svm.fit(x_train,y_train)
poly_predict = poly_svm.predict(x_test)
accuracy_score(y_test,poly_predict)

0.84

In [None]:
x_train

array([[-1.38496952,  0.73487779,  1.17347386,  0.91616965,  0.66288127],
       [ 0.18943818, -0.43676424,  1.7495839 ,  1.35201254, -1.6065771 ],
       [ 1.53079268,  0.88729084,  0.03793784,  0.5428799 , -0.7632861 ],
       ...,
       [ 0.37031509, -1.86653995, -0.68462983,  2.06629615,  1.00751369],
       [ 0.44169727, -0.32861848, -0.544114  , -1.72453624,  0.60318743],
       [-1.68297194,  0.61351797, -0.25737654, -1.98145118, -1.02279257]])