<a href="https://www.kaggle.com/code/rebeccapringle/simple-svm-example?scriptVersionId=117761481" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

SVM (support vector machine) is a supervised machine-learning model used for classification. A boundary is made points on either side of the boundary are the classified points.
These boundaries can be defined in different ways:
1. linear
2. poly - 3D space
3. Gaussian rbf - infinite dimensions

Here I show linear and rbf SVM classification on a dataset of mobile prices. The price ranges are split into categories of 1,2 and 3. After preparing the data and training the models, the model is hypertuned to find the optimised form and then applied to the test data. 

## Import packages

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.svm import SVC
from sklearn.model_selection import cross_val_score,StratifiedKFold,LeaveOneOut, train_test_split
from sklearn.preprocessing import MinMaxScaler

## Get datasets ready for creating the models

In [2]:
train = pd.read_csv('/kaggle/input/mobile-price-classification/train.csv')
test = pd.read_csv('/kaggle/input/mobile-price-classification/test.csv')

In [3]:
y = np.array(train['price_range'])
X = train.drop(['price_range'],axis=1)
X = np.array(X)

scaler = MinMaxScaler()
X = scaler.fit_transform(X)

X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=.20, random_state=42) 

## Create a linear SVM model

In [4]:
model1 = SVC(kernel = 'linear', C = 20).fit(X_train,Y_train)
scores = cross_val_score(model1, X_train, Y_train, cv=5)
strat_scores = cross_val_score(model1, X_train, Y_train, cv = StratifiedKFold(5,random_state=10,shuffle=True))

In [5]:
print(scores)
print(strat_scores)

[0.95     0.9625   0.95     0.925    0.971875]
[0.95625  0.95     0.94375  0.971875 0.94375 ]


## Non-linear SVM

In [6]:
model2 = SVC(kernel='rbf',C = 20).fit(X_train,Y_train)
model2train = model2.score(X_train,Y_train)
model2test  = model2.score(X_test,Y_test)

print(model2train)
print(model2test)

1.0
0.8775


## With gamma hyperparameter

In [7]:
model3 = SVC(kernel = 'rbf',C = 20,gamma = .5).fit(X_train,Y_train)
model3train = model3.score(X_train,Y_train)
model3test  = model3.score(X_test,Y_test)

print(model3train)
print(model3test)

1.0
0.8475


## Find the best parameters and model for the dataset 

In [8]:
for gamma in [.2,.3,.4]:
    for C in [40,50,60]:
        model = SVC(kernel = 'rbf',
                    C = C,
                    gamma = gamma).fit(X_train, Y_train)
        modeltrain = model.score(X_train,
                                 Y_train)
        modeltest  = model.score(X_test,
                                 Y_test)
        print("Gamma: {} C:{} Total Score : {:2f}".format(gamma,C,modeltrain + modeltest))   

Gamma: 0.2 C:40 Total Score : 1.912500
Gamma: 0.2 C:50 Total Score : 1.915000
Gamma: 0.2 C:60 Total Score : 1.915000
Gamma: 0.3 C:40 Total Score : 1.887500
Gamma: 0.3 C:50 Total Score : 1.887500
Gamma: 0.3 C:60 Total Score : 1.887500
Gamma: 0.4 C:40 Total Score : 1.865000
Gamma: 0.4 C:50 Total Score : 1.865000
Gamma: 0.4 C:60 Total Score : 1.865000


Here we can see gamma = .2 and C = 60 gives the best result.

## Apply to test dataset

In [9]:
test = test.drop(['id'],axis=1)
test.head()

Unnamed: 0,battery_power,blue,clock_speed,dual_sim,fc,four_g,int_memory,m_dep,mobile_wt,n_cores,pc,px_height,px_width,ram,sc_h,sc_w,talk_time,three_g,touch_screen,wifi
0,1043,1,1.8,1,14,0,5,0.1,193,3,16,226,1412,3476,12,7,2,0,1,0
1,841,1,0.5,1,4,1,61,0.8,191,5,12,746,857,3895,6,0,7,1,0,0
2,1807,1,2.8,0,1,0,27,0.9,186,3,4,1270,1366,2396,17,10,10,0,1,1
3,1546,0,0.5,1,18,1,25,0.5,96,8,20,295,1752,3893,10,0,7,1,1,0
4,1434,0,1.4,0,11,1,49,0.5,108,6,18,749,810,1773,15,8,7,1,0,1


In [10]:
test_mat = np.array(test)
test = scaler.fit_transform(test_mat)

model = SVC(kernel = 'rbf',C = 25,gamma = .1).fit(X_train,Y_train)
prediction = model.predict(test_mat)
pred = pd.DataFrame(prediction)
pred.head()

Unnamed: 0,0
0,2
1,2
2,2
3,2
4,2
