# Support Vector Machines Classification with Python
This post will provide an example of SVM using Python broken into the following steps.

Data preparation

Model Development

We will use the linear kernel.

# Modules

In [None]:
import numpy as np
import pandas as pd
from pydataset import data
from sklearn import svm
from sklearn.metrics import classification_report
from sklearn import model_selection

# Data Preparation
We now need to load our dataset and remove any missing values.

In [None]:
df=pd.read_cvs()
df=df.dropna()
df.head()

# Dummy Variables

In [None]:
#select the data set only contain target dummy
dummy=pd.get_dummies(df[''])
## concat with the origianal dataset
df=pd.concat([df,dummy],axis=1)
#rename the variable with 0,1, drop will make the string text 2 become 0
df=df.rename(index=str, columns={"string text 1": "new variable name"})
df=df.drop('string text 2', axis=1)

If you look at the dataset now you will see a lot of variables that are not necessary. Drop the text variables

In [None]:
df=df.drop([''],axis=1)
df.head()

# Scaling of Variables
Now we need to scale the data. This is because SVM is sensitive to scale.

In [None]:
df = (df - df.min()) / (df.max() - df.min())
df.head()

# Model Development
Before developing our model we need to prepare the train and test sets we begin by placing our independent and dependent variables in different data frames.

## Independent and Dependent Variables

In [None]:
X=df[['','','']]
y=df['']

# Train and Test Sets
Now, we need to create the models or the hypothesis we want to test. We will create two hypotheses. The first model is using a linear kernel and the second is one using the rbf kernel. For each of these kernels, there are hyperparameters that need to be set which you will see in the code below.

In [None]:
X_train,X_test,y_train,y_test=model_selection.train_test_split(X,y,test_size=.3,random_state=1)

In [None]:
h1=svm.LinearSVC(C=1)

In [None]:
h1.fit(X_train,y_train)
h1.score(X_train,y_train)

In [None]:
y_pred=h1.predict(X_test)

In [None]:
pd.crosstab(y_test,y_pred)

In [None]:
print(classification_report(y_test, y_pred))