In [8]:
# What is the Support Vector Machine
# A Support Vector Machine was first introduced in the 1960s and later improvised in the 1990s. 
# It is a supervised learning machine learning classification algorithm that has become extremely popular 
# nowadays owing to its extremely efficient results.

# An SVM is implemented in a slightly different way than other machine learning algorithms. 
# It is capable of performing classification, regression and outlier detection.

# Support Vector Machine is a discriminative classifier that is formally designed by a separative hyperplane.
#  It is a representation of examples as points in space that are mapped so that the points of different categories are separated by a gap as wide as possible. 
# In addition to this, an SVM can also perform non-linear classification. Let us take a look at how the Support Vector Machine work.

In [9]:
# STEP 1: Loading The Data

# We are using the cancer data-set in the sklearn library, we will make a classifier to predict whether the cancer is malignant or benign. We can load the data-set in the following manner.

from sklearn import datasets
 
cancer_data = datasets.load_breast_cancer()
print(cancer_data.data[5])

[1.245e+01 1.570e+01 8.257e+01 4.771e+02 1.278e-01 1.700e-01 1.578e-01
 8.089e-02 2.087e-01 7.613e-02 3.345e-01 8.902e-01 2.217e+00 2.719e+01
 7.510e-03 3.345e-02 3.672e-02 1.137e-02 2.165e-02 5.082e-03 1.547e+01
 2.375e+01 1.034e+02 7.416e+02 1.791e-01 5.249e-01 5.355e-01 1.741e-01
 3.985e-01 1.244e-01]


In [10]:
print(cancer_data.data.shape)


(569, 30)


In [11]:
#STEP 2: Splitting Data

# We will divide the data-set into a training set and test set to get accurate results. After this, we will split the data using the train_test_split() function. 
# We will need 3 parameters like in the example below. The features to train the model, the target, and the test set size.

from sklearn.model_selection import train_test_split
 
cancer_data = datasets.load_breast_cancer()
 
X_train, X_test, y_train, y_test = train_test_split(cancer_data.data, cancer_data.target, test_size=0.4,random_state=109)


In [12]:
# STEP 3: Generating The Model

# To generate the model, we will first import the SVM module from sklearn to create a support vector classifier in svc() by passing the argument kernel as the linear kernel.

# Then we will train the data-set using the set() and make predictions using the predict() function.

from sklearn import svm
#create a classifier
cls = svm.SVC(kernel="linear")
#train the model
cls.fit(X_train,y_train)
#predict the response
pred = cls.predict(X_test)

In [13]:
# STEP 4: Evaluating the Model
# With this, we can predict how accurately the model or classifier can predict if the patient has heart disease or not.
#  So we will calculate the accuracy score, recall, and precision for our evaluation.

from sklearn import metrics
#accuracy
print("acuracy:", metrics.accuracy_score(y_test,y_pred=pred))
#precision score
print("precision:", metrics.precision_score(y_test,y_pred=pred))
#recall score
print("recall" , metrics.recall_score(y_test,y_pred=pred))
print(metrics.classification_report(y_test, y_pred=pred))

acuracy: 0.9649122807017544
precision: 0.9642857142857143
recall 0.9782608695652174
              precision    recall  f1-score   support

           0       0.97      0.94      0.96        90
           1       0.96      0.98      0.97       138

    accuracy                           0.96       228
   macro avg       0.97      0.96      0.96       228
weighted avg       0.96      0.96      0.96       228

