# Support Vector Machine(SVM)

in the first step we need to import all of the needed libraries

In [6]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.model_selection import train_test_split

in the next s step, we have to import the needed libraries. so we do as the following:

In [7]:
data = pd.read_csv("breast-cancer.csv") # in here we load the dataset

data.dropna(inplace=True) # now we should get riDrop any rows with missing values


# Separate features and target variable
X = data.drop("diagnosis", axis=1)
y = data["diagnosis"]

# Encode target labels to numerical values (Malignant: 1, Benign: 0)
y = np.where(y == "M", 1, 0)

# Split the data into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


It's important to scale the features to ensure that they are on a similar scale, which can improve the SVM model's performance. so we do as following: 

In [8]:
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

in the next part, we have to train the SVM model that we have.
Now, we'll train the SVM model using scikit-learn's SVC (Support Vector Classification) class.

In [9]:
# Create an SVM classifier
svm_classifier = SVC(kernel='linear', C=1.0, random_state=42)

# Train the model on the scaled training data
svm_classifier.fit(X_train_scaled, y_train)

in the code above, The parameter C controls the trade-off between maximizing the margin and minimizing the classification error

now we need to make predictions and evaluate the model. 

In [10]:
# Make predictions on the test set
y_pred = svm_classifier.predict(X_test_scaled)

# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
classification_rep = classification_report(y_test, y_pred)

print(f"Accuracy: {accuracy:.2f}")
print("Confusion Matrix:")
print(conf_matrix)
print("Classification Report:")
print(classification_rep)

Accuracy: 0.96
Confusion Matrix:
[[68  3]
 [ 2 41]]
Classification Report:
              precision    recall  f1-score   support

           0       0.97      0.96      0.96        71
           1       0.93      0.95      0.94        43

    accuracy                           0.96       114
   macro avg       0.95      0.96      0.95       114
weighted avg       0.96      0.96      0.96       114



The code above will print out the accuracy of the model, confusion matrix, and classification report, which contains metrics like precision, recall, and F1-score for both classes (Malignant and Benign).