Support Vector Machines (SVM) are powerful and versatile supervised learning models, used for both classification and regression tasks. They work well for complex datasets.

SVM constructs a hyperplane (or set of hyperplanes) in a high-dimensional space to separate different classes. It aims to find the best margin (distance between the line and the nearest points of each class, known as support vectors) that separates the classes.
Evaluation Metrics

    Classification: Accuracy, Precision, Recall, F1 Score.
    Regression: Mean Squared Error (MSE), R-squared.

Applying with Sci-kit Learn

We’ll apply SVM to the Breast Cancer dataset, focusing on classifying tumors as benign or malignant. We’ll train the SVM model and evaluate its performance using classification metrics.

Here are the steps we’ll follow;

    Create and Train the SVM Model:

    A Support Vector Machine (SVM) model is created using the default settings. SVM is known for its ability to create a hyperplane (or multiple hyperplanes in higher-dimensional spaces) that separates the classes with as wide a margin as possible.

2. Predict:

    The trained SVM model is then used to predict the class labels of the test data. It does this by determining on which side of the hyperplane each data point falls.

3. Evaluate:

    The model’s predictions are evaluated against the actual labels of the test set to assess its performance.

In [1]:
# Import necessary libraries
from sklearn.datasets import load_breast_cancer
from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

In [2]:
# Load the Breast Cancer dataset
breast_cancer = load_breast_cancer()
X, y = breast_cancer.data, breast_cancer.target

In [3]:
# Splitting data into training & testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [4]:
# Creating & training the SVM Model
model = SVC()
model.fit(X_train, y_train)


In [6]:
# Predicting the test set results
y_pred = model.predict(X_test)

In [7]:
# Evaluating the model
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, average='macro')
recall = recall_score(y_test, y_pred, average='macro')
f1 = f1_score(y_test, y_pred, average='macro')

In [8]:
# Printing the results
print("Accuracy:", accuracy)
print("Precision:", precision)
print("Recall:", recall)
print("F1 Score:", f1)

Accuracy: 0.935672514619883
Precision: 0.953781512605042
Recall: 0.9126984126984127
F1 Score: 0.9279448381536104


These results indicate that the SVM model performs exceptionally well on the Breast Cancer dataset. The high accuracy, precision, recall, and F1 scores demonstrate the model’s effectiveness in distinguishing between benign and malignant tumors.

The balance between precision and recall is particularly important in medical diagnoses, where both false positives and false negatives carry significant consequences.