In [2]:
# Question 1.1
print("1.1 In SVM, the margin is the distance between the separating hyperplane (decision boundary) "
      "and the closest data points from each class, which are called support vectors. The equations "
      "for the two margin hyperplanes H+ and H- are typically w.x + b = 1 for H+ and w.x + b = -1 for H-, "
      "where w is the weight vector, x is the input vector, and b is the bias.\n")


# Define the points
x1 = (2, 0)
x2 = (0, 2)

# Compute weights (w1, w2)
w1 = (x2[1] - x1[1]) / (x2[0] - x1[0])
w2 = -((x2[1] - x1[1]) / (x2[0] - x1[0]))

# Compute the offset (b) for the point (2,0)
b = -w1 * x1[0] - w2 * x1[1]

# Print the linear SVM equation
print("Linear SVM that optimally separates the classes by maximizing the margin:")
print(f"{w1} * x1 + {w2} * x2 + {b} = 0\n")


# Question 1.3
print("1.3 A kernel function is a computational tool in SVMs that allows the algorithm to operate in a "
      "high-dimensional space without explicitly calculating the coordinates of the data in that space. "
      "It's a way to implicitly map input data into higher-dimensional space, making it possible to "
      "perform linear separation when the data is not linearly separable in the original input space. "
      "Common kernel functions include linear, polynomial, radial basis function (RBF), and sigmoid.")


1.1 In SVM, the margin is the distance between the separating hyperplane (decision boundary) and the closest data points from each class, which are called support vectors. The equations for the two margin hyperplanes H+ and H- are typically w.x + b = 1 for H+ and w.x + b = -1 for H-, where w is the weight vector, x is the input vector, and b is the bias.

Linear SVM that optimally separates the classes by maximizing the margin:
-1.0 * x1 + 1.0 * x2 + 2.0 = 0

1.3 A kernel function is a computational tool in SVMs that allows the algorithm to operate in a high-dimensional space without explicitly calculating the coordinates of the data in that space. It's a way to implicitly map input data into higher-dimensional space, making it possible to perform linear separation when the data is not linearly separable in the original input space. Common kernel functions include linear, polynomial, radial basis function (RBF), and sigmoid.


In [4]:



import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.neural_network import MLPClassifier


df = pd.read_csv('heart-disease-dataset1.csv')


df = df.replace('?', np.nan)


df = df.apply(pd.to_numeric, errors='coerce')


df = df.fillna(df.mean())


X = df.drop('result', axis=1).values
y = df['result'].values


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)


print("SVM Results:")
kernels = ['linear', 'rbf', 'sigmoid']
for kernel in kernels:
    svm_clf = svm.SVC(kernel=kernel)
    svm_clf.fit(X_train, y_train)
    y_pred = svm_clf.predict(X_test)
    print(f"Accuracy with SVM {kernel} kernel: {accuracy_score(y_test, y_pred)}")


print("\nNeural Network Results:")
optimizers = ['sgd', 'adam']
for optimizer in optimizers:
    nn_clf = MLPClassifier(solver=optimizer, alpha=1e-5, hidden_layer_sizes=(5, 2), random_state=1,
                           max_iter=1000, learning_rate_init=0.001, early_stopping=True)
    nn_clf.fit(X_train, y_train)
    y_pred = nn_clf.predict(X_test)
    print(f"Accuracy with Neural Network {optimizer} optimizer: {accuracy_score(y_test, y_pred)}")

print("""
Based on the model evaluation, the SVM with a linear kernel outperforms the other models, achieving an accuracy of approximately 55.74%. This higher performance suggests that the heart disease dataset might be linearly separable to some extent, making the linear kernel effective in finding a separating hyperplane.

In contrast, the SVM models with RBF and Sigmoid kernels, as well as the Neural Network models with SGD and Adam optimizers, yield lower accuracies (ranging from 42.62% to 52.46%). These results indicate that for this particular dataset, the complexity introduced by non-linear kernels and the multi-layer architecture of the Neural Networks does not necessarily lead to better classification performance.

The superior performance of the SVM with a linear kernel could be attributed to its simplicity and robustness, particularly in scenarios where the data exhibits a linear relationship or when the dataset size and feature space complexity do not justify more complex models.
""")


SVM Results:
Accuracy with SVM linear kernel: 0.5573770491803278
Accuracy with SVM rbf kernel: 0.5245901639344263
Accuracy with SVM sigmoid kernel: 0.5245901639344263

Neural Network Results:
Accuracy with Neural Network sgd optimizer: 0.4262295081967213
Accuracy with Neural Network adam optimizer: 0.4426229508196721

Based on the model evaluation, the SVM with a linear kernel outperforms the other models, achieving an accuracy of approximately 55.74%. This higher performance suggests that the heart disease dataset might be linearly separable to some extent, making the linear kernel effective in finding a separating hyperplane.

In contrast, the SVM models with RBF and Sigmoid kernels, as well as the Neural Network models with SGD and Adam optimizers, yield lower accuracies (ranging from 42.62% to 52.46%). These results indicate that for this particular dataset, the complexity introduced by non-linear kernels and the multi-layer architecture of the Neural Networks does not necessarily 