# **Support Vector Machines**

Support Vector Machines (SVM) is a powerful classification technique that works by finding the hyperplane that best separates the data into different classes. It is particularly effective in high-dimensional spaces and is widely used for classification tasks.

SVM can also be used for regression tasks, but it is primarily known for its classification capabilities.

SVM works by maximizing the margin between the closest points of different classes, known as support vectors. The algorithm finds the optimal hyperplane that separates the classes with the largest possible margin.

SVM can handle both linear and non-linear classification problems by using different kernel functions.
The most common kernel functions used in SVM are linear, polynomial, and radial basis function (RBF) kernels. The choice of kernel depends on the nature of the data and the problem being solved.

SVM is sensitive to the choice of hyperparameters, such as the regularization parameter (C) and the kernel parameters. Proper tuning of these hyperparameters is crucial for achieving good performance.

SVM can be computationally expensive, especially for large datasets. However, it is highly effective for smaller datasets and can achieve high accuracy with proper feature selection and preprocessing.

SVM is robust to overfitting, especially when using the RBF kernel. It can generalize well to unseen data, making it a popular choice for many classification tasks.

SVM can be used for both binary and multi-class classification problems. For multi-class problems, SVM can be extended using techniques like one-vs-one or one-vs-all.

SVM is widely used in various applications, including image classification, text classification, bioinformatics, and more. Its versatility and effectiveness make it a popular choice in machine learning tasks.


In [226]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn import svm

In [227]:
df = pd.read_csv("https://github.com/RyanNolanData/YouTubeData/blob/main/500hits.csv?raw=true", encoding="latin-1")

In [228]:
df.head(10)

Unnamed: 0,PLAYER,YRS,G,AB,R,H,2B,3B,HR,RBI,BB,SO,SB,CS,BA,HOF
0,Ty Cobb,24,3035,11434,2246,4189,724,295,117,726,1249,357,892,178,0.366,1
1,Stan Musial,22,3026,10972,1949,3630,725,177,475,1951,1599,696,78,31,0.331,1
2,Tris Speaker,22,2789,10195,1882,3514,792,222,117,724,1381,220,432,129,0.345,1
3,Derek Jeter,20,2747,11195,1923,3465,544,66,260,1311,1082,1840,358,97,0.31,1
4,Honus Wagner,21,2792,10430,1736,3430,640,252,101,0,963,327,722,15,0.329,1
5,Carl Yastrzemski,23,3308,11988,1816,3419,646,59,452,1844,1845,1393,168,116,0.285,1
6,Paul Molitor,21,2683,10835,1782,3319,605,114,234,1307,1094,1244,504,131,0.306,1
7,Eddie Collins,25,2826,9949,1821,3315,438,187,47,520,1499,286,744,173,0.333,1
8,Willie Mays,22,2992,10881,2062,3283,523,140,660,1903,1464,1526,338,103,0.302,1
9,Eddie Murray,21,3026,11336,1627,3255,560,35,504,1917,1333,1516,110,43,0.287,1


In [229]:
df = df.drop(columns=['PLAYER', 'CS'])

In [230]:
df.head()

Unnamed: 0,YRS,G,AB,R,H,2B,3B,HR,RBI,BB,SO,SB,BA,HOF
0,24,3035,11434,2246,4189,724,295,117,726,1249,357,892,0.366,1
1,22,3026,10972,1949,3630,725,177,475,1951,1599,696,78,0.331,1
2,22,2789,10195,1882,3514,792,222,117,724,1381,220,432,0.345,1
3,20,2747,11195,1923,3465,544,66,260,1311,1082,1840,358,0.31,1
4,21,2792,10430,1736,3430,640,252,101,0,963,327,722,0.329,1


In [231]:
df.columns

Index(['YRS', 'G', 'AB', 'R', 'H', '2B', '3B', 'HR', 'RBI', 'BB', 'SO', 'SB',
       'BA', 'HOF'],
      dtype='object')

In [232]:
X = df.iloc[:, 0:13]
Y = df.iloc[:, 13]

In [233]:
len(df)

465

In [234]:
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

In [235]:
X.shape, Y.shape

((465, 13), (465,))

In [236]:
svm_classifier = svm.SVC()
svm_classifier.fit(X_train, Y_train)

In [237]:
svm_classifier.get_params()

{'C': 1.0,
 'break_ties': False,
 'cache_size': 200,
 'class_weight': None,
 'coef0': 0.0,
 'decision_function_shape': 'ovr',
 'degree': 3,
 'gamma': 'scale',
 'kernel': 'rbf',
 'max_iter': -1,
 'probability': False,
 'random_state': None,
 'shrinking': True,
 'tol': 0.001,
 'verbose': False}

In [None]:
Y_pred = svm_classifier.predict(X_test)
display(Y_pred)

array([1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
       1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0,
       0, 0, 0, 1, 0], dtype=int64)

In [239]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

print("SVC Score: ", svm_classifier.score(X_test, Y_test))
print("SVC Accuracy: ", accuracy_score(Y_test, Y_pred))
print("SVC Classification Report: \n", classification_report(Y_test, Y_pred))
print("SVC Confusion Matrix: \n", confusion_matrix(Y_test, Y_pred))

SVC Score:  0.8064516129032258
SVC Accuracy:  0.8064516129032258
SVC Classification Report: 
               precision    recall  f1-score   support

           0       0.78      0.98      0.87        62
           1       0.93      0.45      0.61        31

    accuracy                           0.81        93
   macro avg       0.86      0.72      0.74        93
weighted avg       0.83      0.81      0.78        93

SVC Confusion Matrix: 
 [[61  1]
 [17 14]]


In [240]:
# kernel parameter can be set to 'linear', 'poly', 'rbf', 'sigmoid', or a callable.
# 'linear' is the simplest kernel and is used for linear classification.
# 'poly' is a polynomial kernel, which can capture non-linear relationships.
# 'rbf' is the radial basis function kernel, which is a popular choice for non-linear classification. It is also the default kernel.
# 'sigmond' is a sigmoid kernel, which is less commonly used.
# C is the regularization parameter, which controls the trade-off between maximizing the margin and minimizing the classification error.
# C='1.0' means that the regularization parameter is set to 1.0. That is, the model will try to maximize the margin while allowing some misclassifications.
# Value of C can lie between 0 and infinity. A smaller value of C will result in a wider margin, but may allow more misclassifications, while a larger value of C will result in a narrower margin, but may lead to overfitting.
# gamma is a parameter that is only used with certain kernels, such as 'rbf', 'poly', and 'sigmoid'.
# It defines how far the influence of a single training example reaches, with low values meaning 'far' and high values meaning 'close'.
# gamma is the kernel coefficient for 'rbf', 'poly', and 'sigmoid'.
# gamma='scale' is the default value, which is calculated as 1 / (n_features * X.var()). It is generally a good choice for most datasets.
# gamma='auto' is another option, which is calculated as 1 / n_features.
svm_classifier_2 = svm.SVC(kernel='linear', C=1.0, gamma='scale')
svm_classifier_2.fit(X_train, Y_train)

In [241]:
svm_classifier_2.get_params()

{'C': 1.0,
 'break_ties': False,
 'cache_size': 200,
 'class_weight': None,
 'coef0': 0.0,
 'decision_function_shape': 'ovr',
 'degree': 3,
 'gamma': 'scale',
 'kernel': 'linear',
 'max_iter': -1,
 'probability': False,
 'random_state': None,
 'shrinking': True,
 'tol': 0.001,
 'verbose': False}

In [242]:
Y_pred_2 = svm_classifier_2.predict(X_test)
display(Y_pred_2)

array([1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 1, 0,
       1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0,
       0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 0,
       0, 0, 0, 1, 0], dtype=int64)

In [243]:
print("SVC 2 Score: ", svm_classifier.score(X_test, Y_test))
print("SVC 2 Accuracy: ", accuracy_score(Y_test, Y_pred_2))
print("SVC 2 Classification Report: \n", classification_report(Y_test, Y_pred_2))
print("SVC 2 Confusion Matrix: \n", confusion_matrix(Y_test, Y_pred_2))

SVC 2 Score:  0.8064516129032258
SVC 2 Accuracy:  0.8602150537634409
SVC 2 Classification Report: 
               precision    recall  f1-score   support

           0       0.86      0.95      0.90        62
           1       0.88      0.68      0.76        31

    accuracy                           0.86        93
   macro avg       0.87      0.81      0.83        93
weighted avg       0.86      0.86      0.86        93

SVC 2 Confusion Matrix: 
 [[59  3]
 [10 21]]
