# Support Vector Machine (SVM) with Scikit-learn Wrapper (SVM_SK)

## 1. Import Required Libraries
In this section, we import the necessary libraries for loading data, applying the custom `SVM_SK` class, and visualizing the results. We also import the `SVM_SK` class from `src/sklearn_impl/SVM_SK.py`.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_classification  # For generating synthetic classification data
import sys
import os

# Add path to access src/sklearn_impl/SVM_SK.py
sys.path.append(os.path.abspath('../src/sklearn_impl'))
from SVM_SK import SVM_SK  # Import from src/sklearn_impl/SVM_SK.py

# Plotting settings
%matplotlib inline
plt.style.use('seaborn')

## 2. Generate or Load Data
We will generate synthetic classification data using `make_classification` for demonstration purposes. If you have your own data (e.g., `placeholder_1.csv` in the repo), you can replace this section with loading your CSV file.

In [None]:
df = pd.read_csv('../data/placeholder_1.csv')  # Adjust path 
X = df[['Feature_1', 'Feature_2']].values
y = df['Label'].values  # Ensure labels are {-1, 1}

## 3. Apply SVM_SK (Scikit-learn Wrapper)
We will use the `SVM_SK` class with the 'sgd' method (iterative training with early stopping) to train the model on the data.

In [None]:
# Create SVM_SK model with SGD method and early stopping
svm_sk = SVM_SK(
    method='sgd',
    C=1.0,
    tol=1e-3,
    max_iter=1000,
    early_stopping=True,
    n_iter_no_change=10,
    verbose=True,
    interval=100,
    random_state=42
)

# Train the model
svm_sk.fit(X, y)

# Predict labels for the training data
df['Predicted_Label'] = svm_sk.predict(X)

# Display the first 5 rows with predicted labels
df.head()

## 4. Visualize the Results
We will plot the data points colored by their true and predicted labels. Since the 'sgd' method does not directly provide support vectors or a decision boundary (unlike SVC with a linear kernel), we'll visualize the classification results only.

In [None]:
# Plot the data points
plt.figure(figsize=(8, 6))
sns.scatterplot(x='Feature_1', y='Feature_2', hue='Label', style='Predicted_Label', 
                data=df, palette='Set1', s=100)

plt.title('SVM_SK Classification Results (SGD Method)')
plt.xlabel('Feature 1')
plt.ylabel('Feature 2')
plt.legend()
plt.show()

## 5. Plot Loss History
We will plot the hinge loss history tracked by `SVM_SK` to observe the convergence of the model, especially with early stopping.

In [None]:
# Get loss history
loss_history = svm_sk.get_loss_history()

# Plot loss history
plt.figure(figsize=(8, 5))
plt.plot(loss_history, marker='o')
plt.title('Hinge Loss History Over Epochs (SVM_SK with SGD)')
plt.xlabel('Epoch')
plt.ylabel('Hinge Loss')
plt.grid(True)
plt.show()

## 6. Evaluate the Model
We will compute the accuracy score of the model on the training data.

In [None]:
# Compute accuracy
accuracy = svm_sk.score(X, y)
print(f"Accuracy on training data: {accuracy:.4f}")

## 7. Conclusion
In this notebook, we implemented the `SVM_SK` class, a scikit-learn wrapper using either `SVC` or `SGDClassifier` with hinge loss. We used the 'sgd' method with early stopping, generated synthetic classification data, trained the model, visualized the results, plotted the loss history, and evaluated the accuracy.