
## Case Study: Predicting Wine Quality using SVM

In this case study, we will use the Support Vector Machine (SVM) algorithm to predict the quality of wine based on its chemical properties. The Wine dataset contains information about various chemical properties of wines and their quality ratings.

### Step 1: Load and Prepare Data


In [None]:

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load dataset
df = pd.read_csv('wine_dataset.csv')

# Display the first few rows
df.head()



### Step 2: Split Data into Training and Testing Sets


In [None]:

X = df.drop('target', axis=1)
y = df['target']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)



### Step 3: Feature Scaling


In [None]:

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)



### Step 4: Build and Train SVM Model


In [None]:

from sklearn.svm import SVC

svm = SVC(kernel='linear')
svm.fit(X_train, y_train)
y_pred = svm.predict(X_test)



### Step 5: Evaluate the Model


In [None]:

from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

print(classification_report(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))
print(f"Accuracy: {accuracy_score(y_test, y_pred) * 100:.2f}%")



### Step 6: Hyperparameter Tuning


In [None]:

from sklearn.model_selection import GridSearchCV

param_grid = {
    'C': [0.1, 1, 10, 100],
    'gamma': [1, 0.1, 0.01, 0.001],
    'kernel': ['rbf', 'poly', 'sigmoid']
}

grid_search = GridSearchCV(SVC(), param_grid, cv=5, scoring='accuracy')
grid_search.fit(X_train, y_train)

print(f"Best parameters: {grid_search.best_params_}")
print(f"Best score: {grid_search.best_score_}")



### Step 7: Evaluate the Tuned Model


In [None]:

best_svm = grid_search.best_estimator_
best_svm.fit(X_train, y_train)
y_pred_best = best_svm.predict(X_test)

print(classification_report(y_test, y_pred_best))
print(confusion_matrix(y_test, y_pred_best))
print(f"Accuracy: {accuracy_score(y_test, y_pred_best) * 100:.2f}%")
