<a href="https://colab.research.google.com/github/Simarjit1303/Data-Science/blob/main/exercises/machine-learning/Neural-Network/stochastic_gradient_descent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Stochastic Gradient Descent
You should build an end-to-end machine learning pipeline using a stochastic gradient descent model. In particular, you should do the following:
- Load the `mnist` dataset using [Pandas](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html). You can find this dataset in the datasets folder.
- Split the dataset into training and test sets using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html).
- Build an end-to-end machine learning pipeline, including a [stochastic gradient descent](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html) model.
- Optimize your pipeline by validating your design decisions.
- Test the best pipeline on the test set and report various [evaluation metrics](https://scikit-learn.org/0.15/modules/model_evaluation.html).  
- Check the documentation to identify the most important hyperparameters, attributes, and methods of the model. Use them in practice.

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import RandomizedSearchCV, cross_val_score
from scipy.stats import uniform, reciprocal

In [2]:
df = pd.read_csv("https://raw.githubusercontent.com/m-mahdavi/teaching/refs/heads/main/datasets/mnist.csv")
df.head()

Unnamed: 0,id,class,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,31953,5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,34452,8,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,60897,5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,36953,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,1981,3,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## DATA Preprocessing

In [3]:
df_train, df_test = train_test_split(df, test_size=0.2, random_state=42)
print(df_train.shape, df_test.shape)

(3200, 786) (800, 786)


In [4]:
X_train = df_train.drop(['id'],axis=1)
y_train = df_train['class']
X_test = df_test.drop(['id'],axis=1)
y_test = df_test['class']

print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

(3200, 785) (3200,)
(800, 785) (800,)


## Feature Scalling

In [5]:
scaler =  StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

## Model Training

## Deafult parameter

In [6]:
sgd_clf_model = SGDClassifier()
sgd_clf_model.fit(X_train_scaled, y_train)
cv_score = cross_val_score(sgd_clf_model, X_train_scaled, y_train, cv=5, scoring='accuracy')
print("Mean_CV Score:", cv_score.mean())

Mean_CV Score: 0.9081250000000001


## Hyperparameter Tuning

In [7]:
param  = {
    'loss': ['hinge', 'log', 'modified_huber', 'squared_hinge', 'perceptron'],
    'penalty': ['l1', 'l2', 'elasticnet', 'None'],
    'alpha': reciprocal(1e-4, 1e0),
    'l1_ratio': uniform(0, 1),
    'fit_intercept': [True, False],
    'max_iter': [1000, 2000, 3000],
    'learning_rate': ['constant', 'optimal', 'invscaling', 'adaptive'],
    'tol': [1e-3, 1e-4, 1e-5],
    'eta0': reciprocal(1e-2, 1e0),
    'power_t': uniform(0, 1)
}

In [None]:
random_search = RandomizedSearchCV(SGDClassifier(), param, n_iter=50, cv=3, scoring='accuracy', random_state=42, n_jobs=-1)
random_search.fit(X_train_scaled, y_train)

best_params= random_search.best_params_
best_model = random_search.best_estimator_
best_score = random_search.best_score_


print("Best Parameters:", best_params)
print("Best Score:", best_score)