## Load MNIST Dataset
Used keras as OpenML server wasn't responding

In [None]:
from tensorflow.keras.datasets import mnist

(X_train, y_train), (X_test, y_test) = mnist.load_data()

## Check out the Number

In [None]:
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt

plt.imshow(X_train[0], cmap='gray')
plt.title(f'Label: {y_train[0]}')
plt.axis('off')
plt.show()

## Reshape 2D image(28x28) to 1D Vector(28x28 = 784)

In [None]:
X_train = X_train.reshape(-1, 28*28)
X_test = X_test.reshape(-1, 28*28)

print("Reshaped training shape:", X_train.shape)

## Normalizing Pizel Size
Effectively mapping each pixel size to a smaller value.

[0,255] -> [0,1]

In [None]:
X_train = X_train / 255.0
X_test = X_test / 255.0

## Shuffling Numbers to be trained
Removes any sort of arrangement that may cause weight or bias issues

In [None]:
import numpy as np

shuffle_index = np.random.permutation(len(X_train))
X_train = X_train[shuffle_index]
y_train = y_train[shuffle_index]

## Training the model based on a Classification Algorithm
Used Logistic Regression (probability based on score z = w₁x₁ + w₂x₂ + ... + w₇₈₄x₇₈₄ + b)     

Weights change in each iteration (1000).

`model.fit()` does the actual job

In [None]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression(max_iter=1000, solver='saga')
model.fit(X_train, y_train)

## Evaluating the Model

After training, `model.predict()` is used to classify the test data.

Performance is measured using (Accuracy = Correct Predictions / Total Test Samples):

`accuracy_score()` calculates how well the model performs on unseen data.

In [None]:
from sklearn.metrics import accuracy_score, classification_report

y_pred = model.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))
print("\nClassification Report:\n")
print(classification_report(y_test, y_pred))

## Predicting a Sample Image
A test image is selected and passed to `model.predict()` to classify the digit.

The model calculates probabilities for each class (0–9) and selects the digit with the highest probability as the final prediction.

The predicted label is then displayed along with the image.

In [None]:
sample = 100
plt.imshow(X_test[sample].reshape(28,28), cmap='gray')
plt.title(f"Predicted: {model.predict([X_test[sample]])[0]}")
plt.axis('off')
plt.show()