<a href="https://colab.research.google.com/github/sabbillareddy/LSA_Week-3/blob/main/Welcome_To_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Understanding the Task**

The MNIST dataset consists of 70,000 grayscale images of handwritten digits (0-9), each 28x28 pixels. The goal is to build a classifier that predicts the digit from an image. We'll use deep learning (DL) with Keras (a high-level API for TensorFlow) for a neural network approach, and compare it to a machine learning (ML) method using scikit-learn. This highlights DL's ability to learn complex patterns from raw data versus ML's reliance on feature engineering.


**Deep Learning Approach with Keras**
DL uses neural networks to automatically learn features from data. We'll build a simple feedforward network.

**Step 1**: Import Libraries and Load Data

In [32]:
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.utils import to_categorical

# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

**Concept**: Keras provides built-in datasets. X_train has 60,000 images (28x28), y_train has labels (0-9). This is supervised learning.

**Step** 2: Preprocess Data

In [33]:
# Normalize pixel values to [0,1] for better training
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# One-hot encode labels (e.g., 5 becomes [0,0,0,0,0,1,0,0,0,0])
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

**Concept**: Normalization speeds up convergence. One-hot encoding converts labels for multi-class classification.

**Step 3**: Build the Model

In [34]:
# Create a sequential model
model = Sequential([
    Flatten(input_shape=(28, 28)),  # Flatten 2D image to 1D vector
    Dense(128, activation='relu'),  # Hidden layer with 128 neurons, ReLU activation
    Dense(10, activation='softmax')  # Output layer for 10 classes
])

  super().__init__(**kwargs)


**Concept**: *Flatten* converts images to vectors. Dense layers are fully connected. ReLU avoids vanishing gradients; softmax outputs probabilities summing to 1.

**Step** 4: Compile and Train

In [35]:
# Compile with optimizer, loss, and metrics
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_split=0.1)

Epoch 1/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 6ms/step - accuracy: 0.8682 - loss: 0.4549 - val_accuracy: 0.9677 - val_loss: 0.1207
Epoch 2/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 4ms/step - accuracy: 0.9626 - loss: 0.1261 - val_accuracy: 0.9722 - val_loss: 0.0928
Epoch 3/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 5ms/step - accuracy: 0.9750 - loss: 0.0827 - val_accuracy: 0.9775 - val_loss: 0.0850
Epoch 4/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 4ms/step - accuracy: 0.9821 - loss: 0.0585 - val_accuracy: 0.9792 - val_loss: 0.0839
Epoch 5/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 6ms/step - accuracy: 0.9864 - loss: 0.0454 - val_accuracy: 0.9792 - val_loss: 0.0716
Epoch 6/10
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 5ms/step - accuracy: 0.9902 - loss: 0.0340 - val_accuracy: 0.9737 - val_loss: 0.0957
Epoch 7/10
[1

**Concept**: Adam optimizer adapts learning rates. Categorical cross-entropy measures error for multi-class. Training uses backpropagation to update weights.

**Step 5**: Evaluate and Predict

In [36]:
# Evaluate on test set
test_loss, test_acc = model.evaluate(X_test, y_test)
print(f"Test Accuracy: {test_acc:.4f}")

# Predict on a sample
predictions = model.predict(X_test[:5])
predicted_labels = np.argmax(predictions, axis=1)
actual_labels = np.argmax(y_test[:5], axis=1)
print(f"Predicted: {predicted_labels}, Actual: {actual_labels}")

[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9742 - loss: 0.0836
Test Accuracy: 0.9768
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 70ms/step
Predicted: [7 2 1 0 4], Actual: [7 2 1 0 4]


**Output**: Test accuracy ~97-98% after 10 epochs. Predictions might show something like Predicted: [7, 2, 1, 0, 4], Actual: [7, 2, 1, 0, 4].

**Machine Learning Approach with Scikit-Learn**

ML requires manual feature extraction. We'll use a Support Vector Machine (SVM) on flattened images.

**Step 1**: Import and Load Data

In [37]:
from sklearn import svm
from sklearn.metrics import accuracy_score
from sklearn.datasets import fetch_openml

# Load MNIST (alternative to Keras)
mnist = fetch_openml('mnist_784', version=1)
X, y = mnist["data"], mnist["target"]
X_train, X_test = X[:60000], X[60000:]
y_train, y_test = y[:60000], y[60000:]

**Concept**: Scikit-learn fetches data. No built-in normalization here, but we'll add it.

**Step 2**: Preprocess &

**Step 3**: Train Model


In [39]:
# Normalize
X_train = X_train / 255.0
X_test = X_test / 255.0

In [40]:
# Use SVM with RBF kernel
clf = svm.SVC(kernel='rbf', gamma='scale')
clf.fit(X_train, y_train)

**Concept**: SVM finds a hyperplane to separate classes. RBF kernel handles non-linear data.

**Step 4**: Predict and Evaluate

In [43]:
# Predict
y_pred = clf.predict(X_test[:5])
print(f"Predicted: {y_pred}, Actual: {y_test[:5].values}")

# Accuracy on full test set (subset for speed)
accuracy = accuracy_score(y_test[:1000], clf.predict(X_test[:1000]))
print(f"Test Accuracy: {accuracy:.4f}")

Predicted: ['7' '2' '1' '0' '4'], Actual: ['7', '2', '1', '0', '4']
Categories (10, object): ['0', '1', '2', '3', ..., '6', '7', '8', '9']
Test Accuracy: 0.9720


**Output**: Accuracy ~95-98%. Slower training than DL.