# MNIST Digit Recognition with Random Forest
**Author:** Magudeshwaran and Senthilkumaran

**Goal:** To build a Random Forest classifier to predict handwritten digits from the MNIST dataset.

### Step 1: Import Libraries and Load Data

We'll start by importing the necessary libraries and loading the MNIST dataset from `tensorflow.keras.datasets`.

In [11]:
import numpy as np
import pandas as pd
from tensorflow.keras.datasets import mnist

# Load the MNIST dataset
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Check the shape of the data
print(f'Train images shape: {train_images.shape}')
print(f'Train labels shape: {train_labels.shape}')

### Step 2: Preprocess the Data

The images are 28x28 pixel matrices. We need to flatten them into 1D arrays (of size 784) and normalize the pixel values to be between 0 and 1. We'll also create pandas DataFrames for easier manipulation.

In [12]:
# Flatten the images and normalize pixel values
train_images = train_images.reshape((60000, 28 * 28)).astype('float32') / 255
test_images = test_images.reshape((10000, 28 * 28)).astype('float32') / 255

# Convert to DataFrame for easier handling
train_df = pd.DataFrame(train_images)
train_df['label'] = train_labels

test_df = pd.DataFrame(test_images)
test_df['label'] = test_labels

### Step 3: Train the Random Forest Classifier

Now, we'll train a `RandomForestClassifier` on the training data.

In [13]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Initialize and train the Random Forest Classifier
rf_classifier = RandomForestClassifier(n_estimators=100)
rf_classifier.fit(train_df.drop('label', axis=1), train_df['label'])

### Step 4: Evaluate the Model

Let's evaluate the accuracy of our model on the test data.

In [14]:
# Make predictions on test data
predictions = rf_classifier.predict(test_df.drop('label', axis=1))

# Calculate accuracy
accuracy = accuracy_score(test_df['label'], predictions)
print(f'Accuracy: {accuracy * 100:.2f}%')

### Step 5: Predict on a Custom Image

This function allows you to provide your own image and have the model predict the digit.

In [21]:
from PIL import Image

def predict_image(image_path):
    # Load and preprocess image
    img = Image.open(image_path).convert('L')  # Convert to grayscale
    img = img.resize((28, 28))  # Resize to 28x28 pixels
    img_array = np.array(img).reshape(1, -1).astype('float32') / 255  # Flatten and normalize
    
    # Make prediction
    prediction = rf_classifier.predict(img_array)
    return prediction[0]

# Example usage:
# image_path = 'image.png'  # Replace with your image path
# predicted_digit = predict_image(image_path)
# print(f'Predicted digit: {predicted_digit}')