# **MNIST Image Classification with Modlee: An End-to-End Tutorial**

We’ll walk through an end-to-end project using the Modlee package for image classification. We’ll use the MNIST dataset to demonstrate how to:


1. Use the Modlee recommender to get a recommended model.
2. Train and evaluate the recommended model on the MNIST dataset.
3. Implement a custom model, train and evaluate it.
4. Compare the performance of the Modlee-recommended model with our custom model.

## Tips

For best performance, ensure that the runtime is set to use a GPU (`Runtime > Change runtime type > T4 GPU`).

## Help & Questions

If you have any questions, please reachout on our [Discord](https://discord.gg/dncQwFdN9m).

You can also use our [documenation](https://docs.modlee.ai/README.html) as a reference for using our package.

# **Environment Setup**
## Step 1:

First, we need to make sure that we have the necessary packages installed. We will need `modlee` and its related packages.

In [None]:
# Install required packages
!pip install modlee torch torchvision pytorch-lightning

# This should take a few minutes, thanks for your patience!

## Step 2:

We will import the necessary libraries, including `modlee` for model recommendation and `torch` for handling neural networks.

We will also set our Modlee API key and initialize the Modlee package.
Make sure that you have a Modlee account and an API key [from the dashboard](https://www.dashboard.modlee.ai/).
Replace `replace-with-your-api-key` with your API key.

In [1]:
import os
import lightning.pytorch as pl
import torchvision.transforms as transforms
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
import torch
import modlee

# Set your API key

os.environ['MODLEE_API_KEY'] = "OktSzjtS27JkuFiqpuzzyZCORw88Cz0P"

# Initialize the Modlee package
modlee.init(api_key=os.environ.get('MODLEE_API_KEY'))

  from .autonotebook import tqdm as notebook_tqdm


# **Dataset Preparation**
## Step 1:

We will define the transformations for the dataset to preprocess the images.
Transformations are like instructions on how to prepare the images before using them. Before we can use the images, we need to transform them into a format that our model can understand.

These transformations resize images to 224x224 pixels, convert them to tensors, and normalize the pixel values, which helps the model to train more effectively.




In [2]:
transform = transforms.Compose([
    transforms.Resize((224, 224)),  # Resize images to 224x224 pixels
    transforms.Grayscale(num_output_channels=3),  # Convert images to RGB format
    transforms.ToTensor(),          # Convert images to tensors (PyTorch format)
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # Normalize images with mean and std deviation
])

## Step 2:

We will load the MNIST dataset with the specified transformations. The MNIST dataset is a collection of grayscale images of handwritten digits from 0 to 9. Here, the MNIST dataset is downloaded and loaded for both training and validation, applying the previously defined transformations.

In [3]:
train_dataset = MNIST( #this command gets the MNIST images
    root='./data',
    train=True, #loading the training split of the dataset
    download=True,
    transform=transform) #applies transformations defined earlier

val_dataset = MNIST(
    root='./data',
    train=False, #loading the validation split of the dataset
    download=True,
    transform=transform)


Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz to ./data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:09<00:00, 1004293.91it/s]


Extracting ./data/MNIST/raw/train-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz to ./data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 395758.35it/s]


Extracting ./data/MNIST/raw/train-labels-idx1-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 2365534.68it/s]


Extracting ./data/MNIST/raw/t10k-images-idx3-ubyte.gz to ./data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Failed to download (trying next):
HTTP Error 403: Forbidden

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Downloading https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 5778140.36it/s]

Extracting ./data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/MNIST/raw






## Step 3:

Next, dataloaders will be created for the training and validation data. The data will be loaded in batches to facilitate easier handling.


In [4]:
train_loader = DataLoader( #this tool loads the data
    train_dataset,
    batch_size=4, #we will load the images in groups of 4
    shuffle=True)

val_dataloader = DataLoader(
    val_dataset,
    batch_size=4)


# **Getting a Model Recommendation**

Now, let's use Modlee to recommend a model based on our data and task. We will create a Modlee recommender object and fit it to the dataset. The server will return a recommended model based on dataset metafeatures.


In [5]:
# create a Modlee recommender object
recommender = modlee.recommender.ImageClassificationRecommender(
    num_classes=10  # MNIST has 10 classes (digits 0 to 9)
)

# recommender analyzes training data to suggest best model
recommender.fit(train_loader)

#retrieves the recommended model
modlee_model = recommender.model
print(f"\nRecommended model: \n{modlee_model}")

INFO:Analyzing dataset based on data metafeatures...
  kurt_val = scipy.stats.kurtosis(values, bias=bias)
  skew_val = scipy.stats.skew(values, bias=bias)
  kurt_val = scipy.stats.kurtosis(values, bias=bias)
  skew_val = scipy.stats.skew(values, bias=bias)
INFO:Finished analyzing dataset.



Recommended model: 
RecommendedModel(
  (model): GraphModule(
    (Conv): Conv2d(3, 3, kernel_size=(1, 1), stride=(1, 1))
    (Conv_1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
    (Relu): ReLU()
    (MaxPool): MaxPool2d(kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], ceil_mode=False)
    (Conv_2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (Relu_1): ReLU()
    (Conv_3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (Add): OnnxBinaryMathOperation()
    (Relu_2): ReLU()
    (Conv_4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (Relu_3): ReLU()
    (Conv_5): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (Add_1): OnnxBinaryMathOperation()
    (Relu_4): ReLU()
    (Conv_6): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (Relu_5): ReLU()
    (Conv_7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (

# **Training the Model**

The next step is to train the recommended model using PyTorch Lightning. The `Trainer` object from `PyTorch Lightning` runs the training of `modlee_model` over one epoch.


In [6]:
with modlee.start_run() as run:
  trainer = pl.Trainer(max_epochs=1)
  trainer.fit( #starts training using recommended model and training data
      model=modlee_model,
      train_dataloaders=train_loader,
      val_dataloaders=val_dataloader
  )


  | Name  | Type        | Params | Mode 
----------------------------------------------
0 | model | GraphModule | 11.7 M | train
----------------------------------------------
11.7 M    Trainable params
0         Non-trainable params
11.7 M    Total params
46.779    Total estimated model params size (MB)


Training: |          | 0/? [00:00<?, ?it/s]                                

INFO:Logging data metafeatures with <class 'modlee.data_metafeatures.DataMetafeatures'>
/Users/mansiagrawal/Documents/modlee_pypi/venv/lib/python3.12/site-packages/lightning/pytorch/trainer/call.py:54: Detected KeyboardInterrupt, attempting graceful shutdown...


# **Evaluate the Model**

Now, we evaluate the model on the validation set using the `validate` method of the trainer.

In [None]:
trainer.validate(model=modlee_model, dataloaders=val_dataloader)

INFO: LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
INFO:lightning.pytorch.accelerators.cuda:LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]


Validation: |          | 0/? [00:00<?, ?it/s]

[{'val_loss': 2.4816603660583496}]

# **Custom Model Implementation**

Now, we'll define a custom CNN model, train it, and evaluate its performance. This model includes convolutional layers to extract features from images, followed by fully connected layers for classification. The forward method specifies how data flows through the network, using ReLU activations, max pooling, and flattening operations.

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F

# Define a simple Convolutional Neural Network (CNN) for image classification
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        # First convolutional layer: takes 1 input channel (e.g., grayscale image),
        # outputs 32 feature maps, with a 3x3 kernel and padding of 1
        self.conv1 = nn.Conv2d(3, 32, kernel_size=3, padding=1)  # MNIST has 1 channel
        # Second convolutional layer: takes 32 input channels,
        # outputs 64 feature maps, with a 3x3 kernel and padding of 1
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
        # Fully connected layer: input size is 64*56*56 (after flattening),
        # outputs 128 features
        self.fc1 = nn.Linear(64 * 56 * 56, 128)  # Adjust input size according to image dimensions
        # Final fully connected layer: maps 128 features to 10 output classes (for MNIST)
        self.fc2 = nn.Linear(128, 10)  # 10 classes for MNIST

    def forward(self, x):
        # Apply the first convolutional layer followed by ReLU activation
        x = F.relu(self.conv1(x))
        # Apply max pooling with a 2x2 kernel and stride of 2
        x = F.max_pool2d(x, kernel_size=2, stride=2)
        # Apply the second convolutional layer followed by ReLU activation
        x = F.relu(self.conv2(x))
        # Apply max pooling with a 2x2 kernel and stride of 2
        x = F.max_pool2d(x, kernel_size=2, stride=2)
        # Flatten the tensor from 4D to 2D (batch size, flattened features)
        x = x.view(x.size(0), -1)  # Flatten
        # Apply the first fully connected layer followed by ReLU activation
        x = F.relu(self.fc1(x))
        # Apply the second fully connected layer to produce the final output
        x = self.fc2(x)
        return x

# **Define the PyTorch Lightning Module**
We wrap the CNN model in a PyTorch Lightning module for training and validation.

In [None]:
import pytorch_lightning as pl
from torch.optim import Adam
import torch
import torch.nn as nn

# Define a PyTorch Lightning module for the model
class LitModel(pl.LightningModule):
    def __init__(self, model):
        super(LitModel, self).__init__()
        self.model = model  # The model to be trained
        self.loss_fn = nn.CrossEntropyLoss()  # Loss function for classification

    def forward(self, x):
        # Forward pass through the model
        return self.model(x)

    def training_step(self, batch, batch_idx):
        # Perform a single training step
        x, y = batch  # Unpack the input and target labels from the batch
        y_hat = self(x)  # Get model predictions
        loss = self.loss_fn(y_hat, y)  # Compute the loss
        return loss  # Return the loss for optimization

    def validation_step(self, batch, batch_idx):
        # Perform a single validation step
        x, y = batch  # Unpack the input and target labels from the batch
        y_hat = self(x)  # Get model predictions
        loss = self.loss_fn(y_hat, y)  # Compute the loss
        # Calculate accuracy
        acc = torch.sum(torch.argmax(y_hat, dim=1) == y).float() / y.size(0)
        # Log validation loss and accuracy
        self.log('val_loss', loss)
        self.log('val_acc', acc)
        return {'val_loss': loss, 'val_acc': acc}  # Return metrics for logging

    def configure_optimizers(self):
        # Configure the optimizer for training
        return Adam(self.model.parameters(), lr=1e-3)  # Adam optimizer with a learning rate of 0.001

# **Define the PyTorch Lightning Module**

The next step is to wrap the CNN model in a PyTorch Lightning module for training and validation. This module includes methods for forward passes, computing loss during training and validation, and configuring the optimizer.

In [None]:
train_loader = DataLoader(train_dataset, batch_size=4, shuffle=True)
val_dataloader = DataLoader(val_dataset, batch_size=4)

# **Train the Custom Model**

We will train the custom model using PyTorch Lightning. We initialize the `LitModel` with our custom CNN, then configure a trainer to handle the training and validation processes, setting it to run for one epoch.


In [None]:
# Create an instance of the LitModel with an instance of the SimpleCNN model
model = SimpleCNN()
lit_model = LitModel(model)

# Initialize the PyTorch Lightning trainer
trainer = pl.Trainer(max_epochs=1)  # Set the number of epochs for training

# Start the training process
trainer.fit(
    model=lit_model,            # Pass the LitModel instance to the trainer
    train_dataloaders=train_loader,  # Provide the training data loader
    val_dataloaders=val_dataloader   # Provide the validation data loader
)

Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

# **Evaluate the Custom Model**

Now, we evaluate the custom model on the validation set using the `validate` method of the trainer.


In [None]:
trainer.validate(model=lit_model, dataloaders=val_dataloader)

Validation: |          | 0/? [00:00<?, ?it/s]

[{'val_loss': 0.07614019513130188, 'val_acc': 0.9781000018119812}]

# **Compare Models**

Finally, compare the performance of the Modlee recommended model with the custom model by examining their accuracy on the test set.


# **Amazing work!**

We've successfully walked through a complete machine learning project using the Modlee package for image classification. We demonstrated how to:

- Use Modlee to recommend and train a model for MNIST image classification.
- Implement and train a custom CNN model.
- Evaluate and compare the performance of both models.

By following these steps, you should now have a solid understanding of how to leverage Modlee for model recommendation and how to build and train custom models. The comparison between the recommended and custom models will help you understand the strengths and weaknesses of each approach.

This is a great start to building and training machine learning models. Keep experimenting and learning!