# **Modlee Model Recommendation Example Walkthrough**

In this walkthrough, we will demonstrate how to use the Modlee package to automatically recommend a model for image classification on the CIFAR10 dataset.


We'll go through the process step-by-step, including importing necessary libraries, setting up the dataset, using Modlee for model recommendation, and training the recommended model.

## Tips

For best performance, ensure that the runtime is set to use a GPU (`Runtime > Change runtime type > T4 GPU`).

## Help & Questions

If you have any questions, please reachout on our [Discord](https://discord.gg/dncQwFdN9m).

You can also use our [documenation](https://docs.modlee.ai/README.html) as a reference for using our package.

# **Environment Setup**
## Step 1:

First, we need to make sure that we have the necessary packages installed. We will need `modlee` and its related packages.



## Step 2:

We will import the necessary libraries, including `modlee` for model recommendation and `torch` for handling neural networks.

We will also set our Modlee API key and initialize the Modlee package.
Make sure that you have a Modlee account and an API key [from the dashboard](https://www.dashboard.modlee.ai/).
Replace `replace-with-your-api-key` with your API key.

In [1]:
import os
import lightning.pytorch as pl

# Set your API key

os.environ['MODLEE_API_KEY'] = "OktSzjtS27JkuFiqpuzzyZCORw88Cz0P"

import torch, torchvision
import torchvision.transforms as transforms
import modlee

# Initialize the Modlee package
modlee.init(api_key=os.environ.get('MODLEE_API_KEY'))

  from .autonotebook import tqdm as notebook_tqdm


# **Dataset Preparation**
## Step 1:

We will define the transformations for the dataset.
Transformations are like instructions on how to prepare the images before using them. Before we can use the images, we need to transform them into a format that our model can understand.




In [2]:
transforms = transforms.Compose([
    transforms.ToTensor(), #converts images to PyTorch tensors
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)) #adjusts the color values to make images easier to work with
    ])

## Step 2:

We will load the CIFAR-10 dataset, which is a collection of 60,000 small images divided into 10 different categories, like airplanes, cars, birds, etc.
These images will be used for training and testing a machine learning model.

In [3]:
train_dataset = torchvision.datasets.CIFAR10( #this command gets the CIFAR-10 images
    root='./data',
    train=True, #loading the training split of the dataset
    download=True,
    transform=transforms) #applies transformations defined earlier

val_dataset = torchvision.datasets.CIFAR10(
    root='./data',
    train=False, #loading the validation split of the dataset
    download=True,
    transform=transforms)

Files already downloaded and verified
Files already downloaded and verified


## Step 3:

Next, dataloaders will be created for the training and validation data. The data will be loaded in batches to facilitate easier handling.


In [4]:
train_dataloader = torch.utils.data.DataLoader( #this tool loads the data
    train_dataset,
    batch_size=16, #we will load the images in groups of 16
   )

val_dataloader = torch.utils.data.DataLoader(
    val_dataset,
    batch_size=16
)

# **Getting a Model Recommendation**

Now, let's use Modlee to recommend a model based on our data and task. We will create a Modlee recommender object and fit it to the dataset. The server will return a recommended model based on dataset metafeatures.


In [5]:
# create a Modlee recommender object
recommender = modlee.recommender.from_modality_task(
    modality='image', #tells the recommender that we are working with images
    task='classification', #tells the recommender that our task is classification
    )

# recommender analyzes training data to suggest best model
recommender.fit(train_dataloader)

#retrieves the recommended model
modlee_model = recommender.model
print(f"\nRecommended model: \n{modlee_model}")

INFO:Analyzing dataset based on data metafeatures...
INFO:Finished analyzing dataset.
INFO:The model is available at the recommender object's `model` attribute.



Recommended model: 
RecommendedModel(
  (model): GraphModule(
    (Conv): Conv2d(3, 3, kernel_size=(1, 1), stride=(1, 1))
    (Conv_1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
    (Relu): ReLU()
    (MaxPool): MaxPool2d(kernel_size=[3, 3], stride=[2, 2], padding=[1, 1], dilation=[1, 1], ceil_mode=False)
    (Conv_2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (Relu_1): ReLU()
    (Conv_3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (Add): OnnxBinaryMathOperation()
    (Relu_2): ReLU()
    (Conv_4): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (Relu_3): ReLU()
    (Conv_5): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (Add_1): OnnxBinaryMathOperation()
    (Relu_4): ReLU()
    (Conv_6): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (Relu_5): ReLU()
    (Conv_7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (

# **Training the Model**

The next step is to train the recommended model using PyTorch Lightning. The `Trainer` object from `PyTorch Lightning` runs the training of `modlee_model` over one epoch.


In [6]:
with modlee.start_run() as run:
    trainer = pl.Trainer(max_epochs=1)
    trainer.fit( #starts training using recommended model and training data
        model=modlee_model,
        train_dataloaders=train_dataloader,
        val_dataloaders=val_dataloader
    )

LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]



  | Name  | Type        | Params | Mode 
----------------------------------------------
0 | model | GraphModule | 11.7 M | train
----------------------------------------------
11.7 M    Trainable params
0         Non-trainable params
11.7 M    Total params
46.779    Total estimated model params size (MB)


Sanity Checking DataLoader 0:   0%|          | 0/2 [00:00<?, ?it/s]

  return F.conv2d(input, weight, bias, self.stride,


Training: |          | 0/? [00:00<?, ?it/s]                                

INFO:Logging data metafeatures...
INFO:Logging model as code (model_graph.py) and text (model_graph.txt)...


Epoch 0:  89%|████████▉ | 2793/3125 [01:04<00:07, 43.38it/s, v_num=1]

# **View Saved Training Assets**
Finally, we can view the saved assets from training.

In [None]:
last_run_path = modlee.last_run_path()
print(f"Run path: {last_run_path}")
artifacts_path = os.path.join(last_run_path, 'artifacts')
artifacts = sorted(os.listdir(artifacts_path))
print(f"Saved artifacts: {artifacts}")

Run path: /home/ubuntu/projects/modlee/src/modlee/mlruns/0/b7e26fd833434066b64ec8d40170d7b9
Saved artifacts: ['cached_vars', 'checkpoints', 'data_metafeatures', 'model', 'model.py', 'model_graph.py', 'model_graph.txt', 'model_size', 'model_summary.txt', 'stats_rep', 'transforms.txt']


# **Awesome job!**

We've successfully set up, prepared our data, got a model recommendation, and trained our model on the CIFAR-10 dataset. This is a great start to building and training machine learning models. Keep experimenting and learning!