# **Energy Benchmarking of Deep Learning Workflows**

### **1.) Introduction to Energy Benchmarking with CodeCarbon**

Researchers in the domains of Green AI or AI Efficiency have recenty started to include "Energy Fact Sheets" of the proposed models in their papers, similar to how the nutritional fact sheet on food items lists what is in the food you eat.

This makes it possible for others to compare and grasp what kind of resources the researchers had at their disposal (GPUs, TPUs, ...), how many parameters their models have (Amount and FLOPs), how much energy was needed for a single training iteration or a single inference call, and sometimes also how many re-training attempts they did in total during their development.

If direct hardware-based measuring of the energy usage is not possible or to much of an effort, software tools such as ***CodeCarbon*** are a good alternative, although generally less accurate. 

***CodeCarbon*** allows you to track the energy usage of your code with an intuitive *wrapping mechanism*. You can either use the explicit `EmissionsTracker` tracker object and wrap the code you want to benchmark with the `tracker.start()` and `tracker.stop()` functions. Or, if you already have your code bundled into individual functions, you can make use of the built-in function decorator `@track_emissions`.

Both versions do the same things: 
1. Keep track of the starting time when you call `tracker.start()` or the decorated function
2. Get an initial measurement of the energy usage at the start
3. Start a scheduler in the background that does a measurement every X seconds while your code runs
4. Wait until you either stop the tracking with `tracker.stop()` or the decorated function terminates
5. Do another meaurement at the end
6. Collect and aggregate the measurment data for you *(and store or return it to you depending on how you configured it)*


#### **The following two ways are examples of how you can use this in your code:**
##### **Version 1 - Tracker object**
```python
    from codecarbon import EmissionsTracker
    
    tracker = EmissionsTracker(
        #... configurations
    )

    tracker.start()
    try:
        # ... do something   
    finally:
        tracker.stop()
    
    results = tracker.final_emissions_data
```


##### **Version 2 - Function decorator**

```python
    from codecarbon import track_emissions

    @track_emissions(
        #... configurations
    )
    def func()
        # ... do something
```

To access the energy benchmark data, you can either look into the `.csv` file that is created by default. Every row corresponds to one call of `start()` to `stop()`. Alternatively, and more useful if you want to directly do something with the respective value, you can access the `final_emissions_data` attribute of the tracker object (This of course only works if you explicitly use the tracker object and not the function decorator).

### **2.) Energy Benchmarking of Model Training and Testing**

Understanding the energy usage of AI models is important for the reduction of the immense carbon footprint of modern AI and making AI development, in general, more sustainable. 

For this next part of the notebook, we will train deep learning models on the Oxford IIIT Pets dataset (see [here](https://pytorch.org/vision/stable/generated/torchvision.datasets.OxfordIIITPet.html) and [here](https://www.robots.ox.ac.uk/~vgg/data/pets/)). Your task is to train a deep learning model, achieve an accuracy of at least 80% and find ways to make the training phase of the model as energy efficient as possible.

We will start by implementing the two helper functions `train_model()` and `test_model()`. These should make it easier and faster for you to go over multiple iterations of your models and try different approaches. Once the basic functionality is implemented, you can extend them how ever you please.

#### **Getting Started**

1. Set up your virtual environment for python and run the import cell down below
2. Implement the leftout spaces of the train_model function
3. Implement a basic first model to test and familiarize yourself with how the two functions work (also helps to get a baseline for the accuracy and energy usage)
4. Extend your models and the two functions to increase the energy efficiency 
    - have a look at ***Things you can consider*** down below to get some ideas where to start.
    - have a look at the basic implementation of the `test_model()` function to get an understanding of what we expect from you in the `train_model()` function that you have to implement yourself.

**Things you can consider:**

- Run model on CPU vs. GPU (see [here](https://pytorch.org/docs/stable/generated/torch.Tensor.to.html) and [here](https://pytorch.org/docs/stable/tensor_attributes.html#torch.device))
- Model architecture and size ([custom models](https://pytorch.org/tutorials/beginner/introyt/modelsyt_tutorial.html) vs. [built-in Pytorch models](https://pytorch.org/vision/stable/models.html))
- Pretrained model weights vs. starting from scratch
- Different optimizers (see [here](https://pytorch.org/docs/stable/optim.html))
- Learning rate and adaptation strategies (Warm-start, Decay, ...)
- Batch size
- Data augmentation (see [here](https://pytorch.org/vision/stable/transforms.html))
- Different loss functions
- Train/Test vs. Train/Validation/Test split of the data set. What could a additional validation set be useful for? (*Hint: Early Stopping*)

Due to the limited time you have, do not worry about implementing or trying out all of these examples. Start by discussing with your colleagues which of these examples help the model to be more energy efficient, and if so, how they achieve it.

Afterwards, pick and focus on the few that you think would help the most (or that you simply find the most interesting to implement). You are, of course, also allowed to come up with your own strategies.

In [1]:
# Dependencies, you can add whatever additional things you want to use
import os
import time
import copy

import torch
import torchvision.models as models
import torch.nn as nn
import torch.optim as optim
import torchvision.datasets as datasets
from torchvision.transforms import v2
from torch.utils.data import DataLoader, random_split

from codecarbon import EmissionsTracker

#### The `train_model()` function
This function should take in a PyTorch model, a string name for your model and a PyTorch DataLoader instance for the Oxford Pets dataset. In the end, it should take care of the complete model training process for you so that you can focus your energy on how and what to change in your models. (*Hint: as you might have seen in the **Things to consider** section, there are also a few things you can improve over basic the train_model() function to make the training phase a lot more efficient*)

In short, we want you to implement the standard deep learning training loop and afterwards expand it how ever you please:
- loop over the epochs and batches of input images and labels
- calculate the loss of your model predictions
- backpropagte the loss
- do the optimizer weight update step

Additionally, this is where you should utilize the ***CodeCarbon*** benchmarking library, to understand how much time and energy the training phase of your models need.


In [None]:
def train_model(model, model_name:str, train_dataloader:DataLoader, num_epochs, lr):
    # You can have a look at the basic implementation of the test_model function to get an idea
    
    ### Create CodeCarbon tracker object and start the tracker or use the decorator
        # Use these configs:
            # project_name="Model_Training",
            # output_file="training_emissions.csv",
            # measure_power_secs=20,
            # log_level="error",
            # allow_multiple_runs=True
    
    ### Define a loss function
    loss = ...

    ### Define an optimizer
    optimizer = ...

    ### Implement the basic training loop for PyTorch
        # Loop over epochs and batches of images and labels
            # Get model output
            # Calculate loss
            # Backpropagate loss
            # Optimizer update step

    ### Stop CodeCarbon tracker if you did not use the decorator version

    ### Store model on disk

    return model


#### The `test_model()` function
This function again takes in a PyTorch model and a PyTorch DataLoader instance of the test split of the dataset, calculates the test accuracy and the inference latency and energy usage at test time of your models.

In [None]:
def test_model(model, test_dataloader):
    # Send model to gpu
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    model.to(device)
    
    # Set model to inference mode
    model.eval()

    tracker = EmissionsTracker(
        project_name="Model_Inference",
        output_file="inference_emissions.csv",
        measure_power_secs=20,
        log_level="error",
        allow_multiple_runs=True
    )
    tracker.start()

    start_time = time.perf_counter()
    try:
        correct = total = 0
        with torch.no_grad():
            for images, labels in test_dataloader:
                images, labels = images.to(device), labels.to(device)
                outputs = model(images)
                _, predicted = torch.max(outputs, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()
                images, labels = images.to("cpu"), labels.to("cpu")
        
        total_inference_time = time.perf_counter() - start_time

        avg_inference_latency = total_inference_time / len(test_dataloader.dataset)
        test_acc = 100 * correct / total
        
        tracker.stop()
        emissions_data = tracker.final_emissions_data

        print(f"[Inference] #Test Samples: {len(test_dataloader.dataset)} | Average Inference Latency: {avg_inference_latency:.4f} seconds | Test Accuracy: {test_acc:.1f}%") 
        print(f"  - CO2 Emissions: {emissions_data.emissions:.6f} kg | Total Energy Usage: {emissions_data.energy_consumed:.7f} kWh")
        return {
            "inference_latency": avg_inference_latency,
            "test_accuracy": test_acc,
            "emissions_data": emissions_data
        }

    finally:
        tracker.stop()
        model.to("cpu")
        torch.cuda.empty_cache()

#### Finally create, train and test your models here

In [None]:
batch_size = 128
output_size = 37 # Oxford Pets 37 categories

# Data set transforms, you can change these how ever you please for your models
transforms = v2.Compose([
    v2.ToImage(),
    v2.ToDtype(torch.float32, scale=True)
])

# Download the train and test sets of the dataset and use your transforms
os.makedirs("./data", exist_ok=True)
train_dataset = datasets.OxfordIIITPet(root="./data", split="trainval", transform=transforms, download=True)
test_dataset = datasets.OxfordIIITPet(root="./data", split="test", transform=transforms, download=True)

train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_dataloader = DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

In [None]:
### Create your model
    # You can either build one from scratch, or try builtin model architectures from PyTorch (ResNet, MobileNet, ...)
    # Best start with a simple model to test the training and testing workflow and then work up to more complex models
model = ...

In [None]:
### Train your model
trained_model = train_model(model, "YOUR_MODEL_NAME", train_dataloader)

### Test your model
test_stats_dict = test_model(trained_model, test_dataloader)


# Free up memory 
trained_model.to("cpu")
torch.cuda.empty_cache()