# Integrating `codecarbon` into the exercises

This notebook demonstrates 2 ways to integrate `codecarbon` into the exercises:

1. Using the `@track_emissions` decorator

2. Using the `CustomEmissionsTracker` class

Due to some limitations of the `codecarbon` library the base emission tracker has been extended to provide better control over writing the data of each experiment.

This is done purely for the course reasons, and there is not added functionality which is not already present in the `codecarbon` library. I decided to do it this way to streamline the process for the students and to make sure we can have consistent and correct data representations.

The `CustomEmissionsTracker` class is a subclass of the `EmissionsTracker` class from the `codecarbon` library. It provides the same functionality as the base class, but it also allows for writing the data to a file after each experiment. This is useful for the course, as we can track the emissions of each experiment separately. It is a bit sketchy and incosistent doing it the codecarbon way, but it is the best way to do it for now.

In [1]:
from codecarbon import track_emissions 
import numpy as np

from utils import calculate_emission_equivalents, CustomEmissionsTracker


# Example use of the `CustomEmissionsTracker` class

Offline tracking requires the use of `country_iso_code` and `log_level` controls how much is printed in the output

The country code is used to determine the carbon intensity of electricity production

https://mlco2.github.io/codecarbon/methodology.html

In this mode the tracker supports setting up custom `experiment_id`. This is part of the custom class which is appending it to the experiment csv. This makes the data collection a lot more consistent as sometimes the tracker doesnt write to file as required etc.

Once you run the first tracking, a CSV should be automatically created in the same directory. The CSV will contain the emissions data for the experiment.

`codecarbon` has a list of supported CPUs for which it knows the power consumption. I noticed during testing on EoXhub that the CPU is not known:

```
[codecarbon WARNING @ 14:56:23] We saw that you have a AMD EPYC 7R13 Processor but we don't know it. Please contact us.
[codecarbon INFO @ 14:56:23] CPU Model on constant consumption mode: AMD EPYC 7R13 Processor
```

in that case `default_cpu_power` needs to be set to a value in Watts. The default value is 42.5W which is too low for server CPUs.

The `default_cpu_power` is power_constant x consumption_percentage_constant according to the docs [here](https://mlco2.github.io/codecarbon/parameters.html)

The 7R13 has a TDP of 225W and 48 cores. However, each user is given 4.5 cores, so the power consumption should be around 225W / 48 * 4.5 = 21.09W. We use 4.5 instead of 4 as power/core is not fully linear. This is a rough estimate, but it should be close enough.

This is something to be discussed and maybe agree on a value for the course. This can dramatically affect the results.


In [4]:
tracker = CustomEmissionsTracker(
    country_iso_code="ITA",
    log_level="error",
    save_to_file=True,
    output_dir="./",
    default_cpu_power=21
)

In [5]:
tracker.start_experiment(experiment_id=420)

def whatever(size: int, iterations: int) -> np.ndarray:
    """
    Multiply two random matrices of size x size

    Args:
    size (int): size of the matrices
    """
    for _ in range(iterations):
        matrix1 = np.random.rand(size, size)
        matrix2 = np.random.rand(size, size)
        result = np.dot(matrix1, matrix2)
    return result


whatever(size=10000, iterations=1)

tracker.stop_experiment()

Starting experiment: 420
Stopped experiment: 420


# Visualizing the result

The visualization is done via custom HTML template. The `calculate_emission_equivalents` function is used to calculate the equivalent of the emissions in terms of other metrics. This is done to make the data more relatable to the students.

How it works is that we read the CSV generated by `codecarbon` and perform some calculations. These results are replaced in the HTML template and displayed to the user.

When using the `CustomEmissionsTracker` class, we can visualize using the specific experiment_id. It makes things a bit more readable and understandable, even if it is not fully stock `codecarbon` functionality. Normally, you would use their dashboard anyway and dont bother with custom visualization.

In [7]:
calculate_emission_equivalents(experiment_id=420)

# Using the `track_emissions` decorator

This mode is useful for wrapping a single function definition. It is useful for tracking the emissions of a single function. It is not as flexible as the `CustomEmissionsTracker` class, but it is useful for simple tracking.

This mode however, does not support setting custom `experiment_id` and it is not as flexible as the `CustomEmissionsTracker` class. It is useful for simple tracking, but for more complex tracking, the `CustomEmissionsTracker` class is recommended.

This mode requires `offline` to be set to `True` and `country_iso_code` to be set to the correct country code.

In [8]:
# I dont honestly know if we need all these args, but the thing is so buggy that I am not going to try to figure it out
@track_emissions(offline=True, save_to_file=True,output_dir="./",country_iso_code="ITA", log_level="error", default_cpu_power=100)
def whatever(size: int, iterations: int) -> np.ndarray:
    """
    Multiply two random matrices of size x size

    Args:
    size (int): size of the matrices
    """
    for _ in range(iterations):
        matrix1 = np.random.rand(size, size)
        matrix2 = np.random.rand(size, size)
        result = np.dot(matrix1, matrix2)
    return result


whatever(size=10000, iterations=2)

array([[2483.27180478, 2490.23493118, 2456.49499933, ..., 2478.97760472,
        2451.97859823, 2468.08242129],
       [2505.44611951, 2515.09554418, 2481.46690565, ..., 2518.40299198,
        2500.56134978, 2503.37980895],
       [2504.39675515, 2506.49964045, 2485.56483129, ..., 2493.8078998 ,
        2502.27529329, 2482.94691763],
       ...,
       [2509.82292392, 2499.24597244, 2489.09289879, ..., 2493.22274916,
        2504.65738512, 2493.53384166],
       [2509.24100095, 2531.02040351, 2490.09111877, ..., 2537.87721421,
        2492.36396769, 2502.86121924],
       [2516.90950612, 2532.07759305, 2513.16827735, ..., 2525.44904115,
        2515.76632497, 2508.70997884]])

# Visualizing the result of track_emissions

Since `codecarbon` does not support custom IDs here (It says it does but it doesnt work), we rely on choosing the latest entry in the CSV file. This is not ideal, but it is the best we can do with the current state of the library.

In [9]:
calculate_emission_equivalents()