# Calculating the Carbon Footprint of ML Models with CodeCarbon

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).

![image](https://co2living.com/wp-content/uploads/2019/02/Reduce-Your-Carbon-Footprint.jpg)

A **carbon footprint** is the total [greenhouse gas (GHG) emissions](https://en.wikipedia.org/wiki/Greenhouse_gas_emissions "Greenhouse gas emissions") caused by an individual, event, organization, service, place or product, expressed as [carbon dioxide equivalent](https://en.wikipedia.org/wiki/Carbon_Dioxide_Equivalent "Carbon Dioxide Equivalent") (CO2e). Greenhouse gases, including the carbon-containing gases [carbon dioxide](https://en.wikipedia.org/wiki/Carbon_dioxide "Carbon dioxide") and [methane](https://en.wikipedia.org/wiki/Methane "Methane"), can be emitted through the burning of [fossil fuels](https://en.wikipedia.org/wiki/Fossil_fuels "Fossil fuels"), land clearance and the production and consumption of food, manufactured goods, materials, wood, roads, buildings, transportation, and other services.


# [CodeCarbon](https://codecarbon.io/)

**CodeCarbon is a lightweight software package that seamlessly integrates into your Python codebase. It estimates the amount of carbon dioxide (CO2) produced by the cloud or personal computing resources used to execute the code.**

Example:

```python

from codecarbon import EmissionsTracker

tracker = EmissionsTracker()

tracker.start()
expensive_computing_function_here()
tracker.stop()

```

**Let us now train a model and generate an emisson report.** 🌱

In [3]:
import tensorflow as tf
from codecarbon import EmissionsTracker

tracker = EmissionsTracker()

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train, x_test = x_train / 255.0, x_test / 255.0

train_images=x_train.reshape(x_train.shape[0], 28, 28, 1)
test_images=x_test.reshape(x_test.shape[0], 28, 28 ,1) 
                                            
train_labels=tf.keras.utils.to_categorical(y_train)
test_labels=tf.keras.utils.to_categorical(y_test)

model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(20, (5,5), padding='same', activation='relu', input_shape=(28,28,1)),
    tf.keras.layers.MaxPooling2D(pool_size=(2,2), strides=(2,2)),
    tf.keras.layers.Conv2D(50, (5,5), padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(pool_size=(2,2), strides=(2,2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(500, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
              loss='categorical_crossentropy',
              metrics=['accuracy'])

print("TensorFlow version:", tf.__version__)
print("Eager mode: ", tf.executing_eagerly())
print("GPU is", "available" if tf.config.list_physical_devices('GPU') else "NOT AVAILABLE")
model.summary()

print('Training...\n')

tracker.start()

history = model.fit(train_images, train_labels, epochs=10,
                    batch_size=256, verbose=1)

tracker.stop()

print('\nEvaluating...\n')
test_loss_score, test_acc_score = model.evaluate(test_images, test_labels)
print(f'Final Loss: {round(test_loss_score, 2)}.')
print(f'Final Performance: {round(test_acc_score * 100, 2)} %.')

[codecarbon INFO @ 14:59:30] [setup] RAM Tracking...
[codecarbon INFO @ 14:59:30] [setup] GPU Tracking...
[codecarbon INFO @ 14:59:31] Tracking Nvidia GPU via pynvml
[codecarbon INFO @ 14:59:31] [setup] CPU Tracking...
[codecarbon INFO @ 14:59:33] CPU Model on constant consumption mode: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
[codecarbon INFO @ 14:59:33] >>> Tracker's metadata:
[codecarbon INFO @ 14:59:33]   Platform system: Windows-10-10.0.19042-SP0
[codecarbon INFO @ 14:59:33]   Python version: 3.9.13
[codecarbon INFO @ 14:59:33]   Available RAM : 31.749 GB
[codecarbon INFO @ 14:59:33]   CPU count: 8
[codecarbon INFO @ 14:59:33]   CPU model: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
[codecarbon INFO @ 14:59:33]   GPU count: 1
[codecarbon INFO @ 14:59:33]   GPU model: 1 x NVIDIA GeForce MX450


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
TensorFlow version: 2.10.1
Eager mode:  True
GPU is available
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 28, 28, 20)        520       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 14, 14, 20)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 14, 14, 50)        25050     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 7, 7, 50)         0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None

[codecarbon INFO @ 14:59:54] Energy consumed for RAM : 0.000050 kWh. RAM Power : 11.905803680419922 W




[codecarbon INFO @ 14:59:54] Energy consumed for all GPUs : 0.000000 kWh. All GPUs Power : 0.0 W
[codecarbon INFO @ 14:59:54] Energy consumed for all CPUs : 0.000058 kWh. All CPUs Power : 14.0 W
[codecarbon INFO @ 14:59:54] 0.000108 kWh of electricity used since the begining.


Epoch 4/10
Epoch 5/10
Epoch 6/10

[codecarbon INFO @ 15:00:09] Energy consumed for RAM : 0.000099 kWh. RAM Power : 11.905803680419922 W




[codecarbon INFO @ 15:00:09] Energy consumed for all GPUs : 0.000000 kWh. All GPUs Power : 0.0 W
[codecarbon INFO @ 15:00:09] Energy consumed for all CPUs : 0.000117 kWh. All CPUs Power : 14.0 W
[codecarbon INFO @ 15:00:09] 0.000216 kWh of electricity used since the begining.


Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
 13/235 [>.............................] - ETA: 4s - loss: 0.0086 - accuracy: 0.9964

[codecarbon INFO @ 15:00:24] Energy consumed for RAM : 0.000149 kWh. RAM Power : 11.905803680419922 W
[codecarbon INFO @ 15:00:24] Energy consumed for all GPUs : 0.000000 kWh. All GPUs Power : 0.0 W
[codecarbon INFO @ 15:00:24] Energy consumed for all CPUs : 0.000175 kWh. All CPUs Power : 14.0 W
[codecarbon INFO @ 15:00:24] 0.000324 kWh of electricity used since the begining.




[codecarbon INFO @ 15:00:28] Energy consumed for RAM : 0.000163 kWh. RAM Power : 11.905803680419922 W
[codecarbon INFO @ 15:00:28] Energy consumed for all GPUs : 0.000000 kWh. All GPUs Power : 0.0 W
[codecarbon INFO @ 15:00:28] Energy consumed for all CPUs : 0.000192 kWh. All CPUs Power : 14.0 W
[codecarbon INFO @ 15:00:28] 0.000354 kWh of electricity used since the begining.



Evaluating...

Final Loss: 0.03.
Final Performance: 99.08 %.


## `Emission Report Generator()`

**First, let's import all the data we find relevant from the CSV report generated by the CodeCarbon `tracker()` method. We are also importing some details of the model in question from another CSV file.**


In [4]:
import pandas as pd

df = pd.read_csv(r'emissions_CIFAR_CNN_GPU.csv')
df = df.drop(['timestamp', 'project_name', 'run_id'], axis=1) # Drop some columns the report does not use

precision = 6 # number of digits after the decimal point

duration = df['duration'][0]
emissions = df['emissions'][0]
emissions_rate = df['emissions_rate'][0]
cpu_power = df['cpu_power'][0]
gpu_power = df['gpu_power'][0]
ram_power = df['ram_power'][0]
cpu_energy = df['cpu_energy'][0]
gpu_energy = df['gpu_energy'][0]
ram_energy = df['ram_energy'][0]
energy_consumed = df['energy_consumed'][0]
country_name = df['country_name'][0]
country_iso_code = df['country_iso_code'][0]
region = df['region'][0]
cloud_provider = df['cloud_provider'][0]
cloud_region = df['cloud_region'][0]
os = df['os'][0]
python_version = df['python_version'][0]
cpu_count = df['cpu_count'][0]
cpu_model = df['cpu_model'][0]
gpu_count = df['gpu_count'][0]
gpu_model = df['gpu_model'][0]
ram_total_size = df['ram_total_size'][0]
tracking_mode = df['tracking_mode'][0]
on_cloud = df['on_cloud'][0]

# Simple model report
df = pd.read_csv('model_details.csv')

who_is_responsible = df['who_is_responsible'][0]
model_specification = df['model_specification'][0]
intended_use = df['intended_use'][0]
dataset = df['dataset'][0]
licensee = df['license'][0]


**Now we simply use the information from the emission report files to fill a `template.md` card.** 📝

In [18]:
from datetime import date

today = date.today()
today_date = today.strftime("%d/%m/%Y")

with open('CO2 report (CIFAR_CNN).md', 'w+') as fp:
    fp.write(f'''# $CO_{2}$ Emission Report

Generated at: _{today_date}_

## CARBON FOOTPRINT

A carbon footprint is the total greenhouse gas (GHG) emissions caused by an individual, event, organization, service, place or product, expressed as carbon dioxide equivalent ($CO_{2}e$). Greenhouse gases, including the carbon-containing gases carbon dioxide and methane , can be emitted through the burning of fossil fuels , land clearance, and the production and consumption of food, manufactured goods, materials, wood, roads, buildings, transportation, and other services.

Modern AI models can consume a massive amount of energy during their training and fine-tuning phase, and these energy requirements are growing at a breathtaking rate. Researchers from the University of Massachusetts [[1](references)], Amherst, conducted a life cycle analysis for training several typical big AI models in a recent publication. They discovered that the procedure may produce almost $626,000$ pounds of $CO_{2}$ equivalent.

## $CO_{2}$ Emission Report with CodeCarbon

A $CO_{2}$ Emission Report is a simple transparency tool to help developers make public (and thus become accountable) the $CO_{2}$ production of an ML model.

This report is made possible by CodeCarbon [[2](references)] [[3](references)] [[4](references)] a lightweight software package that seamlessly integrates into your Python codebase. It estimates the amount of carbon dioxide ($CO_{2}$) produced by the cloud or personal computing resources used to execute the code.

## HOW TO USE CODECARBON

One can use the Code Carbon library by simply installing its dependencies with a `pip install codecarbon`, a using its tracker function to register the energy consumption of some costly computation.

```python

from codecarbon import EmissionsTracker

tracker = EmissionsTracker()
tracker.start()
expensive_computing_function_here()
tracker.stop()

```

## MODEL DETAILS

- {who_is_responsible}
- {model_specification}
- {intended_use}
- {dataset}
- {licensee}

## $CO_{2}$ Emission Results

|**Duration (Seconds)**|**Emission (KgCO2)**|**Emission Rate (KtCO2/Year)**|**CPU Power (Watts)**|
|--------------------------------|-------------------------------------|------------------------------------|--------------------------------|
| {round(duration, precision)}|{round(emissions, precision)}|{round(emissions_rate, precision)}|{round(cpu_power, precision)}|
|**GPU Power (Watts)**|**RAMPower (Watts)**|**Power Consumption (CPU - kWh)**|**Power Consumption (GPU - kWh)**|
|{round(gpu_power, precision)}| {round(ram_power, precision)}|{round(cpu_energy, precision)}|{round(gpu_energy, precision)}|
|**Power Consumption (RAM - kWh)**|**Total Consumption (kWh)**|**Country**| **ISO**|
|{round(ram_energy, precision)}|{round(energy_consumed, precision)}|{country_name}|{country_iso_code}|
|**Region**| **Cloud Provider**| **Provider's Region**|**OS**|
|{region}| {cloud_provider}| {cloud_region}|{os}|
|**Python Version**| **No. of Processors**|**Provider's CPU Model**| **No. of GPUs**|
|{python_version}|{cpu_count}|{cpu_model}|{gpu_count}|
|**GPU Model**|**RAM Memory Size (GB)**| **Tracking Mode**|**Cloud-Processed**|
|{gpu_model}| {ram_total_size}|{tracking_mode}| {on_cloud}|

## REFERENCES

[1] Karen Hao. Training a single ai model can emit as much carbon as five cars in their lifetimes. _MIT technology Review_, 2019.

[2] Alexandre Lacoste, Alexandra Luccioni, Victor Schmidt, and Thomas Dandres. Quantifying the carbon emissions of machine learning. _Workshop on Tackling Climate Change with Machine Learning at NeurIPS 2019_, 2019.

[3] Kadan Lottick, Silvia Susai, Sorelle A. Friedler, and Jonathan P. Wilson. Energy usage reports: Environmental awareness as part of algorithmicaccountability. _Workshop on Tackling Climate Change with Machine Learning at NeurIPS 2019_, 2019.

[4] Victor Schmidt, Kamal Goyal, Aditya Joshi, Boris Feld, Liam Conell, Nikolas Laskaris, Doug Blank, Jonathan Wilson, Sorelle Friedler, and Sasha Luccioni. CodeCarbon: _Estimate and Track Carbon Emissions from Machine Learning Computing_, 2021.
''')
    fp.close()
    
from IPython.display import display, HTML

display(
    HTML(f"<a href='CO2 report (CIFAR_CNN).md' target='_blank'>CO2 report (CIFAR_CNN).md</a>"))

**Other available options for CO2 emission tracking are [Eco2AI](https://github.com/sb-ai-lab/Eco2AI), which has a pretty similar interface and user experience then CodeCarbon.** 🙃

----

Return to the [castle](https://github.com/Nkluge-correa/teeny-tiny_castle).
