# Tracking carbon emissions and power consumption with CodeCarbon

<a href="https://colab.research.google.com/drive/1oZLM3uAHdqdbyVCq67CxHCzWK_ldqMEX" target="_blank">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab">
</a>

Return to the [castle](https://github.com/Nkluge-correa/TeenyTinyCastle).

Recent advancements in Artificial Intelligence, particularly Machine Learning, enable human-level tasks such as image and facial recognition, autonomous driving, and mastering complex games like chess and Go. Achieving such proficiency involves utilizing extensive datasets to learn patterns and features. Consequently, state-of-the-art Machine Learning models require substantial computing power training on advanced processors for weeks or months, leading to significant energy consumption and potential greenhouse gas emissions based on the energy grid used.

Amid this scenario, measuring the energy consumption and respective carbon emissions of, for example, training and inference of ML models, is becoming a common practice.

<img src="https://earthnworld.com/wp-content/webp-express/webp-images/uploads/2021/10/carbon-footprint-scaled.jpg.webp" alt="drawing" width="400"/>

Source: [Earthnworld](https://earthnworld.com/reduce-carbon-footprint/).

To help us in this process, we have libraries like [CodeCarbon](https://codecarbon.io/). CodeCarbon is a lightweight software package that can easily be integrated into a Python codebase. This package enables developers to track emissions, measured as kilograms of $CO_2$-equivalents ($CO_2$eq), to estimate their work's carbon footprint.

According to CodeCarbon [documentation](https://mlco2.github.io/codecarbon/methodology.html), the package estimates carbon dioxide emissions as the product of two main factors:

- $C$: Carbon Intensity of the electricity consumed for computation (grams of C$O_2$ emitted per kilowatt-hour).
- $E$: Energy Consumed by the computational infrastructure (kilowatt-hours - kWh).

Hence:

$$\text{Carbon dioxide emissions} = C \times E$$

CodeCarbon, when possible, uses the global carbon intensity of electricity per cloud provider or per country to infer the value of $C$. For example, this is the mix of energy sources in the local [energy grid for Brazil](https://github.com/mlco2/codecarbon/blob/master/codecarbon/data/private_infra/global_energy_mix.json):

```json
"BRA": {
        "biofuel_TWh": 57.6,
        "carbon_intensity": 158.592,
        "coal_TWh": 25.22,
        "country_name": "Brazil",
        "fossil_TWh": 139.21,
        "gas_TWh": 91.03,
        "hydroelectricity_TWh": 362.82,
        "iso_code": "BRA",
        "low_carbon_TWh": 523.37,
        "nuclear_TWh": 14.7,
        "oil_TWh": 22.96,
        "other_renewable_TWh": 57.6,
        "other_renewable_exc_biofuel_TWh": 0.0,
        "per_capita_Wh": 3091.456,
        "renewables_TWh": 508.67,
        "solar_TWh": 16.75,
        "total_TWh": 662.58,
        "wind_TWh": 71.5,
        "year": 2021
    },
```

According to estimations made in 2021, Brazil has a carbon intensity of 158.592 kgCO2/kWh (and that is our $C$).

> **Note: If CodeCarbon does not have access to the global carbon intensity or electricity of a country but has its electricity mix, it computes the carbon intensity of electricity using [this](https://mlco2.github.io/codecarbon/methodology.html#id5) table.**

Meanwhile, CodeCarbon monitors power usage by tracking hardware infrastructure, including GPUs (utilizing the [`pynvml`](https://pypi.org/project/pynvml/) library), RAM (using a 3 Watts for 8 GB ratio), and CPUs. The net energy used is calculated as the product of power and time, measured in kWh ($E = \text{Power} \times \text{Time}$), and that is how we get the $E$.

This is the method behind CodeCarbon. Now, let us see it in action.

Codecarbon supports both online (with internet access) and offline (without internet access) modes. Let us first see the online mode.

> **Note: In this tutorial, we only cover the explicit tracking objects (`EmissionsTracker` and `OfflineEmissionsTracker `). However, CodeCarbon also supports tracking with decorators and context managers. Check the [documentation](https://mlco2.github.io/codecarbon/usage.html) for a full explanation of available methods.**

To track a process in online mode, we use the `EmissionsTracker` object.


In [16]:
# First, we install `codecarbon`
!pip install codecarbon -q

from codecarbon import EmissionsTracker

tracker = EmissionsTracker(
    project_name="your_cool_project_name", # the name of your project
    log_level="critical", # critical will make the EmissionsTracker less verbose
    measure_power_secs=15, # query energy consumption stats at every x seconds
    output_dir="./", # where to output the report
    output_file="emissions.csv", # the name of your report file
    tracking_mode='machine', # you can choose to track the energy consumption of the whole machine or an isolated process (`process`)
)

# Let us see the emissions related to counting to a billion in python

tracker.start()

for _ in range(1, 1000000001):
    pass

tracker.stop()

# You can get stats directly from the `EmissionsTracker`
# For a full list of what you can query, use `dir(tracker)`
print(f'Geo Location: ISO: {tracker._geo.country_iso_code} | Country: {tracker._geo.country_name} | Region : {tracker._geo.region}')
print(f"Emissions: {tracker.final_emissions} | Total Energy Consumption: {tracker._total_energy.kWh}")

Geo Location: ISO: USA | Country: United States | Region : iowa
Emissions: 0.00031273682802557214 | Total Energy Consumption: 0.0006909466935680294


The results above may vary from where you are (i.e., the energy mixed inferred by your location and the hardware you use). To see the full report produced by the `EmissionsTracker,` you can also check the output file (`"emissions.csv"`), which should be on the specified `output_dir`.

You can also configure CodeCarbon by creating a config file (`.codecarbon.config`) on your working directory. For example, let us create a config file and repeat our experiment; however, let us set the `log_level` to debug so our tracker can output information mid-tracking to the terminal.

> **Note: Configuration files must be named `.codecarbon.config` and start with a section header `[codecarbon]` as the first line in the file.**

In [20]:
# Create a config file
with open('.codecarbon.config', 'w+', encoding='utf8') as fp:
    fp.write("""[codecarbon]
    project_name = your_cool_project_name
    measure_power_secs = 15
    save_to_file = true
    output_dir = ./
    log_level = DEBUG
    tracking_mode = machine
    output_file = my_emissions.csv
    """)

tracker = EmissionsTracker()

tracker.start()

for _ in range(1, 1000000001):
    pass

tracker.stop()

print(f'Geo Location: ISO: {tracker._geo.country_iso_code} | Country: {tracker._geo.country_name} | Region : {tracker._geo.region}')
print(f"Emissions: {tracker.final_emissions} | Total Energy Consumption: {tracker._total_energy.kWh}")

[codecarbon INFO @ 19:55:41] [setup] RAM Tracking...
[codecarbon INFO @ 19:55:41] [setup] GPU Tracking...
[codecarbon INFO @ 19:55:41] No GPU found.
[codecarbon INFO @ 19:55:41] [setup] CPU Tracking...
[codecarbon DEBUG @ 19:55:41] Not using PowerGadget, an exception occurred while instantiating IntelPowerGadget : Platform not supported by Intel Power Gadget
[codecarbon DEBUG @ 19:55:41] Not using the RAPL interface, an exception occurred while instantiating IntelRAPL : Intel RAPL files not found at /sys/class/powercap/intel-rapl on linux
[codecarbon INFO @ 19:55:42] CPU Model on constant consumption mode: Intel(R) Xeon(R) CPU @ 2.20GHz
[codecarbon INFO @ 19:55:42] >>> Tracker's metadata:
[codecarbon INFO @ 19:55:42]   Platform system: Linux-6.1.58+-x86_64-with-glibc2.35
[codecarbon INFO @ 19:55:42]   Python version: 3.10.12
[codecarbon INFO @ 19:55:42]   CodeCarbon version: 2.3.2
[codecarbon INFO @ 19:55:42]   Available RAM : 12.675 GB
[codecarbon INFO @ 19:55:42]   CPU count: 2
[code

Geo Location: ISO: USA | Country: United States | Region : iowa
Emissions: 0.00030232393296936275 | Total Energy Consumption: 0.0006679409111823027


As you can see, the tracker outputs many stats to your terminal every 15 seconds, and you have a new report named `my_emissions.csv`.

You can specify other parameters to the tracker object as arguments to the `EmissionsTracker` or variables defined in the `.codecarbon.config` file. For a complete list of what is possible, check the [documentation](https://mlco2.github.io/codecarbon/parameters.html#input-parameters).

Now, let us see how to use the offline mode. The offline version of the `OfflineEmissionsTracker ` supports restricted environments without internet access. While the internal computations remain unchanged, a `country_iso_code` parameter, which corresponds to the [3-letter alphabet ISO Code](https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes) of the country where the compute infrastructure is hosted, is required to fetch Carbon Intensity details of the regional electricity used. Besides this detail, almost everything else is the same.

In [24]:
from codecarbon import OfflineEmissionsTracker

tracker = OfflineEmissionsTracker(
    country_iso_code="BRA",
    project_name="your_cool_offline_project_name",
    log_level="critical",
    measure_power_secs=15,
    output_dir="./",
    output_file="offline_emissions.csv",
    tracking_mode='machine',
)

tracker.start()

for i in range(1, 1000000001):

  # You can query the tracker inside the loop
  if i % 100_000_000 == 0:
    print(f"Total Energy Consumption at step {i}: {tracker._total_energy.kWh}")
  else:
    pass

tracker.stop()

print(f'Geo Location: ISO: {tracker._geo.country_iso_code} | Country: {tracker._geo.country_name} | Region : {tracker._geo.region}')
print(f"Emissions: {tracker.final_emissions} | Total Energy Consumption: {tracker._total_energy.kWh}")

Total Energy Consumption at step 100000000: 0
Total Energy Consumption at step 200000000: 0.00019722925940767862
Total Energy Consumption at step 300000000: 0.00039411938381308706
Total Energy Consumption at step 400000000: 0.0005910087842068689
Total Energy Consumption at step 500000000: 0.0007880947833134638
Total Energy Consumption at step 600000000: 0.0009849097121915245
Total Energy Consumption at step 700000000: 0.0011820762289086986
Total Energy Consumption at step 800000000: 0.0013789584531849017
Total Energy Consumption at step 900000000: 0.0015759158381346337
Total Energy Consumption at step 1000000000: 0.0017727407270066746
Geo Location: ISO: BRA | Country: Brazil | Region : None
Emissions: 0.0002846856903989168 | Total Energy Consumption: 0.0017950822891376413


You can also save stats mid-tracking. For example, you can flush the tracker's stats to your output file mid-tracking using the command `tracker.flush()`. The tracker will write a line in your output file with all tracked stats. Hence, you can have more detailed reports in your output file. Try re-running the cell above by substituting the print statement with a `tracker.flush()`, then check your output file as it is updated.

In [25]:
from codecarbon import OfflineEmissionsTracker

tracker = OfflineEmissionsTracker(
    country_iso_code="BRA",
    project_name="your_cool_offline_project_name",
    log_level="critical",
    measure_power_secs=15,
    output_dir="./",
    output_file="offline_emissions_with_flush.csv",
    tracking_mode='machine',
)

tracker.start()

for i in range(1, 1000000001):

  # Flush the tracker every 100M steps
  if i % 100_000_000 == 0:
    tracker.flush()
  else:
    pass

tracker.stop()

print(f'Geo Location: ISO: {tracker._geo.country_iso_code} | Country: {tracker._geo.country_name} | Region : {tracker._geo.region}')
print(f"Emissions: {tracker.final_emissions} | Total Energy Consumption: {tracker._total_energy.kWh}")

Geo Location: ISO: BRA | Country: Brazil | Region : None
Emissions: 0.0002989726077979654 | Total Energy Consumption: 0.0018851682795977438


Now, let us track the training of an ML model on the GPU. Bellow, we are using what we learned to follow the training of a CNN on the MNIST digit dataset. As before, we pass the desired arguments to our tracker, wrapping the computationally expensive part of our process (the `model. fit()` in this case) with our tracker.

> **Note: If you can't access a GPU, try running this tutorial on [Colab](https://colab.research.google.com/drive/1oZLM3uAHdqdbyVCq67CxHCzWK_ldqMEX).**

In [1]:
import tensorflow as tf
from codecarbon import EmissionsTracker

tracker = EmissionsTracker(
    project_name="conv2d-emissions",
    log_level="critical",
    output_dir="./",
    output_file="emissions_CNN_MNIST.csv",
)

# Load MNIST and split it
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize the data
x_train, x_test = x_train / 255.0, x_test / 255.0

train_images=x_train.reshape(x_train.shape[0], 28, 28, 1)
test_images=x_test.reshape(x_test.shape[0], 28, 28 ,1)

train_labels=tf.keras.utils.to_categorical(y_train)
test_labels=tf.keras.utils.to_categorical(y_test)

# Create a CNN via the Sequential API
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(20, (5,5), padding='same', activation='relu', input_shape=(28,28,1)),
    tf.keras.layers.MaxPooling2D(pool_size=(2,2), strides=(2,2)),
    tf.keras.layers.Conv2D(50, (5,5), padding='same', activation='relu'),
    tf.keras.layers.MaxPooling2D(pool_size=(2,2), strides=(2,2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(500, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile the model with Adam
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=opt,
              loss='categorical_crossentropy',
              metrics=['accuracy'])

print("TensorFlow version:", tf.__version__)
print("Eager mode: ", tf.executing_eagerly())
print("GPU is", "available" if tf.config.list_physical_devices('GPU') else "NOT AVAILABLE")
model.summary()


# Train and track!
tracker.start()

history = model.fit(train_images, train_labels, epochs=10,
                    batch_size=256, verbose=1)

tracker.stop()

# Evaluate the model
test_loss_score, test_acc_score = model.evaluate(test_images, test_labels)
print(f'Final Loss: {test_loss_score:.2f}.')
print(f'Final Performance: {test_acc_score * 100:.2f} %.')

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m179.0/179.0 kB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m66.4/66.4 kB[0m [31m7.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.1/53.1 kB[0m [31m6.1 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
TensorFlow version: 2.15.0
Eager mode:  True
GPU is available
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 28, 28, 20)        520       
                                                                 
 max_pooling2d (MaxPooling2  (None, 14, 14, 20)        0         
 D)                                                              
                                                              

One thing we can do now is to generate a report from our `emissions_CIFAR_CNN_GPU.csv file`.


In [6]:
import pandas as pd
from IPython.display import display, Markdown

df = pd.read_csv("./emissions_CNN_MNIST.csv")

precision = 6

report = f'''# Carbon Emission Report

|**Duration (Seconds)**|**Emission (KgCO2)**|**Emission Rate (KtCO2/Year)**|**CPU Power (Watts)**|
|--------------------------------|-------------------------------------|------------------------------------|--------------------------------|
| {round(df.duration[0], precision)}|{round(df.emissions[0], precision)}|{round(df.emissions_rate[0], precision)}|{round(df.cpu_power[0], precision)}|
|**GPU Power (Watts)**|**RAMPower (Watts)**|**Power Consumption (CPU - kWh)**|**Power Consumption (GPU - kWh)**|
|{round(df.gpu_power[0], precision)}| {round(df.ram_power[0], precision)}|{round(df.cpu_energy[0], precision)}|{round(df.gpu_energy[0], precision)}|
|**Power Consumption (RAM - kWh)**|**Total Consumption (kWh)**|**Country**| **ISO**|
|{round(df.ram_energy[0], precision)}|{round(df.energy_consumed[0], precision)}|{df.country_name[0]}|{df.country_iso_code[0]}|
|**Region**| **Cloud Provider**| **Provider's Region**|**OS**|
|{df.region[0]}| {df.cloud_provider[0]}| {df.cloud_region[0]}|{df.os[0]}|
|**Python Version**| **No. of Processors**|**Provider's CPU Model**| **No. of GPUs**|
|{df.python_version[0]}|{df.cpu_count[0]}|{df.cpu_model[0]}|{df.gpu_count[0]}|
|**GPU Model**|**RAM Memory Size (GB)**| **Tracking Mode**|**Cloud-Processed**|
|{df.gpu_model[0]}| {df.ram_total_size[0]}|{df.tracking_mode[0]}| {df.on_cloud[0]}|

'''
display(Markdown(report))


# Carbon Emission Report

|**Duration (Seconds)**|**Emission (KgCO2)**|**Emission Rate (KtCO2/Year)**|**CPU Power (Watts)**|
|--------------------------------|-------------------------------------|------------------------------------|--------------------------------|
| 42.597785|0.000391|9e-06|42.5|
|**GPU Power (Watts)**|**RAMPower (Watts)**|**Power Consumption (CPU - kWh)**|**Power Consumption (GPU - kWh)**|
|30.174313| 4.753046|0.000503|0.000562|
|**Power Consumption (RAM - kWh)**|**Total Consumption (kWh)**|**Country**| **ISO**|
|5.6e-05|0.001121|United States|USA|
|**Region**| **Cloud Provider**| **Provider's Region**|**OS**|
|nevada| nan| nan|Linux-6.1.58+-x86_64-with-glibc2.35|
|**Python Version**| **No. of Processors**|**Provider's CPU Model**| **No. of GPUs**|
|3.10.12|2|Intel(R) Xeon(R) CPU @ 2.00GHz|1|
|**GPU Model**|**RAM Memory Size (GB)**| **Tracking Mode**|**Cloud-Processed**|
|1 x Tesla T4| 12.674789428710938|machine| N|



If you have CodeCarbon installed on your machine (_this wont work on Colab_), you can generate a `Dash` app to analyze the results of your experiments. The dashboard displays illustrations to understand the emissions logged from your experiments across projects. To luch the app, run the following CLI command:

```bash
carbonboard --filepath="experiments/emissions.csv" --port=3333
```

- **filepath**: the path to the CSV file containing logged information.

- **port**: an optional port number, e.g., 8050.

With CodeCarbon, you have a simple and straightforward way to keep track of your energy consumption and carbon footprint. Also, other options for CO2 emission tracking exist, like [Eco2AI](https://github.com/sb-ai-lab/Eco2AI), which has a similar interface and user experience to CodeCarbon.

---

Return to the [castle](https://github.com/Nkluge-correa/TeenyTinyCastle).
