# How to compute CO2 emission

To know the amount of estimated CO2 emissions associated to the execution of our applications is a crucial step to realize our environmental impact. To this end, we show here how to accomplish this using the [CodeCarbon](https://mlco2.github.io/codecarbon/index.html) library/service.

This notebook has been developed by the [SINAI](https://sinai.ujaen.es) research group for its usage in the MentalRiskES evaluation campaign at IberLEF 2023.



## Install CodeCarbon package

In [1]:
!pip install codecarbon

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting codecarbon
  Downloading codecarbon-2.1.4-py3-none-any.whl (174 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m174.9/174.9 KB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Collecting arrow
  Downloading arrow-1.2.3-py3-none-any.whl (66 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m66.4/66.4 KB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
Collecting pynvml
  Downloading pynvml-11.5.0-py3-none-any.whl (53 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m53.1/53.1 KB[0m [31m5.6 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting py-cpuinfo
  Downloading py_cpuinfo-9.0.0-py3-none-any.whl (22 kB)
Collecting fuzzywuzzy
  Downloading fuzzywuzzy-0.18.0-py2.py3-none-any.whl (18 kB)
Installing collected packages: py-cpuinfo, fuzzywuzzy, pynvml, arrow, codecarbon
Successfully installed arrow-1.2.3 codecarbon-2.1.4 fuzzywuzzy-0.18.0 py-c

## Estimate different impact metrics

There are several ways to track code's emissions. Here we will use start() and stop() approach.

In [2]:
from codecarbon import EmissionsTracker

config = {
    "save_to_file": True,
    "log_level": "DEBUG",
    "tracking_mode": "process",
    "output_dir": ".",
}

tracker = EmissionsTracker(**config)

[codecarbon INFO @ 11:26:30] [setup] RAM Tracking...
[codecarbon INFO @ 11:26:30] [setup] GPU Tracking...
[codecarbon INFO @ 11:26:30] No GPU found.
[codecarbon INFO @ 11:26:30] [setup] CPU Tracking...
[codecarbon DEBUG @ 11:26:30] Not using PowerGadget, an exception occurred while instantiating IntelPowerGadget : Platform not supported by Intel Power Gadget
[codecarbon DEBUG @ 11:26:30] Not using the RAPL interface, an exception occurred while instantiating IntelRAPL : Intel RAPL files not found at /sys/class/powercap/intel-rapl on linux
[codecarbon DEBUG @ 11:26:32] CPU : We detect a AMD EPYC 7B12 with a TDP of 240 W
[codecarbon INFO @ 11:26:32] CPU Model on constant consumption mode: AMD EPYC 7B12
[codecarbon INFO @ 11:26:32] >>> Tracker's metadata:
[codecarbon INFO @ 11:26:32]   Platform system: Linux-5.10.147+-x86_64-with-glibc2.29
[codecarbon INFO @ 11:26:32]   Python version: 3.8.10
[codecarbon INFO @ 11:26:32]   Available RAM : 12.681 GB
[codecarbon INFO @ 11:26:32]   CPU count

Now, we will compute the CO2 cost of translate some text into Spanish using deep learning models.

In [3]:
!pip install transformers sentencepiece

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting transformers
  Downloading transformers-4.26.1-py3-none-any.whl (6.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.3/6.3 MB[0m [31m31.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting sentencepiece
  Downloading sentencepiece-0.1.97-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m31.9 MB/s[0m eta [36m0:00:00[0m
Collecting huggingface-hub<1.0,>=0.11.0
  Downloading huggingface_hub-0.12.1-py3-none-any.whl (190 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m190.3/190.3 KB[0m [31m15.1 MB/s[0m eta [36m0:00:00[0m
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.6/7.6 MB[0m 

In [4]:
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline, set_seed
tokenizer = AutoTokenizer.from_pretrained("Helsinki-NLP/opus-mt-en-es")
model = AutoModelForSeq2SeqLM.from_pretrained("Helsinki-NLP/opus-mt-en-es")
translator = pipeline('translation', tokenizer=tokenizer, model=model)
set_seed(42)

Downloading (…)okenizer_config.json:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

Downloading (…)lve/main/config.json:   0%|          | 0.00/1.47k [00:00<?, ?B/s]

Downloading (…)olve/main/source.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

Downloading (…)olve/main/target.spm:   0%|          | 0.00/826k [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/1.59M [00:00<?, ?B/s]



Downloading (…)"pytorch_model.bin";:   0%|          | 0.00/312M [00:00<?, ?B/s]

Downloading (…)neration_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

In [5]:
tracker.start()
translation = translator("After the breakfast, we will go for lunch...", num_return_sequences=2)
emissions = tracker.stop()

[codecarbon INFO @ 11:27:08] Energy consumed for RAM : 0.000000 kWh. RAM Power : 0.4725193977355957 W
[codecarbon DEBUG @ 11:27:08] RAM : 0.47 W during 1.12 s [measurement time: 0.0020]
[codecarbon INFO @ 11:27:08] Energy consumed for all CPUs : 0.000037 kWh. All CPUs Power : 120.0 W
[codecarbon DEBUG @ 11:27:08] CPU : 120.00 W during 1.12 s [measurement time: 0.0020]
[codecarbon INFO @ 11:27:08] 0.000038 kWh of electricity used since the begining.
[codecarbon DEBUG @ 11:27:08] last_duration=1.124413251876831
------------------------
[codecarbon DEBUG @ 11:27:08] EmissionsData(timestamp='2023-02-17T11:27:08', project_name='codecarbon', run_id='3e1366d3-65dd-443f-bf07-821425a7c8a5', duration=1.130903720855713, emissions=1.7030902796819308e-05, emissions_rate=0.015059551474401956, cpu_power=120.0, gpu_power=0.0, ram_power=0.4725193977355957, cpu_energy=3.74804417292277e-05, gpu_energy=0, ram_energy=1.4686954790334993e-07, energy_consumed=3.762731127713105e-05, country_name='United States

In [6]:
print(translation)
print(emissions)

[{'translation_text': 'Después del desayuno, iremos a almorzar...'}, {'translation_text': 'Después del desayuno, iremos a comer...'}]
1.7030902796819308e-05


# Data to be submitted

Collected measurements are detailed [here](https://mlco2.github.io/codecarbon/output.html)

MentalRiskES requires of the following info to be added to every submission under a "impact" entry in the JSON object.

In [7]:
relevant_cols = [
    "duration", "emissions", "cpu_energy", "gpu_energy", "ram_energy", 
    "energy_consumed", "cpu_count", "gpu_count", "cpu_model", "gpu_model", 
    "ram_total_size"
]

In order to get all the information from the last entry in the logged CSV file, we can use a Pandas dataframe and convert last row to a dictionary.


In [9]:
import pandas as pd

df = pd.read_csv("emissions.csv")
measurements = df.iloc[-1][relevant_cols].to_dict()

measurements # This is what your team must send in the POST request

{'duration': 1.130903720855713,
 'emissions': 1.7030902796819308e-05,
 'cpu_energy': 3.74804417292277e-05,
 'gpu_energy': 0,
 'ram_energy': 1.4686954790334993e-07,
 'energy_consumed': 3.762731127713105e-05,
 'cpu_count': 2,
 'gpu_count': nan,
 'cpu_model': 'AMD EPYC 7B12',
 'gpu_model': nan,
 'ram_total_size': 12.681198120117188}