In [49]:
import sys

sys.path.insert(0, "../")

In [50]:
from src.tagger import HiddenMarkovModel as LocalHMM

from src.scrapper import parse_conllu_file
from src.visualization import compare_size_and_time

import pandas as pd

In this notebook we analyse the performance of our algorithm in terms of speed and carbon footprint.

-> It is recommended not to execute again the cells below since results may vary slightly according to the architecture where the code is ran therefore the conclusions written in text may vary. However, the magnitudes and proportions between experiments should be preserved. We do encourage to try it yourself on new code cells if you feel like it!


# Text processing

In [52]:
dataset = parse_conllu_file(filepath="../datasets/ca_ancora-ud-train.conllu")
test_dataset = parse_conllu_file(
    filepath="../datasets/en_gum-ud-train.conllu"
)  # we are aware both datasets are from different languages

s_dataset = dataset[: int(len(dataset) / 4)]
m_dataset = dataset[: int(len(dataset) / 2)]
l_dataset = dataset
xl_dataset = dataset + test_dataset

datasets = {"S": s_dataset, "M": m_dataset, "L": l_dataset, "XL": xl_dataset}

for name, data in datasets.items():
    print(f"Length size {name} = {len(data)}")

Length size S = 3280
Length size M = 6561
Length size L = 13123
Length size XL = 21671


# Training Analysis

From now on, we will refer to:
* The models trained using our implementation as the `local implementation`
* The models trained with the different sizes of data as `S`, `M`, `L`, `XL`.

In [61]:
# local
%timeit LocalHMM(s_dataset).train()
%timeit LocalHMM(m_dataset).train()
%timeit LocalHMM(l_dataset).train()
%timeit LocalHMM(xl_dataset).train()

56.4 ms ± 247 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
106 ms ± 147 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
211 ms ± 4.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
305 ms ± 8.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [33]:
local_training_timings = [55 * 10**-3, 106 * 10**-3, 211 * 10**-3, 300 * 10**-3]
compare_size_and_time(datasets, local_training_timings, title="Training")

* It appears that our model grows linearly with the addition of more data samples.

# Predict Analysis

For this section, we have opted for the L-sized model, as it is the model used for our metrics and predictions.

In [31]:
local_l_model = LocalHMM(l_dataset).train()

In [15]:
%timeit local_l_model.predict(s_dataset)
%timeit local_l_model.predict(m_dataset)
%timeit local_l_model.predict(l_dataset)
%timeit local_l_model.predict(xl_dataset)

674 ms ± 3.44 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.25 s ± 15.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.46 s ± 24.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
3.22 s ± 42.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [35]:
local_test_timings = [674 * 10**-3, 1.25, 2.46, 3.22]
compare_size_and_time(datasets, local_test_timings, "Prediction")

* Our model seems to grow linearly in time when it comes to prediction too.

# Some other thoughts

**Why is the implementation done with dictionaries?**
* The parameters we train (transition matrix, emission matrix and initial states) would be very sparse and would occupy a a considerable amount of space if represented using matrixes. Alternatively, by using dictionaries, specifically default dictionaries, we can eliminate the need for sparse matrices and retrieve the required values or default to a predetermined value.
* The cost of it is that code readability may be somewhat reduced, as matrix multiplication is more intuitive than iterating through dictionary keys and values.
* A future work consideration would be to compare the algorithm in dictionary version versus in numpy's arrays implementation.

**Why do are logprobabilities used instead of probabilities?**

* By computing log probabilities, we can simplify adding them when computing the Viterbi matrix and the probability of each step.
* Addition is much more efficient than multiplication in terms of computational complexity and execution time.


# Carbon Footprint Analysis

Our approach is straightforward and resource-efficient, requiring minimal resources for training and prediction. While admittedly of limited complexity, it is nonetheless worthwhile to investigate the potential carbon footprint it could generate, if only out of curiosity

In this section, we use a library that tracks the emissions generated by our function. Since the element we will use is a decorator and we do not want to touch our source code, we have created a mock function that will basically call any method we send as parameter with the corresponding arguments and keyword arguments.

This enables us to evaluate our carbon footprint without impacting the original program.

⚠️ Note that the conclusions drawn in this section can be misleading if the code is re-run in another computer or with other data, so they must be taken as orientative and prone to change ⚠️

In [36]:
from codecarbon import track_emissions

In [37]:
@track_emissions()
def compute_emissions(function_to_track: callable, *args, **kwargs):
    return function_to_track(*args, **kwargs)

In [38]:
initialized_class = LocalHMM(l_dataset)
model = compute_emissions(initialized_class.train)

[codecarbon INFO @ 14:51:22] [setup] RAM Tracking...
[codecarbon INFO @ 14:51:22] [setup] GPU Tracking...
[codecarbon INFO @ 14:51:22] No GPU found.
[codecarbon INFO @ 14:51:22] [setup] CPU Tracking...
[codecarbon INFO @ 14:51:22] CPU Model on constant consumption mode: Apple M1
[codecarbon INFO @ 14:51:22] >>> Tracker's metadata:
[codecarbon INFO @ 14:51:22]   Platform system: macOS-13.5.2-arm64-arm-64bit
[codecarbon INFO @ 14:51:22]   Python version: 3.11.6
[codecarbon INFO @ 14:51:22]   CodeCarbon version: 2.3.1
[codecarbon INFO @ 14:51:22]   Available RAM : 16.000 GB
[codecarbon INFO @ 14:51:22]   CPU count: 8
[codecarbon INFO @ 14:51:22]   CPU model: Apple M1
[codecarbon INFO @ 14:51:22]   GPU count: None
[codecarbon INFO @ 14:51:22]   GPU model: None
[codecarbon INFO @ 14:51:25] 
Graceful stopping: collecting and writing information.
Please wait a few seconds...
[codecarbon INFO @ 14:51:25] Energy consumed for RAM : 0.000000 kWh. RAM Power : 6.0 W
[codecarbon INFO @ 14:51:25] Ene

In [39]:
_ = compute_emissions(model.predict, l_dataset)

[codecarbon INFO @ 14:51:52] [setup] RAM Tracking...
[codecarbon INFO @ 14:51:52] [setup] GPU Tracking...
[codecarbon INFO @ 14:51:52] No GPU found.
[codecarbon INFO @ 14:51:52] [setup] CPU Tracking...
[codecarbon INFO @ 14:51:52] CPU Model on constant consumption mode: Apple M1
[codecarbon INFO @ 14:51:52] >>> Tracker's metadata:
[codecarbon INFO @ 14:51:52]   Platform system: macOS-13.5.2-arm64-arm-64bit
[codecarbon INFO @ 14:51:52]   Python version: 3.11.6
[codecarbon INFO @ 14:51:52]   CodeCarbon version: 2.3.1
[codecarbon INFO @ 14:51:52]   Available RAM : 16.000 GB
[codecarbon INFO @ 14:51:52]   CPU count: 8
[codecarbon INFO @ 14:51:52]   CPU model: Apple M1
[codecarbon INFO @ 14:51:52]   GPU count: None
[codecarbon INFO @ 14:51:52]   GPU model: None
[codecarbon INFO @ 14:51:57] 
Graceful stopping: collecting and writing information.
Please wait a few seconds...
[codecarbon INFO @ 14:51:57] Energy consumed for RAM : 0.000004 kWh. RAM Power : 6.0 W
[codecarbon INFO @ 14:51:57] Ene

When running this model on an Apple M1 CPU without using GPUs (not required either way):

* As can be seen, our model's consumption - in the datasets we have used - is mostly negligible.
* Training:
    * Energy consumed for RAM : 0.000000 kWh. RAM Power : 6.0 W
    * Energy consumed for all CPUs : 0.000000 kWh. Total CPU Power : 5.0 W
* Predict:
    * Energy consumed for RAM : 0.000004 kWh. RAM Power : 6.0 W
    * Energy consumed for all CPUs : 0.000003 kWh. Total CPU Power : 5.0 W
* Nonetheless, we can affirm that prediction is more expensive than training.

The library also saves the results in a csv:

In [55]:
emissions = pd.read_csv("../evaluation/emissions.csv").T
emissions

Unnamed: 0,0,1
timestamp,2023-11-05T14:51:25,2023-11-05T14:51:57
project_name,codecarbon,codecarbon
run_id,694abec2-d4cf-4a36-96f2-31b023ccfbb0,9ebcc48a-03e9-4921-9065-a5f2743ed5a9
duration,0.229971,2.458836
emissions,0.0,0.000001
emissions_rate,0.000001,0.000001
cpu_power,5.0,5.0
gpu_power,0.0,0.0
ram_power,6.0,6.0
cpu_energy,0.0,0.000003


# Conclusions

* Our model grows linearly over time in accordance with the data. It may be necessary to verify its extrapolation abilities with larger dataset sizes.
* Our prediction (in batch) takes longer than our training.
* Our model uses few computational resources, so our carbon footprint is small.
* For future work, it would be interesting to compare our current implementation with dictionaries to external libraries and our same implementation using matrixes.