# Journal 2022-09-06
Reading papers and books on AI and ML ethics, taking notes.  Import the usual python data science packages plus NetworkX and SymPy in case I want to do any diagrams or system modelling.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import numpy as np
import pandas as pd
import networkx as nx
import sympy as sp
import matplotlib.pyplot as plt
import seaborn as sns

# Timnit Gebru papers

## [On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜](https://dl.acm.org/doi/10.1145/3442188.3445922)


### Abstract
For language models
* how big is too big?
* possible risks? 
* possible risk mitigations?
  * consider financial costs and environmental impact of training
  * invest in documenting and curating datasets (MMcD: see [Datasheets for Datasets
](https://arxiv.org/abs/1803.09010)? _later_ mentioned in Conclusion section)
  * pre-development of new/bigger LMs evaluate how planned approach
    * fits into R&D goals
    * supports stakeholder values
    
### 1 Introduction
* Environmental impact
* "documentation debt"
* using training data with biases or abusive language leads to automated generation of this material and risks of harms $\rightarrow$ racist, sexist, ableist, extremist, other (see sec 6)


### 2 Background
* Overview of different language models.
* history: n-gram $\rightarrow$ word embedding $\rightarrow$ transformer 
  * word embedding approach reduced need for downstream training data (ELMo - LSTM model)
  * transformers more data and bigger is always better 
* change in type of model maps to change in type of task they're used for:
  * n-gram: selecting outputs of acoustical or translation models
  * LSTM-derived word vectors: NLP labelling and classification tasks
  * transformer: few-shot retraining for summarization, question answering etc
* multilingual models such as mBERT better than mono-lingual models for Named Entity Recognition (NER), Part-of-Speech (POS) tagging, dependency parsing
* mono-lingual BERT does outperform on 29 other tasks (ref 95) [What the [MASK]? Making Sense of Language-Specific BERT Models](https://arxiv.org/abs/2003.02912)
* "\[even mBert\] does not address the inclusion problems raised by (ref 65), who note that over 90% of the world's languages used by more than a billion people currently have little to no support in terms of language technology" [The State and Fate of Linguistic Diversity and Inclusion in the NLP World](https://arxiv.org/abs/2004.09095)
  * (MMcD: compare language list with Meta [No Language Left Behind](https://ai.facebook.com/research/no-language-left-behind/) project?)
* Model size reduction techniques: knowledge distillation, quantization, factorized embedding parameterization progressive module replacement

### 3 Environmental and Financial Cost
* (ref 129) [Energy and Policy Considerations for Deep Learning in NLP](https://arxiv.org/abs/1906.02243)
  * advice: report training time and sensitivity to hyperparameters when the released model is meant to be retrained for downstream use
* Average human 5t CO2e  (CO2e = Carbon Dioxide emissions p.a.)
* Large Transformer 284t CO2e ([Attention is All You Need](https://arxiv.org/abs/1706.03762)) 
  * (MMcD: how is this affected by technology choice i.e. GPU vs TPU vs IPU (vs ASIC?) ?  _later_ ref 129 sec 4.1 discusses, TPU more efficient than GPU)
* refs 57, 75 tools for benchmarking energy usage 
  * (ref 57) [Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning](https://arxiv.org/abs/2002.05651) $\rightarrow$ [Experiment Impact Tracker](https://github.com/Breakend/experiment-impact-tracker)
  * (ref 75) [Energy Usage Reports: Environmental awareness as part of algorithmic accountability](https://arxiv.org/abs/1911.08354) $\rightarrow$ [Energy Usage (deprecated)](https://github.com/responsibleproblemsolving/energy-usage) $\rightarrow$ [Code Carbon](https://codecarbon.io/)
* also need to consider inference costs, not just training, since these could end up being bigger for PRD model
* risk/benefit calculation: an issue is the distribution of risks and benefits
  * risks: global climate change impacting marginalized communities, hegemonic bias in LMs
  * benefits: some to marginalized communities (refs 17,101) but most to already privileged.  Example given is home automation assistants (Google Home, Amazon Alexa etc) 
    * (MMcD: possibly community focused LM-powered mobile applications could be added to the benefit calculation?  LMs as tools for increasing communication and networking seems more a Meta thing than Google so might have been missed _later_ touched on in section 7 )

### 4 Unfathomable Training Data 
* Large, uncurated, Internet-based datasets encode the dominant/hegomoic view 
* 4.1 Size (of training set) Doesn't Guarantee Diversity
  * (MMcD attempted summary) over-emphasizes recent contributions from mainly younger male internet users due to history of internet "institutions" such as Reddit, Wikipedia
  * de-emphasizes non-mainstream sites e.g. blogs (MMcD: overweights viral content?)
  * (MMcD: will trends in internet usage over time change this i.e. as demographic changes?  Currently an effect of the bias is negative feedback to preserve status quo demographic )
  * Hegemonic viewpoint bias in each step of in funnel of participation $\rightarrow$ continued presence $\rightarrow$ collection of data $\rightarrow$ filtering of data
  * (MMcD: feels like this applies more to public dataset training, what is the impact of federated learning e.g. for Google they can train models that makes use of a larger body of non-public data)
* 4.2 Static Data / Changing Social Views
  * (MMcD: is this distribution shifts between training and test data?)
  * reporting bias de-emphasizes data from marginalized community social movements, or misrepresents to align with existing power structures
  * need for curation
* 4.3 Encoding Bias
  * Models exhibit intersectionality of bias i.e. multiple bias factors leads to more bias
  * (ref 119) [The Woman Worked as a Babysitter: On Biases in Language Generation](https://arxiv.org/abs/1909.01326) covers some methods for auditing bias but these may be unreliable (refs 61, 103, Perspective API)
  * Auditing LMs for biases requires _a priori_ understanding of relevant social categories, but this requires cultural understanding 
  * (ref 19)[Language (Technology) is Power: A Critical Survey of "Bias" in NLP](https://arxiv.org/abs/2005.14050) "an attempt to measure the appropriateness of text generated by LMs, or the biases encoded by a system, always needs to be done in relation to particular social contexts and marginalized perspectives"
* 4.4 Curation, Documentation & Accountability
  * "In summary, LMs trained on large, uncurated, static datasets from the Web encode hegemonic views that are harmful to marginalized populations. We thus emphasize the need to invest significant resources into curating and documenting LM training data."
  * (ref 18) [Large image datasets: A pyrrhic win for computer vision?](https://arxiv.org/abs/2006.16923) covers justice-orientated data collection methodology.
  * _documentation debt_
  
### 5 Down the Garden Path
* (this section covers) the risk of misdirected research effort i.e. applying LM research to tasks intended for Natural Language Understanding (NLU)
* No actual NLU is being done by LMs evn if they perform well on NLU benchmarks
* "Languages are systems of signs, i.e. pairings of form and meaning.  But the training data for LMs is only form; they do not have access to meaning.  Therefore, claims about model abilities must be carefully characterized"
  * (MMcD: is this true for models that condition on speaker intent e.g. consider training a language model on data from MMORPG chats?)
* "BERTology" i.e. interesting linguistic questions on what transformers are learning about linguistic structures from the unsupervised natural language modelling task
  * (ref 110) [A Primer in BERTology: What we know about how BERT works](https://arxiv.org/abs/2002.12327)
  * (ref 133) [BERT Rediscovers the Classical NLP Pipeline
](https://arxiv.org/abs/1905.05950)
>  If a large LM, endowed
with hundreds of billions of parameters and trained on a very large
dataset, can manipulate linguistic form well enough to cheat its
way through tests meant to require language understanding, have
we learned anything of value about how to build machine language
understanding or have we been led down the garden path?"

### 6 Stochastic Parrots
* Exploration of how the factors in sections 4 and 5 lead to real world harms
* 6.1 Coherence in the Eye of the Beholder
  * LMs generate seemingly coherent text because human readers attempt to model the meaning of a speaker's words however for LMs there is no underlying meaning
  * (MMcD: c.f. Tarot readings?  i.e. attempting to read meaning into a sequence of randomly generated symbols)
  * (MMcD: (follow-on thought) extracting the symbols/concepts encoded in a LM trained on biased dataset to construct a biased Tarot card set.  c.f. "The Ceremonies" T.E.D. Klein)
* 6.2 Risks and Harms
  * Encoding of hegemonic worldview in LMs $\rightarrow$ encoded racist, sexist, ableist etc biases
  * (MMcD: c.f. [Unspeak](https://www.amazon.co.uk/Unspeak-Words-Weapons-Steven-Poole/dp/0349119244)?)
  * generating abusive language that then gets included in training data $\rightarrow$ positive feedback loop
  * LM output as part of a classification system $\rightarrow$ biased decisions $rightarrow$ allocational/reputational harm
  * LMs deployed for malicious purposes e.g. synthetic generation of extremist ideology
    * (MMcD: c.f. [Distraction](https://www.amazon.co.uk/Distraction-Bruce-Sterling/dp/1857988310) stochastic terrorism)
  * mistranslation risk 
  * risk of extracting PII (MMcD: c.f. [xkcd Predictive Models](https://xkcd.com/2169/)?)

> We note that
the risks associated with synthetic but seemingly coherent text are
deeply connected to the fact that such synthetic text can enter into
conversations without any person or entity being accountable for it.
This accountability both involves responsibility for truthfulness and
is important in situating meaning. As Maggie Nelson (ref 92) writes:
“Words change depending on who speaks them; there is no cure.”

### 7 Paths Forward 
> In order to mitigate the risks that come with the creation of increasingly large LMs, we urge researchers to shift to a mindset of
careful planning, along many dimensions, before starting to build
either datasets or systems trained on datasets. We should consider our research time and effort a valuable resource, to be spent to the
extent possible on research projects that build towards a technological ecosystem whose benefits are at least evenly distributed or
better accrue to those historically most marginalized. This means
considering how research contributions shape the overall direction
of the field and keeping alert to directions that limit access. Likewise, it means considering the financial and environmental costs
of model development up front, before deciding on a course of investigation. The resources needed to train and tune state-of-the-art
models stand to increase economic inequities unless researchers
incorporate energy and compute efficiency in their model evaluations. Furthermore, the goals of energy and compute efficient model
building and of creating datasets and models where the incorporated biases can be understood both point to careful curation of
data. Significant time should be spent on assembling datasets suited
for the tasks at hand rather than ingesting massive amounts of data
from convenient or easily-scraped Internet sources. As discussed in
§4.1, simply turning to massive dataset size as a strategy for being
inclusive of diverse viewpoints is doomed to failure. We recall again
Birhane and Prabhu’s [18] words (inspired by Ruha Benjamin [15]):
“Feeding AI systems on the world’s beauty, ugliness, and cruelty,
but expecting it to reflect only the beauty is a fantasy.”

* During data collection adopt frameworks such as:
  * (ref 13) [Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science](https://aclanthology.org/Q18-1041/)
  * (ref 52) [Datasheets for Datasets](https://arxiv.org/abs/1803.09010)
  * (ref 86) [Model Cards for Model Reporting](https://arxiv.org/abs/1810.03993)

> We also advocate for a re-alignment of research goals: Where
much effort has been allocated to making models (and their training
data) bigger and to achieving ever higher scores on leaderboards
often featuring artificial tasks, we believe there is more to be gained
by focusing on understanding how machines are achieving the
tasks in question and how they will form part of socio-technical
systems.

* (ref 68) [Performing a Project Premortem](https://hbr.org/2007/09/performing-a-project-premortem)
* consider risks and benefits but also alternatives to current project plans

* [Value-sensitive design](https://www.wikiwand.com/en/Value_sensitive_design)
>  For researchers working
with LMs, value sensitive design is poised to help throughout the
development process in identifying whose values are expressed and
supported through a technology and, subsequently, how a lack of
support might result in harm.

* Discussion of Automatic Speech Recognition, potential benefits to marginalized communities but need for awareness of dual-use aspect of LMs if these are needed as a solution.
  * Watermarking, regulation approaches

### 8 Conclusion
> We have identified a wide variety of costs and risks associated
with the rush for ever larger LMs, including: environmental costs
(borne typically by those not benefiting from the resulting technology); financial costs, which in turn erect barriers to entry, limiting
who can contribute to this research area and which languages can
benefit from the most advanced techniques; opportunity cost, as researchers pour effort away from directions requiring less resources;
and the risk of substantial harms, including stereotyping, denigration, increases in extremist ideology, and wrongful arrest, should
humans encounter seemingly coherent LM output and take it for
the words of some person or organization who has accountability
for what is said.

> Thus, we call on NLP researchers to carefully weigh these risks
while pursuing this research direction, consider whether the benefits outweigh the risks, and investigate dual use scenarios utilizing
the many techniques (e.g. those from value sensitive design) that
have been put forth. We hope these considerations encourage NLP
researchers to direct resources and effort into techniques for approaching NLP tasks that are effective without being endlessly data
hungry. But beyond that, we call on the field to recognize that applications that aim to believably mimic humans bring risk of extreme
harms. Work on synthetic human behavior is a bright line in ethical
AI development, where downstream effects need to be understood
and modeled in order to block foreseeable harm to society and
different social groups. Thus what is also needed is scholarship on
the benefits, harms, and risks of mimicking humans and thoughtful
design of target tasks grounded in use cases sufficiently concrete
to allow collaborative design with affected communities.

# Thoughts on 'Stochastic Parrot'
This is a really excellent paper with a lot of things to ponder.  From a researcher point of view it highlights the questions 'is this model the right one to develop and if so is it right to do it in this way?'.  From a industry practitioner point of view it also offers some tools to try out.  As this is my currently my main perspective I'll discuss these further below.

On the environmental and financial cost aspects, an industry practitioner will be aware of these to some extent in how they impact the company financials or their cost centre.  That is, you'd expect a fairly strong correlation between climate impact and things that are directly apparent to the engineer such as time required to train, cost and size of cloud instances for training, ongoing costs of inference operations etc.  These affect KPIs such as sprint velocity (how many models can we train per sprint), budget, operational cost and general MLOps considerations.  

The social costs can be less apparent depending on where the industry practitioner sits in the organization e.g. Data Scientist vs ML Engineer.  Whether they are measured also depends on the customer focus of the organization e.g. if there's an engagement metric that would surface any issues.  In a lot of cases probably not since sensitive PII shouldn't generally be part of dashboards so maybe separate audit process?  

Possibly related reading I need to look at: [Fairness for Unobserved Characteristics: Insights from Technological Impacts on Queer Communities](https://arxiv.org/abs/2102.04257) (via @deepmind [job advertisement campaign](https://twitter.com/DeepMind/status/1560184474561921024) )

Now to try out one of the tools for environmental impact: [CodeCarbon](https://codecarbon.io/) (current version of reference 75)  - [CodeCarbon GitHub repo](https://github.com/mlco2/codecarbon)

In [3]:
from journal_20220902 import run_training
from codecarbon import EmissionsTracker

In [4]:
tracker = EmissionsTracker()

[codecarbon INFO @ 15:29:21] [setup] RAM Tracking...
[codecarbon INFO @ 15:29:21] [setup] GPU Tracking...
[codecarbon INFO @ 15:29:21] No GPU found.
[codecarbon INFO @ 15:29:21] [setup] CPU Tracking...
[codecarbon ERROR @ 15:29:21] Unable to read Intel RAPL files for CPU power, we will use a constant for your CPU power. Please view https://github.com/mlco2/codecarbon/issues/244 for workarounds : [Errno 13] Permission denied: '/sys/class/powercap/intel-rapl/intel-rapl:1/energy_uj'
[codecarbon ERROR @ 15:29:21] Unable to read Intel RAPL files for CPU power, we will use a constant for your CPU power. Please view https://github.com/mlco2/codecarbon/issues/244 for workarounds : [Errno 13] Permission denied: '/sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj'
[codecarbon INFO @ 15:29:21] Tracking Intel CPU via RAPL interface
[codecarbon ERROR @ 15:29:22] Unable to read Intel RAPL files for CPU power, we will use a constant for your CPU power. Please view https://github.com/mlco2/codecarb

Ah.  Ok off to not a great start since the CodeCarbon library can't read the CPU power state for security reasons.  So instead it will use an estimate.  Could try the workaround and reboot but for the moment carry on, I think this means it uses an estimate of 85W for power.  Might work better on Collab with GPU as well.

In [6]:
tracker.start()
state = run_training()
tracker.stop()

[codecarbon INFO @ 15:32:53] Energy consumed for RAM : 0.000012 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:32:53] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:32:53] 0.000012 kWh of electricity used since the begining.
[codecarbon INFO @ 15:33:08] Energy consumed for RAM : 0.000024 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:33:08] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:33:08] 0.000024 kWh of electricity used since the begining.
[codecarbon INFO @ 15:33:23] Energy consumed for RAM : 0.000036 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:33:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:33:23] 0.000036 kWh of electricity used since the begining.
[codecarbon INFO @ 15:33:38] Energy consumed for RAM : 0.000048 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:33:38] Energy consumed for all CP

train epoch: 1, loss: 0.2140, accuracy:  93.48


[codecarbon INFO @ 15:34:08] Energy consumed for RAM : 0.000072 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:34:08] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:34:08] 0.000072 kWh of electricity used since the begining.


test epoch: 1,loss: 0.0868, accuracy:  97.30


[codecarbon INFO @ 15:34:23] Energy consumed for RAM : 0.000084 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:34:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:34:23] 0.000084 kWh of electricity used since the begining.
[codecarbon INFO @ 15:34:38] Energy consumed for RAM : 0.000096 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:34:38] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:34:38] 0.000096 kWh of electricity used since the begining.
[codecarbon INFO @ 15:34:53] Energy consumed for RAM : 0.000108 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:34:53] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:34:53] 0.000108 kWh of electricity used since the begining.
[codecarbon INFO @ 15:35:08] Energy consumed for RAM : 0.000120 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:35:08] Energy consumed for all CP

train epoch: 2, loss: 0.0642, accuracy:  97.98


[codecarbon INFO @ 15:35:38] Energy consumed for RAM : 0.000143 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:35:38] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:35:38] 0.000143 kWh of electricity used since the begining.


test epoch: 2,loss: 0.0485, accuracy:  98.45


[codecarbon INFO @ 15:35:53] Energy consumed for RAM : 0.000155 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:35:53] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:35:53] 0.000155 kWh of electricity used since the begining.
[codecarbon INFO @ 15:36:08] Energy consumed for RAM : 0.000167 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:36:08] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:36:08] 0.000167 kWh of electricity used since the begining.
[codecarbon INFO @ 15:36:23] Energy consumed for RAM : 0.000179 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:36:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:36:23] 0.000179 kWh of electricity used since the begining.
[codecarbon INFO @ 15:36:38] Energy consumed for RAM : 0.000191 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:36:38] Energy consumed for all CP

train epoch: 3, loss: 0.0432, accuracy:  98.65
test epoch: 3,loss: 0.0372, accuracy:  98.76


[codecarbon INFO @ 15:37:08] Energy consumed for RAM : 0.000215 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:37:08] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:37:08] 0.000215 kWh of electricity used since the begining.
[codecarbon INFO @ 15:37:23] Energy consumed for RAM : 0.000227 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:37:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:37:23] 0.000227 kWh of electricity used since the begining.
[codecarbon INFO @ 15:37:38] Energy consumed for RAM : 0.000239 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:37:38] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:37:38] 0.000239 kWh of electricity used since the begining.
[codecarbon INFO @ 15:37:53] Energy consumed for RAM : 0.000251 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:37:53] Energy consumed for all CP

train epoch: 4, loss: 0.0321, accuracy:  99.04
test epoch: 4,loss: 0.0328, accuracy:  98.88


[codecarbon INFO @ 15:38:23] Energy consumed for RAM : 0.000275 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:38:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:38:23] 0.000275 kWh of electricity used since the begining.
[codecarbon INFO @ 15:38:38] Energy consumed for RAM : 0.000287 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:38:38] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:38:38] 0.000287 kWh of electricity used since the begining.
[codecarbon INFO @ 15:38:53] Energy consumed for RAM : 0.000299 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:38:53] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:38:53] 0.000299 kWh of electricity used since the begining.
[codecarbon INFO @ 15:39:08] Energy consumed for RAM : 0.000311 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:39:08] Energy consumed for all CP

train epoch: 5, loss: 0.0249, accuracy:  99.26
test epoch: 5,loss: 0.0346, accuracy:  98.79


[codecarbon INFO @ 15:39:38] Energy consumed for RAM : 0.000335 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:39:38] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:39:38] 0.000335 kWh of electricity used since the begining.
[codecarbon INFO @ 15:39:53] Energy consumed for RAM : 0.000347 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:39:53] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:39:53] 0.000347 kWh of electricity used since the begining.
[codecarbon INFO @ 15:40:08] Energy consumed for RAM : 0.000359 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:40:08] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:40:08] 0.000359 kWh of electricity used since the begining.
[codecarbon INFO @ 15:40:23] Energy consumed for RAM : 0.000371 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:40:23] Energy consumed for all CP

train epoch: 6, loss: 0.0199, accuracy:  99.44
test epoch: 6,loss: 0.0340, accuracy:  98.88


[codecarbon INFO @ 15:40:53] Energy consumed for RAM : 0.000395 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:40:53] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:40:53] 0.000395 kWh of electricity used since the begining.
[codecarbon INFO @ 15:41:08] Energy consumed for RAM : 0.000407 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:41:08] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:41:08] 0.000407 kWh of electricity used since the begining.
[codecarbon INFO @ 15:41:23] Energy consumed for RAM : 0.000419 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:41:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:41:23] 0.000419 kWh of electricity used since the begining.
[codecarbon INFO @ 15:41:38] Energy consumed for RAM : 0.000431 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:41:38] Energy consumed for all CP

train epoch: 7, loss: 0.0158, accuracy:  99.55
test epoch: 7,loss: 0.0339, accuracy:  98.88


[codecarbon INFO @ 15:42:08] Energy consumed for RAM : 0.000455 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:42:08] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:42:08] 0.000455 kWh of electricity used since the begining.
[codecarbon INFO @ 15:42:23] Energy consumed for RAM : 0.000467 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:42:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:42:23] 0.000467 kWh of electricity used since the begining.
[codecarbon INFO @ 15:42:38] Energy consumed for RAM : 0.000479 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:42:38] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:42:38] 0.000479 kWh of electricity used since the begining.
[codecarbon INFO @ 15:42:53] Energy consumed for RAM : 0.000491 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:42:53] Energy consumed for all CP

train epoch: 8, loss: 0.0125, accuracy:  99.66
test epoch: 8,loss: 0.0349, accuracy:  98.82


[codecarbon INFO @ 15:43:23] Energy consumed for RAM : 0.000514 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:43:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:43:23] 0.000514 kWh of electricity used since the begining.
[codecarbon INFO @ 15:43:38] Energy consumed for RAM : 0.000526 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:43:38] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:43:38] 0.000526 kWh of electricity used since the begining.
[codecarbon INFO @ 15:43:53] Energy consumed for RAM : 0.000538 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:43:53] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:43:53] 0.000538 kWh of electricity used since the begining.
[codecarbon INFO @ 15:44:08] Energy consumed for RAM : 0.000550 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:44:08] Energy consumed for all CP

train epoch: 9, loss: 0.0099, accuracy:  99.74


[codecarbon INFO @ 15:44:38] Energy consumed for RAM : 0.000574 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:44:38] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:44:38] 0.000574 kWh of electricity used since the begining.


test epoch: 9,loss: 0.0418, accuracy:  98.62


[codecarbon INFO @ 15:44:53] Energy consumed for RAM : 0.000586 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:44:53] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:44:53] 0.000586 kWh of electricity used since the begining.
[codecarbon INFO @ 15:45:08] Energy consumed for RAM : 0.000598 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:45:08] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:45:08] 0.000598 kWh of electricity used since the begining.
[codecarbon INFO @ 15:45:23] Energy consumed for RAM : 0.000610 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:45:23] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:45:23] 0.000610 kWh of electricity used since the begining.
[codecarbon INFO @ 15:45:38] Energy consumed for RAM : 0.000622 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:45:38] Energy consumed for all CP

train epoch: 10, loss: 0.0088, accuracy:  99.76


[codecarbon INFO @ 15:46:03] Energy consumed for RAM : 0.000641 kWh. RAM Power : 2.8732738494873047 W
[codecarbon INFO @ 15:46:03] Energy consumed for all CPUs : 0.000000 kWh. All CPUs Power : 0.0 W
[codecarbon INFO @ 15:46:03] 0.000641 kWh of electricity used since the begining.


test epoch: 10,loss: 0.0399, accuracy:  98.78


0.00013405779503157248

In [8]:
tracker.final_emissions

0.00013405779503157248

In [10]:
print(tracker.final_emissions_data)

EmissionsData(timestamp='2022-09-06T15:46:03', project_name='codecarbon', run_id='d485f507-7a28-4ccc-b7ec-61662940d2be', duration=804.4101219177246, emissions=0.00013405779503157248, emissions_rate=0.00016665354074856353, cpu_power=0.0, gpu_power=0.0, ram_power=2.8732738494873047, cpu_energy=0, gpu_energy=0, ram_energy=0.0006414248566103947, energy_consumed=0.0006414248566103947, country_name='United Kingdom', country_iso_code='GBR', region='england', cloud_provider='', cloud_region='', os='Linux-5.15.0-47-generic-x86_64-with-glibc2.35', python_version='3.8.13', cpu_count=4, cpu_model='Intel(R) Core(TM) m3-6Y30 CPU @ 0.90GHz', gpu_count=None, gpu_model=None, longitude=0.1426, latitude=52.1932, ram_total_size=7.6620635986328125, tracking_mode='machine', on_cloud='N')


Ok, not all that useful on laptop (top was showing 300% CPU usage).  Worth a try on Collab though.