Reporting and Analysing the Environmental Impact of Language Models on the Example of Commonsense Question Answering with External Knowledge
by Aida Usmanova, Junbo Huang, Debayan Banerjee and Ricardo Usbeck
This paper was presented at Sustainable AI 2023 in Bonn and is available here.
This project explores T5 Large Language Model. The aim of this project is to report the training time and efficiency of the model. This is achieved through infusing external knowledge from ConceptNet Knowledge Graph and fine-tuning the model on the Commonsense Question Answering task. Training time, power consumption and approximate carbon emissions are tracked throughout all training processes via CarbonTracker.
You can download ConceptNet assertions and save them in data/
folder.
To verbalize the ConcpetNet graph run the src/core/util/preprocess_conceptnet.py
script.
The knowlegde infusion step is done following the example of T5 Masked Language Modeling (MLM) using previously pre-processed ConceptNet triples.
To execute the task run src/core/new_mlm.py
script.
In the end, the T5 model is fine-tuned on the TellMeWhy dataset for the Commonsense QA task.
To execute fine-tuning run src/core/finetune_hf.py
script.