# C-WLT

This paper is the reimplementation of 'Translate to Disambiguate: Zero-shot Multilingual Word Sense Disambiguation with Pretrained Language Models' from https://arxiv.org/pdf/2304.13803.pdf with some improvements, more details can be found in the Report delivered with this notebook.

All cells before Dataset Run are required to run if the user wants to use the Dataset run (it may take around 30/40 minutes if using bloom-3b)

For the the Example run, the Data Loading section can be skipped, but as explained in the specific section, it uses BabelNet so it is incompatible with Colab, you will need to download the notebook and its Google Drive folder (or just create an examples/ folder in the same place). Also be sure to have a BabelNet API key, if not, one comes with the folder.

This project is delivered with original samples data (in data/ folder) and with already sampled data (in samples/ folder), the code automatically check if already sampled data exists and if not it loads original data and samples it.

If you want to try the sampling procedure just empty the /samples folder in the Google Drive repository.

The Dataset run is based on some XL-WSD files and on the 'best-ensemble settings' as said in the paper, in which the language of the prompt is in English and the target languages (the ones in which the words are translated) are English, Russian and Chinese. If one of the target languages is selected as source language, it will just not be translated in the same one.

To try the code with a new {new_lang}, you will need to:
Download from https://github.com/mk322/contextual-word-level-translation the following files:
1. test-{new_lang}.data.xml and test-{new_lang}.gold.key.txt from xl-wsd-data/evaluation_datasets/test-{new_lang}
2. correct_trans_{new_lang}_{tlang}.json, wrong_trans_{new_lang}_{tlang}.json and all_sense_labels_{new_lang}_{tlang}.txt from xl-wsd-files/{new_lang}/, where {tlang} is 'en', 'ru' and 'zh'.
Add these files to data/ folder and write as parameter of wsd_on_dataset() function the iso of the new language (es. 'es' for spanish)

These files were created by paper authors, they allow the run without using BabelNet, incompatible with Colab environment.


# Folder Setup, Imports etc.
READ BEFORE RUN!
Few important things.:
Run the first cell if using GPU, some new update of Colab made the models using too much GPU and this can exceed the GPU RAM with even affordable models.

There are 2 ways of loading data:
1. Through gdown, this is nice as it does not requires interactions with popups and permission requests, but the downside is that it dowsn't allow to actually write and save file in the real Drive folder, all generated files will be stored locally on the notebook runtime and you will need to open through it or download them by your own.

2. With the official mounting of the drive given by Google. With this you can save files in the real folder, but each time you start the notebook you will have to interact with the popup.

The second modality is actually commented in the 3rd cell.

In [None]:
# RUN THIS CELL IF USING GPU
!wget http://launchpadlibrarian.net/367274644/libgoogle-perftools-dev_2.5-2.2ubuntu3_amd64.deb
!wget https://launchpad.net/ubuntu/+source/google-perftools/2.5-2.2ubuntu3/+build/14795286/+files/google-perftools_2.5-2.2ubuntu3_all.deb
!wget https://launchpad.net/ubuntu/+source/google-perftools/2.5-2.2ubuntu3/+build/14795286/+files/libtcmalloc-minimal4_2.5-2.2ubuntu3_amd64.deb
!wget https://launchpad.net/ubuntu/+source/google-perftools/2.5-2.2ubuntu3/+build/14795286/+files/libgoogle-perftools4_2.5-2.2ubuntu3_amd64.deb
!apt install -qq libunwind8-dev
!dpkg -i *.deb
%env LD_PRELOAD=libtcmalloc.so

In [None]:
# GDOWN LOADING

!pip install -q accelerate gdown
#Loading the Drive folder with the required data
import gdown
url = "https://drive.google.com/drive/folders/1DaBLm4BHApfnfBARHusaEZEPo0BCnnWO?usp=sharing"
gdown.download_folder(url, quiet=True, use_cookies=False)

%cd HotNLP_BonaiutiAndrea/

In [None]:
# OFFICIAL COLAB LOADING

#!pip install -q accelerate gdown
#from google.colab import drive
#drive.mount('/content/drive')
#%cd drive/MyDrive/HotNLP_BonaiutiAndrea/

In [None]:
# All needed libraries
from transformers import AutoTokenizer, AutoModelForCausalLM
import json
import xml.etree.ElementTree as ET
import torch.nn.functional as F
import torch
from typing import List
import os
import pickle
from tqdm.notebook import tqdm
import math

# Utility functions and some 'global variables'

In [None]:
# if case you forget to define target languages
TARGET_LANGS = ['en', 'zh', 'ru']

# dict to translate language words
LANG_DICT = {'ru' : 'Russian', 'zh' : 'Chinese', 'en' : 'English', 'es': 'Spanish', 'fr' : 'French',
             'bg' : 'Bulgarian', 'ca' : 'Catalan', 'da' : 'Danish', 'de' : 'German', 'et': 'Estonian',
             'eu' : 'Basque', 'gl' : 'Galician', 'hr' : 'Croatian', 'hu' : 'Hungarian', 'ja' : 'Japanese'
             'nl' : 'Dutch', 'sl' : 'Slovenian'}

# check/modify use of gpu/cpu
device = torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
print(f'Computations will be done on: {device}')

Computations will be done on: cuda


In [None]:
# word object
# it contains for each word, its label (correct sense), and all candidate synsets and translations for all target languages
class Word:
    def __init__(self, text, pos, translations, labels = None):
        self.text = text
        self.pos = pos
        self.translations = translations
        self.labels = labels

# dataset sample object
class WsdSample:
    def __init__(self, id = None, sentence = None, words : List[Word] = None):
        self.id = id
        self.sentence = sentence
        # list of words to be translated
        self.words = words

# define data path
def data_paths(lang):
    sentence_path = f'data/test-{lang}.data.xml'
    label_path = f'data/test-{lang}.gold.key.txt'
    return sentence_path, label_path

# parsing utility of lines in source file
def dict_from_lines(lines):
    d = {}
    for line in lines:
        line = line.replace('\n', '')
        sep = line.find('bn:')
        word = line[:(sep - 1)]
        labels = line[sep:].split(' ')
        d[word] = labels
    return d

# building dicts to retrieve at inference time all senses from translations (used to avoid BabelNet)
def build_id_dicts(source_lang, target_langs = TARGET_LANGS):
    print('Building useful dictionaries..')
    dicts = {}

    for lang in target_langs:
        input_path = f'data/all_sense_labels_{source_lang}_{lang}.txt'
        out_path = f'dicts/senses_{lang}_for_{source_lang}_data.json'

        if os.path.exists(out_path):
            with open(out_path, 'r', encoding='utf-8') as f:
                d = json.load(f)
        else:
            d = {}
        with open(input_path, 'r', encoding='utf-8') as f:
            lines = f.readlines()
            d = dict_from_lines(lines)
        with open(out_path, 'w', encoding='utf-8') as f:
            json.dump(d, f, indent=4)
        dicts[lang] = d

    print('Done')
    return dicts

# building dicts with all possible translations for source words, used to avoid BabelNet in preprocessing phase
def build_vocabs(source_lang, target_langs = TARGET_LANGS):
    print('Building vocabs..')
    vocabs = {}

    for lang in target_langs:
        correct_path = f'data/correct_trans_{source_lang}_{lang}.json'
        wrong_path = f'data/wrong_trans_{source_lang}_{lang}.json'
        out_path = f'dicts/vocab_{lang}.json'

        if os.path.exists(out_path):
            with open(out_path, 'r', encoding='utf-8') as file:
                vocab = json.load(file)
        else:
            vocab = {}
            with open(correct_path) as file:
                vocab = json.load(file)
            with open(wrong_path) as file:
                wrong = json.load(file)
            for key in wrong.keys():
                vocab[key].extend(wrong[key])
            with open(out_path, 'w', encoding='utf-8') as file:
                json.dump(vocab, file, indent=4)
        vocabs[lang] = vocab

    print('Done')
    return vocabs

# function to create the PLM prompt
def get_prompt(sentence, word, target_lang):
    return f"In the sentence \" {sentence} \", the word {word} is translated into {LANG_DICT[target_lang]} as \""

# used in the alternative method of translating, to compute the minimum number of different tokens we need to mask the words
# this method takes all the words in tokens and builds a mask with the minimum dfferent initial tokens for each word
def get_tokens_mask(tokens):
    mask = []
    for i in range(len(tokens)):
        mask.append([])
    max_length = 0
    for i in range(len(tokens)):
        max_length = len(tokens[i]) if len(tokens[i]) > max_length else max_length
    tensor = torch.zeros((len(tokens), max_length), dtype=torch.int32)
    for i, word in enumerate(tokens):
        for j, token in enumerate(word):
            tensor[i,j] = token
    indices = []
    for i in range(max_length):
        _, inverse, counts = torch.unique(tensor[:,i], return_inverse=True, return_counts=True)
        if torch.equal(torch.ones(counts.size()), counts):
            return mask
        else:
            duplicates = [torch.where(inverse == i)[0] for i, c, in enumerate(counts) if counts[i] > 1]
            if i == 0:
                for dup in duplicates:
                    for elem in dup:
                        indices.append(elem.tolist())
            else:
                temp = indices
                indices = []
                for dup in duplicates:
                    for elem in dup:
                        if elem in temp:
                            indices.append(elem.tolist())
            if len(indices) > 1:
                for ind in indices:
                    mask[ind].append(1)
    return mask

# used to truncate some decimals and see if more than 1 results has very same, if not equal, score
def truncate(f, n):
    if f > 0:
        return math.floor(f * 10 ** n) / 10 ** n
    else:
        return math.ceil(f * 10 ** n) / 10 ** n

# Data Loading and preprocessing (if needed)

In [None]:
# Method for building a data sample from dataset file
def build_sample(s, label_dict, source_lang, vocabs):

    # we need to build the sentence that is formed by individual word elements
    sentence = ''
    # stored if partial processing data is needed due to resource limits
    sentence_id = s.attrib['id']
    words = []
    for word in s:
        text = str(word.text).replace('_', ' ')
        sentence += f'{text} '
        # if the word is a target word to translate and disambiguate
        if word.tag == 'instance':
            pos = word.attrib['pos']
            id = word.attrib['id']
            labels = label_dict[id]
            # get all possible translations
            translations = {}
            for lang in vocabs.keys():
                if id in vocabs[lang].keys():
                    translations[lang] = vocabs[lang][id]
            words.append(Word(text, pos, translations, labels))
    sample = WsdSample(sentence_id, sentence, words)

    return sample

def process_data(sentence_path, label_path, source_lang, target_langs = TARGET_LANGS):
    print('Processing data...')
    # to avoid too many computations (before was requests to BabelNet) at the first try the samples object are stored using pickle and then loaded again
    samples_path = f'samples/samples_{source_lang}.pkl'
    vocabs = build_vocabs(source_lang)

    # If data is already preprocessed, load the list of samples and check if it is complete or incomplete due to resource limits
    if os.path.exists(samples_path):
        print('Data found, loading and checking if it is complete..')
        samples = []
        with open(samples_path, 'rb') as f:
            try:
                while True:
                    samples.append(pickle.load(f))
            except EOFError:
                pass

        # retrieve last sample and check if the processing is complete, if not finish the job
        last_sample_id = samples[-1].id
        checkpoint = False
        label_dict = {}
        with open(label_path, 'r') as f:
            lines = f.readlines()
            for line in lines:
                splitted = line.split(' ')
                label_dict[splitted[0]] = []
                for label in splitted[1:]:
                    if label.endswith('\n'):
                        label = label[:-1]
                    label_dict[splitted[0]].append(label)

        xml = ET.parse(sentence_path)
        root = xml.getroot()
        for t in root:
            for s in tqdm(t, desc='Checking'):
                if last_sample_id == s.attrib['id']:
                    checkpoint = True
                elif checkpoint:
                    sample = build_sample(s, label_dict, source_lang, vocabs)
                    samples.append(sample)
                    pickle.dump(sample, open(samples_path, 'ab+'))
    # if it is the first time to process data
    else:
        label_dict = {}
        with open(label_path, 'r') as f:
            lines = f.readlines()
            for line in lines:
                splitted = line.split(' ')
                label_dict[splitted[0]] = []
                for label in splitted[1:]:
                    if label.endswith('\n'):
                        label = label[:-1]
                    label_dict[splitted[0]].append(label)
        samples = []
        # parsing xml file and building list of WsdSample objects
        xml = ET.parse(sentence_path)
        root = xml.getroot()
        for t in root:
            for s in tqdm(t, desc='Building samples'):
                sample = build_sample(s, label_dict, source_lang, vocabs)
                samples.append(sample)
                pickle.dump(sample, open(samples_path, 'ab+'))

    return samples, vocabs

# Translation function
Needed for both dataset and example runs

In [None]:
# Main method for translating a single word, using context sentence and PLM and storing the probabilities of all possible translations
def translate(sentence, word : Word, tokenizer, model, target_lang, paper = False):
    # create the right prompt to feed to the model
    prompt = get_prompt(sentence, word.text, target_lang)
    prompt_tokens = tokenizer(prompt, add_special_tokens=False, return_tensors='pt')['input_ids'].requires_grad_(False).to(device)
    # call the model to generate next token
    with torch.no_grad():
        output = model(prompt_tokens, use_cache = True)
    # storing cache for next call of the model, to avoid using again the whole sentence
    cache = output['past_key_values']
    logits = output['logits'] #[batch_size, len_seq, vocab_size]
    # removing 1st dimension i.e. batch_size, here = 1, and then selecting just the last output
    logits = logits.squeeze()[-1,:]
    # softmax to obtain probabilities
    vocab_prob = F.log_softmax(logits, dim=-1)
    result_dict = {}
    # use the paper method, feeding the model sequentially with all the tokens of a word and compute its average probability
    if paper:
        for w in word.translations[target_lang]:
            token_probs = []
            tokens = tokenizer(w, add_special_tokens=False)['input_ids'][0][0]
            # each token is stored its probability and then it is passed to the model with the previous cache
            for i, token in enumerate(tokens):
                token_probs.append(vocab_prob[token])
                if i+1 < len(tokens):
                    input = torch.LongTensor([[token]]).to(device)#, requires_grad=False).to(device)
                    output = model(input, past_key_values=cache, use_cache=True)
                    cache = output['past_key_values']
                    logits = output['logits'].squeeze()
                    vocab_prob = F.log_softmax(logits, dim=-1)
                else:
                    result_dict[w] = sum(token_probs)/len(token_probs)
    else:
        # given that in the paper method the iterations after the first one are biased, and the final probabilities
        # could be too optimistic, here we take just the probabilities of the first token. If there are words with the same first token
        # compute also the next one and do the average as the paper method.
        tokens = tokenizer(word.translations[target_lang], add_special_tokens=False)['input_ids']
        mask = get_tokens_mask(tokens)
        for i, w in enumerate(tokens):
            # if the word has its first token as unique
            if len(mask[i]) == 0:
                result_dict[word.translations[target_lang][i]] = vocab_prob[w[0]]
            # if to disambiguate its results we need to take into consideration other tokens
            else:
                token_probs = []
                token_probs.append(vocab_prob[w[0]])
                for j, token in enumerate(w):
                    if len(mask[i]) < j:
                        input = torch.LongTensor([[token]]).to(device)#, requires_grad = False).to(device)
                        output = model(input, past_key_values=cache, use_cache=True)
                        cache = output['past_key_values']
                        logits = output['logits'].squeeze()
                        prob = F.log_softmax(logits, dim=-1)
                        token_probs.append(prob[token])
                result_dict[word.translations[target_lang][i]] = sum(token_probs)/len(token_probs)
    return result_dict

# Dataset Run and evaluation
The last cell actually runs the code, with default options of BLOOM 3B as model and Italian as source language.

Furthermore, in the last cell again, some more possible choices can be found as comments regarding different models and different source languages. If you want to try other languages please read instructions at the top of the notebook.

Output with results and metrics can be found in the output/ folder.

In [None]:
# Processing a dataset sentence with all of its target words and returning the best results for each target language
def process_sample(sample : WsdSample, model, tokenizer, print_results=False):
    sample_results = []
    for word in tqdm(sample.words, desc='Words', leave=False):
        single_word_results = {}
        for lang in word.translations.keys():
            word_results = translate(sample.sentence, word, tokenizer, model, target_lang=lang) # add paper=True to use the paper method
            best_result = -9999
            best_words = []
            for key in word_results:
                word_results[key] = truncate(word_results[key], 3)
                if word_results[key] > best_result:
                    best_result = word_results[key]
                    best_words = [key]
                if word_results[key] == best_result:
                    best_words.append(key)
            single_word_results[lang] = (best_words, word.pos, best_result, word_results)
        sample_results.append(single_word_results)
        if print_results:
            print('Results for word ', word.text, ': ')
            print(sample_results)
    # ritorno il miglior risultato per ogni lingua
    return sample.words, sample_results # is a List[dict] where for each word 'language' : (best_translations, pos, score, other results dict)

In [None]:
# Method for computing results over a given language dataset
def wsd_on_dataset(model_name = 'bigscience/bloom-3b', source_lang = 'it', target_langs = TARGET_LANGS):

    if source_lang in target_langs:
        target_langs.remove(source_lang)

    # uploading model and its tokenizer
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name, return_dict_in_generate=True, torch_dtype='auto', device_map='auto')

    # computing paths
    print('Computing paths...')
    sentence_path, label_path = data_paths(source_lang)
    output_path = f'output/{source_lang}_translations.txt'
    metrics_path = f'output/{source_lang}_metrics.txt'
    # preprocess data
    samples, _ = process_data(sentence_path, label_path, source_lang, target_langs)
    ids = build_id_dicts(source_lang)
    print('Data Completed..')

    # vars for metrics
    correct = 0
    wrong = 0
    true_positive, false_positive, true_negative, false_negative = 0, 0, 0, 0
    total_words = 0
    jaccard_index = 0

    print('WSD started')
    for sample in tqdm(samples, desc='Dataset progress'):
        words, results = process_sample(sample, model, tokenizer)
        with open(output_path, 'a', encoding="utf-8") as f:
            f.write(f'Sentence: {sample.sentence}\nWords:\n')
            for i, d in enumerate(results):
                total_words += 1
                synsets = {}
                f.write(f'{words[i].text} :\n')
                # use the same iteration to retrieve the most promising synsets
                for lang in d.keys():
                    best_words, pos, best_score, scores = d[lang]
                    f.write(f'\t{lang}: best translation: {best_words}, ')
                    f.write(f', score = {best_score}\n')
                    f.write(f'\tAll scores: ')
                    for j, (k,v) in enumerate(scores.items()):
                        f.write(f'[{k}: {v}] ')
                    f.write(f'\n')
                    # assign multiplicity
                    for best in best_words:
                        if best in ids[lang].keys():
                            for id in ids[lang][best]:
                                if id not in synsets.keys():
                                    synsets[id] = 1
                                else:
                                    synsets[id] += 1
                max_value = max(synsets.values())
                # select best synset ids
                best_ids = []
                for id in synsets.keys():
                    if synsets[id] == max_value:
                        best_ids.append(id)
                # temporal values for each word results
                tp, fp, tn, fn = 0, 0, 0, 0
                found = False
                f.write(f'\tPredicted ids: {best_ids}\n')
                f.write(f'\tLabels: {words[i].labels}\n')
                for id in best_ids:
                    if id in words[i].labels:
                        tp += 1
                        if not found:
                            correct += 1
                            found = True
                    else:
                        fp += 1
                fn = len(words[i].labels) - tp
                tn = len(synsets.keys()) - (tp + fp)
                jaccard_index += tp / (tp + fp + fn)
                true_positive += tp
                true_negative += tn
                false_positive += fp
                false_negative += fn
                if not found:
                    wrong += 1
        temp_accuracy = (correct/(correct+wrong))*100
        temp_recall = true_positive / (true_positive + false_negative + 1e-10)
        temp_precision = true_positive / (true_positive + false_positive + 1e-10)
        temp_f1 = 2 * ((temp_precision * temp_recall) / (temp_precision + temp_recall + 1e-10))
        temp_jaccard_index = jaccard_index / total_words
        with open(metrics_path, 'w') as f:
            f.write(f'Accuracy percentage = {temp_accuracy}%\n')
            f.write(f'Recall = {temp_recall}\n')
            f.write(f'Precision = {temp_precision}\n')
            f.write(f'F1 Score = {temp_f1}\n')
            f.write(f'Jaccard Index = {temp_jaccard_index}\n')
            f.write(f'Correct: {correct}\n')
            f.write(f'Wrong: {wrong}')
    accuracy = (correct/(correct+wrong))*100
    recall = true_positive / (true_positive + false_negative + 1e-10)
    precision = true_positive / (true_positive + false_positive + 1e-10)
    f1 = 2 * ((precision * recall) / (precision + recall + 1e-10))
    jaccard_index = jaccard_index / total_words
    with open(metrics_path, 'w') as f:
        f.write(f'Accuracy percentage = {accuracy}%\n')
        f.write(f'Recall = {recall}\n')
        f.write(f'Precision = {precision}\n')
        f.write(f'F1 Score = {f1}\n')
        f.write(f'Jaccard Index = {jaccard_index}')
        f.write(f'Correct: {correct}\n')
        f.write(f'Wrong: {wrong}')
        f.write(f'True Positives: {true_positive}')
        f.write(f'True Negatives: {true_negative}')
        f.write(f'False Positives: {false_positive}')
        f.write(f'False Negatives: {false_negative}')

In [None]:
# other models: "bigscience/bloom-560m" "bigscience/bloom-1b7" "bigscience/bloom-1b1" "bigscience/bloom-3b" "bigscience/bloom-7b1"
# other source languages: 'it' 'fr' 'es' 'en'
wsd_on_dataset('bigscience/bloom-3b')

# Example Run
This is just an example, you can specify a sentence, a word contained in the sentence, its language and the output will be all the best translations in the three target languages, their scores and the final possible meanings for that words. Also it saves the same result in a file in the examples/ folder.

It actually uses BabelNet, because doing so you can specify whatever target word you want, without relying on preprocessed data.

But in order to run it, you should download this notebook and running it locally, it takes around 1 minute with CPU.

In [None]:
import babelnet as bn
from babelnet import Language, POS
from babelnet.data.lemma import BabelLemmaType

# Useful for single example run
BABEL_LANG = {
    'en' : Language.EN,
    'it' : Language.IT,
    'es' : Language.ES,
    'fr' : Language.FR,
    'ru' : Language.RU,
    'zh' : Language.ZH
}

# SINGLE DISAMBIGUATION EXAMPLE
# given a sentence, a word, a source language, a target language and a model
# translates it and computes its senses
def disambiguate(sentence, word, source_lang, target_langs = TARGET_LANGS, model_name = 'bigscience/bloom-3b'):

    if source_lang in target_langs:
        target_langs.remove(source_lang)

    # uploading model and its tokenizer
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForCausalLM.from_pretrained(model_name, return_dict_in_generate=True)#, torch_dtype='auto', device_map='auto')

    # creating Word object
    print('Retrieving all possible translations..')
    translations = {}
    for lang in target_langs:
        translations[lang] = []
        babel_synsets = bn.get_synsets(word, from_langs=[BABEL_LANG[source_lang]], to_langs=[BABEL_LANG[lang]])
        for s in babel_synsets:
            lemma = s.lemmas(BABEL_LANG[lang], BabelLemmaType.HIGH_QUALITY)
            if len(lemma) > 0:
                translations[lang].append(str(lemma[0]).replace('_', ' '))
    target_word = Word(word, translations=translations)

    disambiguations = {}
    ids = {}

    for lang in target_langs:
        results = translate(sentence, target_word, tokenizer, model, lang)
        best_result = -9999
        best_words = []
        for key in results:
            results[key] = truncate(results[key], 3)
            if results[key] > best_result:
                best_result = results[key]
                best_words = [key]
            if results[key] == best_result:
                best_words.append(key)
        for word in best_words:
            synsets = bn.get_synsets(word, from_langs=[BABEL_LANG[lang]])
            for s in synsets:
                if s.id in ids.keys():
                    ids[s.id] += 1
                else:
                    ids[s.id] = 1
        disambiguations[lang] = (best_words, best_result)

    max_value = max(ids.values())
    best_ids = []
    for id in ids.keys():
        if ids[id] == max_value:
            best_ids.append(id)

    glosses = []
    for id in best_ids:
        synsets = bn.get_synsets(id, from_langs=[BABEL_LANG[source_lang]])
        for s in synsets:
            glosses.append(str(s.main_gloss(BABEL_LANG[source_lang])))

    return disambiguations, glosses

# EXAMPLE FOR A SINGLE WORD TO TRANSLATE/DISAMBIGUATE
# Some of the glosses returned can be wrong, but at least one is true. This is because once the word is translated (even correctly)
# that word can also have more than one meaning in the target languages and be ambiguous too.
def example(sentence = 'Spero che questo progetto mi far√† prendere un voto molto alto.',
            target_word = 'progetto', source_lang = 'it', model_name = 'bigscience/bloom-3b'):
    disambiguations, glosses = disambiguate(sentence, target_word, source_lang, model_name=model_name)
    #text_target_language = LANG_DICT[target_lang]
    with open('examples/example.txt', 'w', encoding='utf-8') as f:
        f.write(f'The word \'{target_word}\' in the sentence \'{sentence}\'\n')
        print(f'The word \'{target_word}\' in the sentence \'{sentence}\'\n')
        for lang in disambiguations:
            words, scores = disambiguations[lang]
            f.write(f'Has been translated in {LANG_DICT[lang]} as {words} with {scores}\n')
            print(f'Has been translated in {LANG_DICT[lang]} as {words} with {scores}\n')
        f.write(f'And through these translations the inferred meanings are: {glosses}\n')
        print(f'And through these translations the inferred meanings are: {glosses}\n')

In [None]:
example()