# Part 2 - Translation

You will modify Part 1 to generate the translations of your answers from Part 1 into a particular language (see below) and then back to English.

So, your prompt should look like:

> Your question.  
> Answer in English.  
> Answer in the assigned Language.  
> Answer in English, translated from the above language.

The language you will use for your project is:

Team 1, 4, 7, 10, 13 - Spanish

Team 2,5,8,11 - German

Team 3,6,9,12 - French

Observe the effects of the cyclical translation (e.g., English->French->English) and critique the results in your slides and the report.

Part 2.2 -- use two different HF translation models: use the default translation pipeline, then use other models of choice and discuss the differences in the result.

https://huggingface.co/docs/transformers/main_classes/pipelines

https://huggingface.co/docs/transformers/v4.35.0/en/main_classes/pipelines#transformers.TranslationPipeline


In [1]:
!pip3 install -r ../requirements.txt

%load_ext autoreload
%autoreload 2


Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com


In [2]:
import csv
import os

from src import utils
from src.question_answering import run_qa_models
from src.translation import run_tr_models


  from .autonotebook import tqdm as notebook_tqdm


---
---

## Experiments & Results

You will modify Part 1 to generate the translations of your answers from Part 1 into a particular language (see below) and then back to English.

Observe the effects of the cyclical translation (e.g., English->French->English) and critique the results in your slides and the report.


In [3]:
# We only use the best question answering model
models_qa = [
    # DistilBERT
    # "distilbert-base-cased-distilled-squad",
    # "distilbert-base-uncased-distilled-squad",
    # RoBERTa
    # "deepset/roberta-base-squad2",
    "deepset/roberta-large-squad2",
    # Deberta
    # "deepset/deberta-v3-base-squad2",
    # "deepset/deberta-v3-large-squad2",
    # Electra
    # "deepset/electra-base-squad2",
]

# Models for translating the answers given by the question answering models
models_tr = [
    # the opus models are trained specifically for en-fr and fr-en
    ("Helsinki-NLP/opus-mt-en-fr", "Helsinki-NLP/opus-mt-fr-en"),
    # the facebook m2m100 models are supposed to be multilingual
    ("facebook/m2m100_418M", "facebook/m2m100_418M"),
]


In [4]:
csv_file = "res_tr.csv"

csv_header = [
    "ctx_name",
    "ctx_fname",
    "q_idx",
    "q_text",
    "q_answer_true",
    "qa_model",  # the model that was used to answer the question
    "qa_answer_pred",  # the answer predicted by the qa model
    "tr_model_name_en_to_fr",  # the model that was used to translate the answer from english to french
    "tr_model_name_fr_to_en",  # the model that was used to translate the answer from french to english
    "tr_answer_pred_en_to_fr",  # the french translation of the original english answer
    "tr_answer_pred_fr_to_en",  # the english translation of the original french answer
]
csv_rows = []

for ctx_name in ["protagonist", "antagonist", "crime", "evidence", "resolution"]:
    for ctx_idx, (ctx_fname, ctx_text) in enumerate(utils.read_context(ctx_name)):
        ctx_fname = os.path.basename(ctx_fname)
        print("#" * 80)
        print("#" * 80)
        print(ctx_text)

        for q_idx, (q_text, q_answer_true) in enumerate(utils.read_qa(ctx_name)):
            print("=" * 80)
            print("=" * 80)
            print(f"Current Question: {q_text}")
            print(f"Expected Answer: {q_answer_true}")

            q_answers_pred, _ = run_qa_models(
                q_text,
                ctx_text,
                models_qa,
                q_answer_true,
            )

            for qa_model, qa_answer_pred in zip(models_qa, q_answers_pred):
                tr_preds, tr_scores = run_tr_models(
                    qa_answer_pred,
                    models_tr,
                )

                for tr_m, tr_a, tr_s in zip(models_tr, tr_preds, tr_scores):
                    row = [
                        ctx_name,
                        ctx_fname,
                        q_idx,
                        q_text,
                        q_answer_true,
                        qa_model,
                        qa_answer_pred,
                        tr_m[0],
                        tr_m[1],
                        tr_a[0],
                        tr_a[1],
                    ]

                    for metric, score in tr_s.items():
                        if metric not in csv_header:
                            csv_header.append(metric)
                        row.append(score)

                    csv_rows.append(row)


with open(csv_file, "w") as f:
    writer = csv.writer(f)
    writer.writerow(csv_header)
    writer.writerows(csv_rows)


Found: ['protagonist.0.md', 'protagonist.qa.md']
################################################################################
################################################################################
Sherlock Holmes took his bottle from the corner of the mantel-piece and his hypodermic syringe from its neat morocco case. With his long, white, nervous fingers he adjusted the delicate needle, and rolled back his left shirt-cuff. For some little time his eyes rested thoughtfully upon the sinewy forearm and wrist all dotted and scarred with innumerable puncture-marks. Finally he thrust the sharp point home, pressed down the tiny piston, and sank back into the velvet-lined arm-chair with a long sigh of satisfaction. Three times a day for many months I had witnessed this performance, but custom had not reconciled my mind to it. On the contrary, from day to day I had become more irritable at the sight, and my conscience swelled nightly within me at the thought that I had lacked the

  score = doc1.similarity(doc2)


{'bertscore_f1': 0.505920946598053,
 'bertscore_hashcode': 'microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.12(hug_trans=4.35.0)',
 'bertscore_precision': 0.4350338578224182,
 'bertscore_recall': 0.6044065952301025,
 'rouge1': 0.0,
 'rouge2': 0.0,
 'rougeL': 0.0,
 'rougeLsum': 0.0,
 'spacy_sim': 0.0}
----------------------------------------------------------------------------------------------------
model: Helsinki-NLP/opus-mt-en-fr
> Bartholomew Sholto
----------------------------------------------------------------------------------------------------
model: Helsinki-NLP/opus-mt-fr-en
> Bartholomew Sholto
{'bertscore_f1': 1.0,
 'bertscore_hashcode': 'microsoft/deberta-xlarge-mnli_L40_no-idf_version=0.3.12(hug_trans=4.35.0)',
 'bertscore_precision': 1.0,
 'bertscore_recall': 1.0,
 'rouge1': 1.0,
 'rouge2': 1.0,
 'rougeL': 1.0,
 'rougeLsum': 1.0,
 'spacy_sim': 1.0}
----------------------------------------------------------------------------------------------------
model: facebook/m