# Demonstration
* In this Jupyter Notebook we show using MockClients how the translation process will work.
* It is possible to run the scripts on Terminal but we use Jupyter Notebooks to document exactly what the author used to run the code.
* **NOTE**: MockClient used in `proc0.py` does not take data manager as input, which means it returns just a Ceaser Cipher to mimick translations, language detection was thus DISABLED. This allows us to have a procedure very similar to what we do in the real case, where we cannot pass in the data manager as input either.

In [1]:
%run proc0.py run -m gpt-4.1-2025-04-14 -d opus-100

INFO: 2025-05-07 12:44:07 - [🚀]: Selected task for opus-100 - gpt-4.1-2025-04-14
INFO: 2025-05-07 12:44:07 - [🏁]: Starting task 68f289ef-0e55-4aea-9b69-39a273316bf4 on commit 4865abf
ERROR: 2025-05-07 12:44:07 - [🔥]: Error MockError
INFO: 2025-05-07 12:44:07 - [🕒]: Retrying de-en...
INFO: 2025-05-07 12:44:07 - [❌]: Translated 200 sents for de-en but rejected: outside acceptable range
INFO: 2025-05-07 12:44:07 - [🕒]: Retrying de-en...
INFO: 2025-05-07 12:44:07 - [✔️]: Translated 400 sents for de-en
INFO: 2025-05-07 12:44:08 - [✔️]: Translated 400 sents for en-de
INFO: 2025-05-07 12:44:08 - [✔️]: Translated 400 sents for da-en
INFO: 2025-05-07 12:44:08 - [✔️]: Translated 400 sents for en-da
INFO: 2025-05-07 12:44:08 - [✔️]: Translated 400 sents for el-en
INFO: 2025-05-07 12:44:08 - [✔️]: Translated 400 sents for en-el
INFO: 2025-05-07 12:44:09 - [✔️]: Translated 400 sents for pt-en
INFO: 2025-05-07 12:44:09 - [✔️]: Translated 400 sents for en-pt
INFO: 2025-05-07 12:44:09 - [✔️]: Translat

* Logs observable when running the task using the `proc0.py` script are brief
* The same logs are also stored in a `.log` file that is more detailed in case of errors
    * This process is not run on a server nor requires deployments, hence it makes more sense to have the real-time logs brief / readable, whereas the stored logs are more detailed for debugging purposes
    * Since we use Jupyter Notebook, it is possible to 'store' both types of logs (given the notebook is not re-run by accident...)
 

In [2]:
!cat proc0.log | head -n 20

INFO: 2025-05-07 12:44:07 - [🚀]: Selected task for opus-100 - gpt-4.1-2025-04-14
INFO: 2025-05-07 12:44:07 - [🏁]: Starting task 68f289ef-0e55-4aea-9b69-39a273316bf4 on commit 4865abf
ERROR: 2025-05-07 12:44:07 - [🔥]: Error MockError
DEBUG: 2025-05-07 12:44:07 - Traceback:
Traceback (most recent call last):
  File "C:\Files\UZH\Semester_6\BA_Thesis\BA_Repo\scripts\task.py", line 185, in run
    mt_sents = self.client.translate_and_store_document(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Files\UZH\Semester_6\BA_Thesis\BA_Repo\scripts\translators.py", line 47, in translate_and_store_document
    out_text = self.translate_document(
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Files\UZH\Semester_6\BA_Thesis\BA_Repo\scripts\translators.py", line 312, in translate_document
    out_text = self.translate(src_lang, tgt_lang, in_text)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Files\UZH\Semester_6\BA_Thesis\BA_Repo\scripts\translators.py

* We can reproduce the simpler logs using `grep`
    * On Windows we need to use double quotations for the Regex, for Linux/Mac single quotes would work too 

In [3]:
!grep -P "(INFO:|ERROR:)" proc0.log | head -n 6

INFO: 2025-05-07 12:44:07 - [🚀]: Selected task for opus-100 - gpt-4.1-2025-04-14
INFO: 2025-05-07 12:44:07 - [🏁]: Starting task 68f289ef-0e55-4aea-9b69-39a273316bf4 on commit 4865abf
ERROR: 2025-05-07 12:44:07 - [🔥]: Error MockError
INFO: 2025-05-07 12:44:07 - [🕒]: Retrying de-en...
INFO: 2025-05-07 12:44:07 - [❌]: Translated 200 sents for de-en but rejected: outside acceptable range
INFO: 2025-05-07 12:44:07 - [🕒]: Retrying de-en...


* Logs that are prefixed with `TRANSLATION` are translation specific logs, they are also stored in a JSONL separately, used for analysis purposes

In [4]:
!cat tmp/proc0.jsonl | head -n 5

{"src_lang": "de", "tgt_lang": "en", "start": 1746614647.509234, "in_lines": 400, "in_sents": 444, "in_chars": 32731, "in_tokens": 8295, "end": 1746614647.5710516, "out_chars": 15366, "out_lines": 200, "out_sents": 237, "out_tokens": 7368, "id": "68f289ef-0e55-4aea-9b69-39a273316bf4-0002", "translator": "gpt-4.1-2025-04-14", "dataset": "Helsinki-NLP/opus-100", "status_code": 1, "verdict": "rejected"}
{"src_lang": "de", "tgt_lang": "en", "start": 1746614647.6548312, "in_lines": 400, "in_sents": 444, "in_chars": 32731, "in_tokens": 8295, "end": 1746614647.7366693, "out_chars": 32731, "out_lines": 400, "out_sents": 478, "out_tokens": 15578, "id": "68f289ef-0e55-4aea-9b69-39a273316bf4-0003", "translator": "gpt-4.1-2025-04-14", "dataset": "Helsinki-NLP/opus-100", "status_code": 0, "verdict": "accepted"}
{"src_lang": "en", "tgt_lang": "de", "start": 1746614647.8718607, "in_lines": 400, "in_sents": 446, "in_chars": 29414, "in_tokens": 6994, "end": 1746614647.9358702, "out_chars": 29414, "out_

* The translation logs are linked to the task, observe of the translation log ids use the task id as a prefix.

In [5]:
!cat tmp/proc0/opus-100/gpt-4.1-2025-04-14/task.json

{
    "task_id": "68f289ef-0e55-4aea-9b69-39a273316bf4",
    "git_hash": "4865abf",
    "dataset": "Helsinki-NLP/opus-100",
    "num_of_sents": 400,
    "split": "test[:500]",
    "translator": "gpt-4.1-2025-04-14",
    "acceptable_range": [
        360,
        480
    ],
    "timestamp": "2025-05-07T12:44:11.564681+02:00",
    "manual_retry": false,
    "max_retries": 2,
    "retry_delay": 0,
    "duration": 4.17696475982666
}


## Manual Retry
* In cases were a task, even if automatic retry is accounted for, fails to deliver translations that are desired, we need to run the task again. 
* For documentation purposes, we want to link these 'faulty' translations to ones the task delievered before.
* If it is 'faulty' because we never get one due to errors, in such cases we do not log any translations, we still specify a reason but no id
* If it is 'faulty' because we automatically rejected the ones that arrived, we link it to the newest one.
    * Automatic rejection of translations is limited to avoid paying too much
    * Rejection reason is mainly receiving too much or too little output translations. All rejected translations are still stored and have an id. 

* In the following we define a scenario where En-Fr and Fr-En will not be translated successfully because:
    * En-Fr was rejected three times automatically
    * Fr-En failed three times  

In [6]:
from scripts.constants import N, E, R1, R2, R3
pairs = [('en', 'de'), ('en', 'fr'), ('de', 'en'), ('fr', 'en')]
scenario = [N, R1, R1, R2, N, E, E, E] 
# Pass, Reject twice due to outside accepted range, Reject once due t owrong language, Pass, Error thrice
verdicts = ['accepted', 'rejected', 'rejected', 'rejected', 'accepted']
# Errors are not logged in JSONL, hence we check only accepted/rejected
# We also check the status codes defined in scripts/constants.py
status_codes = [c for c in scenario if c != E]

* **NOTE**: This time we include language detection and have our MockClient use the data manager as well, so language detection is sensible and the MockClient outputs 'perfect translations'

In [7]:
from scripts.data_management import Opus100Manager
from scripts.task import TranslationTask
from scripts.translators import MockClient
from scripts.logger import TranslationLogger
from os.path import join
from io import StringIO
main_folder = 'tmp'
sub_folder = join(main_folder, 'proc0')

dm = Opus100Manager()
logfile = StringIO()
logger = TranslationLogger(logfile=logfile)
# This time we pass in the DM into the client to account for language detection
cli = MockClient(logger=logger, model='mock', scenario=scenario, dm=dm)
task = TranslationTask(
    target_pairs=pairs,
    dm=dm,
    client=cli,
    logger=logger,
    mt_folder=sub_folder,
    num_of_sents=400,
    max_retries=2,
    retry_delay=0
)

In [8]:
task.run()

INFO: 2025-05-07 12:44:11 - [🏁]: Starting task fc35ee01-77c5-4482-9a8e-f1ae60332a72 on commit 4865abf


INFO: 2025-05-07 12:44:12 - [✔️]: Translated 400 sents for en-de
INFO: 2025-05-07 12:44:12 - [❌]: Translated 200 sents for en-fr but rejected: outside acceptable range
INFO: 2025-05-07 12:44:12 - [🕒]: Retrying en-fr...
INFO: 2025-05-07 12:44:12 - [❌]: Translated 200 sents for en-fr but rejected: outside acceptable range
INFO: 2025-05-07 12:44:12 - [🕒]: Retrying en-fr...
INFO: 2025-05-07 12:44:13 - [❌]: Translated 400 sents for en-fr but rejected: expected lang fr but got en
INFO: 2025-05-07 12:44:13 - [⏭️]: Failed 2 times, skipping en-fr...
INFO: 2025-05-07 12:44:13 - [✔️]: Translated 400 sents for de-en
ERROR: 2025-05-07 12:44:13 - [🔥]: Error MockError
INFO: 2025-05-07 12:44:13 - [🕒]: Retrying fr-en...
ERROR: 2025-05-07 12:44:13 - [🔥]: Error MockError
INFO: 2025-05-07 12:44:13 - [🕒]: Retrying fr-en...
ERROR: 2025-05-07 12:44:13 - [🔥]: Error MockError
INFO: 2025-05-07 12:44:13 - [⏭️]: Failed 2 times, skipping fr-en...
INFO: 2025-05-07 12:44:13 - [🏁]: Task took 1.88s


In [9]:
import json
# How the JSONL can be used for testing / analysis
log_data = [json.loads(ln) for ln in logfile.getvalue().splitlines()]
check1 = [log['verdict'] for log in log_data] == verdicts
check2 = [log['status_code'] for log in log_data] == status_codes
check1, check2


(True, True)

* Based on these results, we decide to retry En-Fr and Fr-En.
* During analysis, we also noticed that De-En has not been translated at all, BLEU score single digits, src-text was returned, so we retry that as well.

In [10]:
log_ids = [log['id'] for log in log_data if (log['src_lang'] == 'en' and log['tgt_lang'] == 'fr') or (log['src_lang'] == 'de' and log['tgt_lang'] == 'en')]
log_ids

['fc35ee01-77c5-4482-9a8e-f1ae60332a72-0002',
 'fc35ee01-77c5-4482-9a8e-f1ae60332a72-0003',
 'fc35ee01-77c5-4482-9a8e-f1ae60332a72-0004',
 'fc35ee01-77c5-4482-9a8e-f1ae60332a72-0005']

In [11]:
for log in log_data:
    if log['id'] == log_ids[-2]:
        print(log['src_lang'], log['tgt_lang'], log['id'])
    if log['id'] == log_ids[-1]:
        print(log['src_lang'], log['tgt_lang'], log['id'])



en fr fc35ee01-77c5-4482-9a8e-f1ae60332a72-0004
de en fc35ee01-77c5-4482-9a8e-f1ae60332a72-0005


In [12]:
from scripts.logger import RetryLog
selected_ids = [log_ids[-2], log_ids[-1], None]
target_pairs = [('en', 'fr'), ('de', 'en'), ('fr', 'en')]
reasons = ['no accepted translation yet', 'returned src text', 'no translation received yet']
retry_log = RetryLog(pairs=target_pairs, log_ids=selected_ids, reasons=reasons)
new_logger = TranslationLogger(logfile=logfile, retry_log=retry_log)
cli = MockClient(logger=new_logger, dm=dm)
sub_folder = join(main_folder, 'tmp2')

task = TranslationTask(
    target_pairs=target_pairs,
    dm=dm,
    client=cli,
    logger=new_logger,
    mt_folder=sub_folder,
    num_of_sents=400,
    manual_retry=True,
    max_retries=1,
    retry_delay=0
)

task.run()

INFO: 2025-05-07 12:44:13 - [🏁]: Starting task f7c49f13-fcba-46ec-8d5c-e5c0ae74ce61 on commit 4865abf


INFO: 2025-05-07 12:44:14 - [✔️]: Translated 400 sents for en-fr
INFO: 2025-05-07 12:44:14 - [✔️]: Translated 400 sents for de-en
INFO: 2025-05-07 12:44:14 - [✔️]: Translated 400 sents for fr-en
INFO: 2025-05-07 12:44:14 - [🏁]: Task took 0.95s


In [13]:
log_data = [json.loads(ln) for ln in logfile.getvalue().splitlines()]
for log in log_data[-3:]:
    print(log['src_lang'], log['tgt_lang'], log['manual_retry'])

en fr {'prev_id': 'fc35ee01-77c5-4482-9a8e-f1ae60332a72-0004', 'reason': 'no accepted translation yet'}
de en {'prev_id': 'fc35ee01-77c5-4482-9a8e-f1ae60332a72-0005', 'reason': 'returned src text'}
fr en {'prev_id': None, 'reason': 'no translation received yet'}


In [14]:
!rm -rf tmp