# Demonstration
* In this Jupyter Notebook we show using MockClients how the translation process will work.
* It is possible to run the scripts on Terminal but we use Jupyter Notebooks to document exactly what the author used to run the code.

In [1]:
%run proc0.py run -m gpt -d opus-100

2025-05-04 14:20:23 - [🚀]: Selected task for opus-100 - gpt
2025-05-04 14:20:23 - [🏁]: Starting task a9d64ae5-f2f7-4137-8bde-9e313f41600a on commit 7498a71
2025-05-04 14:20:23 - [⚠️]: Error MockError
2025-05-04 14:20:23 - [⏲️]: Retrying de-en...
2025-05-04 14:20:23 - [❌]: Translated 200 sents for de-en but rejected
2025-05-04 14:20:23 - [⏲️]: Retrying de-en...
2025-05-04 14:20:23 - [✔️]: Translated 400 sents for de-en
2025-05-04 14:20:23 - [✔️]: Translated 400 sents for en-de
2025-05-04 14:20:24 - [✔️]: Translated 400 sents for da-en
2025-05-04 14:20:24 - [✔️]: Translated 400 sents for en-da
2025-05-04 14:20:24 - [✔️]: Translated 400 sents for el-en
2025-05-04 14:20:24 - [✔️]: Translated 400 sents for en-el
2025-05-04 14:20:24 - [✔️]: Translated 400 sents for pt-en
2025-05-04 14:20:24 - [✔️]: Translated 400 sents for en-pt
2025-05-04 14:20:25 - [✔️]: Translated 400 sents for sv-en
2025-05-04 14:20:25 - [✔️]: Translated 400 sents for en-sv
2025-05-04 14:20:25 - [✔️]: Translated 400 sent

* Logs observable when running the task using the `proc0.py` script are brief
* The same logs are also stored in a `.log` file that is more detailed in case of errors
    * This process is not run on a server nor requires deployments, hence it makes more sense to have the real-time logs brief / readable, whereas the stored logs are more detailed for debugging purposes
    * Since we use Jupyter Notebook, it is possible to 'store' both types of logs (given the notebook is not re-run by accident...)
 

In [2]:
!cat proc0.log | head -n 18

2025-05-04 13:49:43 - [🚀]: Selected task for opus-100 - gpt
2025-05-04 13:49:43 - [🏁]: Starting task e11e4c3e-3305-44da-b1a4-24d2aae3c6d1 on commit 7498a71
2025-05-04 13:49:43 - [⚠️]: Error MockError
2025-05-04 13:49:43 - Traceback:
Traceback (most recent call last):
  File "C:\Files\UZH\Semester_6\BA_Thesis\BA_Repo\scripts\task.py", line 173, in run
    mt_sents = self.client.translate_and_store_document(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Files\UZH\Semester_6\BA_Thesis\BA_Repo\scripts\translators.py", line 54, in translate_and_store_document
    out_text = self.translate_document(
               ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Files\UZH\Semester_6\BA_Thesis\BA_Repo\scripts\translators.py", line 271, in translate_document
    out_text = self.encrypt(in_text, error_pair=(src_lang, tgt_lang))
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Files\UZH\Semester_6\BA_Thesis\BA_Repo\scripts\translators.py", line 253, in en

* In addition to task specific logs, we also log more details per translation in a JSONL file, for analysis purposes

In [3]:
!cat tmp/proc0.jsonl | head -n 5

{"src_lang": "de", "tgt_lang": "en", "start": 1746361223.4617388, "in_lines": 400, "in_sents": 444, "in_chars": 32731, "in_tokens": 8295, "end": 1746361223.511046, "out_chars": 15366, "out_lines": 200, "out_sents": 237, "out_tokens": 7368, "id": "a9d64ae5-f2f7-4137-8bde-9e313f41600a-0002", "translator": "gpt", "dataset": "Helsinki-NLP/opus-100", "verdict": "rejected"}
{"src_lang": "de", "tgt_lang": "en", "start": 1746361223.577673, "in_lines": 400, "in_sents": 444, "in_chars": 32731, "in_tokens": 8295, "end": 1746361223.6276066, "out_chars": 32731, "out_lines": 400, "out_sents": 478, "out_tokens": 15578, "id": "a9d64ae5-f2f7-4137-8bde-9e313f41600a-0003", "translator": "gpt", "dataset": "Helsinki-NLP/opus-100", "verdict": "accepted"}
{"src_lang": "en", "tgt_lang": "de", "start": 1746361223.7608526, "in_lines": 400, "in_sents": 446, "in_chars": 29414, "in_tokens": 6994, "end": 1746361223.810825, "out_chars": 29414, "out_lines": 400, "out_sents": 449, "out_tokens": 13305, "id": "a9d64ae5-

* The translation logs are linked to the task, observe of the translation log ids use the task id as a prefix.

In [4]:
!cat tmp/proc0/opus-100/gpt/task.json

{"task_id": "a9d64ae5-f2f7-4137-8bde-9e313f41600a", "git_hash": "7498a71", "dataset": "Helsinki-NLP/opus-100", "num_of_sents": 400, "split": "test[:500]", "translator": "gpt", "acceptable_range": [360, 480], "timestamp": "2025-05-04T14:20:27.192907+02:00", "manual_retry": false, "duration": 3.831834077835083}


## Manual Retry
* In cases were a task, even if automatic retry is accounted for, fails to deliver translations that are desired, we need to run the task again. 
* For documentation purposes, we want to link these 'faulty' translations to ones the task delievered before.
* If it is 'faulty' because we never get one due to errors, in such cases we do not log any translations, we still specify a reason but no id
* If it is 'faulty' because we automatically rejected the ones that arrived, we link it to the newest one.
    * Automatic rejection of translations is limited to avoid paying too much
    * Rejection reason is mainly receiving too much or too little output translations. All rejected translations are still stored and have an id. 

* In the following we define a scenario where En-Fr and Fr-En will not be translated successfully because:
    * En-Fr was rejected three times automatically
    * Fr-En failed three times  

In [5]:
pairs = [('en', 'de'), ('en', 'fr'), ('de', 'en'), ('fr', 'en')]
scenario = [0, 1, 1, 1, 0, 2, 2, 2]
verdicts = ['accepted', 'rejected', 'rejected', 'rejected', 'accepted']

In [6]:
from scripts.data_management import Opus100Manager
from scripts.task import TranslationTask
from scripts.translators import MockClient
from scripts.logger import TranslationLogger
from os.path import join
from io import StringIO
main_folder = 'tmp'
sub_folder = join(main_folder, 'proc0')

dm = Opus100Manager()
logfile = StringIO()
logger = TranslationLogger(logfile=logfile)
cli = MockClient(logger=logger, model='mock', scenario=scenario)
task = TranslationTask(
    target_pairs=pairs,
    dm=dm,
    client=cli,
    logger=logger,
    mt_folder=sub_folder,
    num_of_sents=400,
    max_retries=2,
    retry_delay=0
)

In [7]:
task.run()

2025-05-04 14:20:27 - [🏁]: Starting task ac94274a-2187-40cc-84dc-aaeb16d0d2b2 on commit 7498a71
2025-05-04 14:20:27 - [✔️]: Translated 400 sents for en-de
2025-05-04 14:20:27 - [❌]: Translated 200 sents for en-fr but rejected
2025-05-04 14:20:27 - [⏲️]: Retrying en-fr...
2025-05-04 14:20:27 - [❌]: Translated 200 sents for en-fr but rejected
2025-05-04 14:20:27 - [⏲️]: Retrying en-fr...
2025-05-04 14:20:28 - [❌]: Translated 200 sents for en-fr but rejected
2025-05-04 14:20:28 - [⏩]: Failed 2 times, skipping en-fr...
2025-05-04 14:20:28 - [✔️]: Translated 400 sents for de-en
2025-05-04 14:20:28 - [⚠️]: Error MockError
2025-05-04 14:20:28 - [⏲️]: Retrying fr-en...
2025-05-04 14:20:28 - [⚠️]: Error MockError
2025-05-04 14:20:28 - [⏲️]: Retrying fr-en...
2025-05-04 14:20:28 - [⚠️]: Error MockError
2025-05-04 14:20:28 - [⏩]: Failed 2 times, skipping fr-en...
2025-05-04 14:20:28 - [🏁]: Task took 1.18s


In [8]:
import json
log_data = [json.loads(ln) for ln in logfile.getvalue().splitlines()]
[log['verdict'] for log in log_data] == verdicts


True

* Based on these results, we decide to retry En-Fr and Fr-En.
* During analysis, we also noticed that De-En has not been translated at all, BLEU score single digits, src-text was returned, so we retry that as well.

In [9]:
log_ids = [log['id'] for log in log_data if (log['src_lang'] == 'en' and log['tgt_lang'] == 'fr') or (log['src_lang'] == 'de' and log['tgt_lang'] == 'en')]
log_ids

['ac94274a-2187-40cc-84dc-aaeb16d0d2b2-0002',
 'ac94274a-2187-40cc-84dc-aaeb16d0d2b2-0003',
 'ac94274a-2187-40cc-84dc-aaeb16d0d2b2-0004',
 'ac94274a-2187-40cc-84dc-aaeb16d0d2b2-0005']

In [10]:
for log in log_data:
    if log['id'] == log_ids[-2]:
        print(log['src_lang'], log['tgt_lang'])
    if log['id'] == log_ids[-1]:
        print(log['src_lang'], log['tgt_lang'])



en fr
de en


In [11]:
from scripts.logger import RetryLog
selected_ids = [log_ids[-2], log_ids[-1], None]
target_pairs = [('en', 'fr'), ('de', 'en'), ('fr', 'en')]
reasons = ['no accepted translation yet', 'returned src text', 'no translation received yet']
retry_log = RetryLog(pairs=target_pairs, log_ids=selected_ids, reasons=reasons)
new_logger = TranslationLogger(logfile=logfile, retry_log=retry_log)
cli = MockClient(logger=new_logger)
sub_folder = join(main_folder, 'tmp2')

task = TranslationTask(
    target_pairs=target_pairs,
    dm=dm,
    client=cli,
    logger=new_logger,
    mt_folder=sub_folder,
    num_of_sents=400,
    manual_retry=True,
    max_retries=1,
    retry_delay=0
)

task.run()

2025-05-04 14:20:28 - [🏁]: Starting task 098737dc-e303-41b4-96db-79c7ea75045a on commit 7498a71


2025-05-04 14:20:28 - [✔️]: Translated 400 sents for en-fr
2025-05-04 14:20:29 - [✔️]: Translated 400 sents for de-en
2025-05-04 14:20:29 - [✔️]: Translated 400 sents for fr-en
2025-05-04 14:20:29 - [🏁]: Task took 0.83s


In [12]:
log_data = [json.loads(ln) for ln in logfile.getvalue().splitlines()]
log_data[-3:]

[{'src_lang': 'en',
  'tgt_lang': 'fr',
  'start': 1746361228.6587605,
  'in_lines': 400,
  'in_sents': 429,
  'in_chars': 40777,
  'in_tokens': 9019,
  'end': 1746361228.7461498,
  'out_chars': 40777,
  'out_lines': 400,
  'out_sents': 437,
  'out_tokens': 18536,
  'id': '098737dc-e303-41b4-96db-79c7ea75045a-0001',
  'translator': 'mock',
  'dataset': 'Helsinki-NLP/opus-100',
  'manual_retry': {'prev_id': 'ac94274a-2187-40cc-84dc-aaeb16d0d2b2-0004',
   'reason': 'no accepted translation yet'},
  'verdict': 'accepted'},
 {'src_lang': 'de',
  'tgt_lang': 'en',
  'start': 1746361228.9506786,
  'in_lines': 400,
  'in_sents': 444,
  'in_chars': 32731,
  'in_tokens': 8295,
  'end': 1746361229.0243175,
  'out_chars': 32731,
  'out_lines': 400,
  'out_sents': 478,
  'out_tokens': 15578,
  'id': '098737dc-e303-41b4-96db-79c7ea75045a-0002',
  'translator': 'mock',
  'dataset': 'Helsinki-NLP/opus-100',
  'manual_retry': {'prev_id': 'ac94274a-2187-40cc-84dc-aaeb16d0d2b2-0005',
   'reason': 'retur

In [13]:
!rm -rf tmp