# Pipeline (so far, only for one text)

You need the following project structure:

    |_main.ipynb
    |_<b>cyrillic_textgrids</b>
                |_<i>YourTextgrid</i>.TextGrid
    |_<b>latin_textgrids</b>
    |_gridtext.py
    |_heap.py
    |_setup_logger.py
    |_tests.py
    |_transl_dict.csv

In [None]:
# Run
from os.path import join
from gridtext import GridText, GridTextTranscribed
from heap import make_heap
from tests import TestGridText

def translit_dict():  # old_version to rewrite
    with open('transl_dict.csv', 'r', encoding='UTF-8') as f:
        txt = f.read()
    txt_list = txt.split('\n')
    txt_list = [i.split(',') for i in txt_list]
    translit_dict = {i[0]: i[1] for i in txt_list if len(i) == 2 and i[0] != ''}

    translit_dict_cap = {}
    for key in translit_dict.keys(): #  ad capitals
        translit_dict_cap[key.capitalize()] = translit_dict[key].capitalize()
    translit_dict.update(translit_dict_cap)

    return translit_dict


## Step 1

First, specify the TextGrid file name and tier names:

In [None]:
# Edit
test_tg_name = 'TEST.TextGrid'  # your filename
tiernames = ['2', '1']  # [translation tier, transcription tier]

## Step 2

Replace blank translation of russian in transcription, for example:

![blank](img/blank_translation.png)

This should be done so that you can align the borders and not lose empty intervals.

In [None]:
# Run
path_to_test_tg = join('cyrillic_textgrids', test_tg_name)
test_tg = GridText.from_tg_file(path_to_test_tg, *tiernames)
test_tg.replace_blank_translation()

## Step 3

Transliterate transcription tier.

In [None]:
# Run
transliterated_test_tg = test_tg.transliterate_tg('3', translit_dict())
print(GridTextTranscribed.get_labels(transliterated_test_tg.latin_transcription))

## Step 4

Save transliterated TextGrid file.

In [None]:
# Run
path_to_test_tg_save = join('latin_textgrids', 'test_' + test_tg_name)
transliterated_test_tg.save_tg(path_to_test_tg_save)

## Step 5

Align boundaries on tiers, to avoid mistakes when searching for a translation:

![misclick](img/misclicks.png)

Save file again.

In [None]:
# Run
tiernames.append('3')  # name of the latin transcription tier
transliterated_test_tg = GridTextTranscribed.from_tg_file(path_to_test_tg_save, *tiernames, align=True)
transliterated_test_tg.save_tg(path_to_test_tg_save)

## Step 6

Run tests before making html heap. All intervals without corresponding translation will be stored in *error_heap.log*.

**Add translation and align the boundaries in the TextGrid file before Step 7.**

In [None]:
# Run
test_test = TestGridText(transliterated_test_tg)
test_test.test_interval_boundaries()  # test boundaries

## Step 7

Make html heap.

Row result:

![heap](img/heap.png)

In [None]:
# Run
make_heap(transliterated_test_tg)