## Translate body text (evaluation) <br>

In the following, we translate the extracted body text of the articles from the evaluation dataset. As previously mentioned, the limitations of Google Translate API require extra care in ordere to split the text (in order not to reach the character limit) and the set of documents (in order not to reach the document limit).

In [2]:
import pandas as pd
from googletrans import Translator
import time

In [3]:
eval_df = pd.read_csv('eval/_EVAL_details_in_df.csv')

The limit over characters (max 15k characters according to the official documentation, but 5k in our tests) can be overcome by subdividing the text in a way that preserves basic text units. <br><br>

Our idea was subdividing the text into paragraphs (thus using `\n\n` as delimiter) until their length is shorter than 5000 characters. However, some articles presented some critical peculiarities (absence of paragraphs, absence of formatting, absence of punctuation) and that is a reason to observe that there is no obvious way to cut the text that will not raise some problems. <br><br>

Other attempts which we made are: cutting the text at the last punctuation symbol before the 5000 character limit, brutally cutting the text at 5000 characters and gluing the parts together after translation.

In [10]:
translator = Translator()

def splitTranslator(file,text,lang,delimiter='\n\n'):
    
    # break down long text into smaller chuck by splitting every time it sees the '\n\n' 
    # and keep it in body list 
    body_list =  text.split(delimiter)
    
    # iterate over each text chunks to translate
    translated_list = []
    
    flag_long_paragraph = False
    
    for s in body_list:
        
        # check if any chunk is still over 5000 char
        if len(s) > 5000:
            flag_long_paragraph = True
            print (file,'is still over limit:')
            continue
            
        # if the chunk is under 5000 char, then translate, and keep in the translated list
        elif len(s) == 0:
            continue
        else:
            try:
                translated = translator.translate(s)
                lang = translated.src
                translated = translated.text
                translated_list.append(translated)
            except Exception:
                print (file, 'cannot translate')
            time.sleep(0.5)
    
    # put all the translated text together and connect them using space
    translated_body = ' '.join(translated_list)
    return translated_body,flag_long_paragraph

## main translation task
count = 0


start = 0
end = len(eval_df)
long_paragraph_list = []

for index, row in eval_df.iterrows():
    
    if index < start or index > end:
        continue
    
    print('--------------',index,'-----------------')
    id1 = str(row['pair_id']).split('_')[0]
    id2 = str(row['pair_id']).split('_')[1]
    body1 = str(row['text1'])
    lang1 = row['url1_lang']
    body2 = str(row['text2'])
    lang2 = row['url1_lang']
    
    if lang1 != 'en':
        print('translating text1 of language ', lang1,' of length ', len(body1),'...')
        if len(body1) > 5000:
            split_text1 = splitTranslator(id1,body1,lang1)
            translated1 = split_text1[0]
            if split_text1[1]:
                long_paragraph_list.append(id1)   
        elif pd.isnull(body1):
            continue
        else:
            translated1 = translator.translate(body1)
            lang1 = translated1.src
            translated1 = translated1.text
    else:
        translated1 = body1.replace("\n", " ")
        
    if lang2 != 'en':
        print('translating text2 of language ', lang2,' of length ', len(body2),'...')
        if len(body2) > 5000:
            split_text2 = splitTranslator2(id2,body2,lang2)
            translated2 = split_text2[0]
            if split_text2[1]:
                long_paragraph_list.append(id2)                            
        elif pd.isnull(body2):
            continue
        else:
            translated2 = translator.translate(body2)
            lang2 = translated2.src
            translated2 = translated2.text
    else:
        translated2 = body2.replace("\n", " ")
    
    eval_df.loc[index, "translated_body1"] = translated1
    eval_df.loc[index, "translated_body2"] = translated2
    
    time.sleep(0.5)
    count += 1
    
    if (count+1)%50 == 0:
        print (count+1, 'rows done.')
        
    
print("The number of articles that have been cut due to absence of paragraphs is ",len(long_paragraph_list))

-------------- 0 -----------------
-------------- 1 -----------------
-------------- 2 -----------------
-------------- 3 -----------------
-------------- 4 -----------------
-------------- 5 -----------------
-------------- 6 -----------------
-------------- 7 -----------------
-------------- 8 -----------------
-------------- 9 -----------------
-------------- 10 -----------------
-------------- 11 -----------------
-------------- 12 -----------------
-------------- 13 -----------------
-------------- 14 -----------------
-------------- 15 -----------------
-------------- 16 -----------------
-------------- 17 -----------------
-------------- 18 -----------------
-------------- 19 -----------------
-------------- 20 -----------------
-------------- 21 -----------------
-------------- 22 -----------------
-------------- 23 -----------------
-------------- 24 -----------------
-------------- 25 -----------------
-------------- 26 -----------------
-------------- 27 -----------------
--

-------------- 206 -----------------
translating text1 of language  ru  of length  1066 ...
translating text2 of language  ru  of length  847 ...
-------------- 207 -----------------
translating text1 of language  ru  of length  1040 ...
translating text2 of language  ru  of length  898 ...
-------------- 208 -----------------
translating text1 of language  ru  of length  586 ...
translating text2 of language  ru  of length  725 ...
-------------- 209 -----------------
translating text1 of language  ru  of length  1007 ...
translating text2 of language  ru  of length  643 ...
-------------- 210 -----------------
translating text1 of language  ru  of length  1227 ...
translating text2 of language  ru  of length  859 ...
-------------- 211 -----------------
translating text1 of language  ru  of length  1997 ...
translating text2 of language  ru  of length  1185 ...
-------------- 212 -----------------
translating text1 of language  ru  of length  730 ...
translating text2 of language  ru

translating text1 of language  zh  of length  1111 ...
translating text2 of language  zh  of length  453 ...
-------------- 312 -----------------
translating text1 of language  zh  of length  307 ...
translating text2 of language  zh  of length  321 ...
-------------- 313 -----------------
translating text1 of language  zh  of length  340 ...
translating text2 of language  zh  of length  967 ...
-------------- 314 -----------------
translating text1 of language  zh  of length  1993 ...
translating text2 of language  zh  of length  1956 ...
-------------- 315 -----------------
translating text1 of language  zh  of length  246 ...
translating text2 of language  zh  of length  547 ...
-------------- 316 -----------------
translating text1 of language  zh  of length  1220 ...
translating text2 of language  zh  of length  751 ...
-------------- 317 -----------------
translating text1 of language  zh  of length  150 ...
translating text2 of language  zh  of length  150 ...
-------------- 318

translating text2 of language  tr  of length  4797 ...
-------------- 427 -----------------
translating text1 of language  tr  of length  1192 ...
translating text2 of language  tr  of length  907 ...
-------------- 428 -----------------
translating text1 of language  tr  of length  1568 ...
translating text2 of language  tr  of length  430 ...
-------------- 429 -----------------
translating text1 of language  tr  of length  802 ...
translating text2 of language  tr  of length  887 ...
-------------- 430 -----------------
translating text1 of language  tr  of length  1075 ...
translating text2 of language  tr  of length  2427 ...
-------------- 431 -----------------
translating text1 of language  tr  of length  1144 ...
translating text2 of language  tr  of length  1597 ...
-------------- 432 -----------------
translating text1 of language  tr  of length  2808 ...
translating text2 of language  tr  of length  1096 ...
-------------- 433 -----------------
translating text1 of language 

translating text1 of language  ru  of length  2207 ...
translating text2 of language  ru  of length  1647 ...
-------------- 544 -----------------
translating text1 of language  ru  of length  1358 ...
translating text2 of language  ru  of length  714 ...
-------------- 545 -----------------
translating text1 of language  ru  of length  1526 ...
translating text2 of language  ru  of length  2029 ...
-------------- 546 -----------------
translating text1 of language  ru  of length  1464 ...
translating text2 of language  ru  of length  2086 ...
-------------- 547 -----------------
translating text1 of language  ru  of length  878 ...
translating text2 of language  ru  of length  860 ...
-------------- 548 -----------------
translating text1 of language  ru  of length  951 ...
translating text2 of language  ru  of length  882 ...
550 rows done.
-------------- 549 -----------------
translating text1 of language  ru  of length  2900 ...
translating text2 of language  ru  of length  2909 ..

translating text1 of language  de  of length  1946 ...
translating text2 of language  de  of length  1451 ...
-------------- 653 -----------------
translating text1 of language  de  of length  928 ...
translating text2 of language  de  of length  1323 ...
-------------- 654 -----------------
translating text1 of language  de  of length  3738 ...
translating text2 of language  de  of length  4124 ...
-------------- 655 -----------------
translating text1 of language  de  of length  8094 ...
translating text2 of language  de  of length  1032 ...
-------------- 656 -----------------
translating text1 of language  de  of length  4190 ...
translating text2 of language  de  of length  2259 ...
-------------- 657 -----------------
translating text1 of language  de  of length  551 ...
translating text2 of language  de  of length  617 ...
-------------- 658 -----------------
translating text1 of language  de  of length  731 ...
translating text2 of language  de  of length  809 ...
-------------

-------------- 765 -----------------
translating text1 of language  de  of length  1804 ...
translating text2 of language  de  of length  1025 ...
-------------- 766 -----------------
translating text1 of language  de  of length  5856 ...
translating text2 of language  de  of length  3540 ...
-------------- 767 -----------------
translating text1 of language  de  of length  406 ...
translating text2 of language  de  of length  1661 ...
-------------- 768 -----------------
translating text1 of language  de  of length  141 ...
translating text2 of language  de  of length  4329 ...
-------------- 769 -----------------
translating text1 of language  de  of length  882 ...
translating text2 of language  de  of length  1012 ...
-------------- 770 -----------------
translating text1 of language  de  of length  2457 ...
translating text2 of language  de  of length  2050 ...
-------------- 771 -----------------
translating text1 of language  de  of length  1412 ...
translating text2 of language

translating text1 of language  de  of length  958 ...
translating text2 of language  de  of length  1976 ...
-------------- 885 -----------------
translating text1 of language  de  of length  2593 ...
translating text2 of language  de  of length  1408 ...
-------------- 886 -----------------
translating text1 of language  de  of length  3715 ...
translating text2 of language  de  of length  3426 ...
-------------- 887 -----------------
translating text1 of language  de  of length  1009 ...
translating text2 of language  de  of length  1360 ...
-------------- 888 -----------------
translating text1 of language  de  of length  2698 ...
translating text2 of language  de  of length  2212 ...
-------------- 889 -----------------
translating text1 of language  de  of length  255 ...
translating text2 of language  de  of length  1072 ...
-------------- 890 -----------------
translating text1 of language  de  of length  1493 ...
translating text2 of language  de  of length  5726 ...
----------

-------------- 1003 -----------------
translating text1 of language  de  of length  1616 ...
translating text2 of language  de  of length  882 ...
-------------- 1004 -----------------
translating text1 of language  de  of length  2941 ...
translating text2 of language  de  of length  2168 ...
-------------- 1005 -----------------
translating text1 of language  de  of length  5158 ...
translating text2 of language  de  of length  1676 ...
-------------- 1006 -----------------
translating text1 of language  de  of length  1812 ...
translating text2 of language  de  of length  4611 ...
-------------- 1007 -----------------
translating text1 of language  de  of length  1125 ...
translating text2 of language  de  of length  1303 ...
-------------- 1008 -----------------
translating text1 of language  de  of length  6097 ...
translating text2 of language  de  of length  2213 ...
-------------- 1009 -----------------
translating text1 of language  de  of length  1418 ...
translating text2 of

-------------- 1094 -----------------
translating text1 of language  es  of length  490 ...
translating text2 of language  es  of length  559 ...
-------------- 1095 -----------------
translating text1 of language  es  of length  5278 ...
translating text2 of language  es  of length  2509 ...
-------------- 1096 -----------------
translating text1 of language  es  of length  2467 ...
translating text2 of language  es  of length  2419 ...
-------------- 1097 -----------------
translating text1 of language  es  of length  2665 ...
translating text2 of language  es  of length  1851 ...
-------------- 1098 -----------------
translating text1 of language  es  of length  6135 ...
translating text2 of language  es  of length  2027 ...
1100 rows done.
-------------- 1099 -----------------
translating text1 of language  es  of length  3524 ...
translating text2 of language  es  of length  1549 ...
-------------- 1100 -----------------
translating text1 of language  es  of length  2904 ...
trans

translating text2 of language  it  of length  1123 ...
-------------- 1185 -----------------
translating text1 of language  it  of length  946 ...
translating text2 of language  it  of length  1622 ...
-------------- 1186 -----------------
translating text1 of language  it  of length  1322 ...
translating text2 of language  it  of length  2637 ...
-------------- 1187 -----------------
translating text1 of language  it  of length  1165 ...
translating text2 of language  it  of length  761 ...
-------------- 1188 -----------------
translating text1 of language  it  of length  737 ...
translating text2 of language  it  of length  566 ...
-------------- 1189 -----------------
translating text1 of language  it  of length  1153 ...
translating text2 of language  it  of length  913 ...
-------------- 1190 -----------------
translating text1 of language  it  of length  1316 ...
translating text2 of language  it  of length  1315 ...
-------------- 1191 -----------------
translating text1 of lan

translating text2 of language  it  of length  1541 ...
-------------- 1334 -----------------
translating text1 of language  it  of length  1417 ...
translating text2 of language  it  of length  1322 ...
-------------- 1335 -----------------
translating text1 of language  it  of length  5390 ...
translating text2 of language  it  of length  2210 ...
-------------- 1336 -----------------
translating text1 of language  it  of length  1642 ...
translating text2 of language  it  of length  2399 ...
-------------- 1337 -----------------
translating text1 of language  it  of length  1032 ...
translating text2 of language  it  of length  818 ...
-------------- 1338 -----------------
translating text1 of language  it  of length  765 ...
translating text2 of language  it  of length  567 ...
-------------- 1339 -----------------
translating text1 of language  it  of length  5523 ...
translating text2 of language  it  of length  1276 ...
-------------- 1340 -----------------
translating text1 of l

-------------- 1463 -----------------
translating text1 of language  it  of length  349 ...
translating text2 of language  it  of length  402 ...
-------------- 1464 -----------------
translating text1 of language  it  of length  993 ...
translating text2 of language  it  of length  609 ...
-------------- 1465 -----------------
translating text1 of language  it  of length  2058 ...
translating text2 of language  it  of length  3115 ...
-------------- 1466 -----------------
translating text1 of language  it  of length  5594 ...
translating text2 of language  it  of length  4359 ...
-------------- 1467 -----------------
translating text1 of language  it  of length  1354 ...
translating text2 of language  it  of length  2314 ...
-------------- 1468 -----------------
translating text1 of language  it  of length  1661 ...
translating text2 of language  it  of length  3037 ...
-------------- 1469 -----------------
translating text1 of language  it  of length  1692 ...
translating text2 of la

translating text2 of language  zh  of length  651 ...
-------------- 1564 -----------------
translating text1 of language  zh  of length  2617 ...
translating text2 of language  zh  of length  4986 ...
-------------- 1565 -----------------
translating text1 of language  zh  of length  674 ...
translating text2 of language  zh  of length  1587 ...
-------------- 1566 -----------------
translating text1 of language  zh  of length  1304 ...
translating text2 of language  zh  of length  5607 ...
-------------- 1567 -----------------
translating text1 of language  zh  of length  76 ...
translating text2 of language  zh  of length  3202 ...
-------------- 1568 -----------------
translating text1 of language  zh  of length  968 ...
translating text2 of language  zh  of length  1866 ...
-------------- 1569 -----------------
translating text1 of language  zh  of length  466 ...
translating text2 of language  zh  of length  4917 ...
-------------- 1570 -----------------
translating text1 of lang

-------------- 1683 -----------------
translating text1 of language  de  of length  2145 ...
translating text2 of language  de  of length  5767 ...
-------------- 1684 -----------------
translating text1 of language  de  of length  1620 ...
translating text2 of language  de  of length  1589 ...
-------------- 1685 -----------------
translating text1 of language  de  of length  4311 ...
translating text2 of language  de  of length  1003 ...
-------------- 1686 -----------------
translating text1 of language  de  of length  1552 ...
translating text2 of language  de  of length  1483 ...
-------------- 1687 -----------------
translating text1 of language  de  of length  947 ...
translating text2 of language  de  of length  3653 ...
-------------- 1688 -----------------
translating text1 of language  de  of length  4677 ...
translating text2 of language  de  of length  3804 ...
-------------- 1689 -----------------
translating text1 of language  de  of length  1378 ...
translating text2 of

translating text1 of language  es  of length  3097 ...
translating text2 of language  es  of length  1186 ...
-------------- 1808 -----------------
translating text1 of language  es  of length  2539 ...
translating text2 of language  es  of length  4156 ...
-------------- 1809 -----------------
translating text1 of language  es  of length  5401 ...
translating text2 of language  es  of length  1718 ...
-------------- 1810 -----------------
translating text1 of language  es  of length  1257 ...
translating text2 of language  es  of length  1167 ...
-------------- 1811 -----------------
translating text1 of language  es  of length  3585 ...
translating text2 of language  es  of length  832 ...
-------------- 1812 -----------------
translating text1 of language  es  of length  3638 ...
translating text2 of language  es  of length  2281 ...
-------------- 1813 -----------------
translating text1 of language  es  of length  6214 ...
translating text2 of language  es  of length  2455 ...
---

translating text1 of language  es  of length  1578 ...
translating text2 of language  es  of length  2025 ...
-------------- 1922 -----------------
translating text1 of language  es  of length  6426 ...
translating text2 of language  es  of length  4179 ...
-------------- 1923 -----------------
translating text1 of language  es  of length  984 ...
translating text2 of language  es  of length  941 ...
-------------- 1924 -----------------
translating text1 of language  es  of length  3715 ...
translating text2 of language  es  of length  531 ...
-------------- 1925 -----------------
translating text1 of language  es  of length  1799 ...
translating text2 of language  es  of length  6812 ...
-------------- 1926 -----------------
translating text1 of language  es  of length  3846 ...
translating text2 of language  es  of length  1492 ...
-------------- 1927 -----------------
translating text1 of language  es  of length  5319 ...
translating text2 of language  es  of length  4062 ...
-----

translating text2 of language  es  of length  893 ...
-------------- 2036 -----------------
translating text1 of language  es  of length  1097 ...
translating text2 of language  es  of length  1201 ...
-------------- 2037 -----------------
translating text1 of language  es  of length  1882 ...
translating text2 of language  es  of length  178 ...
-------------- 2038 -----------------
translating text1 of language  es  of length  1988 ...
translating text2 of language  es  of length  4223 ...
-------------- 2039 -----------------
translating text1 of language  es  of length  1928 ...
translating text2 of language  es  of length  1512 ...
-------------- 2040 -----------------
translating text1 of language  es  of length  2193 ...
translating text2 of language  es  of length  606 ...
-------------- 2041 -----------------
translating text1 of language  es  of length  2966 ...
translating text2 of language  es  of length  2316 ...
-------------- 2042 -----------------
translating text1 of l

translating text2 of language  ar  of length  877 ...
-------------- 2122 -----------------
translating text1 of language  ar  of length  811 ...
translating text2 of language  ar  of length  702 ...
-------------- 2123 -----------------
translating text1 of language  ar  of length  2302 ...
translating text2 of language  ar  of length  1445 ...
-------------- 2124 -----------------
translating text1 of language  ar  of length  2249 ...
translating text2 of language  ar  of length  1857 ...
-------------- 2125 -----------------
translating text1 of language  ar  of length  1057 ...
translating text2 of language  ar  of length  1093 ...
-------------- 2126 -----------------
translating text1 of language  ar  of length  2958 ...
translating text2 of language  ar  of length  1199 ...
-------------- 2127 -----------------
translating text1 of language  ar  of length  261 ...
translating text2 of language  ar  of length  998 ...
-------------- 2128 -----------------
translating text1 of lan

-------------- 2268 -----------------
translating text1 of language  de  of length  3348 ...
translating text2 of language  de  of length  1672 ...
-------------- 2269 -----------------
translating text1 of language  de  of length  5322 ...
translating text2 of language  de  of length  3043 ...
-------------- 2270 -----------------
translating text1 of language  de  of length  3723 ...
translating text2 of language  de  of length  2821 ...
-------------- 2271 -----------------
translating text1 of language  de  of length  726 ...
translating text2 of language  de  of length  628 ...
-------------- 2272 -----------------
translating text1 of language  de  of length  1839 ...
translating text2 of language  de  of length  1404 ...
-------------- 2273 -----------------
translating text1 of language  de  of length  3288 ...
translating text2 of language  de  of length  9550 ...
-------------- 2274 -----------------
translating text1 of language  de  of length  2995 ...
translating text2 of 

translating text2 of language  de  of length  3774 ...
-------------- 2393 -----------------
translating text1 of language  de  of length  1909 ...
translating text2 of language  de  of length  2131 ...
-------------- 2394 -----------------
translating text1 of language  de  of length  1524 ...
translating text2 of language  de  of length  2933 ...
-------------- 2395 -----------------
translating text1 of language  de  of length  3394 ...
translating text2 of language  de  of length  2020 ...
-------------- 2396 -----------------
translating text1 of language  de  of length  1009 ...
translating text2 of language  de  of length  844 ...
-------------- 2397 -----------------
translating text1 of language  de  of length  2047 ...
translating text2 of language  de  of length  1844 ...
-------------- 2398 -----------------
translating text1 of language  de  of length  1464 ...
translating text2 of language  de  of length  2177 ...
2400 rows done.
-------------- 2399 -----------------
tran

translating text1 of language  fr  of length  616 ...
translating text2 of language  fr  of length  621 ...
-------------- 2508 -----------------
translating text1 of language  fr  of length  6086 ...
translating text2 of language  fr  of length  986 ...
-------------- 2509 -----------------
translating text1 of language  fr  of length  1896 ...
translating text2 of language  fr  of length  1073 ...
-------------- 2510 -----------------
translating text1 of language  fr  of length  3212 ...
translating text2 of language  fr  of length  1232 ...
-------------- 2511 -----------------
translating text1 of language  fr  of length  1224 ...
translating text2 of language  fr  of length  1210 ...
-------------- 2512 -----------------
translating text1 of language  fr  of length  5542 ...
translating text2 of language  fr  of length  3455 ...
-------------- 2513 -----------------
translating text1 of language  fr  of length  1754 ...
translating text2 of language  fr  of length  2200 ...
-----

translating text1 of language  pl  of length  4975 ...
translating text2 of language  pl  of length  2257 ...
-------------- 2625 -----------------
translating text1 of language  pl  of length  2372 ...
translating text2 of language  pl  of length  2839 ...
-------------- 2626 -----------------
translating text1 of language  pl  of length  695 ...
translating text2 of language  pl  of length  1451 ...
-------------- 2627 -----------------
translating text1 of language  pl  of length  1177 ...
translating text2 of language  pl  of length  798 ...
-------------- 2628 -----------------
translating text1 of language  pl  of length  2569 ...
translating text2 of language  pl  of length  2683 ...
-------------- 2629 -----------------
translating text1 of language  pl  of length  3985 ...
translating text2 of language  pl  of length  1970 ...
-------------- 2630 -----------------
translating text1 of language  pl  of length  1983 ...
translating text2 of language  pl  of length  1191 ...
----

-------------- 2751 -----------------
translating text1 of language  pl  of length  2034 ...
translating text2 of language  pl  of length  1809 ...
-------------- 2752 -----------------
translating text1 of language  pl  of length  2407 ...
translating text2 of language  pl  of length  4307 ...
-------------- 2753 -----------------
translating text1 of language  pl  of length  925 ...
translating text2 of language  pl  of length  966 ...
-------------- 2754 -----------------
translating text1 of language  pl  of length  7562 ...
translating text2 of language  pl  of length  5758 ...
-------------- 2755 -----------------
translating text1 of language  pl  of length  5140 ...
translating text2 of language  pl  of length  10273 ...
-------------- 2756 -----------------
translating text1 of language  pl  of length  2419 ...
translating text2 of language  pl  of length  895 ...
-------------- 2757 -----------------
translating text1 of language  pl  of length  4430 ...
translating text2 of 

-------------- 2867 -----------------
translating text1 of language  ru  of length  5886 ...
translating text2 of language  ru  of length  4775 ...
-------------- 2868 -----------------
translating text1 of language  ru  of length  1736 ...
translating text2 of language  ru  of length  1874 ...
-------------- 2869 -----------------
translating text1 of language  ru  of length  3785 ...
translating text2 of language  ru  of length  2651 ...
-------------- 2870 -----------------
translating text1 of language  ru  of length  2118 ...
translating text2 of language  ru  of length  2132 ...
-------------- 2871 -----------------
translating text1 of language  ru  of length  4105 ...
translating text2 of language  ru  of length  1827 ...
-------------- 2872 -----------------
translating text1 of language  ru  of length  8950 ...
translating text2 of language  ru  of length  2534 ...
-------------- 2873 -----------------
translating text1 of language  ru  of length  1264 ...
translating text2 o

translating text2 of language  tr  of length  4620 ...
-------------- 3008 -----------------
translating text1 of language  tr  of length  5936 ...
translating text2 of language  tr  of length  2321 ...
-------------- 3009 -----------------
translating text1 of language  tr  of length  2570 ...
translating text2 of language  tr  of length  5625 ...
-------------- 3010 -----------------
translating text1 of language  tr  of length  1112 ...
translating text2 of language  tr  of length  955 ...
-------------- 3011 -----------------
translating text1 of language  tr  of length  1558 ...
translating text2 of language  tr  of length  1974 ...
-------------- 3012 -----------------
translating text1 of language  tr  of length  6796 ...
translating text2 of language  tr  of length  3168 ...
-------------- 3013 -----------------
translating text1 of language  tr  of length  1945 ...
translating text2 of language  tr  of length  1979 ...
-------------- 3014 -----------------
translating text1 of

-------------- 3117 -----------------
translating text1 of language  zh  of length  417 ...
translating text2 of language  zh  of length  500 ...
-------------- 3118 -----------------
translating text1 of language  zh  of length  122 ...
translating text2 of language  zh  of length  161 ...
-------------- 3119 -----------------
translating text1 of language  zh  of length  276 ...
translating text2 of language  zh  of length  447 ...
-------------- 3120 -----------------
translating text1 of language  zh  of length  634 ...
translating text2 of language  zh  of length  470 ...
-------------- 3121 -----------------
translating text1 of language  zh  of length  526 ...
translating text2 of language  zh  of length  327 ...
-------------- 3122 -----------------
translating text1 of language  zh  of length  338 ...
translating text2 of language  zh  of length  397 ...
-------------- 3123 -----------------
translating text1 of language  zh  of length  1316 ...
translating text2 of language  

translating text2 of language  zh  of length  150 ...
-------------- 3221 -----------------
translating text1 of language  zh  of length  585 ...
translating text2 of language  zh  of length  657 ...
-------------- 3222 -----------------
translating text1 of language  zh  of length  1468 ...
translating text2 of language  zh  of length  1529 ...
-------------- 3223 -----------------
translating text1 of language  zh  of length  546 ...
translating text2 of language  zh  of length  378 ...
-------------- 3224 -----------------
translating text1 of language  zh  of length  347 ...
translating text2 of language  zh  of length  2446 ...
-------------- 3225 -----------------
translating text1 of language  zh  of length  653 ...
translating text2 of language  zh  of length  481 ...
-------------- 3226 -----------------
translating text1 of language  zh  of length  567 ...
translating text2 of language  zh  of length  535 ...
-------------- 3227 -----------------
translating text1 of language

translating text2 of language  zh  of length  565 ...
-------------- 3353 -----------------
translating text1 of language  zh  of length  1632 ...
translating text2 of language  zh  of length  1203 ...
-------------- 3354 -----------------
translating text1 of language  zh  of length  370 ...
translating text2 of language  zh  of length  629 ...
-------------- 3355 -----------------
translating text1 of language  zh  of length  1057 ...
translating text2 of language  zh  of length  665 ...
-------------- 3356 -----------------
translating text1 of language  zh  of length  141 ...
translating text2 of language  zh  of length  128 ...
-------------- 3357 -----------------
translating text1 of language  zh  of length  2174 ...
translating text2 of language  zh  of length  1679 ...
-------------- 3358 -----------------
translating text1 of language  zh  of length  819 ...
translating text2 of language  zh  of length  1535 ...
-------------- 3359 -----------------
translating text1 of langu

translating text2 of language  zh  of length  2122 ...
-------------- 3468 -----------------
translating text1 of language  zh  of length  973 ...
translating text2 of language  zh  of length  2354 ...
-------------- 3469 -----------------
translating text1 of language  zh  of length  1224 ...
translating text2 of language  zh  of length  1658 ...
-------------- 3470 -----------------
translating text1 of language  zh  of length  266 ...
translating text2 of language  zh  of length  2588 ...
-------------- 3471 -----------------
translating text1 of language  zh  of length  100 ...
translating text2 of language  zh  of length  17228 ...
-------------- 3472 -----------------
translating text1 of language  zh  of length  621 ...
translating text2 of language  zh  of length  3976 ...
-------------- 3473 -----------------
translating text1 of language  zh  of length  465 ...
translating text2 of language  zh  of length  67 ...
-------------- 3474 -----------------
translating text1 of lang

translating text1 of language  es  of length  3960 ...
translating text2 of language  es  of length  2198 ...
-------------- 3591 -----------------
translating text1 of language  es  of length  1810 ...
translating text2 of language  es  of length  978 ...
-------------- 3592 -----------------
translating text1 of language  es  of length  1484 ...
translating text2 of language  es  of length  1177 ...
-------------- 3593 -----------------
translating text1 of language  es  of length  4884 ...
translating text2 of language  es  of length  7316 ...
-------------- 3594 -----------------
translating text1 of language  es  of length  2508 ...
translating text2 of language  es  of length  1029 ...
-------------- 3595 -----------------
translating text1 of language  es  of length  2827 ...
translating text2 of language  es  of length  848 ...
-------------- 3596 -----------------
translating text1 of language  es  of length  1758 ...
translating text2 of language  es  of length  1257 ...
----

-------------- 3712 -----------------
translating text1 of language  de  of length  3506 ...
translating text2 of language  de  of length  1827 ...
-------------- 3713 -----------------
translating text1 of language  de  of length  2797 ...
translating text2 of language  de  of length  1922 ...
-------------- 3714 -----------------
translating text1 of language  de  of length  4046 ...
translating text2 of language  de  of length  4325 ...
-------------- 3715 -----------------
translating text1 of language  de  of length  2590 ...
translating text2 of language  de  of length  2687 ...
-------------- 3716 -----------------
translating text1 of language  de  of length  3820 ...
translating text2 of language  de  of length  1701 ...
-------------- 3717 -----------------
translating text1 of language  de  of length  3597 ...
translating text2 of language  de  of length  2381 ...
-------------- 3718 -----------------
translating text1 of language  de  of length  2697 ...
translating text2 o

translating text2 of language  es  of length  778 ...
-------------- 3813 -----------------
translating text1 of language  es  of length  2218 ...
translating text2 of language  es  of length  1030 ...
-------------- 3814 -----------------
translating text1 of language  es  of length  7275 ...
translating text2 of language  es  of length  744 ...
-------------- 3815 -----------------
translating text1 of language  es  of length  5330 ...
translating text2 of language  es  of length  2891 ...
-------------- 3816 -----------------
translating text1 of language  es  of length  1062 ...
translating text2 of language  es  of length  1285 ...
-------------- 3817 -----------------
translating text1 of language  es  of length  3318 ...
translating text2 of language  es  of length  1151 ...
-------------- 3818 -----------------
translating text1 of language  es  of length  1694 ...
translating text2 of language  es  of length  483 ...
-------------- 3819 -----------------
translating text1 of l

-------------- 3947 -----------------
translating text1 of language  es  of length  2116 ...
translating text2 of language  es  of length  1704 ...
-------------- 3948 -----------------
translating text1 of language  es  of length  1350 ...
translating text2 of language  es  of length  1148 ...
3950 rows done.
-------------- 3949 -----------------
translating text1 of language  es  of length  2812 ...
translating text2 of language  es  of length  957 ...
-------------- 3950 -----------------
translating text1 of language  es  of length  2182 ...
translating text2 of language  es  of length  1256 ...
-------------- 3951 -----------------
translating text1 of language  es  of length  2059 ...
translating text2 of language  es  of length  494 ...
-------------- 3952 -----------------
translating text1 of language  es  of length  2775 ...
translating text2 of language  es  of length  1637 ...
-------------- 3953 -----------------
translating text1 of language  es  of length  1150 ...
trans

translating text2 of language  pl  of length  1513 ...
-------------- 4077 -----------------
translating text1 of language  pl  of length  1342 ...
translating text2 of language  pl  of length  2331 ...
-------------- 4078 -----------------
translating text1 of language  pl  of length  3091 ...
translating text2 of language  pl  of length  2204 ...
-------------- 4079 -----------------
translating text1 of language  pl  of length  2339 ...
translating text2 of language  pl  of length  2615 ...
-------------- 4080 -----------------
translating text1 of language  pl  of length  1458 ...
translating text2 of language  pl  of length  1360 ...
-------------- 4081 -----------------
translating text1 of language  pl  of length  1733 ...
translating text2 of language  pl  of length  463 ...
-------------- 4082 -----------------
translating text1 of language  pl  of length  1297 ...
translating text2 of language  pl  of length  1585 ...
-------------- 4083 -----------------
translating text1 of

-------------- 4182 -----------------
translating text1 of language  zh  of length  351 ...
translating text2 of language  zh  of length  858 ...
-------------- 4183 -----------------
translating text1 of language  zh  of length  1002 ...
translating text2 of language  zh  of length  1165 ...
-------------- 4184 -----------------
translating text1 of language  zh  of length  150 ...
translating text2 of language  zh  of length  1108 ...
-------------- 4185 -----------------
translating text1 of language  zh  of length  548 ...
translating text2 of language  zh  of length  613 ...
-------------- 4186 -----------------
translating text1 of language  zh  of length  386 ...
translating text2 of language  zh  of length  307 ...
-------------- 4187 -----------------
translating text1 of language  zh  of length  356 ...
translating text2 of language  zh  of length  231 ...
-------------- 4188 -----------------
translating text1 of language  zh  of length  838 ...
translating text2 of language

translating text2 of language  ar  of length  581 ...
-------------- 4295 -----------------
translating text1 of language  ar  of length  16293 ...
translating text2 of language  ar  of length  1618 ...
-------------- 4296 -----------------
translating text1 of language  ar  of length  908 ...
translating text2 of language  ar  of length  1439 ...
-------------- 4297 -----------------
translating text1 of language  ar  of length  815 ...
translating text2 of language  ar  of length  2121 ...
-------------- 4298 -----------------
translating text1 of language  ar  of length  1285 ...
translating text2 of language  ar  of length  1287 ...
4300 rows done.
-------------- 4299 -----------------
translating text1 of language  ar  of length  1182 ...
translating text2 of language  ar  of length  3448 ...
-------------- 4300 -----------------
translating text1 of language  ar  of length  1066 ...
translating text2 of language  ar  of length  638 ...
-------------- 4301 -----------------
transl

-------------- 4417 -----------------
translating text1 of language  de  of length  2879 ...
translating text2 of language  de  of length  1127 ...
-------------- 4418 -----------------
translating text1 of language  de  of length  1364 ...
translating text2 of language  de  of length  1713 ...
-------------- 4419 -----------------
translating text1 of language  de  of length  1703 ...
translating text2 of language  de  of length  1078 ...
-------------- 4420 -----------------
translating text1 of language  de  of length  1663 ...
translating text2 of language  de  of length  1569 ...
-------------- 4421 -----------------
translating text1 of language  de  of length  908 ...
translating text2 of language  de  of length  1688 ...
-------------- 4422 -----------------
translating text1 of language  de  of length  983 ...
translating text2 of language  de  of length  756 ...
-------------- 4423 -----------------
translating text1 of language  de  of length  1208 ...
translating text2 of l

translating text2 of language  es  of length  3174 ...
-------------- 4605 -----------------
translating text1 of language  es  of length  2459 ...
translating text2 of language  es  of length  1845 ...
-------------- 4606 -----------------
translating text1 of language  es  of length  2924 ...
translating text2 of language  es  of length  1924 ...
-------------- 4607 -----------------
translating text1 of language  es  of length  754 ...
translating text2 of language  es  of length  2108 ...
-------------- 4608 -----------------
translating text1 of language  es  of length  1037 ...
translating text2 of language  es  of length  838 ...
-------------- 4609 -----------------
translating text1 of language  es  of length  2487 ...
translating text2 of language  es  of length  1752 ...
-------------- 4610 -----------------
translating text1 of language  es  of length  5216 ...
translating text2 of language  es  of length  2993 ...
-------------- 4611 -----------------
translating text1 of 

translating text1 of language  es  of length  4323 ...
translating text2 of language  es  of length  3258 ...
-------------- 4725 -----------------
translating text1 of language  es  of length  933 ...
translating text2 of language  es  of length  1005 ...
-------------- 4726 -----------------
translating text1 of language  es  of length  1513 ...
translating text2 of language  es  of length  2517 ...
-------------- 4727 -----------------
translating text1 of language  es  of length  1634 ...
translating text2 of language  es  of length  1597 ...
-------------- 4728 -----------------
translating text1 of language  es  of length  1743 ...
translating text2 of language  es  of length  1795 ...
-------------- 4729 -----------------
translating text1 of language  es  of length  1235 ...
translating text2 of language  es  of length  947 ...
-------------- 4730 -----------------
translating text1 of language  es  of length  4049 ...
translating text2 of language  es  of length  5067 ...
----

translating text2 of language  zh  of length  573 ...
-------------- 4831 -----------------
translating text1 of language  zh  of length  612 ...
translating text2 of language  zh  of length  263 ...
-------------- 4832 -----------------
translating text1 of language  zh  of length  241 ...
translating text2 of language  zh  of length  188 ...
-------------- 4833 -----------------
translating text1 of language  zh  of length  428 ...
translating text2 of language  zh  of length  513 ...
-------------- 4834 -----------------
translating text1 of language  zh  of length  777 ...
translating text2 of language  zh  of length  1247 ...
-------------- 4835 -----------------
translating text1 of language  zh  of length  316 ...
translating text2 of language  zh  of length  294 ...
-------------- 4836 -----------------
translating text1 of language  zh  of length  1891 ...
translating text2 of language  zh  of length  2911 ...
-------------- 4837 -----------------
translating text1 of language

One can observe that only 37 documents have been truncated due to the absence of formatting or paragraphs. <br><br>

In [9]:
eval_df.loc[2249, "translated_body1"]

'Vienna (DPA-AFX) - The Vienna Stock Exchange has closed on Monday with price losses. The ATX fell 27.81 points or 0.88 percent to 3119.10 units. The European environment was not a clear direction at the beginning of the week. In addition, investors are mainly concerned with the economic impact of the virus crisis and follow themselves accordingly on the market, commented on a Börsian.\n\n\n\nEconomic data from Europe did not provide any significant impulses. The reporting situation on the domestic companies remained quite thin - the reporting season takes place again in the further weekly.\n\n\n\nThe shares in the spotlight counted UNIQA, which, with plus 7.92 percent, also made the winning list in the Prime Market clearly. The insurance company has purchased by a billion euros in Eastern Europe. From the French industry-colleague AXA, the daughters were acquired in Poland, Czech Republic and Slovakia. The acquisition brings five million new customers and 800 million euros multi-premi

In [10]:
path = 'eval/_EVAL_text_translated.csv'
eval_df.to_csv(path,index=False)