# Google Notebook
## Importing Libraries and Data

In [1]:
import pandas as pd
import s3fs
import boto3
from nltk.translate.bleu_score import sentence_bleu

In [3]:
translate=boto3.client("translate")

In [75]:
fs = s3fs.S3FileSystem()

In [76]:
file = fs.open('s3://example-mallik/tableConvert_5.csv')

In [77]:
data = pd.read_csv(file)

In [78]:
data

Unnamed: 0,Language,Informal_Text_1,Informal_Text_2,Formal_Text_1,Formal_Text_2
0,English,In my younger and more vulnerable years my fat...,I had a dog — at least I had him for a few day...,"In consequence, I’m inclined to reserve all ju...",Almost any exhibition of complete self suffici...
1,Chinese,我年纪还轻，阅历不深的时候，我父亲教导过我一句话，我至今还念念不忘。“每逢你想要批评任何人的...,我有一条狗——至少在它跑掉以前我养了它几天——一辆旧道吉汽车和一个芬兰女佣人，她替我收拾床铺...,久而久之，我就惯于对所有的人都保留判断，这个习惯既使得许多有怪僻的人肯跟我讲心里话，也使我成...,这种几乎是完全我行我素的神情总是使我感到目瞪口呆，满心赞佩。
2,Spanish,En mis años mozos y más vulnerables mi padre m...,Tenía un perro -o al menos lo tuve durante var...,"En consecuencia, soy una persona dada a reserv...",Casi cualquier exhibición de total autosuficie...
3,French,"Quand j’étais plus jeune, ce qui veut dire plu...",J’avais un chien – du moins je l’eus pendant q...,"En conséquence, je suis porté à réserver mes j...",N’importe quelle exhibition d’assurance m’exto...


We will take the Chinese, Spanish, and French translation of the same sentence and translate this sentence into English. We will do this in both Google Translate and AWS Translate.

In [22]:
translate=boto3.client("translate")

# Informal

# Informal Text 1

**The original text in English is:**
In my younger and more vulnerable years my father gave me some advice that I’ve been turning over in my mind ever since. “Whenever you feel like criticizing any one,” he told me, “just remember that all the people in this world haven’t had the advantages that you’ve had.”

## Chinese to English


Retrieving and translating the text data 

**The text in Chinese is:**
"我年纪还轻，阅历不深的时候，我父亲教导过我一句话，我至今还念念不忘。“每逢你想要批评任何人的时候，”他对我说，“你就记住，这个世界上所有的人，并不是个个都有过你拥有的那些优越条件。”

**The Google translation is:** When I was young and inexperienced, my father taught me a word, which I still cannot forget. "Whenever you feel like criticizing anyone," he said to me, "just remember that not everyone in this world has had the advantages you have.

#### Qualitative Analysis:

Based on our judgement as English speakers, we feel this translation was pretty accurate and conveyed the meaning well enough. We would score it a 
**8.5/10.**

#### Quantitative Analysis

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [11]:
eng_text = data['Informal_Text_1'].iloc[0]

In [12]:
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

in my younger and more vulnerable years my father gave me some advice that i’ve been turning over in my mind ever since whenever you feel like criticizing any one he told me just remember that all the people in this world haven’t had the advantages that you’ve had


Then, converted to a list

In [13]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['in', 'my', 'younger', 'and', 'more', 'vulnerable', 'years', 'my', 'father', 'gave', 'me', 'some', 'advice', 'that', 'i’ve', 'been', 'turning', 'over', 'in', 'my', 'mind', 'ever', 'since', 'whenever', 'you', 'feel', 'like', 'criticizing', 'any', 'one', 'he', 'told', 'me', 'just', 'remember', 'that', 'all', 'the', 'people', 'in', 'this', 'world', 'haven’t', 'had', 'the', 'advantages', 'that', 'you’ve', 'had']


Next, converted the translated text to lowercase and removed special characters

In [5]:
trans_text = "When I was young and inexperienced, my father taught me a word, which I still cannot forget. 'Whenever you feel like criticizing anyone,' he said to me, 'just remember that not everyone in this world has had the advantages you have.'"

In [6]:
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

when i was young and inexperienced my father taught me a word which i still cannot forget 'whenever you feel like criticizing anyone' he said to me 'just remember that not everyone in this world has had the advantages you have'


Then, converted this to a list too.

In [7]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['when', 'i', 'was', 'young', 'and', 'inexperienced', 'my', 'father', 'taught', 'me', 'a', 'word', 'which', 'i', 'still', 'cannot', 'forget', "'whenever", 'you', 'feel', 'like', 'criticizing', "anyone'", 'he', 'said', 'to', 'me', "'just", 'remember', 'that', 'not', 'everyone', 'in', 'this', 'world', 'has', 'had', 'the', 'advantages', 'you', "have'"]


Finally, we checked the percentage of words from translated text that matched the original text.

In [8]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)
    

0.4634146341463415


We also checked the BLEU Score.

**2) BLEU Score**

In [13]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.10624793541906809


This indicates that the Chinese to English translation using the Google API did not perform too well.

## Spanish to English


Retrieving and translating the text data

**The text in Spanish is:**
"En mis años mozos y más vulnerables mi padre me dio un consejo que desde aquella época no ha dejado de darme vueltas en la cabeza. 'Cuando sientas deseos de criticar a alguien' -fueron sus palabras- 'recuerda que no todo el mundo ha tenido las mismas oportunidades que tú tuviste.'"

**The Google translation is:** In my younger and more vulnerable years my father gave me a piece of advice that has been on my mind ever since. "When you feel like criticizing someone" -were his words- "remember that not everyone has had the same opportunities that you had."

#### Qualitative Analysis:

Based on our judgement as English speakers, we feel this translation was pretty accurate and conveyed the meaning well enough. We would score it a 
**9/10.**

#### Quantitative Analysis

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [1]:
eng_text = data['Informal_Text_1'].iloc[0]First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [2]:
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

in my younger and more vulnerable years my father gave me some advice that i’ve been turning over in my mind ever since whenever you feel like criticizing any one he told me just remember that all the people in this world haven’t had the advantages that you’ve had


Then, converted to a list

In [3]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['in', 'my', 'younger', 'and', 'more', 'vulnerable', 'years', 'my', 'father', 'gave', 'me', 'some', 'advice', 'that', 'i’ve', 'been', 'turning', 'over', 'in', 'my', 'mind', 'ever', 'since', 'whenever', 'you', 'feel', 'like', 'criticizing', 'any', 'one', 'he', 'told', 'me', 'just', 'remember', 'that', 'all', 'the', 'people', 'in', 'this', 'world', 'haven’t', 'had', 'the', 'advantages', 'that', 'you’ve', 'had']


Next, converted the translated text to lowercase and removed special characters

In [14]:
trans_text = "In my younger and more vulnerable years my father gave me a piece of advice that has been on my mind ever since. 'When you feel like criticizing someone' -were his words- 'remember that not everyone has had the same opportunities that you had.'"



In [15]:
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

in my younger and more vulnerable years my father gave me a piece of advice that has been on my mind ever since 'when you feel like criticizing someone' -were his words- 'remember that not everyone has had the same opportunities that you had'


Then, converted this to a list too.

In [16]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['in', 'my', 'younger', 'and', 'more', 'vulnerable', 'years', 'my', 'father', 'gave', 'me', 'a', 'piece', 'of', 'advice', 'that', 'has', 'been', 'on', 'my', 'mind', 'ever', 'since', "'when", 'you', 'feel', 'like', 'criticizing', "someone'", '-were', 'his', 'words-', "'remember", 'that', 'not', 'everyone', 'has', 'had', 'the', 'same', 'opportunities', 'that', 'you', "had'"]


Finally, we checked the percentage of words from translated text that matched the original text.

In [17]:
Finally, we checked the percentage of words from translated text that matched the original text.acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)
    

0.6136363636363636


We also checked the BLEU Score.

**2) BLEU Score**

In [18]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.3307303493291697


This indicates that the Spanish to English translation using the Google API didn't perform too well, but better than other translations.

## French to English


Retrieving and translating the text data

**The text in French is:**
Quand j’étais plus jeune, ce qui veut dire plus vulnérable, mon père me donna un conseil que je ne cesse de retourner dans mon esprit : – Quand tu auras envie de critiquer quelqu’un, songe que tout le monde n’a pas joui des mêmes avantages que toi.


**The Google translation is:** When I was younger, which means more vulnerable, my father gave me advice that I keep returning to my mind: - When you want to criticize someone, remember that not everyone has enjoy the same benefits as you.

#### Qualitative Analysis:

Based on our judgement as English speakers, we feel this translation was pretty accurate and conveyed the meaning well enough. We would score it a 
**8/10.**

#### Quantitative Analysis

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [1]:
eng_text = data['Informal_Text_1'].iloc[0]

In [2]:
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

in my younger and more vulnerable years my father gave me some advice that i’ve been turning over in my mind ever since whenever you feel like criticizing any one he told me just remember that all the people in this world haven’t had the advantages that you’ve had


Then, converted to a list

In [3]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['in', 'my', 'younger', 'and', 'more', 'vulnerable', 'years', 'my', 'father', 'gave', 'me', 'some', 'advice', 'that', 'i’ve', 'been', 'turning', 'over', 'in', 'my', 'mind', 'ever', 'since', 'whenever', 'you', 'feel', 'like', 'criticizing', 'any', 'one', 'he', 'told', 'me', 'just', 'remember', 'that', 'all', 'the', 'people', 'in', 'this', 'world', 'haven’t', 'had', 'the', 'advantages', 'that', 'you’ve', 'had']


Next, converted the translated text to lowercase and removed special characters

In [19]:
trans_text = "When I was younger, which means more vulnerable, my father gave me advice that I keep returning to my mind: - When you want to criticize someone, remember that not everyone has enjoy the same benefits as you."



In [20]:
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

when i was younger which means more vulnerable my father gave me advice that i keep returning to my mind - when you want to criticize someone remember that not everyone has enjoy the same benefits as you


Then, converted this to a list too.

In [21]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['when', 'i', 'was', 'younger', 'which', 'means', 'more', 'vulnerable', 'my', 'father', 'gave', 'me', 'advice', 'that', 'i', 'keep', 'returning', 'to', 'my', 'mind', '-', 'when', 'you', 'want', 'to', 'criticize', 'someone', 'remember', 'that', 'not', 'everyone', 'has', 'enjoy', 'the', 'same', 'benefits', 'as', 'you']


Finally, we checked the percentage of words from translated text that matched the original text.

In [22]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)
    

0.42105263157894735


We also checked the BLEU Score.

**2) BLEU Score**

In [23]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.08016440471425877


This indicates that the French to English translation using the Google API did not perform well.

# Informal Text 2

**The original text in English is:**
I had a dog — at least I had him for a few days until he ran away — and an old Dodge and a Finnish woman, who made my bed and cooked breakfast and muttered Finnish wisdom to herself over the electric stove.

Retrieving and translating the text data

## Chinese to English


Retrieving and translating the text data

**The text in Chinese is:**
我有一条狗——至少在它跑掉以前我养了它几天——一辆旧道吉汽车和一个芬兰女佣人，她替我收拾床铺，烧早饭，在电炉上一面做饭，一面嘴里咕哝着芬兰的格言。

**The Google translation is:** I have a dog - at least I had him for a few days before he ran away - an old Dodge and a Finnish maid who makes my bed and cooks breakfast for me on the hot plate while she mutters Muttering Finnish proverbs.

#### Qualitative Analysis:

Based on our judgement as English speakers, we feel this translation was pretty accurate and conveyed the meaning well enough. We would score it a 
**8.5/10.**

#### Quantitative Analysis

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [14]:
eng_text = data['Informal_Text_2'].iloc[0]

special_characters=[',','.',':','?','!', "”", "“", "— "]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

i had a dog at least i had him for a few days until he ran away and an old dodge and a finnish woman who made my bed and cooked breakfast and muttered finnish wisdom to herself over the electric stove


Then, converted to a list

In [15]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['i', 'had', 'a', 'dog', 'at', 'least', 'i', 'had', 'him', 'for', 'a', 'few', 'days', 'until', 'he', 'ran', 'away', 'and', 'an', 'old', 'dodge', 'and', 'a', 'finnish', 'woman', 'who', 'made', 'my', 'bed', 'and', 'cooked', 'breakfast', 'and', 'muttered', 'finnish', 'wisdom', 'to', 'herself', 'over', 'the', 'electric', 'stove']


Next, converted the translated text to lowercase and removed special characters.

In [3]:
trans_text = "I have a dog - at least I had him for a few days before he ran away - an old Dodge and a Finnish maid who makes my bed and cooks breakfast for me on the hot plate while she mutters Muttering Finnish proverbs."
special_characters=[',','.',':','?','!', "”", "“", "— "]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

i have a dog - at least i had him for a few days before he ran away - an old dodge and a finnish maid who makes my bed and cooks breakfast for me on the hot plate while she mutters muttering finnish proverbs


Then, converted this to a list too.

In [4]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['i', 'have', 'a', 'dog', '-', 'at', 'least', 'i', 'had', 'him', 'for', 'a', 'few', 'days', 'before', 'he', 'ran', 'away', '-', 'an', 'old', 'dodge', 'and', 'a', 'finnish', 'maid', 'who', 'makes', 'my', 'bed', 'and', 'cooks', 'breakfast', 'for', 'me', 'on', 'the', 'hot', 'plate', 'while', 'she', 'mutters', 'muttering', 'finnish', 'proverbs']


Finally, we checked the percentage of words from translated text that matched the original text.

In [5]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)
    

0.6444444444444445


We also checked the BLEU Score.

**2) BLEU Score**

In [10]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.3583798874653872


This indicates that the Chinese to English translation using the Google API performed decently well.

## Spanish to English


Retrieving and translating the text data

**The text in Spanish is:**
Tenía un perro -o al menos lo tuve durante varios días, antes de que escapara-, un viejo Dodge y una criada oriunda de Finlandia que me tendía la cama, hacía el desayuno y mascullaba máximas finlandesas junto a la estufa eléctrica.

**The Google translation is:** 
I had a dog-or at least I had for several days before he ran away-an old Dodge and a maid from Finland who made my bed, cooked breakfast and muttered Finnish maxims by the electric stove.

#### Qualitative Analysis:

Based on our judgement as English speakers, we feel this translation was pretty accurate and conveyed the meaning well enough. We would score it a 
**8.5/10.**

#### Quantitative Analysis

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [11]:
eng_text = data['Informal_Text_2'].iloc[0]
special_characters=[',','.',':','?','!', "”", "“", "— "]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

i had a dog at least i had him for a few days until he ran away and an old dodge and a finnish woman who made my bed and cooked breakfast and muttered finnish wisdom to herself over the electric stove


Then, converted to a list

In [12]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['i', 'had', 'a', 'dog', 'at', 'least', 'i', 'had', 'him', 'for', 'a', 'few', 'days', 'until', 'he', 'ran', 'away', 'and', 'an', 'old', 'dodge', 'and', 'a', 'finnish', 'woman', 'who', 'made', 'my', 'bed', 'and', 'cooked', 'breakfast', 'and', 'muttered', 'finnish', 'wisdom', 'to', 'herself', 'over', 'the', 'electric', 'stove']


Next, converted the translated text to lowercase and removed special characters.

In [20]:
trans_text = "I had a dog-or at least I had for several days before he ran away-an old Dodge and a maid from Finland who made my bed, cooked breakfast and muttered Finnish maxims by the electric stove."

special_characters=[',','.',':','?','!', "”", "“", "— "]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

i had a dog-or at least i had for several days before he ran away-an old dodge and a maid from finland who made my bed cooked breakfast and muttered finnish maxims by the electric stove


Then, converted this to a list too.

In [21]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['i', 'had', 'a', 'dog-or', 'at', 'least', 'i', 'had', 'for', 'several', 'days', 'before', 'he', 'ran', 'away-an', 'old', 'dodge', 'and', 'a', 'maid', 'from', 'finland', 'who', 'made', 'my', 'bed', 'cooked', 'breakfast', 'and', 'muttered', 'finnish', 'maxims', 'by', 'the', 'electric', 'stove']


Finally, we checked the percentage of words from translated text that matched the original text.

In [22]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)
    

0.75


We also checked the BLEU Score.

**2) BLEU Score**

In [23]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.3158350346275703


This indicates that the Chinese to English translation using the Google API performed decently well.

## French to English

Retrieving and translating the text data

**The text in French is:**
J’avais un chien – du moins je l’eus pendant quelques jours jusqu’à ce qu’il prît la clef des champs – une vieille auto Dodge et une Finlandaise qui faisait mon lit, préparait mon petit déjeuner et marmottait des proverbes finnois, en s’affairant devant le fourneau électrique.

**The Google translation is:** I had a dog – at least I had him for a few days until he left the countryside – an old Dodge car and a Finnish woman who made my bed, cooked my breakfast and muttered Finnish proverbs, fussing in front of the electric stove.

#### Qualitative Analysis:

Based on our judgement as English speakers, we feel this translation was pretty accurate and conveyed the meaning well enough. We would score it a 
**8/10.**

#### Quantitative Analysis

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [24]:
eng_text = data['Informal_Text_2'].iloc[0]

special_characters=[',','.',':','?','!', "”", "“", "— "]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

i had a dog at least i had him for a few days until he ran away and an old dodge and a finnish woman who made my bed and cooked breakfast and muttered finnish wisdom to herself over the electric stove


Then, converted to a list

In [25]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['i', 'had', 'a', 'dog', 'at', 'least', 'i', 'had', 'him', 'for', 'a', 'few', 'days', 'until', 'he', 'ran', 'away', 'and', 'an', 'old', 'dodge', 'and', 'a', 'finnish', 'woman', 'who', 'made', 'my', 'bed', 'and', 'cooked', 'breakfast', 'and', 'muttered', 'finnish', 'wisdom', 'to', 'herself', 'over', 'the', 'electric', 'stove']


Next, converted the translated text to lowercase and removed special characters.

In [26]:
trans_text = " I had a dog – at least I had him for a few days until he left the countryside – an old Dodge car and a Finnish woman who made my bed, cooked my breakfast and muttered Finnish proverbs, fussing in front of the electric stove."
special_characters=[',','.',':','?','!', "”", "“", "— "]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

 i had a dog – at least i had him for a few days until he left the countryside – an old dodge car and a finnish woman who made my bed cooked my breakfast and muttered finnish proverbs fussing in front of the electric stove


Then, converted this to a list too.

In [27]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['', 'i', 'had', 'a', 'dog', '–', 'at', 'least', 'i', 'had', 'him', 'for', 'a', 'few', 'days', 'until', 'he', 'left', 'the', 'countryside', '–', 'an', 'old', 'dodge', 'car', 'and', 'a', 'finnish', 'woman', 'who', 'made', 'my', 'bed', 'cooked', 'my', 'breakfast', 'and', 'muttered', 'finnish', 'proverbs', 'fussing', 'in', 'front', 'of', 'the', 'electric', 'stove']


Finally, we checked the percentage of words from translated text that matched the original text.

In [28]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)
    

0.7659574468085106


We also checked the BLEU Score.

**2) BLEU Score**

In [29]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.5207598499006179


This indicates that the Spanish to English translation using the Google API performed well.

# Formal

# Formal Text 1

**The original text in English is:**  In consequence, I’m inclined to reserve all judgments, a habit that has opened up many curious natures to me and also made me the victim of not a few veteran bores. 


## Chinese to English

Retrieving and translating the text data

**The text in Chinese is:** 这种几乎是完全我行我素的神情总是使我感到目瞪口呆，满心赞佩。

**The Google translation is:** Over time I have been in the habit of withholding judgment from all men, and this habit has made many eccentrics open to me, and has made me the victim of not a few nagging bores.

#### Qualitative Analysis:

Based on our judgement as English speakers, we feel this translation was decently accurate and conveyed the meaning well enough. We would score it a 
**6.5/10.**

#### Quantitative Analysis

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [17]:
eng_text = data['Formal_Text_1'].iloc[0]

In [18]:
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

in consequence i’m inclined to reserve all judgments a habit that has opened up many curious natures to me and also made me the victim of not a few veteran bores


Then, converted to a list

In [19]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['in', 'consequence', 'i’m', 'inclined', 'to', 'reserve', 'all', 'judgments', 'a', 'habit', 'that', 'has', 'opened', 'up', 'many', 'curious', 'natures', 'to', 'me', 'and', 'also', 'made', 'me', 'the', 'victim', 'of', 'not', 'a', 'few', 'veteran', 'bores']


Next, converted the translated text to lowercase and removed special characters.

In [28]:
trans_text = "Over time I have been in the habit of withholding judgment from all men, and this habit has made many eccentrics open to me, and has made me the victim of not a few nagging bores."

In [29]:
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

over time i have been in the habit of withholding judgment from all men and this habit has made many eccentrics open to me and has made me the victim of not a few nagging bores


Then, converted this to a list too.

In [30]:
Then, converted this to a list too.trans_text_words = trans_text.split(" ")
print(trans_text_words)

['over', 'time', 'i', 'have', 'been', 'in', 'the', 'habit', 'of', 'withholding', 'judgment', 'from', 'all', 'men', 'and', 'this', 'habit', 'has', 'made', 'many', 'eccentrics', 'open', 'to', 'me', 'and', 'has', 'made', 'me', 'the', 'victim', 'of', 'not', 'a', 'few', 'nagging', 'bores']


Finally, we checked the percentage of words from translated text that matched the original text.

In [31]:
Finally, we checked the percentage of words from translated text that matched the original text.acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)
    

0.6388888888888888


We also checked the BLEU Score.

**2) BLEU Score**

In [32]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.24696341803668823


This indicates that the Chinese to English translation using the Google API did not perform too well, but decently better than other translations.

## Spanish to English

Retrieving and translating the text data

**The text in Spanish is:** En consecuencia, soy una persona dada a reservarme todo juicio, hábito que me ha facilitado el conocimiento de gran número de personas singulares, pero que también me ha hecho víctima de más de un latoso inveterado.

**The Google translation is:** Consequently, I am a person given to reserving all judgment, a habit that has made it easier for me to meet a large number of unique people, but which has also made me the victim of more than one inveterate nuisance.

### Qualitative Analysis:
Based on our judgement as English speakers, we feel this translation was relatively accurate and did convey the meaning well enough. We would score it a 7/10.

### Quantitative Analysis:

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [20]:
eng_text = data['Formal_Text_1'].iloc[0]

special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

in consequence i’m inclined to reserve all judgments a habit that has opened up many curious natures to me and also made me the victim of not a few veteran bores


Then, converted to a list

In [70]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['in', 'consequence', 'i’m', 'inclined', 'to', 'reserve', 'all', 'judgments', 'a', 'habit', 'that', 'has', 'opened', 'up', 'many', 'curious', 'natures', 'to', 'me', 'and', 'also', 'made', 'me', 'the', 'victim', 'of', 'not', 'a', 'few', 'veteran', 'bores']


Next, converted the translated text to lowercase and removed special characters.

In [33]:
trans_text = "Consequently, I am a person given to reserving all judgment, a habit that has made it easier for me to meet a large number of unique people, but which has also made me the victim of more than one inveterate nuisance."
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

consequently i am a person given to reserving all judgment a habit that has made it easier for me to meet a large number of unique people but which has also made me the victim of more than one inveterate nuisance


Then, converted this to a list too

In [34]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['consequently', 'i', 'am', 'a', 'person', 'given', 'to', 'reserving', 'all', 'judgment', 'a', 'habit', 'that', 'has', 'made', 'it', 'easier', 'for', 'me', 'to', 'meet', 'a', 'large', 'number', 'of', 'unique', 'people', 'but', 'which', 'has', 'also', 'made', 'me', 'the', 'victim', 'of', 'more', 'than', 'one', 'inveterate', 'nuisance']


Finally, we checked the percentage of words from translated text that matched the original text.

In [35]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)

0.4634146341463415


We also checked the BLEU Score.

**2) BLEU Score**

In [36]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.179987930439129


This indicates that the Spanish to English translation using the Google API did not perform too well.

## French to English 

Retrieving and translating the text data

**The text in French is:** En conséquence, je suis porté à réserver mes jugements, habitude qui m’a ouvert bien des natures curieuses, non sans me rendre victime de pas mal de raseurs invétérés.

**The Google translation is:** Consequently, I am inclined to reserve my judgments, a habit that has opened up many curious natures to me, not without making me the victim of a lot of inveterate borers.

### Qualitative Analysis:
Based on our judgement as English speakers, we feel this translation was relatively accurate and did convey the meaning well enough. We would score it a 7/10.

### Quantitative Analysis:

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [37]:
eng_text = data['Formal_Text_1'].iloc[0]

special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

in consequence i’m inclined to reserve all judgments a habit that has opened up many curious natures to me and also made me the victim of not a few veteran bores


Then, converted to a list

In [38]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['in', 'consequence', 'i’m', 'inclined', 'to', 'reserve', 'all', 'judgments', 'a', 'habit', 'that', 'has', 'opened', 'up', 'many', 'curious', 'natures', 'to', 'me', 'and', 'also', 'made', 'me', 'the', 'victim', 'of', 'not', 'a', 'few', 'veteran', 'bores']


Next, converted the translated text to lowercase and removed special characters.

In [39]:
trans_text = "Consequently, I am inclined to reserve my judgments, a habit that has opened up many curious natures to me, not without making me the victim of a lot of inveterate borers."
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

consequently i am inclined to reserve my judgments a habit that has opened up many curious natures to me not without making me the victim of a lot of inveterate borers


Then, converted this to a list too

In [40]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['consequently', 'i', 'am', 'inclined', 'to', 'reserve', 'my', 'judgments', 'a', 'habit', 'that', 'has', 'opened', 'up', 'many', 'curious', 'natures', 'to', 'me', 'not', 'without', 'making', 'me', 'the', 'victim', 'of', 'a', 'lot', 'of', 'inveterate', 'borers']


Finally, we checked the percentage of words from translated text that matched the original text.

In [41]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)

0.7096774193548387


We also checked the BLEU Score.

**2) BLEU Score**

In [42]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.49041180180807975


This indicates that the French to English translation using the Google API perform pretty well.

# Formal Text 2

**The original text in English is:** Almost any exhibition of complete self sufficiency draws a stunned tribute from me.


## Chinese to English

Retrieving and translating the text data

**The text in Chinese is:** 这种几乎是完全我行我素的神情总是使我感到目瞪口呆，满心赞佩。

**The Google translation is:** This almost complete self-indulgence always left me dumbfounded and filled with admiration.

### Qualitative Analysis:
Based on our judgement as English speakers, we feel this translation was somewhat accurate and partially conveyed the meaning. We would score it a 6/10.

### Quantitative Analysis:

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [21]:
eng_text = data['Formal_Text_2'].iloc[0]

special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

almost any exhibition of complete self sufficiency draws a stunned tribute from me


Then, converted to a list

In [63]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['almost', 'any', 'exhibition', 'of', 'complete', 'self', 'sufficiency', 'draws', 'a', 'stunned', 'tribute', 'from', 'me']


Next, converted the translated text to lowercase and removed special characters.

In [64]:
trans_text = "This almost complete self-indulgence always left me dumbfounded and filled with admiration."
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

this almost complete self-indulgence always left me dumbfounded and filled with admiration


Then, converted this to a list too

In [65]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['this', 'almost', 'complete', 'self-indulgence', 'always', 'left', 'me', 'dumbfounded', 'and', 'filled', 'with', 'admiration']


Finally, we checked the percentage of words from translated text that matched the original text.

In [66]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)

0.25


We also checked the BLEU Score.

**2) BLEU Score**

In [67]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 1.2627076138080564e-231


This indicates that the Spanish to English translation using the Google API did not perform well.

## Spanish to English 

Retrieving and translating the text data

**The text in Spanish is:** Casi cualquier exhibición de total autosuficiencia arranca de mí un atónito tributo.

**The Google translation is:** Almost any display of total self-sufficiency draws a stunned tribute from me.

### Qualitative Analysis:
Based on our judgement as English speakers, we feel this translation was accurate and conveyed the meaning well. We would score it a 9/10.

### Quantitative Analysis:

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [68]:
eng_text = data['Formal_Text_2'].iloc[0]

special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

almost any exhibition of complete self sufficiency draws a stunned tribute from me


Then, converted to a list

In [69]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['almost', 'any', 'exhibition', 'of', 'complete', 'self', 'sufficiency', 'draws', 'a', 'stunned', 'tribute', 'from', 'me']


Next, converted the translated text to lowercase and removed special characters.

In [70]:
trans_text = "Almost any display of total self-sufficiency draws a stunned tribute from me."
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

almost any display of total self-sufficiency draws a stunned tribute from me


Then, converted this to a list too

In [71]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['almost', 'any', 'display', 'of', 'total', 'self-sufficiency', 'draws', 'a', 'stunned', 'tribute', 'from', 'me']


Finally, we checked the percentage of words from translated text that matched the original text.

In [72]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)

0.75


We also checked the BLEU Score.

**2) BLEU Score**

In [73]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 0.44082318755867267


This indicates that the Spanish to English translation using the AWS API performed decently well.

## French to English 

Retrieving and translating the text data

**The text in French is:** N’importe quelle exhibition d’assurance m’extorque un tribut étonné.

**The Google translation is:** Any exhibition of assurance extorts an astonished tribute from me.

### Qualitative Analysis:
Based on our judgement as English speakers, we feel this translation was not entirely accurate. We would score it a 5/10.

### Quantitative Analysis:

**1) Percentage of words matching with original text**

First, we converted the original text to lowercase and removed special characters. This is to ensure a more fair and accurate comparison.

In [74]:
eng_text = data['Formal_Text_2'].iloc[0]

special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    eng_text=eng_text.replace(i,"")
    
eng_text = eng_text.lower()
print(eng_text)

almost any exhibition of complete self sufficiency draws a stunned tribute from me


Then, converted to a list

In [75]:
eng_text_words = eng_text.split(" ")
print(eng_text_words)

['almost', 'any', 'exhibition', 'of', 'complete', 'self', 'sufficiency', 'draws', 'a', 'stunned', 'tribute', 'from', 'me']


Next, converted the translated text to lowercase and removed special characters.

In [76]:
trans_text = "Any exhibition of assurance extorts an astonished tribute from me."
special_characters=[',','.',':','?','!', "”", "“"]
for i in special_characters:
    trans_text=trans_text.replace(i,"")
    
trans_text = trans_text.lower()
print(trans_text)

any exhibition of assurance extorts an astonished tribute from me


Then, converted this to a list too

In [77]:
trans_text_words = trans_text.split(" ")
print(trans_text_words)

['any', 'exhibition', 'of', 'assurance', 'extorts', 'an', 'astonished', 'tribute', 'from', 'me']


Finally, we checked the percentage of words from translated text that matched the original text.

In [78]:
acc = 0
tot = 0
for word in trans_text_words:
    if word in eng_text_words:
        acc += 1
    tot += 1
    
score = acc / tot
print(score)

0.6


We also checked the BLEU Score.

**2) BLEU Score**

In [79]:
reference = [
    trans_text.split()
]

candidate = eng_text.split()
print('BLEU score -> {}'.format(sentence_bleu(reference, candidate)))

BLEU score -> 4.994788421695789e-78


This indicates that the Spanish to English translation using the AWS API did not perform well.