# Translation (textblob)

make a jupyter notebook in which you will :
- import the module(https://textblob.readthedocs.io/en/dev/quickstart.html)
- read the .txt file
- use textblob to translate the file in german
- use textblob to calculate the sentiment of the english and the german version. do they differ?

## Installations

I installed:
* textblob
* textblob-de
* nltk, because otherwise textblob is not working
* GoggleTrans

via the terminal as i tipped in: `pip install textblob` and `pip install googletrans==3.1.0a0`. I downloaded NLTK via Python with the commands `>>> import nltk` and then `>>>nltk.download()`. After this last command it should open a new window and I installed everything from the 'Collections'. And for the german sentiment analysis I downloaded textblob-de per download from the page https://github.com/markuskiller/textblob-de.

After the installation I moved all of these new librarys into the folder anaconda3>Lib>site-packages. The notebook itself I created in anaconda3. It is important that every file/library ecc. is in the same path otherwise it's not working!

## TextBlob

First I checked if I can import the TextBlob module with the following command, which worked.

In [1]:
from textblob import TextBlob

For the second task for the homework I needed to read the text file ice_man. So I opened and printed the text that I need to translate. 

In [2]:
with open('texts/nlp/ice_man.txt', encoding='utf-8') as fh:
    data = fh.read()
print(data) 

I married the Ice Man.
I first met the Ice Man at this ski resort hotel. I guess that’s the kind of place one ought to meet an Ice Man. In the boisterous hotel lobby, crowded with young people, the Ice Man was sitting in a chair at the furthest possible remove from the fireplace, silently reading a book. Though it was approaching high noon, it seemed to me that the cool, fresh light of the winter morning still lingered around him. Hey, that’s the Ice Man, my friend informed me in a low voice. But at that time, I had no idea what in the world an Ice Man was. My friend didn’t really know, either. She just knew that he existed and was called the Ice Man. She’s sure he’s made out of ice. That’s why he’s called the Ice Man, she said to me with a serious expression. It was like she was talking about a ghost or somebody with a contagious disease or something.

The Ice Man was tall, and from looking at him, his hair seemed bristly. When I saw his face, he looked fairly young still, but that th

I also analyed the sentiment of the english text with textblob.

In [3]:
from textblob import TextBlob

blob = TextBlob(data)
blob.sentiment

Sentiment(polarity=0.05720858766015831, subjectivity=0.45825535139671275)

### German Translation with textblob (not working)

In generall textblob should be able to translate a text in many languages. I tried many different commands but it gave me always the same errors. I also extended my TextBlob by downloading the german extension from https://github.com/markuskiller/textblob-de and moved it into the site-packages folder from the anaconda path. It's the same path that I mentioned before. But for the translation it didn't change anything.

In [4]:
from textblob import TextBlob
blob = TextBlob(data)

print(blob.translate(to='de'))

AttributeError: 'list' object has no attribute 'strip'

In [5]:
from textblob import TextBlob
newblob = TextBlob.translate(data,to='de')
newblob

AttributeError: 'str' object has no attribute 'translator'

But as you can see the translation is not working. You will always get the DeprecationWarning. *'TextBlob.translate is deprecated and will be removed in a future release.' 'Use the official Google Translate API instead.'*

After this errors I wanted to try if textblob is even working. That's why I made some lines just to be sure that I installed textblob correctly.

In [6]:
import textblob
blob = textblob.TextBlob('It\'s a lovely winter day here in Graz. But it is cold.')
blob

TextBlob("It's a lovely winter day here in Graz. But it is cold.")

You can also simplify the code:

In [7]:
from textblob import TextBlob
blob = TextBlob('It\'s a lovely winter day here in Graz. But it is cold.')
blob

TextBlob("It's a lovely winter day here in Graz. But it is cold.")

In [8]:
blob.words

WordList(['It', "'s", 'a', 'lovely', 'winter', 'day', 'here', 'in', 'Graz', 'But', 'it', 'is', 'cold'])

In [9]:
blob.sentences

[Sentence("It's a lovely winter day here in Graz."),
 Sentence("But it is cold.")]

And since the programm can still indicate the words and sentences but cannot translate a little phrase I had to use another translator.

In [10]:
text = TextBlob('Hello how are you?')
text.translate(to='de')

AttributeError: 'list' object has no attribute 'strip'

## Alternative Translation with GoogleTrans

We got the error: 'TextBlob.translate is deprecated and will be removed in a future release.'

Therefore I chose googletrans as the translator.

I installed googletrans via the terminal as I tipped in: `pip install googletrans==3.1.0a0`. I had to use an older version of the library because the newest wasn't working for me.

Again I had to move all of the new files (in my case they are located inside the path *AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages*) and put them into the path *anaconda3\Lib\site-packages*, where also this notebook is located.

At first I just tried out if the translation is possible.

In [11]:
import googletrans
from googletrans import Translator
from googletrans import LANGUAGES

In [12]:
translator=Translator()

In [13]:
translated = translator.translate('Ciao mondo!', dest='en')

In [14]:
translated.text

'Hello World!'

In [15]:
print(translated)

Translated(src=it, dest=en, text=Hello World!, pronunciation=Hello World!, extra_data="{'translat...")


(From the ouput above you can also see the language in which the text is written even if you don't understand the language. In this case it is 'it' which is italian as you can see in the dictionary below. The dictionary includes every language that can be translated with googletrans.)

In [16]:
print(LANGUAGES)

{'af': 'afrikaans', 'sq': 'albanian', 'am': 'amharic', 'ar': 'arabic', 'hy': 'armenian', 'az': 'azerbaijani', 'eu': 'basque', 'be': 'belarusian', 'bn': 'bengali', 'bs': 'bosnian', 'bg': 'bulgarian', 'ca': 'catalan', 'ceb': 'cebuano', 'ny': 'chichewa', 'zh-cn': 'chinese (simplified)', 'zh-tw': 'chinese (traditional)', 'co': 'corsican', 'hr': 'croatian', 'cs': 'czech', 'da': 'danish', 'nl': 'dutch', 'en': 'english', 'eo': 'esperanto', 'et': 'estonian', 'tl': 'filipino', 'fi': 'finnish', 'fr': 'french', 'fy': 'frisian', 'gl': 'galician', 'ka': 'georgian', 'de': 'german', 'el': 'greek', 'gu': 'gujarati', 'ht': 'haitian creole', 'ha': 'hausa', 'haw': 'hawaiian', 'iw': 'hebrew', 'he': 'hebrew', 'hi': 'hindi', 'hmn': 'hmong', 'hu': 'hungarian', 'is': 'icelandic', 'ig': 'igbo', 'id': 'indonesian', 'ga': 'irish', 'it': 'italian', 'ja': 'japanese', 'jw': 'javanese', 'kn': 'kannada', 'kk': 'kazakh', 'km': 'khmer', 'ko': 'korean', 'ku': 'kurdish (kurmanji)', 'ky': 'kyrgyz', 'lo': 'lao', 'la': 'lat

### German translation with googletrans

If you try to translate the whole file 'ice_man' it will not translate a single word because the maximum character limit on a single text is 15k.

In [17]:
german = translator.translate(data, dest='de')
german.text

'I married the Ice Man.\nI first met the Ice Man at this ski resort hotel. I guess that’s the kind of place one ought to meet an Ice Man. In the boisterous hotel lobby, crowded with young people, the Ice Man was sitting in a chair at the furthest possible remove from the fireplace, silently reading a book. Though it was approaching high noon, it seemed to me that the cool, fresh light of the winter morning still lingered around him. Hey, that’s the Ice Man, my friend informed me in a low voice. But at that time, I had no idea what in the world an Ice Man was. My friend didn’t really know, either. She just knew that he existed and was called the Ice Man. She’s sure he’s made out of ice. That’s why he’s called the Ice Man, she said to me with a serious expression. It was like she was talking about a ghost or somebody with a contagious disease or something.\n\nThe Ice Man was tall, and from looking at him, his hair seemed bristly. When I saw his face, he looked fairly young still, but tha

Therefore I had to split the "story" into two parts and create two new files which I translated seperatly.

In [18]:
with open('texts/nlp/ice_man1.txt', encoding='utf-8') as fh:
    part1 = fh.read()

In [19]:
story1 = translator.translate(part1, dest='de')
story1.text

'Ich habe den Mann aus dem Eis geheiratet.\nIch habe den Mann aus dem Eis zum ersten Mal in diesem Skiresorthotel getroffen. Ich denke, das ist die Art von Ort, an dem man einen Mann aus dem Eis treffen sollte. In der ausgelassenen, von jungen Leuten überfüllten Hotellobby saß der Mann aus dem Eis auf einem möglichst weit vom Kamin entfernten Stuhl und las schweigend ein Buch. Obwohl es auf den hohen Mittag zuging, schien es mir, als ob das kühle, frische Licht des Wintermorgens ihn noch immer umgab. Hey, das ist der Mann aus dem Eis, teilte mir mein Freund mit leiser Stimme mit. Aber damals hatte ich keine Ahnung, was um alles in der Welt ein Mann aus dem Eis war. Mein Freund wusste es auch nicht wirklich. Sie wusste nur, dass er existierte und der Ice Man genannt wurde. Sie ist sich sicher, dass er aus Eis besteht. Deshalb heißt er der Mann aus dem Eis, sagte sie mit ernster Miene zu mir. Es war, als würde sie über einen Geist oder jemanden mit einer ansteckenden Krankheit oder so et

(I wanted to remove the '\n' from the text, but you can not replace anything when it is translated.)

In [20]:
with open('texts/nlp/ice_man2.txt', encoding='utf-8') as fh:
    part2 = fh.read()

In [21]:
story2 = translator.translate(part2, dest='de')
story2.text

'Eines Tages machte ich meinem Mann einen Antrag. Warum machen wir zur Abwechslung nicht mal zusammen irgendwo einen Ausflug. Reise? er sagte. Er kniff die Augen zusammen, als er mich ansah. Warum um alles in der Welt sollten wir eine Reise unternehmen? Bist du nicht glücklich, hier bei mir zu leben?\n\nDas ist es nicht, sagte ich. Ich bin vollkommen glücklich. Es gibt keine Probleme zwischen uns. Es ist nur so, dass ich mich langweile. Ich möchte an einen weit entfernten Ort gehen und Dinge sehen, die ich noch nie zuvor gesehen habe. Ich möchte Luft atmen, die ich noch nie geatmet habe. Verstehst du? Außerdem waren wir nie in den Flitterwochen. Wir haben viel Geld auf der Bank, und ein paar Tage frei zu nehmen, sollte kein Problem sein. Ich denke nur, dass ein entspannender Ausflug irgendwo schön wäre.\n\nDer Mann aus dem Eis stieß einen tiefen, erstarrten Seufzer aus. Der Seufzer machte ein scharfes Geräusch, als die Luft kristallisierte. Er brachte seine langen, frostbedeckten Finge

Last but not least we need to calculate the sentiment of the two parts, therefore I write the command for each part and then make a mean score to get the sentiment of the whole translated story.

In [22]:
from textblob_de import TextBlobDE

first = TextBlobDE(story1.text)
first.sentiment

Sentiment(polarity=0.08365231259968099, subjectivity=0.04194577352472089)

In [23]:
from textblob_de import TextBlobDE

second = TextBlobDE(story2.text)
second.sentiment

Sentiment(polarity=0.006865284974093266, subjectivity=0.029706390328151987)

### Comparising of the English and German sentiment

So finally we get the result for the German story:

In [24]:
polarity=(0.08365231259968099+0.006865284974093266)/2
subjectivity=(0.04194577352472089+0.029706390328151987)/2
print(f'Sentiment(polarity={polarity}, subjectivity={subjectivity})')

Sentiment(polarity=0.04525879878688713, subjectivity=0.03582608192643644)


The English one got:

In [25]:
from textblob import TextBlob

blob = TextBlob(data)
blob.sentiment

Sentiment(polarity=0.05720858766015831, subjectivity=0.45825535139671275)

(I didn't remove any stopwords or anything like that from the english version because I wouldn't be able to do the same with the German text.)

The polarity is a float within the range [-1.0,1.0] and the subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.

We see that the results of the two sentiments from the two languages is different. But I think that if I would have checked the story in German without dividing it into two parts the polarity would be even closer to the english one. Nevertheless the subjectivity is very different.

In my opinion the polarity should be a least under 0, it's more like a -0,25 for me. For the subjectivity I would agree more with the english result of 0,46. The German result doesn't make so much sense for me.