### Describing most common emotions from given messenger conversations
The program uses two models to present the most popular emotions which can be detected in conversations.

First we have to insert in the first line the name of the json file with the conversation. You can download the json files with your messenger conversations from Facebook. Then just run all the cells step by step. Have fun!

First we need to install some libraries:

In [1]:
pip install transformers

Note: you may need to restart the kernel to use updated packages.


Code below fixes polish signs in json like 'ę', 'ą' etc and saves result in a new file

In [13]:
input_file_path = 'file.json'  # filename

def fix_polish_characters(input_file, output_file):
    with open(input_file, 'rb') as file:
        json_data = file.read().decode('utf-8')
        decoded_json = json_data.replace(r'\u00c4\u0099', 'ę').replace(r'\u00c5\u009b','ś').replace(r'\u00c5\u0082','ł')\
            .replace(r'\u00c4\u0087','ć').replace(r'\u00c5\u00bc','ż').replace(r'\u00c4\u0085','ą').replace(r'\u00c3\u00b3','ó')\
            .replace(r'\u00c5\u0084','ń').replace(r'\u00c5\u00ba','ź')
        
        with open(output_file, 'w', encoding='utf-8') as output_file:
            output_file.write(decoded_json)
 
output_file_path = "fixed_"+input_file_path  # output filename

fix_polish_characters(input_file_path, output_file_path)


Code below reads messages from json and saves then in txt file

In [14]:
import json

# read JSON
with open(output_file_path, 'r', encoding='utf-8') as file:
    data = json.load(file)

    
new_file_name = "data_" + output_file_path.replace(".json", ".txt")

with open(new_file_name, 'w', encoding='utf-8') as output_file:
    for message in data['messages']:
        if 'content' in message:
            output_file.write(message['content'] + '. ')


Code below uses the first model to translate sentences from polish to english, because the next model needs english language

In [15]:
from transformers import MarianMTModel, MarianTokenizer

def translate_polish_to_english(text):
    # loading model
    model_name = "Helsinki-NLP/opus-mt-pl-en"
    model = MarianMTModel.from_pretrained(model_name)
    tokenizer = MarianTokenizer.from_pretrained(model_name)

    # Tokenization
    inputs = tokenizer(text, return_tensors="pt", max_length=512, truncation=True)

    # translating
    translation = model.generate(**inputs)
    translated_text = tokenizer.decode(translation[0], skip_special_tokens=True)

    return translated_text


with open(new_file_name, 'r',encoding='utf-8') as file:
    file_content = file.read()

english_translation = translate_polish_to_english(file_content)
print("Text in English:", english_translation)


Tekst po angielsku: I'd like to start with that. He's going there. Oh, yeah. I was thinking about the saint maximum I already have. They wouldn't fit on the left wing. Tomorrow I think about Neymar, if Comana, special Mitoma. Video chat ended... The video chat ended... So when you're ready I'm ready. As I ate something. tasty too. I guess it's probably in about 10 minutes. I'll eat and I can be. because today Tuesday. and yes. Because today spaghetti that I do in 15 minutes. But it's okay to eat it.


Code below uses the second model to find emotions in the conversation and present them as a result

In [16]:
from transformers import pipeline

classifier = pipeline(task="text-classification", model="SamLowe/roberta-base-go_emotions", top_k=None)
    
sentences = [english_translation]

model_outputs = classifier(sentences)
print(model_outputs[0])


[{'label': 'neutral', 'score': 0.6382584571838379}, {'label': 'approval', 'score': 0.40677452087402344}, {'label': 'realization', 'score': 0.04068932682275772}, {'label': 'optimism', 'score': 0.03435012325644493}, {'label': 'desire', 'score': 0.016977740451693535}, {'label': 'excitement', 'score': 0.014631872065365314}, {'label': 'joy', 'score': 0.012363879941403866}, {'label': 'admiration', 'score': 0.0056058987975120544}, {'label': 'confusion', 'score': 0.005055004730820656}, {'label': 'disapproval', 'score': 0.004396679811179638}, {'label': 'annoyance', 'score': 0.004146473947912455}, {'label': 'curiosity', 'score': 0.003435862949118018}, {'label': 'relief', 'score': 0.003081168979406357}, {'label': 'love', 'score': 0.003008962841704488}, {'label': 'disappointment', 'score': 0.0027620592154562473}, {'label': 'caring', 'score': 0.0018294511828571558}, {'label': 'amusement', 'score': 0.0015960148302838206}, {'label': 'surprise', 'score': 0.0014874140033498406}, {'label': 'pride', 'sco