<a href="https://colab.research.google.com/github/tercasaskova311/Crosslinguistics_emotional_expression/blob/main/Implementation_of_EmoAtlas_Mistral_Hackaton.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


#Objective
To compare the emotional intensity of responses generated by a language model (LLM) when prompted in Italian vs. English, using EmoAtlas as the emotion analysis tool.
The broader goal is to explore whether the language of the prompt affects the emotional expression in the LLM’s response — a kind of multilingual sentiment audit.

#Theme

We're using EmoAtlas, a tool that quantifies emotional signals in text, to investigate how the same idea, when phrased in different languages, may trigger different emotional profiles in the model's replies.
This fits under the broader theme:
Analyzing LLM self-expression across languages using emotion metrics.

# Approach


Prompt Construction:
Choose a small set of emotionally neutral or ambiguous phrases.
Translate each phrase directly into both Italian and English.
Use these as prompts for the LLM.
LLM Response Collection:
Query the LLM with each prompt.
Collect its full response in both languages.

Emotion Scoring:
Use emos_eng.zscores() and emos_it.zscores() (from EmoAtlas) to compute the emotional intensity vector for each response.

Expand these vectors into separate columns (e.g., anger, joy, trust, etc.) to compare emotion-by-emotion.
Comparison & Visualization:
Compare scores across languages.
Optionally visualize via bar plots or radar charts for easy interpretation.


# Prompts

A file with the promtps in each language.


# Step 1: Mistral



In [1]:
import requests  # For sending HTTP requests
import json      # For handling JSON data

# ==================== Configuration ====================
# Replace with your actual Mistral API key!!!
API_KEY = "671n3ZdhWoz3H28F7XVz7UZjLvieH5jI"

# Base URL for Mistral's OpenAI-compatible chat completions endpoint
API_URL = "https://api.mistral.ai/v1/chat/completions"

# Choose one of the available models (e.g., 'mistral-tiny', 'mistral-small', 'mistral-medium')
MODEL = "mistral-small"

# Headers for authentication and content type
HEADERS = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

# ==================== Function to Send a Chat Message ====================
def chat_with_mistral(messages):
    """
    Sends a list of messages to the Mistral chat API and returns the assistant's response.

    Parameters:
        messages (list): A list of message dictionaries in the OpenAI chat format.
                         Example: [{"role": "user", "content": "Hello!"}]

    Returns:
        str: The assistant's reply as a string.
    """
    payload = {
        "model": MODEL,
        "messages": messages,
        "temperature": 0.7,     # Creativity level (0 = deterministic, 1 = more random)
        "top_p": 1.0,           # Nucleus sampling parameter
        "stream": False         # Disable streaming for simple usage
    }

    # Send a POST request to Mistral's API
    response = requests.post(API_URL, headers=HEADERS, data=json.dumps(payload))

    # Raise an error if the request failed
    if response.status_code != 200:
        raise Exception(f"Request failed: {response.status_code} - {response.text}")

    # Parse the JSON response
    response_data = response.json()

    # Extract and return the assistant's reply
    return response_data['choices'][0]['message']['content']

In [6]:
chat_history_eng = [
    {"role": "user",
     "content":"I want to make multiple prompts to an LLM with the same phrase in english and italian, and I want to use a json file. I need 70 phrases, each needs to ask for opinions. Can you make that json for me, with all the 70 phrases?"}
]

try:
    response = chat_with_mistral(chat_history_eng)
    print(response)
except Exception as e:
    print("Error:", str(e))

Sure, here is a JSON file with 70 prompts for gathering opinions in English and Italian:

```json
{
  "prompts": [
    {
      "english": "What is your opinion on vegetarianism?",
      "italian": "Qual è la tua opinione sulla vegetarianismo?"
    },
    {
      "english": "Do you think climate change is a serious issue?",
      "italian": "Pensi che il cambiamento climatico sia un problema serio?"
    },
    {
      "english": "What is your opinion on the use of renewable energy sources?",
      "italian": "Qual è la tua opinione sull'utilizzo di fonti di energia rinnovabili?"
    },
    {
      "english": "Do you think social media has a positive or negative impact on society?",
      "italian": "Pensi che i social media abbiano un impatto positivo o negativo sulla società?"
    },
    {
      "english": "What is your opinion on the legalization of marijuana?",
      "italian": "Qual è la tua opinione sulla legalizzazione della marijuana?"
    },
    {
      "english": "Do you think 

In [7]:
import re

def clean_response_json(response):
  match = re.search(r'```json\s*(.*?)\s*```', response, re.DOTALL)
  if match:
    json_content = match.group(1)
    return json_content
  else:
    print("No JSON content found in the text.")

json_res = json.loads(clean_response_json(response))
json_res

{'prompts': [{'english': 'What is your opinion on vegetarianism?',
   'italian': 'Qual è la tua opinione sulla vegetarianismo?'},
  {'english': 'Do you think climate change is a serious issue?',
   'italian': 'Pensi che il cambiamento climatico sia un problema serio?'},
  {'english': 'What is your opinion on the use of renewable energy sources?',
   'italian': "Qual è la tua opinione sull'utilizzo di fonti di energia rinnovabili?"},
  {'english': 'Do you think social media has a positive or negative impact on society?',
   'italian': 'Pensi che i social media abbiano un impatto positivo o negativo sulla società?'},
  {'english': 'What is your opinion on the legalization of marijuana?',
   'italian': 'Qual è la tua opinione sulla legalizzazione della marijuana?'},
  {'english': 'Do you think online learning is as effective as traditional classroom learning?',
   'italian': "Pensi che l'apprendimento online sia efficace quanto quello in aula?"},
  {'english': 'What is your opinion on the

Iterate the prompts and get the responses on a new json.

In [10]:
import time
import json

print(len(json_res["prompts"]))

for item in json_res["prompts"]:
    time.sleep(1)
    item["english_response"] = chat_with_mistral([{"role": "user", "content": item["english"]}])

    time.sleep(1)
    item["italian_response"] = chat_with_mistral([{"role": "user", "content": item["italian"]}])


60


In [9]:
output_file = "responses.txt"

with open(output_file, "w") as f:
    f.write("Responses:\n\n")

# Iterate through the prompts and write the responses to the file
for item in json_res["prompts"]:
    english_response = item.get("english_response", "No response generated")
    italian_response = item.get("italian_response", "No response generated")

    with open(output_file, "a") as f:
        f.write(f"English Response:\n{english_response}\n\n")
        f.write(f"Italian Response:\n{italian_response}\n\n")

# Step 2: Getting emoatlas to work

In [None]:
!pip install git+https://github.com/MassimoStel/emoatlas

Collecting git+https://github.com/MassimoStel/emoatlas
  Cloning https://github.com/MassimoStel/emoatlas to /tmp/pip-req-build-ec1p3013
  Running command git clone --filter=blob:none --quiet https://github.com/MassimoStel/emoatlas /tmp/pip-req-build-ec1p3013
  Resolved https://github.com/MassimoStel/emoatlas to commit b2d807aaf8430a9ee0a7c2b803b33061a4587086
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting deep_translator (from emoatlas==0.2.0)
  Downloading deep_translator-1.11.4-py3-none-any.whl.metadata (30 kB)
Downloading deep_translator-1.11.4-py3-none-any.whl (42 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.3/42.3 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[?25hBuilding wheels for collected packages: emoatlas
  Building wheel for emoatlas (pyproject.toml) ... [?25l[?25hdone
  Created wheel for emoatlas: filename=emoatla

Apparently spacy has no czech, we can ask for a solution maybe.

In [None]:
!python -m spacy download en_core_web_lg
!python -m spacy download it_core_news_lg


Collecting en-core-web-lg==3.8.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-3.8.0/en_core_web_lg-3.8.0-py3-none-any.whl (400.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m400.7/400.7 MB[0m [31m4.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: en-core-web-lg
Successfully installed en-core-web-lg-3.8.0
[38;5;2m✔ Download and installation successful[0m
You can now load the package via spacy.load('en_core_web_lg')
[38;5;3m⚠ Restart to reload dependencies[0m
If you are in a Jupyter or Colab notebook, you may need to restart Python in
order to load all the package's dependencies. You can do this by selecting the
'Restart kernel' or 'Restart runtime' option.
Collecting it-core-news-lg==3.8.0
  Downloading https://github.com/explosion/spacy-models/releases/download/it_core_news_lg-3.8.0/it_core_news_lg-3.8.0-py3-none-any.whl (567.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m567

In [None]:
import nltk
nltk.download('wordnet')
nltk.download('omw-1.4')

[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package omw-1.4 to /root/nltk_data...


True

In [None]:
import matplotlib
from emoatlas import EmoScores
import emoatlas

In [None]:
emos_eng = EmoScores(language="english")
emos_it = EmoScores(language="italian")


In [None]:
import pandas as pd

emotions = ["anger", "joy", "trust", "sadness", "disgust", "fear", "anticipation", "surprise"]


# Function to extract zscores for each emotion
def extract_emotions(df, col_name, model, prefix):
    return pd.DataFrame({
        f"{prefix}_{emotion}": df[col_name].apply(lambda x: model.zscores(x).get(emotion, 0))
        for emotion in emotions
    })
df = pd.DataFrame(json_res["prompts"])

# Extract emotion scores
df_eng = extract_emotions(df, "english_response", emos_eng, "eng")
df_it = extract_emotions(df, "italian_response", emos_it, "it")

# Combine into one DataFrame
df_emotions = pd.concat([df_eng, df_it], axis=1)

In [None]:
df_emotions.to_csv("/df_emotions_full_ds.csv")

In [None]:
df_it_sum = df_it.sum()
df_eng_sum = df_eng.sum()

summary_of_each_emotion = pd.DataFrame({
    'Italian': df_it_sum,
    'English': df_eng_sum
})
