# Access the MentalRiskEs data and interact with the server

This notebook has been developed by the [SINAI](https://sinai.ujaen.es/) research group for its usage in the [MentalRiskES](https://sites.google.com/view/mentalriskes2025/) evaluation campaign at IberLEF 2025.

**NOTE 1**: Please visit the [MentalRiskES competition website](https://sites.google.com/view/mentalriskes2025/evaluation) to read the instructions about how to download the data and interact with the server to send the predictions of your system.

**NOTE 2**: Along the code, please replace "URL" by the URL server and "TOKEN" by your personal token.

Remember this is a support to help you to develop your own system of communication with our server. We recommend you to download it as a Python script instead of working directly on colab and adapt the code to your needs.

# Install CodeCarbon package
Read the [documentation](https://mlco2.github.io/codecarbon/) about the library if necessary. Remember that we provide a [CodeCarbon notebook](https://colab.research.google.com/drive/1boavnGOir0urui8qktbZaOmOV2pS5cn6?usp=sharing) with the example in its specific use in our competition.


In [1]:
!pip install codecarbon
!pip install dotenv

Collecting codecarbon
  Downloading codecarbon-2.8.3-py3-none-any.whl.metadata (8.7 kB)
Collecting arrow (from codecarbon)
  Downloading arrow-1.3.0-py3-none-any.whl.metadata (7.5 kB)
Collecting fief-client[cli] (from codecarbon)
  Downloading fief_client-0.20.0-py3-none-any.whl.metadata (2.1 kB)
Collecting questionary (from codecarbon)
  Downloading questionary-2.1.0-py3-none-any.whl.metadata (5.4 kB)
Collecting rapidfuzz (from codecarbon)
  Downloading rapidfuzz-3.13.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting types-python-dateutil>=2.8.10 (from arrow->codecarbon)
  Downloading types_python_dateutil-2.9.0.20241206-py3-none-any.whl.metadata (2.1 kB)
Collecting httpx<0.28.0,>=0.21.3 (from fief-client[cli]->codecarbon)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting jwcrypto<2.0.0,>=1.4 (from fief-client[cli]->codecarbon)
  Downloading jwcrypto-1.5.6-py3-none-any.whl.metadata (3.1 kB)
Collecting yaspin (from fief-clie

# Import libraries

In [2]:
import requests, zipfile, io
from requests.adapters import HTTPAdapter, Retry
from typing import List, Dict
import random
import json
import os
from dotenv import load_dotenv
import pandas as pd
from codecarbon import EmissionsTracker

In [4]:
os.environ["SERVER_URL"] = "http://s3-ceatic.ujaen.es:8036"
os.environ["ACCESS_TOKEN"] = "c461869975ffb0a7ba8544ffdddf3b58"

# Endpoints
These URL addresses are necessary for the connection to the server.

**IMPORTANT:** Replace "URL" by the URL server and "TOKEN" by your user token.

In [5]:
load_dotenv()
URL = os.getenv("SERVER_URL")
TOKEN = os.getenv("ACCESS_TOKEN")
print(URL, TOKEN)
# Download endpoints
ENDPOINT_DOWNLOAD_TRIAL = URL+"/{TASK}/download_trial/{TOKEN}"
ENDPOINT_DOWNLOAD_TRAIN = URL+"/{TASK}/download_train/{TOKEN}"

# Trial endpoints
ENDPOINT_GET_MESSAGES_TRIAL = URL+"/{TASK}/getmessages_trial/{TOKEN}"
ENDPOINT_SUBMIT_DECISIONS_TRIAL = URL+"/{TASK}/submit_trial/{TOKEN}/{RUN}"

# Test endpoints
ENDPOINT_GET_MESSAGES = URL+"/{TASK}/getmessages/{TOKEN}"
ENDPOINT_SUBMIT_DECISIONS = URL+"/{TASK}/submit/{TOKEN}/{RUN}"

http://s3-ceatic.ujaen.es:8036 c461869975ffb0a7ba8544ffdddf3b58


# Download Data
To download the data, you can make use of the **function defined in the following**.

The following function download the trial data. To adapt it to download the train and test data, follow the instructions given in the [website of the competition](https://sites.google.com/view/mentalriskes2024/evaluation).

In [6]:
def download_messages_trial(task: str, token: str):
    """ Allows you to download the trial data of the task.
        Args:
          task (str): task from which the data is to be retrieved
          token (str): authentication token
    """

    response = requests.get(ENDPOINT_DOWNLOAD_TRIAL.format(TASK=task, TOKEN=token))

    if response.status_code != 200:
        print("Trial - Status Code " + task + ": " + str(response.status_code) + " - Error: " + str(response.text))
    else:
      z = zipfile.ZipFile(io.BytesIO(response.content))
      os.makedirs("./data/{task}/trial/".format(task=task))
      z.extractall("./data/{task}/trial/".format(task=task))

In [7]:
def download_messages_train(task: str, token: str):
    """ Allows you to download the train data of the task.
        Args:
          task (str): task from which the data is to be retrieved
          token (str): authentication token
    """
    response = requests.get(ENDPOINT_DOWNLOAD_TRAIN.format(TASK=task, TOKEN=token))

    if response.status_code != 200:
        print("Train - Status Code " + task + ": " + str(response.status_code) + " - Error: " + str(response.text))
    else:
      z = zipfile.ZipFile(io.BytesIO(response.content))
      os.makedirs("./data/{task}/train/".format(task=task),exist_ok=True)
      z.extractall("./data/{task}/train/".format(task=task))

# Client Server
This class simulates communication with our server. The following code established the conection with the server client and simulate the GET and POST requests.

**IMPORTANT NOTE:** Please pay attention to the basic functions and remember that it is only a base for your system.

In [8]:
class Client_task1_2:
    """ Client communicating with the official server.
        Attributes:
            token (str): authentication token
            number_of_runs (int): number of systems. Must be 3 in order to advance to the next round.
            tracker (EmissionsTracker): object to calculate the carbon footprint in prediction

    """
    def __init__(self, task:str, token: str, number_of_runs: int, tracker: EmissionsTracker):
        self.task = task
        self.token = token
        self.number_of_runs = number_of_runs
        self.tracker = tracker
        self.relevant_cols = ['duration', 'emissions', 'cpu_energy', 'gpu_energy',
                              'ram_energy','energy_consumed', 'cpu_count', 'gpu_count',
                              'cpu_model', 'gpu_model', 'ram_total_size','country_iso_code']


    def get_messages(self, retries: int, backoff: float) -> Dict:
        """ Allows you to download the test data of the task by rounds.
            Here a GET request is sent to the server to extract the data.
            Args:
              retries (int): number of calls on the server connection
              backoff (float): time between retries
        """
        session = requests.Session()
        retries = Retry(
                        total = retries,
                        backoff_factor = backoff,
                        status_forcelist = [500, 502, 503, 504]
                        )
        session.mount('https://', HTTPAdapter(max_retries=retries))

        response = session.get(ENDPOINT_GET_MESSAGES_TRIAL.format(TASK=self.task, TOKEN=self.token)) # ENDPOINT

        if response.status_code != 200:
          print("GET - Task {} - Status Code {} - Error: {}".format(self.task, str(response.status_code), str(response.text)))
          return []
        else:
          return json.loads(response.content)

    def submit_decission(self, messages: List[Dict], emissions: Dict, retries: int, backoff: float):
        """ Allows you to submit the decisions of the task by rounds.
            The POST requests are sent to the server to send predictions and carbon emission data
            Args:
              messages (List[Dict]): Message set of the current round
              emissions (Dict): carbon footprint generated in the prediction
              retries (int): number of calls on the server connection
              backoff (float): time between retries
        """
        decisions_run0 = {}
        decisions_run1 = {}
        decisions_run2 = {}
        type_addiction_list = ["betting", "onlinegaming", "betting", "trading"]
        type_addiction_decision = {}

        # You must create the appropriate structure to send the predictions according to each task
        for message in messages:
            decisions_run0[message["nick"]] = random.choice([0,1])
            decisions_run1[message["nick"]] = random.choice([0,1])
            decisions_run2[message["nick"]] = random.choice([0,1])
            type_addiction_decision[message["nick"]] = random.choice(type_addiction_list)

        data1_run0 = {
            "predictions": decisions_run0,
            "emissions": emissions
        }
        data1_run1 = {
            "predictions": decisions_run1,
            "emissions": emissions
        }
        data1_run2 = {
            "predictions": decisions_run2,
            "emissions": emissions
        }
        data2_run0 = {
            "predictions": decisions_run0,
            "types":type_addiction_decision,
            "emissions": emissions
        }
        data2_run1 = {
            "predictions": decisions_run1,
            "types":type_addiction_decision,
            "emissions": emissions
        }
        data2_run2 = {
            "predictions": decisions_run2,
            "types":type_addiction_decision,
            "emissions": emissions
        }

        data1 = []
        data1.append(json.dumps(data1_run0))
        data1.append(json.dumps(data1_run1))
        data1.append(json.dumps(data1_run2))

        data2 = []
        data2.append(json.dumps(data2_run0))
        data2.append(json.dumps(data2_run1))
        data2.append(json.dumps(data2_run2))

        # Session to POST request
        session = requests.Session()
        retries = Retry(
                        total = retries,
                        backoff_factor = backoff,
                        status_forcelist = [500, 502, 503, 504]
                        )
        session.mount('https://', HTTPAdapter(max_retries=retries))

        for run in range(0, self.number_of_runs):
            # For each run, new decisions
            response1 = session.post(ENDPOINT_SUBMIT_DECISIONS_TRIAL.format(TASK='task1', TOKEN=self.token, RUN=run), json=[data1[run]]) # ENDPOINT
            if response1.status_code != 200:
                print("POST - Task1 - Status Code {} - Error: {}".format(str(response1.status_code), str(response1.text)))
                return
            else:
                print("POST - Task1 - run {} - Message: {}".format(run, str(response1.text)))

            response2 = session.post(ENDPOINT_SUBMIT_DECISIONS_TRIAL.format(TASK='task2', TOKEN=self.token, RUN=run), json=[data2[run]]) # ENDPOINT
            if response2.status_code != 200:
                print("POST - Task2 - Status Code {} - Error: {}".format(str(response2.status_code), str(response2.text)))
                return
            else:
                print("POST - Task2 - run {} - Message: {}".format(run, str(response2.text)))

            with open('./data/preds/task1/round{}_run{}.json'.format(messages[0]["round"], run), 'w+', encoding='utf8') as json_file:
                json.dump(data1[run], json_file, ensure_ascii=False)
            with open('./data/preds/task2/round{}_run{}.json'.format(messages[0]["round"], run), 'w+', encoding='utf8') as json_file:
                json.dump(data2[run], json_file, ensure_ascii=False)


    def run_task1_2(self, retries: int, backoff: float):
        """ Main thread
            Args:
              retries (int): number of calls on the server connection
              backoff (float): time between retries
        """
        # Get messages for task1_2
        messages = self.get_messages(retries, backoff)

        # If there are no messages
        if len(messages) == 0:
            print("All rounds processed")
            return

        while len(messages) > 0:
            print(messages)
            print("----------------------- Processing round {}".format(messages[0]["round"]))
            # Save subjects
            with open('./data/rounds/round{}.json'.format(messages[0]["round"]), 'w+', encoding='utf8') as json_file:
                json.dump(messages, json_file, ensure_ascii=False)

            # Calculate emissions for each prediction
            self.tracker.start()

            # Your code

            emissions = self.tracker.stop()
            df = pd.read_csv("emissions.csv")
            measurements = df.iloc[-1][self.relevant_cols].to_dict()

            self.submit_decission(messages, measurements, retries, backoff)

            # One GET request for each round
            messages = self.get_messages(retries, backoff)

        print("All rounds processed")

# Main

In [9]:
def download_data(task: str, token: str):
    # download_messages_trial(task, token)
    download_messages_train(task, token)

def get_post_data(task: str, token: str):
    # Emissions Tracker Config
    config = {
        "save_to_file": True,
        "log_level": "WARNING",
        "tracking_mode": "process",
        "output_dir": ".",
        "allow_multiple_runs": True
    }
    tracker = EmissionsTracker(**config)

    number_runs = 3 # Max: 3

    # Prediction period
    client_task1_2 = Client_task1_2(task, token, number_runs, tracker)
    client_task1_2.run_task1_2(5, 0.1)

Be careful! In this specific example we use the name of the task1 to do the get, knowing that it is the same data for both task 1 and task 2. In addition, the data upload is performed for both tasks.

In [10]:
if __name__ == '__main__':
    download_data("task2", TOKEN)
    # get_post_data("task1",TOKEN)

In [15]:
!pip install groq

Collecting groq
  Downloading groq-0.22.0-py3-none-any.whl.metadata (15 kB)
Downloading groq-0.22.0-py3-none-any.whl (126 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m126.7/126.7 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: groq
Successfully installed groq-0.22.0


In [17]:
from groq import Groq
import re

In [21]:
class TASK_LLM:
    """ Client communicating with the official server.
        Attributes:
            token (str): authentication token
            number_of_runs (int): number of systems. Must be 3 in order to advance to the next round.
            tracker (EmissionsTracker): object to calculate the carbon footprint in prediction

    """
    def __init__(self, task:str, token: str, number_of_runs: int, tracker: EmissionsTracker):
        self.task = task
        self.token = token
        self.number_of_runs = number_of_runs
        self.tracker = tracker
        self.relevant_cols = ['duration', 'emissions', 'cpu_energy', 'gpu_energy',
                              'ram_energy','energy_consumed', 'cpu_count', 'gpu_count',
                              'cpu_model', 'gpu_model', 'ram_total_size','country_iso_code']


    def get_messages(self, retries: int, backoff: float) -> Dict:
        """ Allows you to download the test data of the task by rounds.
            Here a GET request is sent to the server to extract the data.
            Args:
              retries (int): number of calls on the server connection
              backoff (float): time between retries
        """
        session = requests.Session()
        retries = Retry(
                        total = retries,
                        backoff_factor = backoff,
                        status_forcelist = [500, 502, 503, 504]
                        )
        session.mount('https://', HTTPAdapter(max_retries=retries))

        response = session.get(ENDPOINT_GET_MESSAGES_TRIAL.format(TASK=self.task, TOKEN=self.token)) # ENDPOINT

        if response.status_code != 200:
          print("GET - Task {} - Status Code {} - Error: {}".format(self.task, str(response.status_code), str(response.text)))
          return []
        else:
          return json.loads(response.content)

    def submit_decission(self, messages: List[Dict], emissions: Dict, retries: int, backoff: float):
        """ Allows you to submit the decisions of the task by rounds.
            The POST requests are sent to the server to send predictions and carbon emission data
            Args:
              messages (List[Dict]): Message set of the current round
              emissions (Dict): carbon footprint generated in the prediction
              retries (int): number of calls on the server connection
              backoff (float): time between retries
        """
        decisions_run0 = {}
        decisions_run1 = {}
        decisions_run2 = {}
        type_addiction_list = ["betting", "onlinegaming", "betting", "trading"]
        type_addiction_decision = {}

        # You must create the appropriate structure to send the predictions according to each task
        for message in messages:
            decisions_run0[message["nick"]] = random.choice([0,1])
            decisions_run1[message["nick"]] = random.choice([0,1])
            decisions_run2[message["nick"]] = random.choice([0,1])
            type_addiction_decision[message["nick"]] = random.choice(type_addiction_list)

        data1_run0 = {
            "predictions": decisions_run0,
            "emissions": emissions
        }
        data1_run1 = {
            "predictions": decisions_run1,
            "emissions": emissions
        }
        data1_run2 = {
            "predictions": decisions_run2,
            "emissions": emissions
        }
        data2_run0 = {
            "predictions": decisions_run0,
            "types":type_addiction_decision,
            "emissions": emissions
        }
        data2_run1 = {
            "predictions": decisions_run1,
            "types":type_addiction_decision,
            "emissions": emissions
        }
        data2_run2 = {
            "predictions": decisions_run2,
            "types":type_addiction_decision,
            "emissions": emissions
        }

        data1 = []
        data1.append(json.dumps(data1_run0))
        data1.append(json.dumps(data1_run1))
        data1.append(json.dumps(data1_run2))

        data2 = []
        data2.append(json.dumps(data2_run0))
        data2.append(json.dumps(data2_run1))
        data2.append(json.dumps(data2_run2))

        # Session to POST request
        session = requests.Session()
        retries = Retry(
                        total = retries,
                        backoff_factor = backoff,
                        status_forcelist = [500, 502, 503, 504]
                        )
        session.mount('https://', HTTPAdapter(max_retries=retries))

        for run in range(0, self.number_of_runs):
            # For each run, new decisions
            response1 = session.post(ENDPOINT_SUBMIT_DECISIONS_TRIAL.format(TASK='task1', TOKEN=self.token, RUN=run), json=[data1[run]]) # ENDPOINT
            if response1.status_code != 200:
                print("POST - Task1 - Status Code {} - Error: {}".format(str(response1.status_code), str(response1.text)))
                return
            else:
                print("POST - Task1 - run {} - Message: {}".format(run, str(response1.text)))

            response2 = session.post(ENDPOINT_SUBMIT_DECISIONS_TRIAL.format(TASK='task2', TOKEN=self.token, RUN=run), json=[data2[run]]) # ENDPOINT
            if response2.status_code != 200:
                print("POST - Task2 - Status Code {} - Error: {}".format(str(response2.status_code), str(response2.text)))
                return
            else:
                print("POST - Task2 - run {} - Message: {}".format(run, str(response2.text)))

            with open('./data/preds/task1/round{}_run{}.json'.format(messages[0]["round"], run), 'w+', encoding='utf8') as json_file:
                json.dump(data1[run], json_file, ensure_ascii=False)
            with open('./data/preds/task2/round{}_run{}.json'.format(messages[0]["round"], run), 'w+', encoding='utf8') as json_file:
                json.dump(data2[run], json_file, ensure_ascii=False)


    def run_task1_2(self, retries: int, backoff: float):
        """ Main thread
            Args:
              retries (int): number of calls on the server connection
              backoff (float): time between retries
        """
        # Get messages for task1_2
        messages = self.get_messages(retries, backoff)

        # If there are no messages
        if len(messages) == 0:
            print("All rounds processed")
            return

        while len(messages) > 0:
            print(messages)
            print("----------------------- Processing round {}".format(messages[0]["round"]))
            # Save subjects
            with open('./data/rounds/round{}.json'.format(messages[0]["round"]), 'w+', encoding='utf8') as json_file:
                json.dump(messages, json_file, ensure_ascii=False)

            # Calculate emissions for each prediction
            self.tracker.start()

            # Your code
            prediction, addiction_type = self.predict_with_llama(messages)

            emissions = self.tracker.stop()
            df = pd.read_csv("emissions.csv")
            measurements = df.iloc[-1][self.relevant_cols].to_dict()

            self.submit_decission(messages, measurements, retries, backoff)

            # One GET request for each round
            messages = self.get_messages(retries, backoff)

        print("All rounds processed")

    def predict_with_llama(self, messages):

        client = Groq(
            api_key= os.getenv("GROQ_API_KEY"),
        )
        # print ("messages", messages)

        prompt = f"""
        A continuación se te dará una serie de mensajes escritos por una persona. Tu tarea es analizar el contenido y determinar si hay indicios de que la persona podría estar mostrando
        comportamientos relacionados con la ludopatía (adicción al juego).

        Tu respuesta debe ser exclusivamente un número:
        - 1 si hay indicios de ludopatía. A continuación, en la siguiente línea, pon el tipo de entre: 'betting', 'onlinegaming', 'trading'
        - 0 si no los hay.

        Ahora evalúa dados los siguientes mensajes:
        {messages}
        """

        chat_completion = client.chat.completions.create(
            messages=[
                {
                    "role": "user",
                    "content": prompt,
                }
            ],
            model="llama3-70b-8192",
        )

        response = chat_completion.choices[0].message.content
        string_match = re.match(r"(\d+)\s*(.*)", response)
        if string_match is None:
          return 0

        number = string_match.group(1)
        type_addiction = string_match.group(2) or None

        return int(number), type_addiction

In [24]:
t1 = TASK_LLM("LLM", os.getenv("ACCESS_TOKEN"), 3, EmissionsTracker("a"))
t1.run_task1_2(1,1.5)

[codecarbon INFO @ 23:39:26] [setup] RAM Tracking...
[codecarbon INFO @ 23:39:26] [setup] CPU Tracking...
 Linux OS detected: Please ensure RAPL files exist at \sys\class\powercap\intel-rapl to measure CPU

[codecarbon INFO @ 23:39:27] CPU Model on constant consumption mode: Intel(R) Xeon(R) CPU @ 2.20GHz
[codecarbon INFO @ 23:39:27] [setup] GPU Tracking...
[codecarbon INFO @ 23:39:27] No GPU found.
[codecarbon INFO @ 23:39:27] >>> Tracker's metadata:
[codecarbon INFO @ 23:39:27]   Platform system: Linux-6.1.85+-x86_64-with-glibc2.35
[codecarbon INFO @ 23:39:27]   Python version: 3.11.11
[codecarbon INFO @ 23:39:27]   CodeCarbon version: 2.8.3
[codecarbon INFO @ 23:39:27]   Available RAM : 12.675 GB
[codecarbon INFO @ 23:39:27]   CPU count: 2
[codecarbon INFO @ 23:39:27]   CPU model: Intel(R) Xeon(R) CPU @ 2.20GHz
[codecarbon INFO @ 23:39:27]   GPU count: None
[codecarbon INFO @ 23:39:27]   GPU model: None
[codecarbon INFO @ 23:39:27] Saving emissions data to file /content/emissions.cs

GET - Task LLM - Status Code 404 - Error: {"detail":"Not Found"}
All rounds processed
