# Analysing the proportions in the finetuning dataset

## Libraries

In [23]:
from pathlib import Path
import pandas as pd 
from rich.console import Console
from rich.table import Table
from rich.panel import Panel
from statsmodels.stats.proportion import proportion_confint
from utils.chain_of_thought import CoTColumnBuilder
from utils.experiments_construction import FinetuneDatasetBuilder

## Global variables

In [24]:
DATA_PATH       = Path('../data/finetuning_dataset')
TRAIN_PATH      = DATA_PATH / 'finetuning_train.parquet'
TEST_PATH       = DATA_PATH / 'finetuning_test.parquet'
AUTO_AGREEMENT_PATH  = Path('../data/data_postchecking') / 'auto_agreement_checking.csv'

console = Console()

## Utils

In [25]:
def dataset_summary(df, name):
    return {
        "Dataset": name,
        "Rows": len(df),
        "Toxicity": int(df["annotator's conclusion"].sum()),
        "Number of columns": len(df.columns),
        "List of the columns": list(df.columns),
        "Missing values": df.isnull().sum().sum()
    }

def print_examples(df, label, n=3):
    examples = df[df["annotator's conclusion"] == label].sample(n=min(n, len(df[df["annotator's conclusion"] == label])), random_state=42)
    for idx, row in examples.iterrows():
        console.print(
            Panel(
                row['content'],
                title=f"{'Toxic' if label else 'Non-Toxic'} Example",
                subtitle=f"Index: {idx}",
                style="bold red" if label else "bold green"
            )
        )

## Load datasets

In [26]:
df_train    = pd.read_parquet(TRAIN_PATH)
df_test     = pd.read_parquet(TEST_PATH)

summaries = [
    dataset_summary(df_train, "Train"),
    dataset_summary(df_test, "Test")
]

table = Table(title="Dataset Summary")
table.add_column("Dataset", style="cyan", no_wrap=True)
table.add_column("Rows", justify="right")
table.add_column("Toxicity", justify="right")
table.add_column("Number of columns", justify="right")
table.add_column("List of the columns", justify="right")
table.add_column("Missing values", justify="right")

for summary in summaries:
    table.add_row(
        summary["Dataset"],
        str(summary["Rows"]),
        f'{summary["Toxicity"]}/{summary["Rows"]} ({summary["Toxicity"] / summary["Rows"]:.2%})',
        str(summary["Number of columns"]),
        ", ".join(summary["List of the columns"]),
        str(summary["Missing values"])
    )

console.print(table)

## Print some examples

In [30]:
console.rule("[bold yellow]Toxic Examples (Train)")
print_examples(df_train, label=1, n=5)

console.rule("[bold yellow]Non-Toxic Examples (Train)")
print_examples(df_train, label=0, n=5)

In [34]:
console.rule("[bold yellow]Toxic Examples (Test)")
print_examples(df_test, label=1, n=3)

console.rule("[bold yellow]Non-Toxic Examples (Test)")
print_examples(df_test, label=0, n=3)

## Analysis of the *human-error* rate

**Toxicity is not well defined; each human has their own perspective on it. There are two main sources of error:**

* A human may be **uncertain** about the toxicity of some content.
* Different humans may have **different definitions** of toxicity.

---

### **Notations:**

Let $\Sigma$ be a finite alphabet, and let $\Sigma^*$ denote the set of all finite strings over $\Sigma$.
Let $L \subseteq \Sigma^*$ be the set of all texts in a given language (e.g., the French language).

We postulate the existence of a **true toxicity function**:

$$
T : L \to \{0,1\}
$$

This represents an objective notion of toxicity (possibly hypothetical or latent).

However, each human $h \in \mathcal{H}$ has their own **subjective definition** of toxicity, represented by a function:

$$
T_h : L \to \{0,1\}
$$

This function is deterministic for each individual human but varies across the population.

---

### **Uncertainty in Individual Judgment**

When human $h$ is asked to annotate a text $s \in L$, their response is not deterministic due to internal uncertainty, mood, context, etc. Instead, we model their **observed response** as a Bernoulli random variable:

$$
\hat{T}_h(s) \in \{0,1\}
$$

with:

$$
\mathbb{E}[\hat{T}_h(s)] = T_h(s)
$$

That is, their annotation is a random sample with expectation equal to their internal definition.

If we take repeated annotations from the same human on the same sample (e.g., over time or in a crowd-annotated setting), the **law of large numbers** applies:

$$
\frac{1}{n} \sum_{i=1}^n \hat{T}_h^{(i)}(s) \xrightarrow[]{\text{a.s.}} T_h(s)
$$

---

### **Inter-Human Variation**

Since humans have diverse backgrounds and norms, the mapping:

$$
h \mapsto T_h
$$

can itself be viewed as a **random function** drawn from a distribution over annotators. Then for any $s \in L$, the expected value of $T_h(s)$ across the human population is:

$$
\mathbb{E}_{h \sim \mathcal{H}}[T_h(s)] =: T(s)
$$

which defines a **population-level expected toxicity** — a probabilistic consensus that approximates the “true” toxicity.

--- 

In fact, we can define $T_{\mathcal P}(s)$ directly as the *toxicity* of a culture as being the limit of toxicities by taking $\mathcal P$ instead of $\mathcal H$, therefore we show that there are different scales to talk about toxicity.

In [5]:
# === Load and preprocess data ===
df_auto_agreement = pd.read_csv(AUTO_AGREEMENT_PATH, encoding='utf-8')
first_annotation = "annotator's conclusion"
second_annotation = "2nd annotation score"

In [20]:
# === Wilson CI Computation ===
def compute_stats(count, total):
    percent = count / total
    ci_low, ci_high = proportion_confint(count, total, method='wilson')
    return {
        "count": count,
        "percent": percent * 100,
        "ci_low": ci_low * 100,
        "ci_high": ci_high * 100
    }

# === Format function for table output ===
def format_stat(s):
    return f"{s['count']} ( {s['percent']:.2f}% [{s['ci_low']:.2f}–{s['ci_high']:.2f}%] )"

In [21]:
df_first_toxic = df_auto_agreement[df_auto_agreement[first_annotation] == 1][second_annotation] 
df_first_non_toxic = df_auto_agreement[df_auto_agreement[first_annotation] == 0][second_annotation]

table = Table(title="Auto Agreement Checking Summary")
table.add_column("2nd Annotation Choice", style="cyan", no_wrap=True)
table.add_column("First annotated as toxic", justify="right")
table.add_column("First annotated as non-toxic", justify="right")
table.add_row(
    "Merged Toxic",
    str(format_stat(compute_stats(len(df_first_toxic[df_first_toxic > 0.5]), len(df_first_toxic)))),
    str(format_stat(compute_stats(len(df_first_non_toxic[df_first_non_toxic > 0.5]), len(df_first_non_toxic))))
)
table.add_row(
    "Yes",
    str(format_stat(compute_stats(len(df_first_toxic[df_first_toxic == 1]), len(df_first_toxic)))),
    str(format_stat(compute_stats(len(df_first_non_toxic[df_first_non_toxic == 1]), len(df_first_non_toxic))))
)
table.add_row(
    "Maybe yes",
    str(format_stat(compute_stats(len(df_first_toxic[df_first_toxic == 0.75]), len(df_first_toxic)))),
    str(format_stat(compute_stats(len(df_first_non_toxic[df_first_non_toxic == 0.75]), len(df_first_non_toxic))))
)
table.add_row(
    "Merge Non-Toxic",
    str(format_stat(compute_stats(len(df_first_toxic[df_first_toxic < 0.5]), len(df_first_toxic)))),
    str(format_stat(compute_stats(len(df_first_non_toxic[df_first_non_toxic < 0.5]), len(df_first_non_toxic))))
)
table.add_row(
    "Maybe no",
    str(format_stat(compute_stats(len(df_first_toxic[df_first_toxic == 0.25]), len(df_first_toxic)))),
    str(format_stat(compute_stats(len(df_first_non_toxic[df_first_non_toxic == 0.25]), len(df_first_non_toxic))))
)
table.add_row(
    "No",
    str(format_stat(compute_stats(len(df_first_toxic[df_first_toxic == 0]), len(df_first_toxic)))),
    str(format_stat(compute_stats(len(df_first_non_toxic[df_first_non_toxic == 0]), len(df_first_non_toxic))))
)

console.print(table)

## Create several experiments 

We will try several finetuning : 
- (r) Random order of the dataset 
    - (e) Same proportion of toxic and non-toxic content (smaller dataset) 
        - (a) With CoT finetuning
        - (b) Without CoT finetuning
    - (d) Different proportion of toxic and non-toxic content (bigger dataset)
        - (a) With CoT finetuning
            - (s) small sample (100)
            - (m) medium sample (1000)
            - (l) large sample (all)
        - (b) Without CoT finetuning
            - (s) small sample (100)
            - (m) medium sample (1000)
            - (l) large sample (all)
- (o) Ordered dataset (Curriculum learning)
    - (e) Same proportion of toxic and non-toxic content (smaller dataset) 
        - (a) With CoT finetuning
        - (b) Without CoT finetuning
    - (d) Different proportion of toxic and non-toxic content (bigger dataset)
        - (a) With CoT finetuning
            - (s) small sample (100)
            - (m) medium sample (1000)
            - (l) large sample (all)
        - (b) Without CoT finetuning
            - (s) small sample (100)
            - (m) medium sample (1000)
            - (l) large sample (all)

We could choose an *automatic curriculum learning* approach, as some presented in the paper [A Survey on Curriculum Learning](https://ieeexplore.ieee.org/abstract/document/9392296), which has the advantage of taking into account the model's feedback, however we will choose a predefined curriculum learning approach taking benefit from the grading that GPT has done on the dataset along with statistics on the proportion of toxic content per grade and the agreement between humans and GPT, that is a measure of the difficulty of the task.

In [5]:
df_train = CoTColumnBuilder(df_train).add_cot_column()
df_test = CoTColumnBuilder(df_test).add_cot_column()

In [6]:
print(df_train['cot_text'].iloc[0])

<think>
Explication :
- Sujet : Imagination de dialogues par une personne.
- Sens probable : L'auteur du message critique une personne qui crée des dialogues fictifs ou déformés, suggérant un manque de crédibilité ou une tendance à embellir la réalité.
</think>
<think>
Tons :
Tons perçus : Critique, Moqueur.

Justification : Le terme « s'invente » laisse entendre un jugement négatif sur l'authenticité, tandis que l'expression « lignes de dialogues » peut suggérer une moquerie envers l'inventivité de la personne. Doutes : le ton peut varier selon le contexte.
</think>
<think>
Intentions :
**Intentions principales :**

1. **Critiquer** (certitude élevée)  
   Justification : Le choix des mots « s'invente » indique un jugement négatif sur l'authenticité et remet en question la crédibilité de la personne en cause.

2. **Se moquer** (certitude élevée)  
   Justification : L'expression « lignes de dialogues » attire une attention moqueuse et dévalorisante sur la nature fictive ou embellie de

In [7]:
df_train.to_parquet(TRAIN_PATH, index=False)
df_test.to_parquet(TEST_PATH, index=False)

In [8]:
builder = FinetuneDatasetBuilder(df_train)
all_splits = builder.build_all()

keep_columns = [
    "msg_id", 
    "content", 
    "cot_text",
    "literal_conclusion_annotator",
]

for split_name, split_df in all_splits.items():
    path_output = DATA_PATH
    for i, letter in enumerate(split_name):
        path_output = path_output / letter
    path_output.mkdir(parents=True, exist_ok=True)
    split_df = split_df[[col for col in keep_columns if col in split_df.columns]]
    split_df.to_csv(path_output / f"train_{split_name}.csv", encoding='utf-8', index=False)

table = Table(title="Finetuning Dataset Splits", show_lines=True)
table.add_column("Split Name", style="cyan", no_wrap=True)
table.add_column("Rows", justify="right")
table.add_column("Toxicity distribution", justify="center")

for (order, prop, cot, size) in [
        (order, prop, cot, size)
        for order in ("r", "o")  # random | ordered
        for prop in ("e", "d")  # equal | different
        for cot in ("a", "b")  # with CoT | without CoT
        for size in ("s", "m", "l")  # small | medium | large
    ]:
    split_name = f"{order}{prop}{cot}{size}"
    split_df = all_splits[split_name]
    toxicity_count = split_df["literal_conclusion_annotator"].apply(lambda x: x == "oui").sum()
    total_count = len(split_df)
    toxicity_distribution = f"{toxicity_count}/{total_count} ({toxicity_count / total_count:.2%})"
    table.add_row(
        split_name,
        str(total_count),
        toxicity_distribution
    )
console.print(table)

## Looking at the `Dataset` format 

Each dataset looks like : 

| Column                         | Type  | Description                                                                   |
| ------------------------------ | ----- | ----------------------------------------------------------------------------- |
| `msg_id`                       | `str` | Unique ID for the message (anonimized)                                        |
| `content`                      | `str` | Raw message text to analyze (possibly toxic)                                  |
| `cot_text`                     | `str` | Rich, multi-paragraph **Chain-of-Thought analysis** (in Markdown-like format) |
| `literal_conclusion_annotator` | `str` | Final binary decision by annotator (e.g., `"oui"` or `"non"`)                 |

In [11]:
path_of_csv = DATA_PATH / 'o' / 'd' / 'a' / 'l' / 'train_odal.csv'

In [12]:
from datasets import load_dataset, DatasetDict
from huggingface_hub import login
from pathlib import Path
import os
from huggingface_hub import HfApi

In [13]:
proxy_address="socks5h://127.0.0.1:1080"
os.environ["HTTP_PROXY"] = proxy_address
os.environ["HTTPS_PROXY"] = proxy_address

In [14]:
login(token=Path('../data/confidential/hf_token.txt').read_text().strip())

In [15]:
test_path = DATA_PATH / "finetuning_test.parquet"

# Load shared test set once
shared_test = load_dataset("parquet", data_files=str(test_path))["train"]

# Prepare experiment datasets
experiments = {}

# Search for train_*.csv recursively
for csv_path in DATA_PATH.rglob("train_*.csv"):
    experiment_name = csv_path.stem.replace("train_", "")  # e.g. "odal"
    train_dataset = load_dataset("csv", data_files=str(csv_path))["train"]

    # Create DatasetDict with shared test
    experiments[experiment_name] = DatasetDict({
        "train": train_dataset,
        "test": shared_test
    })

# ✅ Example: print available experiments
print("Loaded experiments:", list(experiments.keys()))
print("Sample from 'odal':", experiments["odal"]["train"][0])


Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating train split: 0 examples [00:00, ? examples/s]

Loaded experiments: ['oebm', 'oebs', 'oebl', 'oeam', 'oeas', 'oeal', 'odbm', 'odbs', 'odbl', 'odam', 'odas', 'odal', 'rebm', 'rebs', 'rebl', 'ream', 'reas', 'real', 'rdbm', 'rdbs', 'rdbl', 'rdam', 'rdas', 'rdal']
Sample from 'odal': {'msg_id': 'anon_msg_cdfe18ce3ff1', 'content': 'Faut les crever ces SALES JUIFS DE MERDE 😂', 'cot_text': '<think>\nExplication :\n**Sujet du message** : Attitude hostile envers un groupe particulier.\n\n**Sens probable** : L\'auteur exprime des sentiments de haine ou de mépris à l\'égard des Juifs, utilisant un langage vulgaire et provocateur. \n\n**Explication** : Le terme "SALES JUIFS" représente une insulte stéréotypée envers les Juifs, et "crever" signifie faire souffrir ou tuer. Utilisation de symboles d’amusement, ici un emoji rieur, renforce le caractère offensant du propos.\n</think>\n<think>\nTons :\nTons perçus : \n\n1. **Hostile** (certitude élevée) : L\'auteur manifeste une haine claire envers un groupe religieux/ethnique, avec des insultes dire

In [16]:
cols_to_keep = ['msg_id', 'content', 'cot_text', 'literal_conclusion_annotator']

# Replace test split in all experiments with trimmed version
trimmed_test = shared_test.remove_columns(
    [col for col in shared_test.column_names if col not in cols_to_keep]
)

# Then build unified DatasetDict
combined = {f"train_{k}": v["train"] for k, v in experiments.items()}
combined["test"] = trimmed_test

from datasets import DatasetDict
combined_ds = DatasetDict(combined)

In [17]:
combined_ds['train_odal']['cot_text'][0]

'<think>\nExplication :\n**Sujet du message** : Attitude hostile envers un groupe particulier.\n\n**Sens probable** : L\'auteur exprime des sentiments de haine ou de mépris à l\'égard des Juifs, utilisant un langage vulgaire et provocateur. \n\n**Explication** : Le terme "SALES JUIFS" représente une insulte stéréotypée envers les Juifs, et "crever" signifie faire souffrir ou tuer. Utilisation de symboles d’amusement, ici un emoji rieur, renforce le caractère offensant du propos.\n</think>\n<think>\nTons :\nTons perçus : \n\n1. **Hostile** (certitude élevée) : L\'auteur manifeste une haine claire envers un groupe religieux/ethnique, avec des insultes directes.\n2. **Provocateur** (certitude élevée) : L\'usage d\'un langage violent et d\'un emoji humoristique vise à choquer et à provoquer une réaction.\n3. **Méprisant** (certitude élevée) : Le langage dégradant dénote un profond mépris pour le groupe visé.\n\nDoutes : Il est possible qu\'une tentative de sarcasme existe, mais le ton géné

In [18]:
combined_ds.push_to_hub("Naela00/ToxiFrenchFinetuning")

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/5 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/5 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/53 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/53 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/5 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/5 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/53 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/1 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/53 [00:00<?, ?ba/s]

Uploading the dataset shards:   0%|          | 0/1 [00:00<?, ?it/s]

Creating parquet from Arrow format:   0%|          | 0/2 [00:00<?, ?ba/s]

README.md:   0%|          | 0.00/8.78k [00:00<?, ?B/s]

CommitInfo(commit_url='https://huggingface.co/datasets/Naela00/ToxiFrenchFinetuning/commit/ec8d665bfa6b8016484917a86880247004adcf2f', commit_message='Upload dataset', commit_description='', oid='ec8d665bfa6b8016484917a86880247004adcf2f', pr_url=None, repo_url=RepoUrl('https://huggingface.co/datasets/Naela00/ToxiFrenchFinetuning', endpoint='https://huggingface.co', repo_type='dataset', repo_id='Naela00/ToxiFrenchFinetuning'), pr_revision=None, pr_num=None)

In [19]:
from datasets import load_dataset

dataset_path    = "Naela00/ToxiFrenchFinetuning"    # Link to the dataset on Hugging Face
dataset_name    = "rdal"                                    # Name of the dataset to load

dataset_splits = load_dataset(dataset_path, name=dataset_name, trust_remote_code=True) 

README.md:   0%|          | 0.00/8.78k [00:00<?, ?B/s]

ToxiFrenchFinetuning.py:   0%|          | 0.00/2.38k [00:00<?, ?B/s]

train_rdal-00000-of-00001.parquet:   0%|          | 0.00/51.4M [00:00<?, ?B/s]

test-00000-of-00001.parquet:   0%|          | 0.00/1.26M [00:00<?, ?B/s]

Generating train split: 0 examples [00:00, ? examples/s]

Generating test split: 0 examples [00:00, ? examples/s]

In [20]:
dataset_splits['train']['cot_text'][0]

'<think>\nExplication :\n**Sujet du message** : Activités estivales et plans futurs.\n\n**Résumé** : L\'auteur mentionne qu\'il réalise diverses petites activités pendant l\'été, qu\'il alterne entre ces tâches et l\'écriture de son mémoire, et qu\'il sera au chômage à la rentrée.\n\n**Sens probable** : L\'auteur décrit ses occupations actuelles et indique une situation de chômage à venir. \n\nAucune expression ou terme n\'est ambigu dans ce contexte.\n</think>\n<think>\nTons :\nTons perçus : Neutre, Réaliste, Légèrement préoccupé.\n\nAnalyse : Le ton est neutre en décrivant des faits simples. Le terme "chômage" induit une légère préoccupation sur l\'avenir. Aucune ambiguïté perceptible. Certitude élevée.\n</think>\n<think>\nIntentions :\n1. **Informer** (Certitude élevée) : L\'auteur présente des informations sur ses activités estivales, son travail sur son mémoire et sa situation de chômage, apportant des données factuelles sur sa vie professionnelle.\n\n2. **Partager** (Certitude mo

## Test area