# **DIMEMEX ‚Äî Complete Project Pipeline Notebook**

*Multilingual Meme Translation & Hate Speech Analysis*

---

## **Members**

* **B√°rbara** (Text)
* **Amanda** (Text + Description)
* **Juan David Nieto** (Text + Description + Image)
* **Luisa** (Image)

---

## **Project Overview**

This project analyzes whether **offensive or hate speech content in memes is preserved after translating the memes from Spanish to Portuguese**.

We work with the **DIMEMEX** dataset, which contains:

* Meme **text**
* Meme **description**
* Meme **image**
* Labels: *hate speech*, *inappropriate content*, *neither*

This project has **two main tasks**:

### **Task 1 ‚Äî Translation Quality Evaluation**

Translate the Spanish text to Portuguese and evaluate translation quality using standard NLP metrics:

* **BLEURT**
* **BERTScore**
* **COMET-Kiwi**
* **chrF**

### **Task 2 ‚Äî Hate Speech Detection (Multimodal Fine-Tuning)**

Fine-tune models to classify:

* Hate speech
* Inappropriate content
* Neither

We fine-tune under four input settings:

1. Text
2. Text + Description
3. Text + Description + Image (Multimodal)
4. Image

---

## **üìå Objectives**

1. Evaluate whether offensive content is **maintained or lost** during translation.
2. Compare performance of **original Spanish memes vs. translated Portuguese memes**.
3. Train and evaluate multimodal detectors to classify hate speech.
4. Analyze cases where the label changes across languages.
5. Perform **human qualitative analysis** on inconsistent cases.



# Imports

# üõ†Ô∏è 1. Install Dependencies

In [None]:
%pip install numpy==1.24.3  
%pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
%pip install triton==2.0.0
%pip install transformers==4.46.0  
%pip install peft==0.6.2
%pip install accelerate==0.25.0
%pip install pyarrow==12.0.0  
%pip install datasets==2.14.0
%pip install pillow tqdm scikit-learn matplotlib evaluate

print("üîÑ AGORA REINICIE O KERNEL!")

In [28]:
import os, json, warnings, torch
# Importar evaluate separadamente ap√≥s outros imports
import pandas as pd
import numpy as np
from PIL import Image

from sklearn.model_selection import train_test_split
from sklearn.metrics import (
    accuracy_score,
    f1_score,
    precision_score,
    recall_score,
    classification_report
)

from datasets import Dataset

# Hugging Face - Transformers (Models, Processors, Configs)
from transformers import (
    AutoTokenizer,
    AutoModelForCausalLM,
    AutoModelForVision2Seq,
    TrainingArguments,
    pipeline,
    AutoProcessor,
    Trainer,
)

# Hugging Face - PEFT (LoRA)
from peft import (
    LoraConfig,
    get_peft_model,
    PeftModel
)

from tqdm import tqdm
import matplotlib.pyplot as plt

# Importar evaluate por √∫ltimo
try:
    import evaluate
except ImportError:
    print("Warning: evaluate package not available")

warnings.filterwarnings("ignore")

# üóÇÔ∏è 2. Upload Data

In [2]:

# Montar Drive
# drive.mount('/content/drive')

# === Caminhos dos CSVs (ajuste se preferir) ===
CSV_TRAIN = "../train/dados_espanhol_balanceado.csv"
CSV_VAL   = "../validation/dados_espanhol.csv"
CSV_TEST  = "../test/dados_espanhol.csv"
# === Caminhos das pastas de imagens ===
TRAIN_IMAGES_DIR = "train_images"
VAL_IMAGES_DIR   = "validation_images"
TEST_IMAGES_DIR  = "test_images"
# Carregar CSVs
df_train = pd.read_csv(CSV_TRAIN)
df_val   = pd.read_csv(CSV_VAL)
df_test  = pd.read_csv(CSV_TEST)

print("CSV train:", df_train.shape)
print("CSV val:", df_val.shape)
print("CSV test:", df_test.shape)


CSV train: (1458, 4)
CSV val: (322, 4)
CSV test: (648, 4)


# ‚öôÔ∏è 3. Main Configurations

In [3]:
# --- Model & Training Params ---
MODEL_ID = "HuggingFaceTB/SmolVLM-256M-Instruct"
BATCH_SIZE = 2
EPOCHS = 4
LR = 2e-5
device = "cuda" if torch.cuda.is_available() else "cpu"

# --- Label distribution (string labels only) ---
print("\nüìä Quantidade de exemplos por label:")
print(df_train["label"].value_counts())


üìä Quantidade de exemplos por label:
label
hate speech              600
inappropriate content    472
neither                  386
Name: count, dtype: int64


# üì¶ 4. Prepare Train, Validation, and Test Datasets

In [4]:
# Os dados j√° est√£o divididos

df_train["image_path"] = df_train["image_path"].apply(lambda x: os.path.join(TRAIN_IMAGES_DIR, x))
df_val["image_path"]   = df_val["image_path"].apply(lambda x: os.path.join(VAL_IMAGES_DIR, x))
df_test["image_path"]  = df_test["image_path"].apply(lambda x: os.path.join(TEST_IMAGES_DIR, x))

ds_train = Dataset.from_pandas(df_train.reset_index(drop=True))
ds_val = Dataset.from_pandas(df_val.reset_index(drop=True))
ds_test = Dataset.from_pandas(df_test.reset_index(drop=True))

print(f"Train: {len(ds_train)} | Validation: {len(ds_val)} | Test: {len(ds_test)}")

Train: 1458 | Validation: 322 | Test: 648


# üß© 5. Processor and Pre-processing **(Image)**

In [23]:
# Carregar tokenizer e image_processor separadamente
class SmolVLMProcessor:
    def __init__(self, tokenizer, max_image_size=512):
        self.tokenizer = tokenizer
        self.max_image_size = max_image_size

    def apply_chat_template(self, messages, add_generation_prompt=False):
        if not messages or not isinstance(messages, list):
            return ""
        user_content = messages[0].get("content", [])
        text_content = next((item["text"] for item in user_content if item.get("type") == "text"), "")
        if len(messages) > 1:
            assistant_content = messages[1].get("content", [])
            assistant_text = next((item["text"] for item in assistant_content if item.get("type") == "text"), "")
            return f"User: {text_content}\nAssistant:{'' if add_generation_prompt else ' ' + assistant_text}"
        else:
            return f"User: {text_content}\nAssistant:"

    def __call__(self, text=None, images=None, return_tensors="pt", **kwargs):
        import torch

        inputs = {}

        # Tokeniza√ß√£o segura
        if text is not None:
            text_inputs = self.tokenizer(
                text if isinstance(text, list) else [text],
                return_tensors=return_tensors,
                padding=True,
                truncation=True,
                max_length=151,
                **kwargs
            )
            inputs.update(text_inputs)

        # Processamento manual de imagens
        if images is not None:
            if not isinstance(images, list):
                images = [images]
            image_tensors = []
            for img in images:
                img = img.convert("RGB")
                img = img.resize((self.max_image_size, self.max_image_size))
                img_tensor = torch.tensor(np.array(img)).permute(2, 0, 1)  # HWC -> CHW
                image_tensors.append(img_tensor)
            inputs["pixel_values"] = torch.stack(image_tensors)

        return inputs

In [None]:
# Criar processor customizado
# tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
# processor = SmolVLMProcessor(tokenizer, max_image_size=512)
processor = AutoProcessor.from_pretrained(MODEL_ID)


# Fun√ß√£o para formatar cada exemplo usando o processor customizado
def format_with_original_label(example, label):
    path = example["image_path"]
    try:
        if not os.path.exists(path):
            image = Image.new("RGB", (224, 224), "black")
        else:
            image = Image.open(path).convert("RGB")
            image.thumbnail((512, 512))  
    except Exception:
        image = Image.new("RGB", (224, 224), "black")

    messages = [
        {
            "role": "user",
            "content": [
                {"type": "image"},
                {"type": "text", "text": "Clasifica este meme: hate speech, inappropriate content, o neither"}
            ]
        },
        {
            "role": "assistant",
            "content": [
                {"type": "text", "text": label}
            ]
        }
    ]
    
    return {"images": image, "messages": messages, "image_path": path}

# Criar datasets
train_data = [
    format_with_original_label({"image_path": row["image_path"]}, row["label"])
    for _, row in df_train.iterrows()
]

val_data = [
    format_with_original_label({"image_path": row["image_path"]}, row["label"])
    for _, row in df_val.iterrows()
]

ds_train = Dataset.from_list(train_data)
ds_val = Dataset.from_list(val_data)

print("‚úÖ Datasets recriados com sucesso!")
print(f"Colunas: {ds_train.column_names}")

# Testar processor customizado
if len(ds_train) > 0:
    test_prompt = processor.apply_chat_template(ds_train[0]['messages'], add_generation_prompt=False)
    print(f"Exemplo de formata√ß√£o:\n{test_prompt}")


‚úÖ Datasets recriados com sucesso!
Colunas: ['images', 'messages', 'image_path']
Exemplo de formata√ß√£o:
User: Clasifica este meme: hate speech, inappropriate content, o neither
Assistant: neither


# üß† 6. Model + LoRA

In [25]:
os.environ["CUDA_VISIBLE_DEVICES"] = "0"

print("üîß Carregando modelo sem quantiza√ß√£o...")

base_model = AutoModelForVision2Seq.from_pretrained(
    MODEL_ID,
    torch_dtype=torch.bfloat16,  # ‚Üê Usar bfloat16 (economia de mem√≥ria sem quantiza√ß√£o)
    trust_remote_code=True,
)

# LoRA configura√ß√£o
lora_cfg = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj", "k_proj", "o_proj"],
    task_type="CAUSAL_LM",
    bias="none",
    lora_dropout=0.05,
)

print("üîß Aplicando LoRA...")
model = get_peft_model(base_model, lora_cfg)
print("‚úÖ LoRA aplicado")
model.print_trainable_parameters()

üîß Carregando modelo sem quantiza√ß√£o...
üîß Aplicando LoRA...
‚úÖ LoRA aplicado
trainable params: 1,363,968 || all params: 257,848,896 || trainable%: 0.5289795772482191


# üßÆ 7. Training Configuration

In [26]:
class DataCollatorSmolVLM:
    def __init__(self, processor):
        self.processor = processor

    def __call__(self, examples):
        texts = []
        images = []

        for example in examples:
            # Converte a lista de mensagens em string formatada
            text = self.processor.apply_chat_template(example["messages"], add_generation_prompt=False)
            texts.append(text)
            images.append(example["images"])

        # O processor aqui lida com o padding e tensores de imagem automaticamente
        batch = self.processor(text=texts, images=images, return_tensors="pt", padding=True)

        # Configurar labels (mascarando o padding)
        labels = batch["input_ids"].clone()
        labels[labels == self.processor.tokenizer.pad_token_id] = -100
        batch["labels"] = labels

        return batch

data_collator = DataCollatorSmolVLM(processor)

OUTPUT_DIR = "./SmolVLM_DIMEMEX_ImageOnly_third_chance"

# 2. Configurar Argumentos de Treino
training_args = TrainingArguments(
    output_dir=OUTPUT_DIR,
    per_device_train_batch_size=BATCH_SIZE,
    gradient_accumulation_steps=2,          
    num_train_epochs=EPOCHS,
    learning_rate=LR,
    fp16=False,
    bf16=True,            
    tf32=True,            
    logging_steps=10,
    save_strategy="epoch",
    eval_strategy="epoch",
    remove_unused_columns=False,
    report_to="none",
    dataloader_num_workers=4,        
    ddp_find_unused_parameters=False 
)

# 3. Instanciar o Trainer Padr√£o (n√£o SFTTrainer)
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=ds_train,
    eval_dataset=ds_val,
    data_collator=data_collator,
)

# 9. Train Model

In [27]:
trainer.train()
print("Fine-tuning complete.")

TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/luisastellet/.local/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/luisastellet/.local/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 54, in fetch
    return self.collate_fn(data)
  File "/tmp/ipykernel_228903/1094426231.py", line 16, in __call__
    batch = self.processor(text=texts, images=images, return_tensors="pt", padding=True)
  File "/tmp/ipykernel_228903/3661737659.py", line 26, in __call__
    text_inputs = self.tokenizer(
TypeError: GPT2TokenizerFast(name_or_path='HuggingFaceTB/SmolVLM-256M-Instruct', vocab_size=49152, model_max_length=8192, is_fast=True, padding_side='right', truncation_side='left', special_tokens={'bos_token': '<|im_start|>', 'eos_token': '<end_of_utterance>', 'unk_token': '<|endoftext|>', 'pad_token': '<|im_end|>', 'additional_special_tokens': ['<fake_token_around_image>', '<image>', '<end_of_utterance>']}, clean_up_tokenization_spaces=False),  added_tokens_decoder={
	0: AddedToken("<|endoftext|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	1: AddedToken("<|im_start|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	2: AddedToken("<|im_end|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	3: AddedToken("<repo_name>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	4: AddedToken("<reponame>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	5: AddedToken("<file_sep>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	6: AddedToken("<filename>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	7: AddedToken("<gh_stars>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	8: AddedToken("<issue_start>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	9: AddedToken("<issue_comment>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	10: AddedToken("<issue_closed>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	11: AddedToken("<jupyter_start>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	12: AddedToken("<jupyter_text>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	13: AddedToken("<jupyter_code>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	14: AddedToken("<jupyter_output>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	15: AddedToken("<jupyter_script>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	16: AddedToken("<empty_output>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49152: AddedToken("<global-img>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49153: AddedToken("<row_1_col_1>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49154: AddedToken("<row_1_col_2>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49155: AddedToken("<row_1_col_3>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49156: AddedToken("<row_1_col_4>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49157: AddedToken("<row_1_col_5>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49158: AddedToken("<row_1_col_6>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49159: AddedToken("<row_2_col_1>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49160: AddedToken("<row_2_col_2>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49161: AddedToken("<row_2_col_3>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49162: AddedToken("<row_2_col_4>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49163: AddedToken("<row_2_col_5>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49164: AddedToken("<row_2_col_6>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49165: AddedToken("<row_3_col_1>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49166: AddedToken("<row_3_col_2>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49167: AddedToken("<row_3_col_3>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49168: AddedToken("<row_3_col_4>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49169: AddedToken("<row_3_col_5>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49170: AddedToken("<row_3_col_6>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49171: AddedToken("<row_4_col_1>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49172: AddedToken("<row_4_col_2>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49173: AddedToken("<row_4_col_3>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49174: AddedToken("<row_4_col_4>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49175: AddedToken("<row_4_col_5>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49176: AddedToken("<row_4_col_6>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49177: AddedToken("<row_5_col_1>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49178: AddedToken("<row_5_col_2>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49179: AddedToken("<row_5_col_3>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49180: AddedToken("<row_5_col_4>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49181: AddedToken("<row_5_col_5>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49182: AddedToken("<row_5_col_6>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49183: AddedToken("<row_6_col_1>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49184: AddedToken("<row_6_col_2>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49185: AddedToken("<row_6_col_3>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49186: AddedToken("<row_6_col_4>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49187: AddedToken("<row_6_col_5>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49188: AddedToken("<row_6_col_6>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49189: AddedToken("<fake_token_around_image>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49190: AddedToken("<image>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49191: AddedToken("<|reserved_special_token_0|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49192: AddedToken("<|reserved_special_token_1|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49193: AddedToken("<|reserved_special_token_2|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49194: AddedToken("<|reserved_special_token_3|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49195: AddedToken("<|reserved_special_token_4|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49196: AddedToken("<|reserved_special_token_5|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49197: AddedToken("<|reserved_special_token_6|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49198: AddedToken("<|reserved_special_token_7|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49199: AddedToken("<|reserved_special_token_8|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49200: AddedToken("<|reserved_special_token_9|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49201: AddedToken("<|reserved_special_token_10|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49202: AddedToken("<|reserved_special_token_11|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49203: AddedToken("<|reserved_special_token_12|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49204: AddedToken("<|reserved_special_token_13|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49205: AddedToken("<|reserved_special_token_14|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49206: AddedToken("<|reserved_special_token_15|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49207: AddedToken("<|reserved_special_token_16|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49208: AddedToken("<|reserved_special_token_17|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49209: AddedToken("<|reserved_special_token_18|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49210: AddedToken("<|reserved_special_token_19|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49211: AddedToken("<|reserved_special_token_20|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49212: AddedToken("<|reserved_special_token_21|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49213: AddedToken("<|reserved_special_token_22|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49214: AddedToken("<|reserved_special_token_23|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49215: AddedToken("<|reserved_special_token_24|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49216: AddedToken("<|reserved_special_token_25|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49217: AddedToken("<|reserved_special_token_26|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49218: AddedToken("<|reserved_special_token_27|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49219: AddedToken("<|reserved_special_token_28|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49220: AddedToken("<|reserved_special_token_29|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49221: AddedToken("<|reserved_special_token_30|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49222: AddedToken("<|reserved_special_token_31|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49223: AddedToken("<|reserved_special_token_32|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49224: AddedToken("<|reserved_special_token_33|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49225: AddedToken("<|reserved_special_token_34|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49226: AddedToken("<|reserved_special_token_35|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49227: AddedToken("<|reserved_special_token_36|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49228: AddedToken("<|reserved_special_token_37|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49229: AddedToken("<|reserved_special_token_38|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49230: AddedToken("<|reserved_special_token_39|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49231: AddedToken("<|reserved_special_token_40|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49232: AddedToken("<|reserved_special_token_41|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49233: AddedToken("<|reserved_special_token_42|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49234: AddedToken("<|reserved_special_token_43|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49235: AddedToken("<|reserved_special_token_44|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49236: AddedToken("<|reserved_special_token_45|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49237: AddedToken("<|reserved_special_token_46|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49238: AddedToken("<|reserved_special_token_47|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49239: AddedToken("<|reserved_special_token_48|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49240: AddedToken("<|reserved_special_token_49|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49241: AddedToken("<|reserved_special_token_50|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49242: AddedToken("<|reserved_special_token_51|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49243: AddedToken("<|reserved_special_token_52|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49244: AddedToken("<|reserved_special_token_53|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49245: AddedToken("<|reserved_special_token_54|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49246: AddedToken("<|reserved_special_token_55|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49247: AddedToken("<|reserved_special_token_56|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49248: AddedToken("<|reserved_special_token_57|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49249: AddedToken("<|reserved_special_token_58|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49250: AddedToken("<|reserved_special_token_59|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49251: AddedToken("<|reserved_special_token_60|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49252: AddedToken("<|reserved_special_token_61|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49253: AddedToken("<|reserved_special_token_62|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49254: AddedToken("<|reserved_special_token_63|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49255: AddedToken("<|reserved_special_token_64|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49256: AddedToken("<|reserved_special_token_65|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49257: AddedToken("<|reserved_special_token_66|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49258: AddedToken("<|reserved_special_token_67|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49259: AddedToken("<|reserved_special_token_68|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49260: AddedToken("<|reserved_special_token_69|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49261: AddedToken("<|reserved_special_token_70|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49262: AddedToken("<|reserved_special_token_71|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49263: AddedToken("<|reserved_special_token_72|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49264: AddedToken("<|reserved_special_token_73|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49265: AddedToken("<|reserved_special_token_74|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49266: AddedToken("<|reserved_special_token_75|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49267: AddedToken("<|reserved_special_token_76|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49268: AddedToken("<|reserved_special_token_77|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49269: AddedToken("<|reserved_special_token_78|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49270: AddedToken("<|reserved_special_token_79|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49271: AddedToken("<|reserved_special_token_80|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49272: AddedToken("<|reserved_special_token_81|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49273: AddedToken("<|reserved_special_token_82|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49274: AddedToken("<|reserved_special_token_83|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49275: AddedToken("<|reserved_special_token_84|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49276: AddedToken("<|reserved_special_token_85|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49277: AddedToken("<|reserved_special_token_86|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49278: AddedToken("<|reserved_special_token_87|>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
	49279: AddedToken("<end_of_utterance>", rstrip=False, lstrip=False, single_word=False, normalized=False, special=True),
} got multiple values for keyword argument 'padding'


# 10. Loss Curves

In [None]:
# Garante que o diret√≥rio de sa√≠da existe
if not os.path.exists(OUTPUT_DIR):
    os.makedirs(OUTPUT_DIR)

# Acessa o hist√≥rico direto da mem√≥ria
logs = trainer.state.log_history

train_steps = [x["step"] for x in logs if "loss" in x]
train_loss = [x["loss"] for x in logs if "loss" in x]

eval_steps = [x["step"] for x in logs if "eval_loss" in x]
eval_loss = [x["eval_loss"] for x in logs if "eval_loss" in x]

plt.figure(figsize=(10, 5))
plt.plot(train_steps, train_loss, label="Train Loss")
plt.plot(eval_steps, eval_loss, label="Eval Loss", marker='o')
plt.title("Training and Evaluation Loss")
plt.xlabel("Steps")
plt.ylabel("Loss")
plt.legend()
plt.grid(True)

# --- SALVAR O GR√ÅFICO ---
plot_path = os.path.join(OUTPUT_DIR, "loss_curve.png")
plt.savefig(plot_path)
print(f"üìâ Gr√°fico de Loss salvo em: {plot_path}")

plt.show()

### **S√≥ pode rodar a c√©lula abaixo quando estiver satisfeito com os resultado! Par√£o n√£o ter vazamento de dados**

# ‚úÖ 11. Final Evaluation and Metrics Summary

In [None]:
# Use o modelo treinado
model = trainer.model
model.eval()

test_predictions = []
test_ground_truth = []
valid_labels = ["hate speech", "inappropriate content", "neither"]

print("üöÄ Rodando avalia√ß√£o final no conjunto de teste...")

# Loop pelo Test Set
for example in tqdm(ds_test):
    try:
        image = Image.open(example['image_path']).convert("RGB")
    except:
        continue

    prompt_text = """"
                Analiza esta imagen de meme y clasif√≠cala en una de estas tres categor√≠as: 
                - hate speech
                - inappropriate content
                - neither
                ¬øCu√°l es la categor√≠a correcta para este meme?
                """

    messages = [
        {
            "role": "user",
            "content": [
                {"type": "image"},
                {"type": "text", "text": prompt_text}
            ]
        }
    ]

    prompt = processor.apply_chat_template(messages, add_generation_prompt=True)
    inputs = processor(text=prompt, images=image, return_tensors="pt")
    inputs = {k: v.to(device) for k, v in inputs.items()}

    with torch.no_grad():
        generated_ids = model.generate(**inputs, max_new_tokens=20, do_sample=False)

    generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

    # Limpeza da resposta
    if "Assistant:" in generated_text:
        prediction = generated_text.split("Assistant:")[-1].strip()
    elif "assistant" in generated_text:
         prediction = generated_text.split("assistant")[-1].strip()
    else:
        prediction = generated_text.strip()

    test_predictions.append(prediction)
    test_ground_truth.append(example['label'])

In [None]:
# --- Processamento e Salvamento dos Resultados ---

# 1. Limpeza e Filtro
cleaned_preds = []
for p in test_predictions:
    p_clean = p.lower().replace(".", "").strip()
    if p_clean in valid_labels:
        cleaned_preds.append(p_clean)
    else:
        cleaned_preds.append("unknown") # Marca erro de gera√ß√£o

# 2. Preparar listas finais
final_preds = []
final_gt = []
for p, g in zip(cleaned_preds, test_ground_truth):
    if p in valid_labels:
        final_preds.append(p)
        final_gt.append(g)

# 3. SALVAR PREDI√á√ïES (CSV)
# Isso cria uma tabela com: Label Real | Predi√ß√£o | Acertou?
df_results = pd.DataFrame({
    "ground_truth": final_gt,
    "prediction": final_preds
})
df_results["correct"] = df_results["ground_truth"] == df_results["prediction"]
csv_path = os.path.join(OUTPUT_DIR, "test_predictions.csv")
df_results.to_csv(csv_path, index=False)
print(f"\nüíæ Tabela de predi√ß√µes salva em: {csv_path}")

# 4. Calcular M√©tricas
report_str = classification_report(final_gt, final_preds, labels=valid_labels)
results = {
    "accuracy": accuracy_score(final_gt, final_preds),
    "f1_weighted": f1_score(final_gt, final_preds, average="weighted"),
    "precision_weighted": precision_score(final_gt, final_preds, average="weighted", zero_division=0),
    "recall_weighted": recall_score(final_gt, final_preds, average="weighted", zero_division=0)
}

# 5. SALVAR RELAT√ìRIO (TXT e JSON)
txt_path = os.path.join(OUTPUT_DIR, "classification_report.txt")
with open(txt_path, "w") as f:
    f.write(report_str)
    f.write("\n\nSummary Metrics:\n")
    f.write(json.dumps(results, indent=4))

json_path = os.path.join(OUTPUT_DIR, "metrics.json")
with open(json_path, "w") as f:
    json.dump(results, f, indent=4)

print(f"üìÑ Relat√≥rios salvos em: {txt_path} e {json_path}")

# Exibir na tela
print(f"\n{len(final_preds)} / {len(test_predictions)} predi√ß√µes v√°lidas.")
print("\nüìä Resultados Finais:")
print(report_str)

# üíæ 12. Save Final Model

In [None]:
# Save the LoRA adapters
model.save_pretrained(OUTPUT_DIR)
# Save the processor
processor.save_pretrained(OUTPUT_DIR)

print(f"\n‚úÖ Fine-tuning complete. Model saved to: {OUTPUT_DIR}")