# RoBERTa full fine-tune (Kaggle)

This notebook runs a full RoBERTa-base fine-tune for the TEXT_BRANCH on Kaggle.

Usage notes:
- Upload `data/iemocap_manifest.jsonl` as a Kaggle Dataset and add it to the Notebook (recommended).
- Use GPU Accelerator (T4 or better).
- Adjust `--batch_size`, `--accumulation_steps`, and `--max_length` depending on GPU memory.

In [None]:
# 1) Optional: install pinned Python packages (Kaggle often has CUDA-enabled PyTorch preinstalled)
!pip install --quiet numpy==1.24.4 transformers==4.57.1 datasets sentence-transformers accelerate==0.20.3 huggingface-hub

In [None]:
# 2) Clone the repository and change to repo root
!git clone https://github.com/SpeedyLabX/ser-conformer-gat-xai.git
%cd ser-conformer-gat-xai

In [None]:
# 3) Copy your Kaggle Dataset manifest into the repo (example).
# If you added the dataset to the session, the file will be under /kaggle/input/<dataset-name>/...
# Replace <dataset-name> and path as appropriate.
# !cp /kaggle/input/<dataset-name>/iemocap_manifest.jsonl data/iemocap_manifest.jsonl
!ls -l data || true

In [None]:
# 4) Run the finetune (adjust arguments to match GPU).
# Recommended: --batch_size 8 --accumulation_steps 4 --epochs 3 --max_length 128 --fp16
!python scripts/finetune_roberta.py --manifest data/iemocap_manifest.jsonl --backbone roberta-base --batch_size 8 --accumulation_steps 4 --epochs 3 --max_length 128 --num_class 7 --out_dir /kaggle/working/artifacts/roberta_kaggle --fp16

In [None]:
# 5) List artifacts and zip them for download
!ls -la /kaggle/working/artifacts/roberta_kaggle || true
!zip -r /kaggle/working/roberta_kaggle_artifacts.zip /kaggle/working/artifacts/roberta_kaggle || true