ConflictAwareAH — Conflict-Aware Multimodal Fusion for Ambivalence/Hesitancy Recognition

Authors: Salah Eddine Bekhouche, Hichem Telli, Azeddine Benlamoudi, Salah Eddine Herrouz, Abdelmalik Taleb-Ahmed, Abdenour Hadid

Code for the ABAW10 Ambivalence/Hesitancy Challenge. Implements a 6-token fusion architecture with VideoMAE, HuBERT, and RoBERTa-GoEmotions encoders, conflict features, and text-guided late fusion.

Setup

# Create conda environment (Python 3.10 recommended)
conda create -n conda3.10 python=3.10 -y
conda activate conda3.10

# Install dependencies
pip install -r requirements.txt

# Install ffmpeg (required for audio loading)
# conda install -c conda-forge ffmpeg

Data

Place the BAH dataset in the data/ folder. Expected structure:

data/
  data/                    # labeled split
    split/                 # train.txt, val.txt, test.txt
    Videos/
    cropped-aligned-faces/
    transcription/
  test_unlabeled/          # challenge test set
    split/
    Videos/
    cropped-aligned-faces/
    transcription/

Obtain the BAH dataset from the ABAW10 Challenge / BAH dataset.

Pre-extract Audio (recommended)

Run once before training for faster data loading:

conda run -n conda3.10 python scripts/extract_audio.py

Reproducing Results

We will upload pre-trained weights so you can reproduce our results or retrain the model.

Option 1: Use Pre-trained Weights (from Hugging Face)

Simplest — use --hf_repo (downloads automatically):

# No manual download needed; checkpoint is fetched on first run
python scripts/predict.py \
    --hf_repo Bekhouche/ConflictAwareAH \
    --data_root data \
    --split test --num_windows 5 --output outputs/submission_test.csv

python scripts/predict.py \
    --hf_repo Bekhouche/ConflictAwareAH \
    --data_root data \
    --split test_unlabeled --num_windows 5 --output outputs/submission.csv

Or download first, then use local path:

pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download(repo_id='Bekhouche/ConflictAwareAH', local_dir='checkpoints/ConflictAwareAH')"

python scripts/predict.py \
    --checkpoints checkpoints/ConflictAwareAH/best_model.pt \
    --split test --num_windows 5 --output outputs/submission_test.csv

Option 2: Train from Scratch

Challenge configuration (leaderboard AVGF1 0.715):

CUDA_VISIBLE_DEVICES=0 bash scripts/train.sh \
    --unfreeze_top_k 0 \
    --label_smoothing 0.1 \
    --dropout 0.4 \
    --text_blend 0.5

After training, replace <RUN_TIMESTAMP> with your run folder (e.g. outputs/runs/20260314_191324):

python scripts/predict.py \
    --checkpoints outputs/runs/<RUN_TIMESTAMP>/best_model.pt \
    --split test --num_windows 5 --output outputs/submission_test_eval.csv

python scripts/predict.py \
    --checkpoints outputs/runs/<RUN_TIMESTAMP>/best_model.pt \
    --split test_unlabeled --num_windows 5 --output outputs/submission.csv

Upload Weights to Hugging Face

To share your trained model:

pip install huggingface_hub
export HUGGING_FACE_HUB_TOKEN=hf_xxx   # from https://huggingface.co/settings/tokens

# From project root (or use absolute path)
python scripts/upload_to_hf.py outputs/runs/20260314_191324 --repo_id your-username/conflict-aware-ah

Architecture

Encoders: VideoMAE-Base, HuBERT-Base, RoBERTa-GoEmotions (frozen)
Fusion: 6-token Transformer (v, a, t, |v−a|, |v−t|, |a−t|) + MLP
Output: Text-guided blend: α·σ(text_logit) + (1−α)·σ(full_logit) with α=0.5

Citation

@inproceedings{conflictawareah2026,
  title={Conflict-Aware Multimodal Fusion for Ambivalence and Hesitancy Recognition},
  author={Bekhouche, Salah Eddine and Telli, Hichem and Benlamoudi, Azeddine and Herrouz, Salah Eddine and Taleb-Ahmed, Abdelmalik and Hadid, Abdenour},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
bah		bah
scripts		scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ConflictAwareAH — Conflict-Aware Multimodal Fusion for Ambivalence/Hesitancy Recognition

Setup

Data

Pre-extract Audio (recommended)

Reproducing Results

Option 1: Use Pre-trained Weights (from Hugging Face)

Option 2: Train from Scratch

Upload Weights to Hugging Face

Architecture

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ConflictAwareAH — Conflict-Aware Multimodal Fusion for Ambivalence/Hesitancy Recognition

Setup

Data

Pre-extract Audio (recommended)

Reproducing Results

Option 1: Use Pre-trained Weights (from Hugging Face)

Option 2: Train from Scratch

Upload Weights to Hugging Face

Architecture

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages