Skip to content

hijack-lf/FixATE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

👁️ FixATE: Fixation-Aligned Tuning for Personalized User Emulation

Dataset: RecGaze Dataset: AdSERP


We propose FixATE, a framework that aligns a frozen VLM's visual attention with each user's characteristic gaze pattern through interpretability-based probing and personalized soft prompt tuning, enabling more faithful user simulation in visual recommendation scenarios.

📚 Contents

🔍 Overview

Existing LLM-based user simulators perceive recommendations through text or structured metadata, missing the visual attention signals that drive real user behavior. FixATE bridges this gap by:

  1. Probing the VLM's internal visual attention via interpretability operators (Attention Rollout, GLIMPSE, AttnLRP) to obtain slot-level relevance distributions comparable with human fixation.
  2. Learning personalized soft prompts through a factorized basis decomposition, steering the model's attention toward each user's characteristic fixation pattern.

Dependencies

📦 Data Preparation

Download RecGaze and AdSERP into datasets/ (e.g. RecGaze/, Adserp/).

RecGaze

python preprocessing/recgaze/dataset_preprocess_swipes.py
python preprocessing/recgaze/generate_interface_iamge.py 1-35

Raw inputs for the second step: datasets/raw/RecGaze/ (summary_feedback.csv, item_features.csv, poster_cache/). Outputs: datasets/RecGaze/init_interface_user_gaze(swipes).csv and datasets/RecGaze/interface_iamge/*.png.

AdSERP (optional)

python preprocessing/adserp/build_samples.py --mode both --n 5
python preprocessing/adserp/build_click_aoi_dataset.py --split all

🔧 Training

Run from the repo root. Hyperparameters are in config/common_config.py and operator-specific files (config/attnlrp_config.py, glimpse_config.py, rollout_config.py, attnlrp_config_adserp.py). Put VLM weights under llm_models/ (paths in common_config.py).

RecGaze

python fixate/fixate_training/train_fixate_attnlrp.py      # AttnLRP
python fixate/fixate_training/train_fixate_glimpse.py      # GLIMPSE
python fixate/fixate_training/train_fixate_rollout.py      # Attention Rollout

AdSERP

python fixate/fixate_training/train_fixate_attnlrp_adserp.py

📊 Evaluation

Training scripts write per-run metrics to JSON under outputs/ and checkpoints/ (paths depend on config/). Below matches what compute_sample_metrics and the trainers aggregate (sample-level metrics are micro-averaged over the evaluation set, prefixed with micro_ in logs).

Attention alignment metrics

How well the normalized model slot-attention vector a matches the normalized human gaze (dwell) vector g on the same slots. choice is the ground-truth clicked slot index.

Metric Meaning Better
KL divergence (kl_div / micro_kl_div) KL(ga): how much human gaze g differs from model mass a Lower
JS divergence (js_div / micro_js_div) Squared Jensen–Shannon distance between g and a Lower
Cosine similarity (cosine_sim / micro_cosine_sim) Cosine similarity between vectors g and a Higher
CSH@k (CSH@1, CSH@3, CSH@5) Whether choice is in the top-k slots when ranked by model attention a Higher
TGO@k (TGO@1, TGO@3, TGO@5) Overlap between top-k by a and top-k by g (implementation normalizes by k for k>1) Higher

Prediction-level metrics

Metrics that depend on the ground-truth choice (clicked slot) and/or the model’s discrete prediction, not only distributional alignment between g and a.

Metric Meaning Better
Log-Loss Negative log of the softmax probability, over candidate answer tokens, assigned to the true slot (not mass from the saliency / attention map in RecGaze eval) Lower
AUC One-vs-rest style rank score: other slots vs. choice under that same logit-based slot distribution Higher
Answer Accuracy Fraction of samples where the model’s generated choice (letter or index) matches the label Higher

📁 Project Structure

High-level layout:

├── config/                 # Training hyperparameters & paths
├── datasets/               # RecGaze / AdSERP data
├── fixate/                 # Core library + training scripts (fixate_training/)
├── preprocessing/          # Dataset-specific preprocessing
├── llm_models/             # Local VLM checkpoints (optional path)
├── outputs/                # Metrics / logs
└── requirements.txt

😄Acknowledgements

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages