Skip to content

hustvl/Food-R1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Food-R1

A Unified Multi-Task Food Vision-Language Model with Reinforcement Learning

Yu Zhu*, Yongkang Li*, Wenjie Zhu, Haoyi Jiang, Wenyu Liu, Wei Yang, Bin Li, Xinggang Wang

Huazhong University of Science and Technology

*Equal contribution, Corresponding author: xgwang@hust.edu.cn

arXiv HuggingFace

Installation

For SFT, create the SFT environment:

conda env create -f environment.yml
conda activate foodr1-sft

For GRPO, create the GRPO environment:

conda env create -f environment_grpo.yml
conda activate foodr1-grpo

Food-R1 training is based on ms-swift.

git clone https://github.com/modelscope/ms-swift.git
cd ms-swift
pip install -e .

Datasets

Food-R1 is trained and evaluated on multiple public food-related datasets. Please download the datasets from their official sources and follow their respective licenses and access policies.

Food-101, VIREO Food-172, Recipe1M, Nutrition5k, FoodDialogues, MM-Food-100K

Note: For Nutrition5k, you only need to download the extracted Nutrition5k images provided in FoodDialogues instead of the original Nutrition5k image data.

Training

First configure the local paths:

cp configs/foodr1.env.example local.env

Edit local.env with your ms-swift checkout, model checkpoint, dataset files, image roots, and output directories.

Run SFT:

FOODR1_ENV=local.env bash scripts/train_sft.sh

Run GRPO on CalorieBench-80K:

FOODR1_ENV=local.env bash scripts/train_grpo.sh

Evaluation

The public release includes CalorieBench-80K evaluation.

Run inference:

python eval/infer_caloriebench80k.py \
  --model /path/to/foodr1/checkpoint \
  --data /path/to/caloriebench80k_val.json \
  --image_prefix /path/to/caloriebench80k/images \
  --output outputs/pred/caloriebench80k_val \
  --num_processes 8 \
  --batch_size 8

Compute metrics:

python eval/metrics_caloriebench80k.py --input outputs/pred/caloriebench80k_val.json

Acknowledgement

We thank the contributors of Qwen2.5-VL, Qwen3-VL, and the public food datasets used in this work. We also acknowledge GPT-4 for supporting the construction and annotation of CalorieBench-80K.

License

This project is released under the Apache License 2.0.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors