FrameThinker

This is the official repository for the core code of the paper: FrameThinker: Learning to Think with Long Videos via Multi-Turn Frame Spotlighting.

📖 About The Project

FrameThinker is a novel framework for long-video reasoning that challenges the inefficient, passive methods of traditional models. Instead of processing a fixed set of pre-sampled frames, FrameThinker actively interrogates video content through a multi-turn, iterative process. It intelligently spotlights relevant frame sequences to gather evidence, guided by a Cognitive Consistency Verification (CCV) module that ensures its reasoning is logical and interpretable. Across six challenging benchmarks, FrameThinker achieves an average +10.4% accuracy improvement over the baseline. As a highlight, it surpasses the strong LongVILA-R1 to set a new state-of-the-art on the LongVideo-Reason benchmark, using just 20.6 frames on average.

An illustration of the FrameThinker framework.

An example of FrameThinker solving a reasoning task.

An example of a multi-step reasoning process.

🚀 Getting Started

Prerequisites

Python==3.10
vllm==0.9.1
transformers==4.52.4
Other dependencies listed in requirements.txt

💾 Data

Our training data is sourced from the following open-source projects:

LongVideoReason, Video-R1, Video-Holmes, CG-Bench

⚙️ Training

bash examples/agent/train_frame_thinker.sh

🤝 Merge Checkpoints

python merge_script.py \
    --backend fsdp \
    --hf_model_path /path/to/original/hf-model \
    --local_dir /path/to/your/checkpoints \
    --target_dir /path/to/save/merged_hf_model

🚀 Inference & Evaluation

python examples/agent/infer.py

🙏 Acknowledgements

We would like to express our sincere gratitude to the open-source community and the creators of the foundational projects that made this work possible.

Our implementation is built upon the excellent codebases of verl and DeepEyes. Their work provided a strong foundation and significantly accelerated our research. We highly recommend checking out their projects.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
examples		examples
patches		patches
recipe		recipe
scripts		scripts
tests		tests
verl		verl
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
Notice.txt		Notice.txt
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FrameThinker

📖 About The Project

🚀 Getting Started

Prerequisites

💾 Data

⚙️ Training

🤝 Merge Checkpoints

🚀 Inference & Evaluation

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

License

lcqysl/FrameThinker

Folders and files

Latest commit

History

Repository files navigation

FrameThinker

📖 About The Project

🚀 Getting Started

Prerequisites

💾 Data

⚙️ Training

🤝 Merge Checkpoints

🚀 Inference & Evaluation

🙏 Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages