Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence

🔥 Updates

🚀 February 20, 2026 - Conan got accepted by CVPR 2026.

🚀 November 17, 2025 — We release the Conan-91k dataset!

🚀 October 25, 2025 — We release the training framework of Conan!

🚀 October 23, 2025 — We release the paper!

🚀 October 20, 2025 — We are excited to release Conan-7B and its accompanying evaluation toolkit, Conan-Eval!

🚀 September 30, 2025 — Conan-SFT-7B has officially landed on Hugging Face!

Introduction

Conan is an innovative Multimodal Large Language Model (MLLM) designed with advanced reasoning capabilities inspired by a detective's investigative process. It excels in:

Identifying multi-scale frames of visual evidence.
Reasoning over cross-frame clues to connect information.
Deciding plausible actions based on its deductions.

🏆 Performance Highlights Here's a glimpse of Conan's impressive capabilities:

⚙️ Environment Setup

Clone the Repository:

git clone https://github.com/OuyangKun10/Conan.git
cd Conan

Create and Activate Environment:

conda create --name Conan python=3.10
conda activate Conan

Install Dependencies:

cd ms-swift

pip install -e .

🏋️ Training

cd ms-swift/training_scripts

Texual Reasoning

bash Conan-SFT-Stage1.sh

Multimodal alignment Reasoning

bash Conan-SFT-Stage2.sh

Vision-centric Reasoning

bash Conan-SFT-Stage3.sh

AIR RLVR

bash Conan-server.sh
bash Conan-AIR-RLVR.sh

📊 Evaluation

Conan-Eval toolkit allows for comprehensive evaluation across various benchmarks:

Multi-step Reasoning Benchmarks:
- MMR-V
- Video-Holmes
- VRBench
- VCRBench
- LongVideoReason
- HumanPCR
Long-video Understanding Benchmarks:
- LongVideoBench
- MLVU
- LVBench
- Video-MME
Usage: To run the evaluation, simply execute:
```
bash run.sh
```

Acknowledgments

We extend our sincere gratitude to the following projects for their invaluable contributions and inspiration:

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
Conan-Eval		Conan-Eval
figure		figure
ms-swift		ms-swift
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence

🔥 Updates

Introduction

⚙️ Environment Setup

🏋️ Training

📊 Evaluation

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Conan: Progressive Learning to Reason Like a Detective over Multi-Scale Visual Evidence

🔥 Updates

Introduction

⚙️ Environment Setup

🏋️ Training

📊 Evaluation

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages