📝 Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

📋 Overview

This repository contains the official implementation of Rea²Seg, currently supporting two types of candidate mask generators for inference and evaluation:

SESAME / LLaVA-v1.5-7B: trained from scratch based on see-say-segment/sesame.
LENS / Qwen2.5-VL-3B: uses weights trained by hustvl/LENS, with facebook/sam2-hiera-large.

The mask evaluator is finetuned from OpenGVLab/InternVL3-8B.

📦 Checkpoints

Download the following checkpoints and place them at the target paths below.

Component	Source	Target Path
SESAME generator	snowball521/Sesame_Generator	`checkpoint/Sesame_Generator`
CLIP vision tower (SESAME only)	openai/clip-vit-large-patch14-336	`checkpoint/clip-vit-large-patch14-336`
LENS CoT generator	OuyBin/LENS_ReasonSeg_CoT	`checkpoint/LENS_ReasonSeg_CoT`
SAM2 large (LENS only)	facebook/sam2-hiera-large	`checkpoint/sam2-hiera-large`
InternVL scorer	snowball521/Internvl3_Scorer_8B	`checkpoint/Internvl3_Scorer_8B`

📚 Benchmarks and Datasets

ReasonSeg-SGDR

ReasonSeg-SGDR is a challenging benchmark for reasoning segmentation. It comprehensively evaluates a model’s perception, grounding, and reasoning abilities across multiple dimensions, including discriminative recognition, spatial reasoning, geometric reasoning, and multi-step reasoning, with fine-grained mask generation.

Download ReasonSeg-SGDR from snowball521/ReasonSeg-SGDR, then place it at:

dataset/reason_seg/ReasonSeg-SGDR

run preprocess_reasonseg_sgdr.py to extract it.

ReasonSeg

Download ReasonSeg from Google Drive, then place it at:

dataset/reason_seg/ReasonSeg

Rea²Seg-16K

Rea²Seg-16K is a large-scale CoT-annotated training dataset for reasoning segmentation. It covers diverse reasoning types and mask granularities, spanning object-level to semantic-level.

Download Rea²Seg-16K from snowball521/Rea2Seg-16K, then place it at:

dataset/reason_seg/Rea2Seg-16K

run preprocess_rea2seg_16k.py to extract it.

🛠️ Environments

Three conda environments are used:

rea2seg_sesame — SESAME mask generator and demo (Python 3.9).
rea2seg_lens — LENS mask generator and demo (Python 3.10).
internvl_scorer — InternVL mask scoring for evaluation (Python 3.10).

The demo scripts run entirely inside their respective generator environment. The eval scripts automatically switch between the generator environment and internvl_scorer.

Adjust the PyTorch --index-url for your CUDA version if needed.

SESAME Environment

conda create -n rea2seg_sesame python=3.9 -y
conda activate rea2seg_sesame
pip install pybind11==2.11.1
pip install -r requirements_rea2seg_sesame.txt

LENS Environment

conda create -n rea2seg_lens python=3.10 -y
conda activate rea2seg_lens
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements_rea2seg_lens.txt

InternVL Scorer Environment

conda create -n internvl_scorer python=3.10 -y
conda activate internvl_scorer
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements_internvl_scorer.txt

🚀 Demo

conda activate rea2seg_sesame
python demo_rea2seg_sesame.py --save_logs

conda activate rea2seg_lens
python demo_rea2seg_lens.py --save_logs

--save_logs provides intermediate outputs for all candidate masks and visualizations of:

Attention maps from prompt tokens to image tokens inside the mask generator;
Visual feature similarity maps within SAM's ViT encoder.

📊 Evaluation

Evaluation outputs are written to eval_log/ by default.

The evaluation covers:

all ReasonSeg-SGDR categories: discriminative, geometric, multi-step, and spatial;
ReasonSeg test and val splits.

Evaluate SESAME + InternVL

The script generates candidate masks in rea2seg_sesame and then switches to internvl_scorer for scoring:

bash eval_rea2seg_sesame.sh

Evaluate LENS + InternVL

The script generates candidate masks in rea2seg_lens and then switches to internvl_scorer for scoring:

bash eval_rea2seg_lens.sh

🙏 Acknowledgements

This project is based on the following open-source projects:

📖 Citation

Feel free to open issues if you have any questions. If you find our project helpful, please give us a star and cite our work.

@misc{gao2026reasontwicesegmentationcandidate,
      title={Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning}, 
      author={Xinyan Gao and Haoran Hao and Xiangyu Yue},
      year={2026},
      eprint={2606.09303},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2606.09303}, 
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📝 Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

📋 Overview

📑 Contents

📦 Checkpoints

📚 Benchmarks and Datasets

ReasonSeg-SGDR

ReasonSeg

Rea²Seg-16K

🛠️ Environments

SESAME Environment

LENS Environment

InternVL Scorer Environment

🚀 Demo

📊 Evaluation

Evaluate SESAME + InternVL

Evaluate LENS + InternVL

🙏 Acknowledgements

📖 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
LENS		LENS
assets		assets
sesame		sesame
.gitignore		.gitignore
README.md		README.md
demo_rea2seg_lens.py		demo_rea2seg_lens.py
demo_rea2seg_sesame.py		demo_rea2seg_sesame.py
eval_rea2seg_lens.sh		eval_rea2seg_lens.sh
eval_rea2seg_sesame.sh		eval_rea2seg_sesame.sh
mask_scorer.py		mask_scorer.py
preprocess_rea2seg_16k.py		preprocess_rea2seg_16k.py
preprocess_reasonseg_sgdr.py		preprocess_reasonseg_sgdr.py
requirements_internvl_scorer.txt		requirements_internvl_scorer.txt
requirements_rea2seg_lens.txt		requirements_rea2seg_lens.txt
requirements_rea2seg_sesame.txt		requirements_rea2seg_sesame.txt

Folders and files

Latest commit

History

Repository files navigation

📝 Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

📋 Overview

📑 Contents

📦 Checkpoints

📚 Benchmarks and Datasets

ReasonSeg-SGDR

ReasonSeg

Rea²Seg-16K

🛠️ Environments

SESAME Environment

LENS Environment

InternVL Scorer Environment

🚀 Demo

📊 Evaluation

Evaluate SESAME + InternVL

Evaluate LENS + InternVL

🙏 Acknowledgements

📖 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages