Skip to content

snowball521/Rea2Seg

Repository files navigation

📝 Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

[📄 Arxiv] | [🌐 Project Page]

📋 Overview

This repository contains the official implementation of Rea²Seg, currently supporting two types of candidate mask generators for inference and evaluation:

The mask evaluator is finetuned from OpenGVLab/InternVL3-8B.

📑 Contents

📦 Checkpoints

Download the following checkpoints and place them at the target paths below.

Component Source Target Path
SESAME generator snowball521/Sesame_Generator checkpoint/Sesame_Generator
CLIP vision tower (SESAME only) openai/clip-vit-large-patch14-336 checkpoint/clip-vit-large-patch14-336
LENS CoT generator OuyBin/LENS_ReasonSeg_CoT checkpoint/LENS_ReasonSeg_CoT
SAM2 large (LENS only) facebook/sam2-hiera-large checkpoint/sam2-hiera-large
InternVL scorer snowball521/Internvl3_Scorer_8B checkpoint/Internvl3_Scorer_8B

📚 Benchmarks and Datasets

ReasonSeg-SGDR

ReasonSeg-SGDR is a challenging benchmark for reasoning segmentation. It comprehensively evaluates a model’s perception, grounding, and reasoning abilities across multiple dimensions, including discriminative recognition, spatial reasoning, geometric reasoning, and multi-step reasoning, with fine-grained mask generation.

ReasonSeg-SGDR benchmark table

Download ReasonSeg-SGDR from snowball521/ReasonSeg-SGDR, then place it at:

dataset/reason_seg/ReasonSeg-SGDR

run preprocess_reasonseg_sgdr.py to extract it.

ReasonSeg

Download ReasonSeg from Google Drive, then place it at:

dataset/reason_seg/ReasonSeg

Rea²Seg-16K

Rea²Seg-16K is a large-scale CoT-annotated training dataset for reasoning segmentation. It covers diverse reasoning types and mask granularities, spanning object-level to semantic-level.

Download Rea²Seg-16K from snowball521/Rea2Seg-16K, then place it at:

dataset/reason_seg/Rea2Seg-16K

run preprocess_rea2seg_16k.py to extract it.

🛠️ Environments

Three conda environments are used:

  • rea2seg_sesame — SESAME mask generator and demo (Python 3.9).
  • rea2seg_lens — LENS mask generator and demo (Python 3.10).
  • internvl_scorer — InternVL mask scoring for evaluation (Python 3.10).

The demo scripts run entirely inside their respective generator environment. The eval scripts automatically switch between the generator environment and internvl_scorer.

Adjust the PyTorch --index-url for your CUDA version if needed.

SESAME Environment

conda create -n rea2seg_sesame python=3.9 -y
conda activate rea2seg_sesame
pip install pybind11==2.11.1
pip install -r requirements_rea2seg_sesame.txt

LENS Environment

conda create -n rea2seg_lens python=3.10 -y
conda activate rea2seg_lens
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements_rea2seg_lens.txt

InternVL Scorer Environment

conda create -n internvl_scorer python=3.10 -y
conda activate internvl_scorer
pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu124
pip install -r requirements_internvl_scorer.txt

🚀 Demo

conda activate rea2seg_sesame
python demo_rea2seg_sesame.py --save_logs
conda activate rea2seg_lens
python demo_rea2seg_lens.py --save_logs

--save_logs provides intermediate outputs for all candidate masks and visualizations of:

  • Attention maps from prompt tokens to image tokens inside the mask generator;
  • Visual feature similarity maps within SAM's ViT encoder.

📊 Evaluation

Evaluation outputs are written to eval_log/ by default.

The evaluation covers:

  • all ReasonSeg-SGDR categories: discriminative, geometric, multi-step, and spatial;
  • ReasonSeg test and val splits.

Evaluate SESAME + InternVL

The script generates candidate masks in rea2seg_sesame and then switches to internvl_scorer for scoring:

bash eval_rea2seg_sesame.sh

Evaluate LENS + InternVL

The script generates candidate masks in rea2seg_lens and then switches to internvl_scorer for scoring:

bash eval_rea2seg_lens.sh

🙏 Acknowledgements

This project is based on the following open-source projects:

📖 Citation

Feel free to open issues if you have any questions. If you find our project helpful, please give us a star and cite our work.

@misc{gao2026reasontwicesegmentationcandidate,
      title={Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning}, 
      author={Xinyan Gao and Haoran Hao and Xiangyu Yue},
      year={2026},
      eprint={2606.09303},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2606.09303}, 
}

About

Reason Twice: Segmentation via Candidate Discovery and Comparative Reasoning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages