MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation

Kaixin Cai*, Pengzhen Ren*, Jianhua Han, Yi Zhu, Hang Xu, Jianzhuang Liu, Xiaodan Liang^📧

^*equal contribution. ^📧 corresponding author.

[arxiv.]

💥Updates

2026-03: Our MagicSeg has been officially accepted by TPAMI 2026 🎉🎉🎉.
2026-01: We have released the code of MagicSeg 🤗.

🚀 Examples

Sample Results

Each row shows the original image, counterfactual image, and corresponding mask:

Original Image	Counterfactual Image	Mask

Overview

Training Code: Based on Zegclip
Generation Code: Pipeline for text, image, and mask generation(base on Grounded-SAM)

Project Structure

MaigcSeg/
├── example/                 # Example data
│   ├── images/             # Original images
│   ├── images_neg/         # Negative images (filename + '-neg')
│   └── masks/              # Generated masks
├── generate/               # Generation pipeline
│   ├── generate_texts.py   # Text generation using GPT
│   ├── generate_images.py  # Image generation using SD1.5
│   └── mask_generate.py    # Mask generation using GroundingDINO+SAM
└── train_code/            # Training framework
    └── ZegCLIP-main/      # Modified ZegCLIP codebase

Training Configuration

Key Modifications

The training framework has been enhanced with the following features:

1. Dynamic Text Feature Construction

File: models/segmentor/zegclip.py
Function: forward_train() method
Features:
- Extracts class names from image filenames (max 2 classes separated by '_')
- Samples additional classes to reach 100 total classes per image
- Constructs new text_feat with shape [bs, 100, dim]

2. Contrastive Loss

File: models/decode_heads/decode_seg.py
Function: Added cosine similarity loss
Formula: max(0, cos(cls_token, cls_token_neg))
Integration: Added to losses dictionary

Training Commands

bash dist_train.sh configs/magicseg/vpt_seg_fully_vit-b_512x512_20k_12_10.py Path/to/magicseg/fully

Generation Pipeline

1. Text Generation

File: generate/generate_texts.py

python generate_texts.py

2. Image Generation

File: generate/generate_images.py

python generate_images.py

3. Mask Generation

File: generate/mask_generate.py

ref to Grounded-SAM

Requirements

ref to Zegclip

Citation

If you use MagicSeg in your research, please cite:

@misc{cai2026magicsegopenworldsegmentationpretraining,
      title={MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation}, 
      author={Kaixin Cai and Pengzhen Ren and Jianhua Han and Yi Zhu and Hang Xu and Jianzhuang Liu and Xiaodan Liang},
      year={2026},
      eprint={2603.19575},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.19575}, 
}

License

This project is built upon ZegCLIP, Grounged-Segment-Anything. Please refer to the original repository for licensing information.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
example		example
generate		generate
train_code/ZegCLIP-main		train_code/ZegCLIP-main
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation

💥Updates

🚀 Examples

Sample Results

Overview

Project Structure

Training Configuration

Key Modifications

1. Dynamic Text Feature Construction

2. Contrastive Loss

Training Commands

Generation Pipeline

1. Text Generation

2. Image Generation

3. Mask Generation

Requirements

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation

💥Updates

🚀 Examples

Sample Results

Overview

Project Structure

Training Configuration

Key Modifications

1. Dynamic Text Feature Construction

2. Contrastive Loss

Training Commands

Generation Pipeline

1. Text Generation

2. Image Generation

3. Mask Generation

Requirements

Citation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages