Skip to content

ckxhp/magicseg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation

Kaixin Cai*, Pengzhen Ren*, Jianhua Han, Yi Zhu, Hang Xu, Jianzhuang Liu, Xiaodan Liang📧

* equal contribution. 📧 corresponding author.

[arxiv.]

💥Updates

  • 2026-03: Our MagicSeg has been officially accepted by TPAMI 2026 🎉🎉🎉.
  • 2026-01: We have released the code of MagicSeg 🤗.

🚀 Examples

Sample Results

show

Each row shows the original image, counterfactual image, and corresponding mask:

Original Image Counterfactual Image Mask
img1 img1-neg img1-mask
img2 img2-neg img2-mask
img3 img3-neg img3-mask
img4 img4-neg img4-mask

Overview

  1. Training Code: Based on Zegclip

  2. Generation Code: Pipeline for text, image, and mask generation(base on Grounded-SAM)

Project Structure

MaigcSeg/
├── example/                 # Example data
│   ├── images/             # Original images
│   ├── images_neg/         # Negative images (filename + '-neg')
│   └── masks/              # Generated masks
├── generate/               # Generation pipeline
│   ├── generate_texts.py   # Text generation using GPT
│   ├── generate_images.py  # Image generation using SD1.5
│   └── mask_generate.py    # Mask generation using GroundingDINO+SAM
└── train_code/            # Training framework
    └── ZegCLIP-main/      # Modified ZegCLIP codebase

Training Configuration

Key Modifications

The training framework has been enhanced with the following features:

1. Dynamic Text Feature Construction

  • File: models/segmentor/zegclip.py
  • Function: forward_train() method
  • Features:
    • Extracts class names from image filenames (max 2 classes separated by '_')
    • Samples additional classes to reach 100 total classes per image
    • Constructs new text_feat with shape [bs, 100, dim]

2. Contrastive Loss

  • File: models/decode_heads/decode_seg.py
  • Function: Added cosine similarity loss
  • Formula: max(0, cos(cls_token, cls_token_neg))
  • Integration: Added to losses dictionary

Training Commands

bash dist_train.sh configs/magicseg/vpt_seg_fully_vit-b_512x512_20k_12_10.py Path/to/magicseg/fully

Generation Pipeline

1. Text Generation

File: generate/generate_texts.py

python generate_texts.py

2. Image Generation

File: generate/generate_images.py

python generate_images.py

3. Mask Generation

File: generate/mask_generate.py

ref to Grounded-SAM

Requirements

ref to Zegclip

Citation

If you use MagicSeg in your research, please cite:

@misc{cai2026magicsegopenworldsegmentationpretraining,
      title={MagicSeg: Open-World Segmentation Pretraining via Counterfactural Diffusion-Based Auto-Generation}, 
      author={Kaixin Cai and Pengzhen Ren and Jianhua Han and Yi Zhu and Hang Xu and Jianzhuang Liu and Xiaodan Liang},
      year={2026},
      eprint={2603.19575},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2603.19575}, 
}

License

This project is built upon ZegCLIP, Grounged-Segment-Anything. Please refer to the original repository for licensing information.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors