Skip to content

Picsart-AI-Research/FlowDIS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching

Picsart AI Research (PAIR)


FlowDIS teaser

FlowDIS enables highly accurate foreground segmentation, optionally guided by a text prompt. When ambiguity prevents the model from producing the desired result, the user can specify which elements to retain in the foreground.

News

  • [06/05/2026] 💻 Project page released.
  • [06/05/2026] 📄 Paper has been released on arXiv.
  • [21/02/2026] 🔥 FlowDIS has been accepted to CVPR 2026.

Requirements

  • Python 3.12 (the project is tested on 3.12)
  • A CUDA-capable GPU (multi-GPU inference is supported)

Installation

Clone the repository and install the package:

git clone https://github.com/Picsart-AI-Research/FlowDIS
cd FlowDIS
pip install -e .

Models

Model weights are hosted on the Hugging Face Hub at PAIR/FlowDIS. They are downloaded automatically on first run and cached under ~/.cache/huggingface/hub.

To pre-download manually:

from flowdis.util import download_from_hf_hub
root_model_dir = download_from_hf_hub("PAIR/FlowDIS")
print(root_model_dir)

Inference

Run inference on a directory of images. If multiple GPUs are available, the workload is automatically split across them.

python inference.py \
    --images-dir /path/to/images \
    --output-dir /path/to/output \
    --prompts-json /path/to/prompts.json \
    --num-steps 2 \
    --resolution 1024

To use local weights instead of auto-downloading, pass --root-model-dir:

python inference.py \
    --root-model-dir /path/to/models \
    --images-dir /path/to/images \
    --output-dir /path/to/output

Arguments

Argument Required Default Description
--images-dir yes Directory of input images (.jpg, .jpeg, .png); searched recursively.
--output-dir yes Directory where predicted masks (.png) are written.
--root-model-dir no None Root directory of pre-downloaded weights. If omitted, weights are fetched from PAIR/FlowDIS on the Hugging Face Hub.
--prompts-json no None JSON mapping { "image_filename": "prompt" }. If omitted, empty prompts are used.
--num-steps no 2 Number of flow-matching sampling steps.
--resolution no 1024 Image resolution used for inference.
--num-samples no -1 Limit the number of images processed (-1 means all).

Prompts file format

{
    "image_001.jpg": "a red sports car",
    "image_002.png": "a golden retriever sitting on grass"
}

Pre-generated language prompts for the DIS dataset are available here. Precomputed results for reproducing the paper can be downloaded here.

Demo

An interactive Gradio demo is included under demo/:

python demo/app.py

Hardware requirements:

  • At least 48 GB of GPU memory for inference at 1024×1024px.
  • 80 GB of GPU memory is required for inference at higher resolutions (such as 2048x2048px).

Programmatic usage

from PIL import Image
from flowdis import flowdis_predict, load_models

models = load_models(device="cuda")

input_img_path = "path/to/input.jpg"     # Input image path
output_mask_path = "path/to/output.png"  # Path to save the output mask

image = Image.open(input_img_path).convert("RGB")

mask = flowdis_predict(
    image=image,
    prompt="",  # Text prompt
    models=models,
    resolution=1024,
    num_inference_steps=2,
    device="cuda",
)
mask.save(output_mask_path)

License

FlowDIS is licensed under the PicsArt Inc. FlowDIS Model License.

Acknowledgements

This project is built on top of FLUX.1 [schnell] and DIS5K.

BibTeX

If you use our work in your research, please cite our publication:

@article{sargsyan2026flowdis,
  title={{FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching}},
  author={Sargsyan, Andranik and Navasardyan, Shant},
  journal={arXiv preprint arXiv:2605.05077},
  year={2026},
  eprint={2605.05077},
  archivePrefix={arXiv},
  url={https://arxiv.org/abs/2605.05077}
}

About

[CVPR 2026] FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages