FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching

Picsart AI Research (PAIR)

FlowDIS teaser

FlowDIS enables highly accurate foreground segmentation, optionally guided by a text prompt. When ambiguity prevents the model from producing the desired result, the user can specify which elements to retain in the foreground.

News

[06/05/2026] 💻 Project page released.
[06/05/2026] 📄 Paper has been released on arXiv.
[21/02/2026] 🔥 FlowDIS has been accepted to CVPR 2026.

Requirements

Python 3.12 (the project is tested on 3.12)
A CUDA-capable GPU (multi-GPU inference is supported)

Installation

Clone the repository and install the package:

git clone https://github.com/Picsart-AI-Research/FlowDIS
cd FlowDIS
pip install -e .

Models

Model weights are hosted on the Hugging Face Hub at PAIR/FlowDIS. They are downloaded automatically on first run and cached under ~/.cache/huggingface/hub.

To pre-download manually:

from flowdis.util import download_from_hf_hub
root_model_dir = download_from_hf_hub("PAIR/FlowDIS")
print(root_model_dir)

Inference

Run inference on a directory of images. If multiple GPUs are available, the workload is automatically split across them.

python inference.py \
    --images-dir /path/to/images \
    --output-dir /path/to/output \
    --prompts-json /path/to/prompts.json \
    --num-steps 2 \
    --resolution 1024

To use local weights instead of auto-downloading, pass --root-model-dir:

python inference.py \
    --root-model-dir /path/to/models \
    --images-dir /path/to/images \
    --output-dir /path/to/output

Arguments

Argument	Required	Default	Description
`--images-dir`	yes	–	Directory of input images (`.jpg`, `.jpeg`, `.png`); searched recursively.
`--output-dir`	yes	–	Directory where predicted masks (`.png`) are written.
`--root-model-dir`	no	`None`	Root directory of pre-downloaded weights. If omitted, weights are fetched from `PAIR/FlowDIS` on the Hugging Face Hub.
`--prompts-json`	no	`None`	JSON mapping `{ "image_filename": "prompt" }`. If omitted, empty prompts are used.
`--num-steps`	no	`2`	Number of flow-matching sampling steps.
`--resolution`	no	`1024`	Image resolution used for inference.
`--num-samples`	no	`-1`	Limit the number of images processed (`-1` means all).

Prompts file format

{
    "image_001.jpg": "a red sports car",
    "image_002.png": "a golden retriever sitting on grass"
}

Pre-generated language prompts for the DIS dataset are available here. Precomputed results for reproducing the paper can be downloaded here.

Demo

An interactive Gradio demo is included under demo/:

python demo/app.py

Hardware requirements:

At least 48 GB of GPU memory for inference at 1024×1024px.
80 GB of GPU memory is required for inference at higher resolutions (such as 2048x2048px).

Programmatic usage

from PIL import Image
from flowdis import flowdis_predict, load_models

models = load_models(device="cuda")

input_img_path = "path/to/input.jpg"     # Input image path
output_mask_path = "path/to/output.png"  # Path to save the output mask

image = Image.open(input_img_path).convert("RGB")

mask = flowdis_predict(
    image=image,
    prompt="",  # Text prompt
    models=models,
    resolution=1024,
    num_inference_steps=2,
    device="cuda",
)
mask.save(output_mask_path)

License

FlowDIS is licensed under the PicsArt Inc. FlowDIS Model License.

Acknowledgements

This project is built on top of FLUX.1 [schnell] and DIS5K.

BibTeX

If you use our work in your research, please cite our publication:

@article{sargsyan2026flowdis,
  title={{FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching}},
  author={Sargsyan, Andranik and Navasardyan, Shant},
  journal={arXiv preprint arXiv:2605.05077},
  year={2026},
  eprint={2605.05077},
  archivePrefix={arXiv},
  url={https://arxiv.org/abs/2605.05077}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
demo		demo
flowdis		flowdis
.gitignore		.gitignore
APACHE-2.0-LICENSE		APACHE-2.0-LICENSE
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching

News

Requirements

Installation

Models

Inference

Arguments

Prompts file format

Demo

Programmatic usage

License

Acknowledgements

BibTeX

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FlowDIS: Language-Guided Dichotomous Image Segmentation with Flow Matching

News

Requirements

Installation

Models

Inference

Arguments

Prompts file format

Demo

Programmatic usage

License

Acknowledgements

BibTeX

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages