DOS: Directional Object Separation in Text Embeddings (AAAI 2026)

This repository contains the official implementation of the paper: DOS: Directional Object Separation in Text Embeddings for Multi-Object Image Generation

Prerequisites

The code was tested on a RTX3090 but should work on other cards with at least 24GB VRAM.

conda env create --file environment.yaml
conda activate dos
pip install -e .

Note: Please input the valid hugging face token in ./configs/envs.py before running diffusion models.

How to test

Please refer to the following notebooks:

SDXL: ./notebooks/test_dos_sdxl.ipynb
SD3.5: ./notebooks/test_dos_sd3.5.ipynb

How to run benchmark

DATASET="similar_shapes"

# SDXL
python run_benchmark.py \
--device cuda:0 \
--output_path outputs/performance_comparison/${DATASET}/sdxl \
--dataset ${DATASET} \
--method sdxl \
--seed_range 1 5

# SDXL with DOS
python run_benchmark.py \
--device cuda:0 \
--output_path outputs/performance_comparison/${DATASET}/sdxl_with_dos \
--dataset ${DATASET} \
--method sdxl_with_dos \
--lambda_sep 1.0 \
--seed_range 1 5

How to evaluate

We measure gpt-4o-mini-based Sucess Rate (SR) and Mixture Rate (MR) for the generated images.

python evaluate_with_vlm.py \
--folder outputs/performance_comparison/similar_shapes/sdxl \
--model openai/gpt-4o-mini \
--api_key {your_open_router_api_key}

For more details, please refer to the script evaluate_with_vlm.py.

Citation

@article{byun2025directional,
  title={DOS: Directional Object Separation in Text Embeddings for Multi-Object Image Generation},
  author={Byun, Dongnam and Park, Jungwon and Ko, Jumgmin and Choi, Changin and Rhee, Wonjong},
  journal={arXiv preprint arXiv:2510.14376},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
configs		configs
docs		docs
methods		methods
notebooks		notebooks
utils		utils
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
evaluate_with_vlm.py		evaluate_with_vlm.py
run_benchmark.py		run_benchmark.py
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DOS: Directional Object Separation in Text Embeddings (AAAI 2026)

Prerequisites

How to test

How to run benchmark

How to evaluate

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DOS: Directional Object Separation in Text Embeddings (AAAI 2026)

Prerequisites

How to test

How to run benchmark

How to evaluate

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages