MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping
MegaStyle is a novel and scalable data curation pipeline that first explores consistent T2I style mapping ability from current large generative models to construct intra-style consistent, inter-style diverse and high-quality style dataset.
Your star is our fuel! We're revving up the engines with it! Check the our project page for more visual results!
- A more diverse and larger-scale style dataset.
MegaStyle-1.4M is a large-scale style dataset built through a scalable pipeline that leverages consistent text-to-image style mapping of Qwen-Image. It combines 170K curated style prompts with 400K content prompts to generate 1.4M high-quality images that share strong intra-style consistency while covering diverse fine-grained styles.

Trained on MegaStyle1.4M, we introduce MegaStyle-FLUX and MegaStyle-Encoder for generalizable style transfer and reliable style similarity measurement.
git clone git@github.com:Tencent/MegaStyle.git
cd ./MegaStyle
conda create -n megastyle python==3.10
conda activate megastyle
pip install diffsynth==1.1.8
-
Download the pretrained models of SigLIP and FLUX.1-dev.
-
Download the models into
./models/.
For image style transfer, we provide 50 reference style images from StyleBench in ./ref_styles:
python inference.py --ckpt_path models/megastyle_flux.safetensors --ref_path ./ref_styles
For computing style score:
python style_score.py --ckpt_path models/megastyle_encoder.pth --real_image_path <path/to/image.png> --fake_image_path <path/to/image.png>
To train a style transfer model with paired supervision, please download our style dataset, MegaStyle1.4M, and start training with:
bash FLUX.1-dev.sh # FLUX.1-dev-npu.sh for npu
All assets and code are under the license unless specified otherwise.
If this work is helpful for your research, please consider citing the following BibTeX entry.
@article{gao2026megastyle,
title={MegaStyle: Constructing Diverse and Scalable Style Dataset via Consistent Text-to-Image Style Mapping},
author={Gao, Junyao and Liu, Sibo and Li, Jiaxing and Sun, Yanan and Tu, Yuanpeng and Shen, Fei and Zhang, Weidong and Zhao, Cairong and Zhang, Jun},
journal={arXiv preprint arXiv:2604.08364},
year={2026}
}
The code is built upon DiffSynth-Studio.
