Our method enables fine-grained control over the intensity of semantic attributes in diffusion models through a plug-and-play value encoder.
Unlike existing text encoders, which cannot interpret numeric intensity or continuous values, AttriCtrl bridges this gap and allows precise, interpretable adjustments of aesthetic attributes.

Examples of controlling individual aesthetic attributes.

Demonstrations of seamless integration with other frameworks.

AttriCtrl is lightweight, model-agnostic, and achieves continuous controllability without modifying the underlying diffusion backbone.
git clone https://github.com/CD22104/AttriCtrl.git
cd AttriCtrl
pip install -e .import torch
from diffsynth.pipelines.flux_image_new import FluxImagePipeline, ModelConfig
pipe = FluxImagePipeline.from_pretrained(
torch_dtype=torch.bfloat16,
device="cuda",
model_configs=[
ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="flux1-dev.safetensors"),
ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="text_encoder/model.safetensors"),
ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="text_encoder_2/"),
ModelConfig(model_id="black-forest-labs/FLUX.1-dev", origin_file_pattern="ae.safetensors"),
ModelConfig(model_id="DiffSynth-Studio/AttriCtrl-FLUX.1-Dev", origin_file_pattern="models/detail.safetensors")
],
)
for i in [0.1, 0.3, 0.5, 0.7, 0.9]:
image = pipe(prompt="a cat on the beach", seed=2, value_controller_inputs=[i])
image.save(f"value_control_{i}.jpg")