Skip to content

fix: correct batch handling in _prepare_image_ids for cond_model_input#1

Open
tanishaj12 wants to merge 1 commit into
mainfrom
tanishaj12-patch-1
Open

fix: correct batch handling in _prepare_image_ids for cond_model_input#1
tanishaj12 wants to merge 1 commit into
mainfrom
tanishaj12-patch-1

Conversation

@tanishaj12
Copy link
Copy Markdown
Owner

What does this PR do?

Fixes huggingface#13811

Description

In train_dreambooth_lora_flux2_img2img.py, _prepare_image_ids was being called with a list of B single-image tensors (one per batch element), causing it to assign escalating temporal IDs (T=10, T=20, T=30...) across batch elements — as if they were multiple reference images for a single sample.

Since each batch element is an independent training sample with exactly one conditional image, all samples should receive the same temporal ID (T=10). The fix computes IDs for a single sample and expands them across the batch dimension.

Before (buggy):

cond_model_input_list = [cond_model_input[i].unsqueeze(0) for i in range(cond_model_input.shape[0])]
cond_model_input_ids = Flux2Pipeline._prepare_image_ids(cond_model_input_list).to(
    device=cond_model_input.device
)
cond_model_input_ids = cond_model_input_ids.view(
    cond_model_input.shape[0], -1, model_input_ids.shape[-1]
)

After (fixed):

cond_model_input_ids = Flux2Pipeline._prepare_image_ids(
    [cond_model_input[0:1]]
).to(device=cond_model_input.device)
cond_model_input_ids = cond_model_input_ids.expand(
    cond_model_input.shape[0], -1, -1
)

Before submitting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix incorrect batch handling in _prepare_image_ids usage in train_dreambooth_lora_flux2_img2img.py

1 participant