Skip to content

[None][fix] Fix scheduler off-by-one in FLUX pipelines at high resolutions#13091

Merged
karljang merged 1 commit intoNVIDIA:mainfrom
karljang:fix/flux-160k-scheduler-begin-index
Apr 17, 2026
Merged

[None][fix] Fix scheduler off-by-one in FLUX pipelines at high resolutions#13091
karljang merged 1 commit intoNVIDIA:mainfrom
karljang:fix/flux-160k-scheduler-begin-index

Conversation

@karljang
Copy link
Copy Markdown
Collaborator

@karljang karljang commented Apr 15, 2026

Summary

  • Add set_begin_index(0) after set_timesteps in FLUX.1 and FLUX.2 pipelines to match upstream diffusers behavior
  • Fixes IndexError when running FLUX at high resolutions (e.g. 6400×6400 / 160K+ tokens) with few inference steps

Root Cause

FLUX uses dynamic timestep shifting (mu) that scales linearly with sequence length. At high resolutions (160K+ tokens), mu becomes large enough (~27.5) that all scheduler sigmas collapse to the same value (1.0) and all timesteps become identical (1000.0).

Without set_begin_index(0), the scheduler's _init_step_index calls index_for_timestep(1000.0), finds all timesteps matching, and picks index 1 (not 0). After N steps, step_index reaches N+1, causing an out-of-bounds access on self.sigmas[step_index + 1].

The upstream diffusers FluxPipeline avoids this by calling self.scheduler.set_begin_index(0) before the denoise loop (diffusers pipeline_flux.py:936), which forces _init_step_index to use index 0.

Test Plan

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • Bug Fixes
    • Fixed scheduler initialization in Flux visual generation models to ensure consistent pipeline behavior across generation runs.

…tions

Add set_begin_index(0) after set_timesteps in FLUX.1 and FLUX.2
pipelines to match upstream diffusers behavior.

Without this, when the dynamic shift parameter mu is large (which
happens at high resolutions like 6400x6400 / 160K+ tokens), all
scheduler timesteps collapse to the same value. The scheduler's
index_for_timestep then picks index 1 instead of 0 for the first
step, causing an IndexError on the final step when it tries to
access sigmas[num_steps + 1].

Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
@karljang karljang requested a review from a team as a code owner April 15, 2026 19:59
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 15, 2026

📝 Walkthrough

Walkthrough

Two Flux pipeline implementations now explicitly set the scheduler's begin index to 0 immediately after retrieving timesteps, ensuring consistent scheduler state initialization at the start of pipeline execution.

Changes

Cohort / File(s) Summary
Scheduler Initialization in Flux Pipelines
tensorrt_llm/_torch/visual_gen/models/flux/pipeline_flux.py, tensorrt_llm/_torch/visual_gen/models/flux/pipeline_flux2.py
Added explicit call to self.scheduler.set_begin_index(0) after retrieving timesteps in the forward method to ensure scheduler state is properly initialized.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the fix being applied: adding scheduler initialization to resolve an off-by-one error in FLUX pipelines at high resolutions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The pull request description comprehensively explains the issue, root cause, solution, and test coverage following the template structure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43870 [ run ] triggered by Bot. Commit: 57ef61c Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43870 [ run ] completed with state FAILURE. Commit: 57ef61c
/LLM/main/L0_MergeRequest_PR pipeline #34322 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@karljang
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43951 [ run ] triggered by Bot. Commit: 57ef61c Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43951 [ run ] completed with state SUCCESS. Commit: 57ef61c
/LLM/main/L0_MergeRequest_PR pipeline #34397 completed with status: 'SUCCESS'

CI Report

Link to invocation

@karljang karljang merged commit 7cf851c into NVIDIA:main Apr 17, 2026
11 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants