Skip to content

[https://nvbugs/6162857][fix] Use generation metrics for VisualGen perf sanity#14176

Merged
zhenhuaw-me merged 3 commits into
NVIDIA:mainfrom
taianz-nv:dev-taianz-bug6162857
May 29, 2026
Merged

[https://nvbugs/6162857][fix] Use generation metrics for VisualGen perf sanity#14176
zhenhuaw-me merged 3 commits into
NVIDIA:mainfrom
taianz-nv:dev-taianz-bug6162857

Conversation

@taianz-nv
Copy link
Copy Markdown
Collaborator

@taianz-nv taianz-nv commented May 15, 2026

Fixes https://nvbugs/6162857 by aligning VisualGen perf sanity with the new benchmark metric schema and ensuring online benchmarks receive real engine-side generation timing.

Description

VisualGen benchmark results now report latency and generation separately in seconds. This PR updates perf sanity validation and OpenSearch extraction to upload both metric families, and uses median generation as the regression metric.

For online image and sync-video benchmarks, trtllm-serve now returns VisualGen engine-side timing through the standard Server-Timing response header. The benchmark client parses that metadata into the result JSON, so generation metrics no longer default to zero. This is non-breaking for regular serving clients: request schemas and response bodies are unchanged, and clients that ignore headers keep the same behavior.

Test Coverage

  • python3 -m py_compile for the modified VisualGen serve, benchmark, and perf sanity files.
  • Added endpoint unit coverage for Server-Timing on image and sync-video responses.
  • Pre-commit hooks passed during commit amend.
  • Full VisualGen GPU perf sanity case not run locally.

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@taianz-nv taianz-nv requested a review from a team as a code owner May 15, 2026 06:37
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

📝 Walkthrough

Walkthrough

This PR migrates VisualGen performance sanity test metrics from end-to-end latency to separate latency and generation metrics. Metric definitions, validation contracts, database upload gating, documentation, and test waivers are coordinated to use mean/median/percentile latency and generation fields instead of e2e_latency, with validation enforcing positive generation values and required percentile fields.

Changes

VisualGen metrics migration from e2e_latency to latency/generation

Layer / File(s) Summary
Metric definitions and field mappings
tests/integration/defs/perf/visual_gen_perf_utils.py
MINIMIZE_METRICS and REGRESSION_METRICS are updated to use mean/median/percentile latency and generation instead of e2e_latency; RESULT_METRIC_PATHS is reworked to map latency and generation fields from VisualGen benchmark JSON, including p90/p99 percentiles.
Validation contract and database upload
tests/integration/defs/perf/test_visual_gen_perf_sanity.py
Database upload is now gated by OPEN_SEARCH_DB_BASE_URL environment variable; _validate_benchmark_result is updated to require separate latency/generation metric fields, enforce p90/p99 percentiles in both metric categories, and reject zero or negative mean generation values.
Documentation and test waiver cleanup
tests/integration/defs/perf/README_test_visual_gen_perf_sanity.md, tests/integration/test_lists/waives.txt
README now documents latency and generation metrics (read from VisualGen JSON in seconds) and clarifies latency includes encoding/persistence overhead while generation reflects engine-side time; obsolete test waivers for e2e metric tests are removed.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

  • jieli-matrix
  • xinhe-nv
  • zhenhuaw-me
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: switching to generation metrics for VisualGen performance sanity tests, directly addressing the fix for the referenced bug.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed PR description is comprehensive and follows the template structure with clear sections for Description, Test Coverage, and PR Checklist.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/integration/defs/perf/test_visual_gen_perf_sanity.py (1)

512-517: QA list impact looks unchanged for this PR.

This metrics-contract migration does not add/remove test selectors or cases, so test-list entries under tests/integration/test_lists/qa/ do not need updates in this PR.

As per coding guidelines: “If the change adds or materially alters an integration test under tests/integration/defs/... call out whether an entry is needed under tests/integration/test_lists/qa/.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/defs/perf/test_visual_gen_perf_sanity.py` around lines 512
- 517, The change to the metrics list in test_visual_gen_perf_sanity.py (the
added/modified metric names like "mean_latency", "median_latency",
"percentiles_latency", "mean_generation", "median_generation",
"percentiles_generation") does not add/remove integration tests, so explicitly
document that no update to the QA test-list is required: add a one-line comment
in tests/integration/defs/perf/test_visual_gen_perf_sanity.py near the metrics
block (or a short note in the PR description) stating that QA list entries under
tests/integration/test_lists/qa/ are unchanged and no action is needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/integration/defs/perf/test_visual_gen_perf_sanity.py`:
- Around line 549-558: The current validation for result_data["mean_generation"]
accepts NaN/Inf because float(... ) <= 0 is false for non-finite values; update
the check in the test (the block that inspects result_data["mean_generation"])
to explicitly verify finiteness and positivity by using math.isfinite on
float(result_data["mean_generation"]) and raising the same ValueError if the
value is not finite or is <= 0 so invalid NaN/Inf values are rejected.

---

Nitpick comments:
In `@tests/integration/defs/perf/test_visual_gen_perf_sanity.py`:
- Around line 512-517: The change to the metrics list in
test_visual_gen_perf_sanity.py (the added/modified metric names like
"mean_latency", "median_latency", "percentiles_latency", "mean_generation",
"median_generation", "percentiles_generation") does not add/remove integration
tests, so explicitly document that no update to the QA test-list is required:
add a one-line comment in
tests/integration/defs/perf/test_visual_gen_perf_sanity.py near the metrics
block (or a short note in the PR description) stating that QA list entries under
tests/integration/test_lists/qa/ are unchanged and no action is needed.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 247229f9-a49a-4273-a85e-985e05dc2611

📥 Commits

Reviewing files that changed from the base of the PR and between 849bb4d and 0b28c45.

📒 Files selected for processing (4)
  • tests/integration/defs/perf/README_test_visual_gen_perf_sanity.md
  • tests/integration/defs/perf/test_visual_gen_perf_sanity.py
  • tests/integration/defs/perf/visual_gen_perf_utils.py
  • tests/integration/test_lists/waives.txt
💤 Files with no reviewable changes (1)
  • tests/integration/test_lists/waives.txt

Comment thread tests/integration/defs/perf/test_visual_gen_perf_sanity.py Outdated
@taianz-nv taianz-nv force-pushed the dev-taianz-bug6162857 branch 3 times, most recently from e788658 to 6fcdefa Compare May 15, 2026 08:54
@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48572 [ run ] triggered by Bot. Commit: 6fcdefa Link to invocation

@taianz-nv taianz-nv force-pushed the dev-taianz-bug6162857 branch from 6fcdefa to 0e41c6b Compare May 15, 2026 09:27
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48572 [ run ] completed with state FAILURE. Commit: 6fcdefa
/LLM/main/L0_MergeRequest_PR pipeline #38359 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot cancel

@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@github-actions
Copy link
Copy Markdown

GitHub Bot Help

/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...

Provide a user friendly way for developers to interact with a Jenkins server.

Run /bot [-h|--help] to print this help message.

See details below for each supported subcommand.

Details

run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental) --high-priority]

Launch build/test pipelines. All previously running jobs will be killed.

--reuse-test (optional)pipeline-id (OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.

--disable-reuse-test (OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.

--disable-fail-fast (OPTIONAL) : Disable fail fast on build/tests/infra failures.

--skip-test (OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.

--stage-list "A10-PyTorch-1, xxx" (OPTIONAL) : Only run the specified test stages. Supports wildcard * for pattern matching (e.g., "*PerfSanity*" matches all stages containing PerfSanity). Examples: "A10-PyTorch-1, xxx", "PerfSanity". Note: Does NOT update GitHub check status.

--gpu-type "A30, H100_PCIe" (OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.

--test-backend "pytorch, cpp" (OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.

--only-multi-gpu-test (OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.

--disable-multi-gpu-test (OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.

--add-multi-gpu-test (OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.

--post-merge (OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.

--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" (OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Supports wildcard * for pattern matching. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx", --extra-stage "Post-Merge".

--detailed-log (OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.

--debug (OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in the stage-list parameter to access the appropriate container environment. Note: Does NOT update GitHub check status.

--high-priority (OPTIONAL) : Run the pipeline with high priority. This option is restricted to authorized users only and will route the job to a high-priority queue.

kill

kill

Kill all running builds associated with pull request.

skip

skip --comment COMMENT

Skip testing for latest commit on pull request. --comment "Reason for skipping build/test" is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

reuse-pipeline

reuse-pipeline

Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48581 [ run ] triggered by Bot. Commit: 0e41c6b Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48581 [ run ] completed with state SUCCESS. Commit: 0e41c6b
/LLM/main/L0_MergeRequest_PR pipeline #38367 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@taianz-nv taianz-nv force-pushed the dev-taianz-bug6162857 branch from 0e41c6b to a01e03f Compare May 18, 2026 06:45
@taianz-nv taianz-nv requested review from a team as code owners May 18, 2026 06:45
@taianz-nv taianz-nv requested a review from schetlur-nv May 18, 2026 06:45
Comment thread tensorrt_llm/serve/visual_gen_metrics.py Outdated
Comment thread tensorrt_llm/serve/visual_gen_metrics.py
@taianz-nv taianz-nv force-pushed the dev-taianz-bug6162857 branch from a01e03f to e009b7e Compare May 19, 2026 02:19
@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49040 [ run ] triggered by Bot. Commit: e009b7e Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49040 [ run ] completed with state SUCCESS. Commit: e009b7e
/LLM/main/L0_MergeRequest_PR pipeline #38777 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49154 [ run ] triggered by Bot. Commit: e009b7e Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49154 [ run ] completed with state SUCCESS. Commit: e009b7e
/LLM/main/L0_MergeRequest_PR pipeline #38835 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@taianz-nv taianz-nv force-pushed the dev-taianz-bug6162857 branch from 6749a7b to e4468b6 Compare May 25, 2026 07:56
@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50176 [ run ] triggered by Bot. Commit: e4468b6 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50176 [ run ] completed with state SUCCESS. Commit: e4468b6
/LLM/main/L0_MergeRequest_PR pipeline #39719 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@taianz-nv taianz-nv force-pushed the dev-taianz-bug6162857 branch from e4468b6 to e0fa37a Compare May 26, 2026 02:27
@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50270 [ run ] triggered by Bot. Commit: e0fa37a Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50270 [ run ] completed with state SUCCESS. Commit: e0fa37a
/LLM/main/L0_MergeRequest_PR pipeline #39800 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50336 [ run ] triggered by Bot. Commit: e0fa37a Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50336 [ run ] completed with state SUCCESS. Commit: e0fa37a
/LLM/main/L0_MergeRequest_PR pipeline #39863 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50643 [ run ] triggered by Bot. Commit: e0fa37a Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50643 [ run ] completed with state SUCCESS. Commit: e0fa37a
/LLM/main/L0_MergeRequest_PR pipeline #40137 completed with status: 'SUCCESS'

CI Report

Link to invocation

taianz-nv added 3 commits May 28, 2026 05:45
Track VisualGen latency and generation from the new benchmark result schema, and use generation as the regression metric while keeping upload gated on OpenSearch configuration.

Signed-off-by: Taian Zhang <taianz@nvidia.com>
Signed-off-by: Taian Zhang <taianz@nvidia.com>
Remove waivers that were accidentally preserved while resolving the upstream rebase conflict so VisualGen perf sanity runs as intended.

Signed-off-by: Taian Zhang <taianz@nvidia.com>
@taianz-nv taianz-nv force-pushed the dev-taianz-bug6162857 branch from e0fa37a to e8b3397 Compare May 28, 2026 05:54
@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50705 [ run ] triggered by Bot. Commit: e8b3397 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50705 [ run ] completed with state FAILURE. Commit: e8b3397
/LLM/main/L0_MergeRequest_PR pipeline #40190 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@taianz-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50966 [ run ] triggered by Bot. Commit: e8b3397 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50966 [ run ] completed with state SUCCESS. Commit: e8b3397
/LLM/main/L0_MergeRequest_PR pipeline #40422 completed with status: 'SUCCESS'

CI Report

Link to invocation

@taianz-nv taianz-nv force-pushed the dev-taianz-bug6162857 branch from 5bfb2e2 to e8b3397 Compare May 29, 2026 09:25
@taianz-nv taianz-nv removed the request for review from schetlur-nv May 29, 2026 09:29
@zhenhuaw-me zhenhuaw-me merged commit 3f21a48 into NVIDIA:main May 29, 2026
12 checks passed
zhenhuaw-me added a commit to zhenhuaw-me/TensorRT-LLM that referenced this pull request May 29, 2026
…d_strength to extra_params, clamp seed

Third round of self-review feedback on PR NVIDIA#14733. Three code-fix
threads + one CodeRabbit follow-up; conflict-resolution merge done
in this round too.

Pipeline forward() signature reorder
- Moved `seed: int` to the second positional argument (right after
  `prompt`, or after `image` for `WanImageToVideoPipeline`) across
  flux / flux2 / wan / wan_i2v / ltx2 / ltx2_two_stages, and dropped
  the bare `*,` keyword-only marker. Callers (`infer()` and
  `_run_warmup`) already used kwargs only, so this is a mechanical
  internal-API change.
- Aligned `Cosmos3OmniMoTPipeline.forward` with the same convention —
  it still carried `seed: int = 42` as a stale default.

image_cond_strength moved to per-pipeline extra_params
- Cross-framework check confirmed that diffusers, vllm-omni, and
  SGLang Diffusion all treat conditioning-strength knobs as
  pipeline-specific (diffusers exposes them only on
  `LTXConditionPipeline` / SD img2img / SVD; vllm-omni and SGLang
  route them through generic `diffusers_kwargs` bags). None treats
  `image_cond_strength` as a first-class request field.
- Dropped `image_cond_strength` from `VisualGenParams`, from
  `VideoGenerationRequest`, from the `visual_gen_utils` translation,
  from `_GENERATION_CONFIG_FIELDS`, and from the Wan-defaults
  carve-out (which let `get_wan_default_params` lose the
  `include_i2v` argument entirely).
- Added the field to `LTX2Pipeline.extra_param_specs` (`default=1.0`);
  both `LTX2Pipeline.infer()` and `LTX2TwoStagesPipeline.infer()` now
  read it from `req.params.extra_params["image_cond_strength"]`. Wan
  pipelines reject the key via the existing unknown-`extra_params`
  path.
- Updated the LTX-2 example script so `--image_cond_strength` flows
  into `extra_params` instead of the dropped top-level field, and
  trimmed the serve README's per-request control list.

Seed clamp + N-image RNG semantics doc
- Introduced `MAX_UINT32_SEED = 2**32 - 1` and applied
  `ge=0, le=MAX_UINT32_SEED` on `VisualGenParams.seed` and on the
  three openai_protocol video / image / edit request schemas.
- Coordinator-rank random seed materialization moved from
  `secrets.randbits(63)` to `secrets.randbits(32)` so engine-drawn
  seeds stay inside the OpenAI DALL-E range that vllm-omni adopts.
- Documented the chosen `num_images_per_prompt > 1` semantics in the
  `VisualGenParams.seed` description: diffusers/vllm-omni style
  (single `torch.Generator(seed=s)` driving N latents from one RNG
  stream), not SGLang's `[s, s+1, …]` per-image expansion.
- Three new unit tests pin the range bounds.

Conflict resolution
- Rebased the three serve-parity commits onto origin/main, resolving
  conflicts in `openai_video_routes.py` against the
  `build_visual_gen_timing_headers` introduction from NVIDIA#14176. Kept
  the upstream cleanup that uses local `generation` / `denoise`
  variables.

Tests
- `test_visual_gen_params.py`: three obsolete top-level
  image_cond_strength tests replaced with two extra_params variants;
  three new tests for seed bounds.

Signed-off-by: Zhenhua Wang <zhenhuaw@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants