Skip to content

[test] add LTX-2 distilled T2V SSIM regression test#1240

Open
SolitaryThinker wants to merge 3 commits intomainfrom
will/ltx2_ssim
Open

[test] add LTX-2 distilled T2V SSIM regression test#1240
SolitaryThinker wants to merge 3 commits intomainfrom
will/ltx2_ssim

Conversation

@SolitaryThinker
Copy link
Copy Markdown
Collaborator

@SolitaryThinker SolitaryThinker commented Apr 17, 2026

Summary

Adds two things that together let LTX-2 start acting as an SSIM regression guard:

  1. fastvideo/tests/ssim/test_ltx2_similarity.py — LTX-2 is the only model family in fastvideo/pipelines/basic/ without an SSIM coverage file. Adds one alongside the other per-family tests so the in-flight API refactor and future LTX-2-specific changes (two-stage refine, gpu_pool upstream, Dynamo backend) have a golden-quality regression guard.

    Test shape matches test_wan_t2v_similarity.py / test_turbodiffusion_similarity.py:

    • Default (CI-friendly): FastVideo/LTX2-Distilled-Diffusers, 512x768, 45 frames, 4 inference steps, sp_size=2 on 2 GPUs, FLASH_ATTN, ltx2_vae_tiling=True.
    • --ssim-full-quality: falls back to ltx2_distilled preset defaults (1024x1536, 121 frames, 8 steps, guidance_scale=1.0).
    • Min acceptable SSIM = 0.93 (matches Wan T2V).
    • REQUIRED_GPUS = 2 at module scope so the Modal SSIM orchestrator schedules correctly.
    • LTX2_DISTILLED_MODEL_TO_PARAMS named so the orchestrator can split by model id.
  2. .agents/skills/seed-ssim-references/ — agent skill that automates the one-time reference-video seeding flow this test (and any future family test) needs. Wraps fastvideo/tests/modal/ssim_test.py --sync-generated-to-volume with a SKILL.md + scripts/seed_ssim.sh launcher that:

    • Runs the test on Modal L40S:4 with --skip-reference-download --no-fail-fast so generation completes even though no reference exists yet.
    • Prints the modal volume get, reference_videos_cli.py copy-local, and reference_videos_cli.py upload commands pre-filled for the run.
    • Documents a verify step that re-runs the test without the skip/fail-fast flags to confirm the uploaded references land.

Test plan

  • File parses (python -c "import ast; ast.parse(...)").
  • Follows the same collection shape as Wan/LongCat/TurboDiffusion — same REQUIRED_GPUS + *_MODEL_TO_PARAMS contract the orchestrator auto-discovers.
  • Skill launcher works end-to-end locally (seed_ssim.sh --help prints; arg checks fire; mock run emits the expected follow-up commands).
  • First run on Modal L40S will fail the SSIM assertion (no reference video yet) — expected. Use the seed-ssim-references skill to capture the generated videos and upload via reference_videos_cli.py upload to seed FastVideo/ssim-reference-videos.
  • After references seed, rebase in-flight API-refactor branches ([feat] [6/n] Improve API: LTX-2 public preset + asset wiring + gpu_pool translation #1239) and confirm the LTX-2 test passes — that's the regression guard this PR unlocks.

LTX-2 was the only model family in fastvideo/pipelines/basic/ without an
SSIM coverage file. Add test_ltx2_similarity.py alongside the other
per-family SSIM tests so the in-flight API refactor and future
LTX-2-specific changes (two-stage refine, gpu_pool upstream, Dynamo
backend) have a golden-quality regression guard.

Parameters:

  Default (CI-friendly):
    model: FastVideo/LTX2-Distilled-Diffusers (8-step distilled)
    resolution: 512x768, num_frames=45, num_inference_steps=4
    sp_size=2 on 2 GPUs, FLASH_ATTN backend
    ltx2_vae_tiling=True for peak-memory safety

  --ssim-full-quality:
    falls back to the ltx2_distilled preset defaults
    (1024x1536, 121 frames, 8 steps, guidance_scale=1.0)

Uses the shared run_text_to_video_similarity_test helper (same pattern
as Wan/TurboDiffusion), so _build_init_kwargs picks up
ltx2_vae_tiling + related tile sizes automatically.

REQUIRED_GPUS = 2 is declared at module scope so the Modal SSIM
orchestrator (fastvideo/tests/modal/ssim_test.py) schedules it
correctly on the L40S:8 runner. LTX2_DISTILLED_MODEL_TO_PARAMS is
named so the orchestrator can split by model id (one subprocess per
entry) for future multi-model LTX-2 coverage.

Threshold min_acceptable_ssim=0.93 matches Wan T2V.

Reference videos are not in the repo. After this lands on main, run
the test once on an L40S and upload via
`python fastvideo/tests/ssim/reference_videos_cli.py upload --quality-tier all`
to seed FastVideo/ssim-reference-videos. Subsequent runs (including
regression guards for in-flight refactor PRs) auto-download the
references before the test executes.
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 17, 2026

⚠️ PR title format required

Your PR title must start with a type tag in brackets. Examples:

  • [feat] Add new model support
  • [bugfix] Fix VAE tiling corruption
  • [refactor] Restructure training pipeline
  • [perf] Optimize attention kernel
  • [ci] Update test infrastructure
  • [docs] Add inference guide
  • [misc] Clean up configs
  • [new-model] Port Flux2 to FastVideo

Valid tags: feat, feature, bugfix, fix, refactor, perf, ci, doc, docs, misc, chore, kernel, new-model

Please update your PR title and the merge protection check will pass automatically.

@mergify mergify bot added the scope: infra CI, tests, Docker, build label Apr 17, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Apr 17, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🔴 PR merge requirements

Waiting for:

  • #approved-reviews-by>=1
  • check-success=full-suite-passed
  • title~=(?i)^\[(feat|feature|bugfix|fix|refactor|perf|ci|doc|docs|misc|chore|kernel|new.?model)\]
This rule is failing.
  • #approved-reviews-by>=1
  • check-success=full-suite-passed
  • title~=(?i)^\[(feat|feature|bugfix|fix|refactor|perf|ci|doc|docs|misc|chore|kernel|new.?model)\]
  • check-success=fastcheck-passed
  • check-success~=pre-commit

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new SSIM-based similarity test for LTX-2 distilled text-to-video models. The feedback identifies an issue where resolving the device reference folder at the module level can cause errors during test collection on non-GPU machines, suggesting it be moved to a fixture or the test function. Additionally, a suggestion was made to use dictionary methods to reduce redundancy when defining full-quality parameters.


REQUIRED_GPUS = 2

device_reference_folder = resolve_inference_device_reference_folder(logger)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Resolving device_reference_folder at the module level causes the test file to raise a ValueError during import on machines without supported GPUs (e.g., local development environments or CPU-only CI nodes). This prevents pytest from collecting any tests in this file. It is recommended to move this resolution inside the test function or use a fixture to allow for graceful skipping when the environment is unsupported.

Comment on lines +42 to +56
LTX2_DISTILLED_FULL_QUALITY_PARAMS = {
"num_gpus": LTX2_DISTILLED_PARAMS["num_gpus"],
"model_path": LTX2_DISTILLED_PARAMS["model_path"],
"height": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.height,
"width": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.width,
"num_frames": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.num_frames,
"num_inference_steps":
_LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.num_inference_steps,
"guidance_scale": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.guidance_scale,
"seed": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.seed,
"sp_size": LTX2_DISTILLED_PARAMS["sp_size"],
"tp_size": LTX2_DISTILLED_PARAMS["tp_size"],
"fps": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.fps,
"ltx2_vae_tiling": LTX2_DISTILLED_PARAMS["ltx2_vae_tiling"],
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The construction of LTX2_DISTILLED_FULL_QUALITY_PARAMS is redundant as it manually copies several fields from LTX2_DISTILLED_PARAMS. Using .copy() and .update() would be cleaner and more maintainable.

LTX2_DISTILLED_FULL_QUALITY_PARAMS = LTX2_DISTILLED_PARAMS.copy()
LTX2_DISTILLED_FULL_QUALITY_PARAMS.update({
    "height": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.height,
    "width": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.width,
    "num_frames": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.num_frames,
    "num_inference_steps":
        _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.num_inference_steps,
    "guidance_scale": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.guidance_scale,
    "seed": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.seed,
    "fps": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.fps,
})

Wraps the existing fastvideo/tests/modal/ssim_test.py
--sync-generated-to-volume path with a step-by-step SKILL.md and a
thin seed_ssim.sh launcher so new SSIM tests can bootstrap their
HF reference videos without manual Modal wrangling.

Motivation: the new test_ltx2_similarity.py (committed earlier on
this branch) has no HF reference video yet; the next operator needs
a deterministic procedure for generating + uploading the first set.
Generalises beyond LTX-2 — any new family test can reuse verbatim.
Claude Code only scans ~/.claude/skills/ and .claude/skills/ for
user-invocable skills (no skillsPath / skillsDir config option
exists — https://code.claude.com/docs/en/skills.md). Skills in this
repo live under .agents/skills/ so they travel with the repo and
stay under git.

Add a one-shot idempotent sync script that symlinks each
.agents/skills/<name>/ directory into .claude/skills/<name>. Run
after cloning or after adding/removing a skill:

    .agents/scripts/sync-skills.sh

Behavior:
  * relative symlinks (../../.agents/skills/<name>) so the link
    survives moving the clone
  * requires a SKILL.md inside each skill directory to be eligible
  * prunes stale symlinks whose source vanished from .agents/skills/
  * refuses to clobber a pre-existing .claude/skills/<name>/ that
    isn't a symlink (lets the operator keep hand-written skills
    alongside managed ones)
  * prints a linked / unchanged / pruned / skipped summary

Note: .claude/ is gitignored; the symlinks themselves are not
committed. The script is the source of truth for reconstructing
them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

scope: infra CI, tests, Docker, build

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant