[test] add LTX-2 distilled T2V SSIM regression test#1240
[test] add LTX-2 distilled T2V SSIM regression test#1240SolitaryThinker wants to merge 3 commits intomainfrom
Conversation
LTX-2 was the only model family in fastvideo/pipelines/basic/ without an
SSIM coverage file. Add test_ltx2_similarity.py alongside the other
per-family SSIM tests so the in-flight API refactor and future
LTX-2-specific changes (two-stage refine, gpu_pool upstream, Dynamo
backend) have a golden-quality regression guard.
Parameters:
Default (CI-friendly):
model: FastVideo/LTX2-Distilled-Diffusers (8-step distilled)
resolution: 512x768, num_frames=45, num_inference_steps=4
sp_size=2 on 2 GPUs, FLASH_ATTN backend
ltx2_vae_tiling=True for peak-memory safety
--ssim-full-quality:
falls back to the ltx2_distilled preset defaults
(1024x1536, 121 frames, 8 steps, guidance_scale=1.0)
Uses the shared run_text_to_video_similarity_test helper (same pattern
as Wan/TurboDiffusion), so _build_init_kwargs picks up
ltx2_vae_tiling + related tile sizes automatically.
REQUIRED_GPUS = 2 is declared at module scope so the Modal SSIM
orchestrator (fastvideo/tests/modal/ssim_test.py) schedules it
correctly on the L40S:8 runner. LTX2_DISTILLED_MODEL_TO_PARAMS is
named so the orchestrator can split by model id (one subprocess per
entry) for future multi-model LTX-2 coverage.
Threshold min_acceptable_ssim=0.93 matches Wan T2V.
Reference videos are not in the repo. After this lands on main, run
the test once on an L40S and upload via
`python fastvideo/tests/ssim/reference_videos_cli.py upload --quality-tier all`
to seed FastVideo/ssim-reference-videos. Subsequent runs (including
regression guards for in-flight refactor PRs) auto-download the
references before the test executes.
|
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🔴 PR merge requirementsWaiting for:
This rule is failing.
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new SSIM-based similarity test for LTX-2 distilled text-to-video models. The feedback identifies an issue where resolving the device reference folder at the module level can cause errors during test collection on non-GPU machines, suggesting it be moved to a fixture or the test function. Additionally, a suggestion was made to use dictionary methods to reduce redundancy when defining full-quality parameters.
|
|
||
| REQUIRED_GPUS = 2 | ||
|
|
||
| device_reference_folder = resolve_inference_device_reference_folder(logger) |
There was a problem hiding this comment.
Resolving device_reference_folder at the module level causes the test file to raise a ValueError during import on machines without supported GPUs (e.g., local development environments or CPU-only CI nodes). This prevents pytest from collecting any tests in this file. It is recommended to move this resolution inside the test function or use a fixture to allow for graceful skipping when the environment is unsupported.
| LTX2_DISTILLED_FULL_QUALITY_PARAMS = { | ||
| "num_gpus": LTX2_DISTILLED_PARAMS["num_gpus"], | ||
| "model_path": LTX2_DISTILLED_PARAMS["model_path"], | ||
| "height": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.height, | ||
| "width": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.width, | ||
| "num_frames": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.num_frames, | ||
| "num_inference_steps": | ||
| _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.num_inference_steps, | ||
| "guidance_scale": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.guidance_scale, | ||
| "seed": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.seed, | ||
| "sp_size": LTX2_DISTILLED_PARAMS["sp_size"], | ||
| "tp_size": LTX2_DISTILLED_PARAMS["tp_size"], | ||
| "fps": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.fps, | ||
| "ltx2_vae_tiling": LTX2_DISTILLED_PARAMS["ltx2_vae_tiling"], | ||
| } |
There was a problem hiding this comment.
The construction of LTX2_DISTILLED_FULL_QUALITY_PARAMS is redundant as it manually copies several fields from LTX2_DISTILLED_PARAMS. Using .copy() and .update() would be cleaner and more maintainable.
LTX2_DISTILLED_FULL_QUALITY_PARAMS = LTX2_DISTILLED_PARAMS.copy()
LTX2_DISTILLED_FULL_QUALITY_PARAMS.update({
"height": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.height,
"width": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.width,
"num_frames": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.num_frames,
"num_inference_steps":
_LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.num_inference_steps,
"guidance_scale": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.guidance_scale,
"seed": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.seed,
"fps": _LTX2_DISTILLED_FULL_QUALITY_DEFAULTS.fps,
})Wraps the existing fastvideo/tests/modal/ssim_test.py --sync-generated-to-volume path with a step-by-step SKILL.md and a thin seed_ssim.sh launcher so new SSIM tests can bootstrap their HF reference videos without manual Modal wrangling. Motivation: the new test_ltx2_similarity.py (committed earlier on this branch) has no HF reference video yet; the next operator needs a deterministic procedure for generating + uploading the first set. Generalises beyond LTX-2 — any new family test can reuse verbatim.
Claude Code only scans ~/.claude/skills/ and .claude/skills/ for user-invocable skills (no skillsPath / skillsDir config option exists — https://code.claude.com/docs/en/skills.md). Skills in this repo live under .agents/skills/ so they travel with the repo and stay under git. Add a one-shot idempotent sync script that symlinks each .agents/skills/<name>/ directory into .claude/skills/<name>. Run after cloning or after adding/removing a skill: .agents/scripts/sync-skills.sh Behavior: * relative symlinks (../../.agents/skills/<name>) so the link survives moving the clone * requires a SKILL.md inside each skill directory to be eligible * prunes stale symlinks whose source vanished from .agents/skills/ * refuses to clobber a pre-existing .claude/skills/<name>/ that isn't a symlink (lets the operator keep hand-written skills alongside managed ones) * prints a linked / unchanged / pruned / skipped summary Note: .claude/ is gitignored; the symlinks themselves are not committed. The script is the source of truth for reconstructing them.
Summary
Adds two things that together let LTX-2 start acting as an SSIM regression guard:
fastvideo/tests/ssim/test_ltx2_similarity.py— LTX-2 is the only model family infastvideo/pipelines/basic/without an SSIM coverage file. Adds one alongside the other per-family tests so the in-flight API refactor and future LTX-2-specific changes (two-stage refine, gpu_pool upstream, Dynamo backend) have a golden-quality regression guard.Test shape matches
test_wan_t2v_similarity.py/test_turbodiffusion_similarity.py:FastVideo/LTX2-Distilled-Diffusers, 512x768, 45 frames, 4 inference steps,sp_size=2on 2 GPUs,FLASH_ATTN,ltx2_vae_tiling=True.--ssim-full-quality: falls back toltx2_distilledpreset defaults (1024x1536, 121 frames, 8 steps,guidance_scale=1.0).REQUIRED_GPUS = 2at module scope so the Modal SSIM orchestrator schedules correctly.LTX2_DISTILLED_MODEL_TO_PARAMSnamed so the orchestrator can split by model id..agents/skills/seed-ssim-references/— agent skill that automates the one-time reference-video seeding flow this test (and any future family test) needs. Wrapsfastvideo/tests/modal/ssim_test.py --sync-generated-to-volumewith a SKILL.md +scripts/seed_ssim.shlauncher that:--skip-reference-download --no-fail-fastso generation completes even though no reference exists yet.modal volume get,reference_videos_cli.py copy-local, andreference_videos_cli.py uploadcommands pre-filled for the run.Test plan
python -c "import ast; ast.parse(...)").REQUIRED_GPUS+*_MODEL_TO_PARAMScontract the orchestrator auto-discovers.seed_ssim.sh --helpprints; arg checks fire; mock run emits the expected follow-up commands).seed-ssim-referencesskill to capture the generated videos and upload viareference_videos_cli.py uploadto seedFastVideo/ssim-reference-videos.