Release Release 0.2.0 · hao-ai-lab/FastVideo

What's Changed

Added longcat-video python api examples by @shaoxiongduan in #994
[docs]: add LoRA extraction utilities documentation by @ShreejithSG in #992
[bugfix] Add configs for TurboDiffusion T2V/I2V models by @loaydatrain in #993
[ci] temporarily disable turbodiffusion ssim test by @SolitaryThinker in #1000
[CI] Fixed Turbodiffusion I2V CI by @loaydatrain in #1002
[feat!] Disable FSDP inference by default by @XOR-op in #1001
[misc] [bugfix] unpin 'av' in pyproject by @SolitaryThinker in #1009
[feat] Introduce Cosmos 2.5 Text2World pipeline by @KyleShao1016 in #974
[CI] SSIM tests optimization: load all model weights from Modal persistent Volume by @alexzms in #958
[CI] Fix OOM issues in ssim tests by @SolitaryThinker in #1011
[Bug Fix] Add autograd wrapper for block-sparse attention in fastvideo-kernel + fix CMake extension linking by @alexzms in #1015
[chore] release fastvideo-kernel 0.2.3 by @SolitaryThinker in #1018
[feat] Hooks API and layerwise offloading for all DiTs by @XOR-op in #1006
[kernel] Fix fastvideo-kernel release workflow by @SolitaryThinker in #1019
[kernel] [bugfix] [ci] bump v0.2.4. Fix STA output handling, TurboDiffusion CUDA norm dtypes for fastvideo-kernel unit tests. by @alexzms in #1020
Added LTX-2 Distilled T2V Generation by @shaoxiongduan in #1016
fix: SP for hunyuanvideo 1.5 by @XOR-op in #1026
[fastvideo-kernel] replace map to index with Triton implementation + add vsa benchmark by @alexzms in #1029
[bugfix] add omegaconf as dep. by @SolitaryThinker in #1032
[bugfix] Allow update timesteps for hy1.5 model. by @Davids048 in #1033
[feat] Add HY-World1.5-Bidirectional-480P-I2V by @mignonjia in #1027
[ci] Increase ci test error threshold by @alexzms in #1038
[bugfix] Fix NCCL all_gather contiguity + correct ParallelTiledVAE decode tiling threshold by @KyleShao1016 in #1037
[bugfix]: handle architectural differences while lora extraction by @ShreejithSG in #1035
[bugfix] fix torchvision import by @RandNMR73 in #1039
[docs] Update runpod instructions by @SolitaryThinker in #1043
[docs] Offloading instruction by @XOR-op in #1022
[feat]Add Matrix Game 2.0 training by @H1yori233 in #1017
[docs] Update design overview and add agents tutorial by @SolitaryThinker in #1044
[SP Sharding] Fix SP loss sharding on token axis (thw) with padding; add distributed correctness tests by @alexzms in #1045
fix: sageattn3 installation by @XOR-op in #1050
[bugfix] Double Normalization in Preprocessing Dataset by @H1yori233 in #1055
[Bugfix] [Wan I2V] Fix CLIP Image encoder config by @JerryZhou54 in #1063
[chore]: use higher precision timestamp in logging by @XOR-op in #1062
[feat] Add Cosmos 2.5 I2W/V2W support (staged pipeline + examples) by @KyleShao1016 in #1021
Added Sequence Parallelism for LTX-2 Distilled by @shaoxiongduan in #1036
[misc] upgrade torch to 2.10 by @SolitaryThinker in #1048
[feat] HYworld VAE with cache by @mignonjia in #1057
[misc] Fix naming instruction in runpod.md by @SolitaryThinker in #1067
[refactor] Action module by @XOR-op in #1065
[core] Refactor and centralize our registry for models, pipelines, and sampling params by @SolitaryThinker in #1066
Some minor fixes by @zhisbug in #1068
[Feature] [Hy1.5] Support HY1.5 super-resolution pipeline for 1080p videos by @JerryZhou54 in #1046
[Fix] remove video ratio limitation by @Eigensystem in #1069
more fix and relocate STA arguments to pipeline config by @zhisbug in #1073
[perf]: use CUDA IPC in multiproc executor to avoid serialization overhead by @XOR-op in #1061
readme small fix by @jzhang38 in #1076
[fix]: _compile_conditions regression by @XOR-op in #1077
add AGENTS.md file by @RandNMR73 in #1085
[Misc] [Training] Fixed a bunch of bugs in current training pipeline by @JerryZhou54 in #1084
[Feat] Port LTX2 trainer by @Davids048 in #1074
[Feat] Add Stable Diffusion 3.5 by @Ishxn20 in #1075
[Fix] CI Transformer Tests by @Eigensystem in #1089
[Model] LTX 2 Base by @Davids048 in #1064
[misc] cleanup assets/ and demo/ by @SolitaryThinker in #1091
[feat] Port LingBot-World-Base (Cam) by @H1yori233 in #1081
[misc] update wechat group link by @SolitaryThinker in #1098
[bugfix] Fixed ltx2 base cfg guidance by @shaoxiongduan in #1095
[bugfix] fastvideo-kernel: fix VSA Triton padding NaNs and support q/kv length mismatch by @alexzms in #1094
[kernel] add torch 2.10 to package build matrix by @SolitaryThinker in #1099
[feature] Add Hunyuan-GameCraft model support by @MihirJagtap in #1071
[bugfix] Fix failed kernel publish and SFT regressions by @SolitaryThinker in #1103
[perf] causal MatrixGame optimization by @XOR-op in #1078
[Fix] hunyuan postprecessing issue by @Eigensystem in #1104
[bugfix] fix import PreTrainedModel in stepllm.py by @dsynkd in #1108
Update README.md by @dsynkd in #1110
[Feat] Native dit implementation for SD3.5 by @Ishxn20 in #1093
[Misc] clean up VSA finetuning examples. by @jzhang38 in #1111
[bugfix] get_torch_device and other device calls were being made on non-cuda platforms by @dsynkd in #1107
[misc] add hy-world link to readme by @SolitaryThinker in #1113
Improve Docs by @jzhang38 in #1112
Upstream LTX2 Training by @jzhang38 in #1116
[Misc] Remove StepVideo by @jzhang38 in #1118
small refactor in post-processing to improve efficiency by @RandNMR73 in #1123
[Misc] Remove Teacache by @jzhang38 in #1121
[bugfix] Added ltx2 guidance missing modulation term by @shaoxiongduan in #1100
[Misc] Remove STA by @jzhang38 in #1124
[Feat] Improved CI by @Eigensystem in #1119
[misc] fix hunyuan by @jzhang38 in #1125
migrate uv by @Davids048 in #1127
[fix] preprocessing issue by @Eigensystem in #1134
[bugfix] fix matrix game kv indexing by @SolitaryThinker in #1135
[Misc] Fix memory leakage in VideoGenerator by @jzhang38 in #1132
Py/fix sp by @jzhang38 in #1138
[CI][Feat] launch 2 instance to run ssim by @Eigensystem in #1137
[bugfix]: fix a bug where collect_env was not running properly... by @dsynkd in #1145
[Doc] add doc for inference architecture by @Eigensystem in #1147
Added OpenAI-compatible API server and benchmark script by @AjAnubolu in #1109
[Refactor] SP Mask --> original seq len; HunyuanVideo 1.5 does not need mask by @jzhang38 in #1142
[CI] Add inference performance regression tests by @AjAnubolu in #1140
[CI] PR template by @Eigensystem in #1157
[misc] FlashAttention 4 support by @XOR-op in #1114
feat: Building agent friendly repo by @GindaChen in #1151
[Feat] Add causal Wan pipeline with multi-step denoising by @alexzms in #1161
[feat] Refactor training framework into fastvideo/train by @alexzms in #1159
Py/cleanup by @jzhang38 in #1163
[feat] Self-Forcing methods in refactored training infra by @alexzms in #1164
[feat] pre-commit support 120 col num by @alexzms in #1167
[feat]: Knowledge Distillation training method for ODE-init (KDMethod + KDCausalMethod) by @alexzms in #1166
[docs] Update README with realtime demo announcement by @zhisbug in #1169
[CI] add contributor interaction automation by @Eigensystem in #1170
[bugfix] self-forcing train/validation step mismatch by @H1yori233 in #1173
[misc] update action loading in validation and preprocess by @H1yori233 in #1143
[feat]: add HunyuanVideo model plugin for fastvideo/train framework by @alexzms in #1175
Kandinsky5 lite dit clean by @jaisurya27 in #1088
[misc]: reorganize training configs and add documentation by @alexzms in #1177
[bugfix]: fix I2V preprocessing crash for models without CLIP (Wan2.2 I2V) by @alexzms in #1184
[feat] Job Runner UI by @dsynkd in #1172
Revert "[feat] Job Runner UI" by @Eigensystem in #1188
[bugfix]: fix VAE temporal tiling blend corruption in tiled_encode by @alexzms in #1181
[feat]: overhaul SSIM test infrastructure — partition scheduling, helper migration, CI fixes by @Eigensystem in #1185
[ci] CI infrastructure cleanup and workflow reorganization (1/2) by @Eigensystem in #1186
[ci] Merge Queue, label system overhaul, and slash commands (2/2) by @Eigensystem in #1187
[ci] Add approval and pre-commit checks to merge protections by @Eigensystem in #1190
[ci] CI follow-up: gate checks, issue label unification, draft PR skip by @Eigensystem in #1193
[ci] Fix Merge Queue immediate dequeue by @Eigensystem in #1196
[ci] Fix Merge Queue requeue and draft PR pre-commit skip by @Eigensystem in #1197
ci: upgrade configuration to current format by @mergify[bot] in #1194
[ci] Replace Merge Queue with auto-merge — reduce CI complexity by @Eigensystem in #1200
[ci] Fix fork PR checkout for /test and Full Suite triggers by @Eigensystem in #1202
[ci] Add TEST_SCOPE routing for clean single-test execution by @Eigensystem in #1203
[ci] Trigger pre-commit on /test slash commands by @Eigensystem in #1205
[ci] Post pre-commit status to PR commit SHA by @Eigensystem in #1206
[ci] Add statuses:write permission for /test pre-commit by @Eigensystem in #1207
[ci] Remove Mergify ready-label race condition by @Eigensystem in #1208
[ci] Fix /merge to directly trigger Full Suite + simplify rebase conditions by @Eigensystem in #1209
[ci] Add retry for flaky tests and fix stale SSIM references by @Eigensystem in #1210
[ci] Ignore legacy reference videos when checking for HF download by @Eigensystem in #1211
[ci] Fix jq crash when Buildkite build env is null by @Eigensystem in #1212
[ci] Use pull_request_target for Full Suite trigger by @Eigensystem in #1213
[ci] Add direct test retry with check overwrite and aggregate status refresh by @Eigensystem in #1214
[ci] Use update instead of rebase for auto branch sync by @Eigensystem in #1215
[feat] add gen3c (cosmos-7b) model and pipeline support by @vishruthb in #1059
[feat] Job Runner UI by @Eigensystem in #1189
ci: upgrade configuration to current format by @mergify[bot] in #1216
[Feature] Add BSA (Bidirectional Sparse Attention) inference backend by @Satyam-53 in #1174
[feat] [1/n] API improvements: add intial files for new fastvideo public API by @SolitaryThinker in #1218
[perf]: Eliminate CPU-GPU synchronization bottlenecks in training pipeline by @rich7420 in #1217
[bugfix]Fixing Lora distillation training distributed checkpointing bug by @klhhhhh in #1192
[feat] [2/n] Improve API: add initial support in video_generator by @SolitaryThinker in #1220
[feat] [3/n] Improve API: extend support to cli by @SolitaryThinker in #1226
[feat] [4/n] Improve API: refactor sampling param and merge with presets by @SolitaryThinker in #1234
[misc] small cleanup for API handling by @SolitaryThinker in #1235
[feat] [5/n] Improve API: wire ServeConfig.default_request into OpenAI serving by @SolitaryThinker in #1237
[feat] [5.5/n] Improve API: streaming server config surface + serve dispatch by @SolitaryThinker in #1238
[test] add LTX-2 distilled T2V SSIM regression test by @SolitaryThinker in #1240
[feat] [6/n] Improve API: LTX-2 public preset + asset wiring + gpu_pool translation by @SolitaryThinker in #1239
[feat] Add typed LTX-2 continuation state and streaming session store by @SolitaryThinker in #1250
[bugfix]: normalize uint8 pil_image in I2V VAE encoding by @Davids048 in #1249
[feat] Streaming WebSocket server skeleton (single generator + fMP4) by @SolitaryThinker in #1251
[docs]: clarify real_score_guidance_scale CFG parameterization by @alexzms in #1256
[Perf] Skip bool-mask round-trip in block-sparse VSA attention by @Godmook in #1243
[bugfix] Fix modal remote functions crash container on sys exit in CI remote functions by @Satyam-53 in #1261
[ci] add CPU unit tests for fastvideo.train load_run_config by @alexzms in #1264
[bugfix]: fix SP deadlock in negative prompt encoding during training by @alexzms in #1178
[ci] add CPU unit tests for train checkpoint utilities in fastvideo.train by @alexzms in #1265
[feat] Cosmos 2.5 training support in fastvideo.train by @alexzms in #1224
[feat] Stable Audio Open 1.0: T2A + A2A + RePaint inpainting (native) by @SolitaryThinker in #1260
[ci] add CPU unit tests for train callback system in fastvideo.train by @alexzms in #1267
[misc] cleanup: grad-norm asserts, dead offload file, callback names by @alexzms in #1268
[refactor] tests/local_tests: organize by model family by @SolitaryThinker in #1269
[bugfix]: classify stable_audio fields in schema parity inventory by @SolitaryThinker in #1275
[ci] pre-commit: drop stale excludes + document agent lint flow by @SolitaryThinker in #1276
[bugfix] Update fa import by @Davids048 in #1271
[docs] add hierarchical AGENTS.md per-directory guidance by @SolitaryThinker in #1278
[ci] Add CI Performance Regression Tracking Changes by @Satyam-53 in #1248
[misc] pin torch to 2.11.0 by @SolitaryThinker in #1277
[misc]: standardize install instructions on uv pip install by @SolitaryThinker in #1279
[feat] Improve API: streaming server GpuPool + worker subprocess by @SolitaryThinker in #1257
[feat] Improve API: streaming prompt enhancer with LLMProvider abstraction by @SolitaryThinker in #1258
[feat] Improve API: streaming auxiliaries (safety, rewrite, logger, mock) by @SolitaryThinker in #1284
[feat] Improve API: streaming router (multi-replica load balancer + ws proxy) by @SolitaryThinker in #1286
[ci] Replace flaky LTX-2 pixel SSIM with latent-slice cosine regression by @Godmook in #1253
[infra]: Add activation trace hooks for pipeline debugging by @SolitaryThinker in #1293
[feat] Add fastvideo.eval video evaluation suite by @shaoxiongduan in #1305
[feat]: Loader umbrella-repo support + optional component dirs by @SolitaryThinker in #1294
[infra]: New skill - decompose-pipeline-pr by @SolitaryThinker in #1303
[ci] mergify: accept [skill]/[skills] and [infra] PR title tags by @SolitaryThinker in #1309
[feat]: add LongCat bidirectional finetuning support by @aryan5v in #1244
[misc]: import add-model skill stack to .agents/skills/ by @SolitaryThinker in #1308
[misc] attention hot-path cleanup + denoising loop hoists by @alexzms in #1272
[feat] FastVideo World Model Training by @H1yori233 in #1179
[feat] Add Cosmos 2.5 T2W training pipeline (LoRA + full fine-tune) by @Mister-Raggs in #1227
[docs] Dreamverse 01/14: Add integration provenance by @Davids048 in #1324
[infra]: MagiHuman housekeeping (gitignore, codespell, skills index) (1/8) by @SolitaryThinker in #1295
[docs] Dreamverse 02/14: Add app documentation by @Davids048 in #1325
[feat] Dreamverse 03/14: Add backend skeleton by @Davids048 in #1326
[feat]: T5-Gemma encoder for MagiHuman pipeline (2/8) by @SolitaryThinker in #1296
[feat] Dreamverse 04/14: Add session and prompt logic by @Davids048 in #1327
[feat]: MagiHuman DiT (transformer) port + parity tests (3/8) by @SolitaryThinker in #1297
[feat]: MagiHuman pipeline stages (4/8) by @SolitaryThinker in #1298
[feat] Dreamverse 05/14: Add streaming runtime by @Davids048 in #1328
[feat]: MagiHuman pipeline orchestrator + 10-test parity battery (5/8) by @SolitaryThinker in #1299
[docs]: MagiHuman provenance - AGENTS.md, JOURNAL.md, lessons (6/8) by @SolitaryThinker in #1300
[infra]: MagiHuman checkpoint conversion + push scripts (7/8) by @SolitaryThinker in #1301
[feat] Dreamverse 06/14: Add frontend scaffold by @Davids048 in #1329
[ci] add GPU model loading tests for fastvideo.train (PR 4/9) by @alexzms in #1274
[bugfix] MatrixGame2 SF distillation under gradient checkpointing by @H1yori233 in #1340
feat: FP4 Flash Attention 4 for Blackwell GPUs by @Edenzzzz in #1221
[feat] Dreamverse 07/14: Add frontend session UI by @Davids048 in #1330
[feat] Dreamverse 08/14: Add frontend media and E2E coverage by @Davids048 in #1331
[infra] Dreamverse 09/14: Add Docker and launch scripts by @Davids048 in #1332
[misc]: empty init.py files with no logic by @SolitaryThinker in #1346
[misc]: PR-1225 sync — housekeeping (1/12) by @SolitaryThinker in #1347
[feat] eval: async VideoPool + metric streamlines by @shaoxiongduan in #1320
[feat] Dreamverse 10/14: Add serving API contracts by @Davids048 in #1333
[infra] Dreamverse 11/14: Add NVFP4 quantization support by @Davids048 in #1334
[feat]: Add NVFP4QAT quantization config (Attn-QAT 2/12) by @SolitaryThinker in #1348
[feat] Dreamverse 12/14: Add LTX2 refine and upsampler support by @Davids048 in #1335
[docs] Add copy page action by @Davids048 in #1351
[misc]: Add Dreamverse deploy skill frontmatter by @Davids048 in #1353
[feat] Dreamverse 13/14: Activate LTX2 integration by @Davids048 in #1336
[perf] shallow-copy VSA attn_metadata in train model plugins by @alexzms in #1342
[perf] Dreamverse 14/14: Add LTX2 profile speedups by @Davids048 in #1337
[feat]: Add NVFP4QAT linear layer (Attn-QAT 3/12) by @SolitaryThinker in #1350
[misc]: demote ROCm-unavailable startup message to DEBUG by @SolitaryThinker in #1360
[feat] eval: add audio metrics by @shaoxiongduan in #1352
[infra] [dreamverse]: add instruction to install nasm and update ffmpeg installer to work in plain venv by @SolitaryThinker in #1361
[feat] Add minimal LoRA finetuning support to the YAML training stack by @radicalyyyahaha in #1242
[misc] Rename MatrixGame to MatrixGame2 by @H1yori233 in #1357
[fix] Fix causal self-forcing attention settings by @mignonjia in #1355
[feat]: add FastLTX-2.3 Gradio demo package (draft) by @Davids048 in #1247
[perf] Mark LayerwiseOffloadHook entry points torch.compiler.disable (remove per-layer graph break) by @Mister-Raggs in #1365
[Bugfix] FP4 FA4 installation fix by @Edenzzzz in #1367
[bugfix]: shrink Dreamverse Docker context by @Davids048 in #1368
[ci] Add Dreamverse Docker image workflow by @Davids048 in #1369
[ci] Component time performance + reseed hf baseline skill by @Satyam-53 in #1292
[feat] Optimize distributed weight loading in multi-node training by @Edenzzzz in #572
[docs] Document performance benchmark workflow by @Satyam-53 in #1376
[docs] Document enable_torch_compile (+ A/B example) by @Mister-Raggs in #1366
[feat]: Attn-QAT inference + training backends (deadcode) (Attn-QAT 4/12) by @SolitaryThinker in #1358
chore: pin dreamverse npm deps to address Dependabot alerts by @SolitaryThinker in #1359
[infra] Add Dreamverse Modal UI image build by @Davids048 in #1381
[infra] Use npm for Dreamverse web builds by @Davids048 in #1385
[perf]: register FA2/FA3 default flash_attn_func as a torch.library custom op by @Mister-Raggs in #1373
[ci] Add DreamVerse app CI tests by @Davids048 in #1386
[refactor]: shared attention infra additions for QAT-compat (Attn-QAT 5/12) by @SolitaryThinker in #1383
[refactor] eval: consolidate FVD into common.fvd, remove benchmarks/fvd by @shaoxiongduan in #1380
[ci] add per-method single-step training tests for fastvideo.train by @alexzms in #1343
[feat] VSA-256 fastpath on Blackwell via FA4 CuTe block-sparse attention by @alexzms in #1354
[docs]: Wire activation trace into mkdocs nav + perf/troubleshooting by @SolitaryThinker in #1304
[docs]: surface activation-trace utility in add-model skills by @SolitaryThinker in #1399
[feat] eval: input ergonomics + Evaluator features + bug fixes by @shaoxiongduan in #1392
[bugfix] Fix Dreamverse Modal compile warmup latency by @Davids048 in #1394
[docs]: highlight Dreamverse deployment paths + add Server B200 (SSH) guide by @alexzms in #1409
[feat] Add MatrixGame3.0 by @H1yori233 in #1201
[feat] LTX-2.3 transformer support (config-gated extension of LTX-2) by @alexzms in #1397
[feat] LTX-2.3 audio: BWE vocoder path by @alexzms in #1398
[bugfix]: dreamverse modal bypasses ENTRYPOINT — set ffmpeg env + key check by @alexzms in #1413
[perf] Add Adaptive Guidance (CFG gating) for stale-uncond reuse by @rich7420 in #1372
[ci] Add additional Dreamverse UI tests by @kevin314 in #1417
[bugfix] Fix STFT dtype mismatch by @kevin314 in #1419
[bugfix] LTX2: honor video_position_offset_sec in the DiT by @H1yori233 in #1422
[feat] LoRA controls and integration for Dreamverse by @H1yori233 in #1420
[feat] dreamverse: sequence parallelism for serving by @shaoxiongduan in #1424
[bugfix] tests: include ltx2_3_base in expected LTX2 preset set (#1427) by @Mister-Raggs in #1428
[chore]: unpin runtime deps in pyproject.toml by @SolitaryThinker in #1431
[chore]: release v0.2.0 by @SolitaryThinker in #1432

New Contributors

@mignonjia made their first contribution in #1027
@Ishxn20 made their first contribution in #1075
@dsynkd made their first contribution in #1108
@AjAnubolu made their first contribution in #1109
@GindaChen made their first contribution in #1151
@jaisurya27 made their first contribution in #1088
@mergify[bot] made their first contribution in #1194
@vishruthb made their first contribution in #1059
@Satyam-53 made their first contribution in #1174
@rich7420 made their first contribution in #1217
@klhhhhh made their first contribution in #1192
@Godmook made their first contribution in #1243
@aryan5v made their first contribution in #1244
@Mister-Raggs made their first contribution in #1227
@radicalyyyahaha made their first contribution in #1242

Full Changelog: v0.1.7...v0.2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.2.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!