support longcat-image block offload with 2 mgr by Lubenwei-nb123 · Pull Request #977 · ModelTC/LightX2V

Lubenwei-nb123 · 2026-03-31T15:07:13Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces CPU offloading capabilities for the LongCat Image model, specifically implementing block-level offloading. A new LongCatImageOffloadTransformerInfer class is added to manage asynchronous weight prefetching and double-buffering for transformer blocks. The LongCatImageTransformerInfer class is refactored to use a dispatching infer_func, and the main LongCatImageTransformerModel is updated to conditionally use the offload-enabled infer class and manage model/block movement between CPU and GPU. Weight classes (LongCatImageDoubleBlockWeights, LongCatImageSingleBlockWeights) are modified to support creating dedicated CUDA buffers for offloading. A new configuration file and a shell script are included to enable and demonstrate this feature. Feedback points out an unused self.block_idx attribute and a potential device mismatch error if an unsupported offload_granularity is configured.

gemini-code-assist · 2026-03-31T15:15:29Z

+        current_stream = torch_device_module.current_stream()
+        self.offload_manager_double.compute_stream.wait_stream(current_stream)
+        for block_idx in range(len(blocks.double_blocks)):
+            self.block_idx = block_idx


The attribute self.block_idx is set here but does not appear to be used anywhere within this class or its parent LongCatImageTransformerInfer. If it's not required for external hooks or profiling, it should be removed to avoid confusion.

gemini-code-assist · 2026-03-31T15:15:29Z

+        if self.cpu_offload and self.offload_granularity == "block":
+            self.transformer_infer_class = LongCatImageOffloadTransformerInfer
+        else:
+            self.transformer_infer_class = LongCatImageTransformerInfer


The logic here only handles offload_granularity == "block". If cpu_offload is enabled but offload_granularity is set to something else (e.g., "phase"), it falls back to the base LongCatImageTransformerInfer. However, the infer method (lines 92-98) only handles "model" and "block" granularities. If a different granularity is provided, the weights will remain on CPU while computation is attempted on GPU, leading to a device mismatch error. Consider adding a check or defaulting to a supported mode.

gushiqiao · 2026-04-02T08:51:54Z

+        for block_idx in range(len(blocks.double_blocks)):
+            self.block_idx = block_idx
+
+            if self.offload_manager_double.need_init_first_buffer:


我看了下，像是这里，如果每个step的block id=0的时候，都初始化下buffer，那么上面的wait_stream就不用加了，结果也对。感觉可能是上一个step结束的时候，某个step开始之前，swap_blocks在没有完成？所以需要wait下。不过我感觉不是很影响速度，可以先merge

gushiqiao · 2026-04-02T09:20:29Z

pip install ruff pre-commit
pre-commit run --all-files
解决下ci

rebase

Lubenwei-nb123 · 2026-04-02T09:52:18Z

pip install ruff pre-commit pre-commit run --all-files 解决下ci

done

gemini-code-assist Bot reviewed Mar 31, 2026

View reviewed changes

gushiqiao approved these changes Apr 2, 2026

View reviewed changes

Lubenwei-nb123 added 2 commits April 2, 2026 17:50

support longcat-image block offload with 2 mgr

6dfa0d6

rebase

fix ci

5c52346

Lubenwei-nb123 force-pushed the feat/longcat_img_blk_offload branch from e3bcba7 to 5c52346 Compare April 2, 2026 09:51

gushiqiao merged commit f4c5184 into ModelTC:main Apr 2, 2026

Lubenwei-nb123 deleted the feat/longcat_img_blk_offload branch April 2, 2026 09:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support longcat-image block offload with 2 mgr#977

support longcat-image block offload with 2 mgr#977
gushiqiao merged 2 commits intoModelTC:mainfrom
Lubenwei-nb123:feat/longcat_img_blk_offload

Lubenwei-nb123 commented Mar 31, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Mar 31, 2026

Uh oh!

gemini-code-assist Bot Mar 31, 2026

Uh oh!

gushiqiao Apr 2, 2026

Uh oh!

gushiqiao commented Apr 2, 2026

Uh oh!

Lubenwei-nb123 commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Lubenwei-nb123 commented Mar 31, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

gushiqiao Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

gushiqiao commented Apr 2, 2026

Uh oh!

Lubenwei-nb123 commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants