Rebase to latest slime by knlnguyen1802 · Pull Request #12 · SamitHuang/slime

knlnguyen1802 · 2026-05-17T14:19:07Z

No description provided.

Signed-off-by: SamitHuang <285365963@qq.com>

Signed-off-by: samithuang <285365963@qq.com>

Add rollout backend client and test qwen2.5-0.5b non-colocate training

Signed-off-by: samithuang <285365963@qq.com>

Eliminate intermediate CPU tensors

Reorder weight synchronization support for colocate and non-colocate scenarios in the goal plan.

* Draft router design Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com> * Add vllm router Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com> * Add router to script Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com> * Fix gpu memory utilization Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com> * Fix output token ids Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com> * Add more nccl flag Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com> * Fix bug Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com> --------- Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

…site (THUDM#1902) Co-authored-by: jingshenghang <shenghang.jing@aminer.cn>

) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

gemini-code-assist

Code Review

This pull request introduces vLLM as a first-class rollout backend for Slime, refactoring the codebase to support backend-agnostic requests and responses through new adapter interfaces. Significant additions include a managed vLLM engine lifecycle, a translation sidecar for protocol compatibility, specialized weight synchronization mechanisms for both colocated and distributed deployments, and a Megatron patch for chunked gradient coalescence. Review feedback highlights the need to restore configuration validation logic and remove an unconditional override of the rollout offloading setting. Furthermore, the reviewer suggests replacing hardcoded file paths in the new shell scripts with variables to enhance environment portability.

gemini-code-assist · 2026-05-17T14:24:56Z

+        # if hasattr(hf_config, hf_config_name) and hasattr(args, megatron_config_name):
+        #     if not compare_fn(getattr(hf_config, hf_config_name), getattr(args, megatron_config_name)):
+        #         errors.append(
+        #             f"{hf_config_name} in hf config {getattr(hf_config, hf_config_name)} is not equal to "
+        #             f"{megatron_config_name} {getattr(args, megatron_config_name)}, please check the config."
+        #         )


The validation logic comparing Hugging Face config values with Megatron arguments has been commented out. This is risky as it can lead to silent misconfigurations that are hard to debug at runtime. If the validation was causing issues, it should be fixed or adapted rather than completely disabled. Please consider restoring this validation to ensure configuration correctness.

gemini-code-assist · 2026-05-17T14:24:56Z

        args.offload_train = False
    if args.offload_rollout is None:
        args.offload_rollout = False
+    args.offload_rollout = False 


The offload_rollout argument is unconditionally set to False. This seems to override any user-provided value and disables the feature. If this is intentional, it should be documented. Otherwise, this line should be removed to respect the user's configuration.

gemini-code-assist · 2026-05-17T14:24:56Z

+PYTHONPATH=/root/Megatron-LM python tools/convert_hf_to_torch_dist.py \
+    ${MODEL_ARGS[@]} \
+    --hf-checkpoint /root/Qwen2.5-0.5B-Instruct \
+    --save /root/Qwen2.5-0.5B-Instruct_torch_dist/


This script contains hardcoded paths (e.g., /root/Megatron-LM, /root/Qwen2.5-0.5B-Instruct). This reduces portability and makes it difficult for other users to run the script in different environments. Consider using environment variables or command-line arguments to specify these paths to improve reusability.

gemini-code-assist · 2026-05-17T14:24:56Z

+   --hf-checkpoint /root/Qwen2.5-0.5B-Instruct/
+   --ref-load /root/Qwen2.5-0.5B-Instruct_torch_dist/


This script contains hardcoded paths (e.g., /root/Qwen2.5-0.5B-Instruct/, /root/gsm8k/...). This reduces portability and makes it difficult for other users to run the script in different environments. It's recommended to use environment variables or command-line arguments for these paths.

gemini-code-assist · 2026-05-17T14:24:56Z

+   --hf-checkpoint /root/Qwen2.5-0.5B-Instruct/
+   --ref-load /root/Qwen2.5-0.5B-Instruct_torch_dist/


This script contains hardcoded paths (e.g., /root/Qwen2.5-0.5B-Instruct/, /root/gsm8k/...). This reduces portability and makes it difficult for other users to run the script in different environments. It's recommended to use environment variables or command-line arguments for these paths.

knlnguyen1802 · 2026-05-18T02:46:24Z

Latest result

Colocated training

Reward

Non-colocated training

Reward

SamitHuang and others added 30 commits March 2, 2026 16:32

temp save rfc

08ce80b

Signed-off-by: SamitHuang <285365963@qq.com>

add plan

3af806e

Signed-off-by: SamitHuang <285365963@qq.com>

update

48fbde3

Signed-off-by: SamitHuang <285365963@qq.com>

qwen2.5 0.5b non-colocate (first attempt ok, but nccl error later)

f8ceed6

Signed-off-by: samithuang <285365963@qq.com>

add convert script

2caa4a0

add setup doc

8caa8ba

fix nccl error by NcclBridge subprocess

25ee005

Add rollout backend client and test qwen2.5-0.5b non-colocate training

09f534a

Add rollout backend client and test qwen2.5-0.5b non-colocate training

eliminate gpu to cpu weight transfer

ab7eb0b

Signed-off-by: samithuang <285365963@qq.com>

Eliminate intermediate CPU tensors for faster weight transfer

411e2d2

Eliminate intermediate CPU tensors

Revise weight synchronization strategy in goal plan

546d2ad

Reorder weight synchronization support for colocate and non-colocate scenarios in the goal plan.

Plan refactor vllm/sglang

c154078

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Code implemented

1a2dcf5

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix bug

8addb37

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix bug

9689a97

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix bug

3753432

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix port

f1e7554

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix config

91cc780

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix bug MOE weight sync

be1ecd4

Fix bug vllm transfer weight

e7216d8

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix weight sync

2343647

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix

8498b7b

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix config

8a41184

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Change name config

acc9690

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Resolve review

90934e6

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Try colocated vllm weight

6b4c373

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

[Draft] Local runable dev

49d760f

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix offload train

ee2702b

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix offload train

cb77fc5

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

knlnguyen1802 and others added 12 commits April 22, 2026 14:46

Fix offload_rollout

ba06919

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix vllm offload

3491aae

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix offload traing

af2bfb4

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix offload weight

14aa09a

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix offload weight

d82d1d0

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix CI: update rollout_data_postprocess plugin contract for new call …

c8aaf01

…site (THUDM#1902) Co-authored-by: jingshenghang <shenghang.jing@aminer.cn>

Patch Megatron TP grad coalesce to chunked all-reduce (THUDM#1899)

5b326e6

fix: harden retool rollout against multi-turn / retry desync (THUDM#1861

41dc3b6

) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Fix log file

a7a3ee1

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Rebase

e7dbc8d

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix import engine group

b015d72

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

Fix rebase code

26f979d

Signed-off-by: knlnguyen1802 <knlnguyen1802@gmail.com>

gemini-code-assist Bot reviewed May 17, 2026

View reviewed changes

SamitHuang merged commit 9146f9f into SamitHuang:rebase May 18, 2026
1 check failed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebase to latest slime#12

Rebase to latest slime#12
SamitHuang merged 42 commits into
SamitHuang:rebasefrom
knlnguyen1802:rebase-vllm

knlnguyen1802 commented May 17, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

gemini-code-assist Bot May 17, 2026

Uh oh!

knlnguyen1802 commented May 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		--hf-checkpoint /root/Qwen2.5-0.5B-Instruct/
		--ref-load /root/Qwen2.5-0.5B-Instruct_torch_dist/

Conversation

knlnguyen1802 commented May 17, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 17, 2026

Choose a reason for hiding this comment

Uh oh!

knlnguyen1802 commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

knlnguyen1802 commented May 18, 2026 •

edited

Loading