About the deep-coder-1.5b-base-ckpt

I'm trying to reproduce the training process of deepcoder 1.5b, but I found the base model is `"checkpoints/deepscaler/deepscaler-code-32k-easy/global_step_320/actor/checkpoint"` at https://wandb.ai/mluo/deepcoder/runs/s3lpnxwa/overview, with already relatively high val score (~ 23 on test_lcb)
how to improve from r1-distilled-qwen-1.5b (~ 16-17) to the performance of this ckpt? Thanks for your advices~

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

About the deep-coder-1.5b-base-ckpt #108

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

About the deep-coder-1.5b-base-ckpt #108

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions