Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions blog/2025-11-19-miles.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@ date: "November 19, 2025"
previewImg: /images/blog/miles/miles.jpg
---

> A journey of a thousand miles begins with a single step.
> *A journey of a thousand miles begins with a single step.*

We're excited to introduce Miles, an enterprise-facing reinforcement learning framework designed for large-scale MoE training and production workloads. This introductory chapter will be the beginning of a series of tech blogs.

Miles is forked from slime, the lightweight RL framework that has quietly powered many of today’s post-training pipelines and large MoE training runs. Building on slime’s foundation, Miles aims to deliver a smooth and controllable RL experience for teams that need reliability and scale in real-world deployments.

The GitHub link for Miles can be found here: https://github.com/radixark/miles
The GitHub link for Miles can be found [here](https://github.com/radixark/miles).

## 🧠 Starting Point: slime - A Lightweight and Customizable RL Framework

Expand Down Expand Up @@ -64,7 +64,7 @@ In RL, freezing the draft model prevents it from following the target model poli

### Miscellaneous Updates

Enhance the FSDP training backend; allow deploying the rollout subsystem independently outside the framework; debug utilities such as more metrics, post-hoc analyzers, and enhancing profilers; gradually refactor the code to further enhance it; A formal mathematics (Lean) example is provided with SFT/RL scripts.
Enhance the FSDP training backend; allow deploying the rollout subsystem independently outside the framework; debug utilities such as more metrics, post-hoc analyzers, and enhancing profilers; gradually refactor the code to further enhance it; A formal mathematics (Lean) example is provided with [SFT/RL scripts](https://github.com/radixark/miles/tree/main/examples/formal_math/single_round).

## 🚧 Towards the Future: Our Roadmap

Expand Down