diff --git a/src/scheduling/README.md b/src/scheduling/README.md
index ce6118c2..13b1cf13 100644
--- a/src/scheduling/README.md
+++ b/src/scheduling/README.md
@@ -2,8 +2,17 @@
This directory implements a two-phase scheduler for distributed LLM inference:
-- **Phase 1 — Layer allocation**: assign contiguous decoder layer ranges to nodes and rebalance in place.
-- **Phase 2 — Request routing**: compute an end-to-end, minimum-latency path across the assigned node ranges.
+### Phase 1 — Layer allocation
+
+Assign contiguous decoder layer ranges to nodes and rebalance in place, as illustrated below:
+
+
+
+### Phase 2 — Request routing
+
+Compute an end-to-end, minimum-latency path across the assigned node ranges, as illustrated below:
+
+
The main entrypoint is `scheduling.scheduler.Scheduler`, which orchestrates allocation, dynamic joins/leaves, health checks, and routing.