Skip to content

refactor(engine): extract WeightUploadOrchestrator (init_weights) to its own TU#312

Closed
kekzl wants to merge 3 commits into
mainfrom
refactor/engine-extract-weight-upload
Closed

refactor(engine): extract WeightUploadOrchestrator (init_weights) to its own TU#312
kekzl wants to merge 3 commits into
mainfrom
refactor/engine-extract-weight-upload

Conversation

@kekzl
Copy link
Copy Markdown
Owner

@kekzl kekzl commented May 20, 2026

Summary

Move `Engine::init_weights` (~230 LOC) from `src/runtime/engine.cpp` to `src/runtime/engine_weight_upload.cpp`. Mechanical TU split, body byte-identical.

Stacking

Stack chain: #310#311 → this.

LOC

`engine.cpp` 2752 → 2522 (-230). New file 253 LOC.

Test plan

  • `make build` green
  • `make verify-fast` green
  • Pre-push hook verify-fast green
  • Combined spec + code-quality reviewer ✅ Approved

Phase 4 Task 3 of `docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md`.

🤖 Generated with Claude Code

kekzl and others added 3 commits May 20, 2026 13:17
Anonymous-namespace and file-scope static helpers move to a new header
src/runtime/engine_internal.h so subsequent per-subsystem extractions
(Tasks 2-7 of Phase 4) can share them across translation units.

No behavior change.

Phase 4 of docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves Engine::init_apply_debug_raw_overrides_,
Engine::init_resolve_kv_dtype_policy_,
Engine::init_resolve_ssm_dtype_,
Engine::init_resolve_fp8_prefill_,
Engine::init_resolve_quant_flags_, and
Engine::init_compute_max_seq_len_ (~320 LOC across 6 methods) to
src/runtime/engine_init_resolver.cpp.

Declarations in engine.h unchanged. Bodies byte-identical.

Phase 4 of docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…its own TU

Moves Engine::init_weights (~230 LOC) from src/runtime/engine.cpp to
src/runtime/engine_weight_upload.cpp. Declaration in engine.h unchanged.
Body byte-identical.

Phase 4 of docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kekzl
Copy link
Copy Markdown
Owner Author

kekzl commented May 20, 2026

Superseded by #334 — rebased onto current main since this branch couldn't be force-pushed via the harness.

@kekzl kekzl closed this May 20, 2026
kekzl added a commit that referenced this pull request May 20, 2026
* refactor(engine): extract WeightUploadOrchestrator (init_weights) to its own TU

Moves Engine::init_weights (~230 LOC) from src/runtime/engine.cpp to
src/runtime/engine_weight_upload.cpp. Declaration in engine.h unchanged.
Body byte-identical.

Phase 4 of docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(engine): extract KVCacheInitializer (init_kv_cache) to its own TU

Moves Engine::init_kv_cache (~255 LOC) to src/runtime/engine_kv_cache_init.cpp.

Declaration in engine.h unchanged. Body byte-identical.

Phase 4 of docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(engine): extract WorkspaceBuilder + Warmup + banned-token list

Combines three small init-tail methods into one TU:
  - Engine::init_features (~77 LOC)
  - Engine::build_banned_token_list (~85 LOC)
  - Engine::warmup (~91 LOC)

All three run during engine init after weights + KV cache are set up;
colocating them keeps the init-tail logic in one file. Declarations in
engine.h unchanged. Bodies byte-identical.

Phase 4 of docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(engine): extract Scheduler to its own TU

Moves 13 step/prefill/decode methods (~1250 LOC, the bulk of engine.cpp)
to src/runtime/engine_scheduler.cpp:
  - step, step_async_graph_resume, step_schedule
  - supports_chunked_prefill_, resolve_prefill_chunk_size_
  - step_prefill, prefill_allocate_kv_blocks_,
    prefill_upload_metadata_, step_prefill_one
  - step_decode, decode_build_inference_state_,
    step_decode_forward, step_decode_process_outputs

All declarations in engine.h unchanged. Bodies byte-identical.

This is the largest single drop in Phase 4 of the architecture refactor.

Phase 4 of docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(engine): extract Sampling helpers + Stop controller — final split

Moves 6 decode-loop tail methods to src/runtime/engine_sampling_stop.cpp:
  - is_stop_token, track_think_state, should_stop (stop check)
  - fill_sampling_params, upload_penalties, fill_recurrent_state (per-request setup)

Two related clusters colocated because they share the per-request
state-passing pattern (Request& + InferenceState&) and run in the
decode loop tail.

This is the FINAL per-subsystem extraction of Phase 4. After this PR,
engine.cpp is at ~570 LOC — well under the <=800 LOC plan target.

Declarations in engine.h unchanged. Bodies byte-identical.

Phase 4 of docs/superpowers/specs/2026-05-20-architecture-refactor-roadmap-design.md

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(arch): close Phase 4 of refactor roadmap

Phase 4 (Engine.cpp zerteilen) is done. PRs that landed:
- #310 engine_internal.h shared helpers
- #311 InitResolver (~319 LOC)
- #312 WeightUploadOrchestrator (~230 LOC)
- #313 KVCacheInitializer (~255 LOC)
- #314 WorkspaceBuilder + Warmup + banned-tokens (~247 LOC combined)
- #315 Scheduler — 13 step/prefill/decode methods (~1244 LOC biggest)
- #316 Sampling + Stop helpers (~206 LOC) — final per-subsystem extraction

Outcome: engine.cpp 3112 → 570 LOC façade (target ≤800, beaten by ~30 %).

Plan deviated from "named class with constructor injection" to
"TU-split-only" — same pattern as Phase 3. Multi-week pure-functional
class redesign avoided. Promoting to named classes preserved as
opportunistic future refactor.

Closeout updates:
1. docs/superpowers/specs/...-roadmap-design.md — Phase 4 status
   with PR list + deviation note.
2. docs/architecture.md — rewrite engine.cpp wound to reflect the
   new 570-LOC façade + 6 per-subsystem TU layout.

Phase 5 (Schichten und APIs — VRAM owner, RuntimeConfig de-globalize,
Public API dedupe, 1 GiB S-matrix wound) is the FINAL phase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(engine): remove stray closing brace in engine_scheduler.cpp

CI failure on PR #337 / v2 #337 (= #325 chain head for scheduler):
syntax error at engine_scheduler.cpp:1297 due to extra '}' after the
namespace-scope function closing brace. Introduced during the local
rebase conflict resolution when I replaced the scheduler function
bodies with HEAD versions via sed cascade.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant