Sprint 45: extract cli/commands.py + per-domain dispatchers#18
Merged
Conversation
The dispatcher previously did 'from dlm.preference import build_judge' (re-export). Tests monkeypatch the canonical 'dlm.preference.judge.build_judge' path; using the canonical import in the dispatcher keeps function-local attribute lookup aligned with what tests patch.
Lifts build_backend + load + generate out of the CLI for text-only bases. VL and audio paths still live in prompt.py CLI helpers; a follow-up phase splits them into modality-aware dispatchers.
….dispatch:run_train Lifts the hardware probe → manifest bootstrap → phase orchestration sequence out of the CLI. Watch loop, RPC probe server, multi-GPU accelerate launcher dispatch, and license interactive prompt stay CLI-side. Dotted imports in the dispatcher keep tests' monkeypatches on dlm.hardware.doctor and dlm.train.preference.phase_orchestrator.run_phases visible at call time.
Two bugs combined to make `dlm prompt --backend mlx` produce base-model behavior even with a fully-trained PEFT LoRA adapter: 1. `target_modules` from PEFT is bare (`q_proj`), but mlx-lm's `linear_to_lora_layers` matches `named_modules()` keys inside each transformer block via exact equality. The FQN within a block is `self_attn.q_proj`, so no keys ever matched and `linear_to_lora_layers` silently left the model un-wrapped. 2. PEFT and mlx-lm use different LoRA tensor layouts: PEFT lora_A=[r,in], lora_B=[out,r]; mlx-lm lora_a=[in,r], lora_b=[r,out]. mlx-lm's `model.load_weights(strict=False)` silently skipped the mismatched shapes, leaving zero overlay. The user-visible failure was "trained model behaves identically to base" — surfaced during the audit-13 follow-up Finding 04 direct-query smoke test.
Even with the conversion fix, an unconvertible adapter (architecture whose layers don't follow the self_attn/mlp convention) would still fall through to base-model output silently. Add a post-load guard that walks the model's `trainable_parameters` and raises `MlxConversionError` when zero `lora_a`/`lora_b` parameters are present. Surfaces the failure as a clear message pointing at `--backend pytorch` instead of letting the trained adapter behave identically to the base.
Lifts each server target's prepare → smoke → finalize chain out of the CLI into a typed dispatcher. CLI just builds a Request, calls the runner, and renders. Smoke failure surfaces as a populated 'smoke' field with ok=False (and manifest_path=None), so the CLI keeps full control of exit codes. Dotted import of dlm.export.targets keeps existing test fixture monkeypatches visible at call time.
Lifts the adapter-dir resolution + prepare_llama_server_export + smoke chain out of the CLI's llama-server branch. CLI just builds a LlamaServerPostExportRequest, calls run_llama_server_post_export, and renders the typed result. VendoringError + ExportError still propagate to the CLI for target-specific banner formatting.
Each new dispatcher module now has a tests/unit/ peer that drives its branches directly, so the per-package coverage gates (store, train, inference, export) stay at 100% without depending on CLI tests' indirect coverage. Modules covered: dlm.inference.dispatch, dlm.train.dispatch, dlm.store.bootstrap, dlm.store.show, dlm.export.entry.
…tion # Conflicts: # src/dlm/cli/commands.py # src/dlm/replay/store.py
The preference dispatcher uses dotted import 'from dlm.preference import judge as _judge_mod; _judge_mod.build_judge(...)'. Tests must patch 'dlm.preference.judge.build_judge' (canonical) for late attribute lookup to see the patch — patches on the package re-export 'dlm.preference.build_judge' are invisible to the dispatcher. Caught by Ubuntu CI on PR #18.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
cli/commands.py(4677 LOC) into thecli/commands/package: 22 per-command submodules +_shared.pyhelpers + a slim__init__.pyof re-exports.metrics,synth,preference,init,show,prompt,train. Each command now builds a typedRequest, calls arun_*dispatcher, and renders a typedResult. Dotted imports throughout so test monkeypatches resolve at call time.dlm.export.entry:run_vllm_target_export,run_mlx_serve_target_export,run_llama_server_post_export. CLI is now thin glue around prepare → smoke → finalize for each runtime target.tests/unit/{inference,train,store,export}/test_*.py), so the per-package coverage gates stay 100% without depending on indirect CLI test coverage.The branch also carries a few non-Sprint-45 fixes that landed during the same window: an MLX-PEFT-adapter silent-corruption fix (
f7f0450,931f6bb,4d133cf), probe-marker normalization across replay/synth/gate parsers, and audit-13 follow-up findings moved into the versioned docs tree.Test plan
uv run pytest tests/unit/— 4200 pass, 4 skip./scripts/coverage-gates.sh— all 16 gates at 100%./scripts/pregate.sh— clean (ruff, format, mypy, unit, advisory checks)dlm init --base smollm2-135m→ scaffolded.dlm+ provisioned store;dlm show(text +--json) → both render correctly throughgather_store_view.dlm promptskipped (needs a trained adapter; out of scope for a refactor smoke)