release: development → main — Phase 22 wizard modernisation + site honesty pass-5 by cemililik · Pull Request #42 · HodeTech/ForgeLM

cemililik · 2026-05-09T20:35:50Z

Summary

Promotes 17 commits from development to main, covering two merged feature PRs, the working-memory documentation policy, and a release-preparation pass.

PR feat(wizard): Phase 22 — CLI wizard parity with web + sub-package split #40 (Phase 22 wizard modernisation) — CLI wizard parity with the in-browser wizard, sub-package split (forgelm/wizard/), schema-driven defaults SOT (tools/generate_wizard_defaults.py + _defaults.json), idempotent re-run via --wizard-start-from <yaml>, and 5 review absorption cycles (PR-A through PR-E + 18 verified findings sweep).
PR fix(site): pass-5 closure — site-vs-code honesty sweep + i18n parity #41 (site honesty pass-5) — Closed 11 CRITICAL + 10 HIGH + 9 MEDIUM + 7 LOW findings from the pass-4 site-vs-code audit. Brings de/fr/es/zh i18n to full parity (731 keys each, was 689) by translating the 42-key compliance/governance gap. Eliminates ghost YAML keys (compliance.config_hash, compliance.human_approval), ghost CLI flags (--model-card), and stale paths (forgelm/data_audit.py).
Working-memory directories policy (commit 46d2518) — docs/marketing/ and docs/analysis/ are now strictly gitignored with zero exceptions; new tools/check_no_analysis_refs.py CI guard prevents public-tree citations into either directory.
Release-prep refresh (commit ecf3dcc) — added after the initial PR description: nightly pip-audit --ignore-vuln CVE-2026-1839 stop-gap with documented rationale + remove-after condition (closes #37); CHANGELOG [Unreleased] rewritten in public-facing style (grouped by user-visible capability, no internal review-cycle / wave-N / round-N terminology); docs/roadmap.md + roadmap-tr.md add Phase 22 + site sweep rows; docs/roadmap/releases.md marks v0.5.5 as "Merged on main; PyPI tag deferred" and adds an "upcoming PyPI release" section that surfaces the 0.5.6 vs 0.6.0 version-slot decision for the maintainer.

What changed (high level)

17 commits, 88 files
~7283 insertions / ~8656 deletions (deletions dominated by the unversioning of 11 working-memory files under docs/analysis/)
New CI guards: check_no_analysis_refs.py, check_wizard_defaults_sync.py, generate_wizard_defaults.py (generator), check_anchor_resolution.py hardening
Expanded test coverage: tests/test_wizard_phase22.py (1759 new lines), updates to test_cli_phase10.py, test_phase12_5.py, test_wizard_phase11.py
Wizard module: forgelm/wizard/ package split (orchestrator, state, collectors, BYOD, IO, defaults JSON)
New CLI surface bits: --wizard-start-from, refreshed _exit_codes.py (EXIT_WIZARD_CANCELLED = 5), updated _parser.py
pyproject.toml: package data inclusion for forgelm.wizard/_defaults.json
.github/workflows/nightly.yml: --ignore-vuln CVE-2026-1839 flag added to the pip-audit step

Test plan

All 14 documentation / parity / consistency CI guards pass on the development tip (verified during PR fix(site): pass-5 closure — site-vs-code honesty sweep + i18n parity #41 closure + post-ecf3dcc re-run)
ruff check + format clean across forgelm/ tools/ tests/
forgelm --config config_template.yaml --dry-run ✓
PR feat(wizard): Phase 22 — CLI wizard parity with web + sub-package split #40 merged into development after 5 review cycles + independent verification
PR fix(site): pass-5 closure — site-vs-code honesty sweep + i18n parity #41 merged after independent reviewer verdict + 4 inline-review absorption + i18n drift fix
Codacy + SonarCloud Quality Gates passing on the development tip
Two independent reviewer passes on this development→main PR — both APPROVE with NIT-level notes (no blockers)
Final maintainer pass + tag-version decision (0.5.6 vs 0.6.0) before tag push

Notes

No version bump in this PR. pyproject.toml stays at 0.5.5. The version-slot decision for the upcoming tagged release is documented in docs/roadmap/releases.md ("Next PyPI release") with rationale on each side.
The CHANGELOG [Unreleased] section is the staging area for the upcoming PyPI release. Once the version-slot decision is made, the [Unreleased] header gets promoted to the chosen version (## [0.5.6] or ## [0.6.0]) with the release date, the version is bumped in pyproject.toml, and the tag push triggers publish.yml to PyPI.
Sourcery's review on PR fix(site): pass-5 closure — site-vs-code honesty sweep + i18n parity #41 was skipped due to its 150K-character diff limit — this PR is much larger, so expect the same skip here.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Interactive configuration wizard now supports idempotent reruns via --wizard-start-from, resumable state persistence across sessions, and 9-step guided flow with parity to in-browser wizard.
- New wizard-cancellation exit code (5) distinguishes user-cancelled sessions from successful completion.
- Cross-tab wizard synchronization enables state sharing across browser windows.
Improvements
- Windows platform support now documented across site and CLI.
- Enhanced mobile wizard experience with scrollable YAML pane.
- Exit code contract extended to include wizard-specific outcomes.
Documentation
- Quickstart guides expanded with detailed wizard step descriptions (multi-language).
- Architecture and design documents updated to reflect modernized wizard structure.
- CLI reference documentation includes new --wizard-start-from flag and exit codes.

Closes the parity gap between `forgelm --wizard` (CLI) and the in-browser `site/js/wizard.js` documented in `docs/analysis/code_reviews/2026-05-07_cli_wizard_ux_analysis.md` + `-2.md`. 20 original findings (G1-G20) and 5 independent findings (I1-I5) were verified against current code; this PR addresses the actionable subset. What's covered (CLI): - **Step reorder (I1):** `compliance.risk_classification` is now collected BEFORE `evaluation.safety` so the F-compliance-110 strict-tier auto-coercion can fire reliably. The v0.5.5 wizard asked safety first, then risk — silently producing high-risk configs that failed to load. - **Trainer-specific hyperparameters (G1):** `dpo_beta`, `simpo_beta` / `simpo_gamma`, `kto_beta`, `orpo_beta`, `grpo_num_generations`, `grpo_max_completion_length`, `grpo_reward_model` are now surfaced per `trainer_type`. SFT short-circuits. - **PEFT method breadth (G2):** the strategy step now offers all four schema-supported `lora.method` values (`lora`, `dora`, `pissa`, `rslora`) plus GaLore as a separate axis (6 cards total). - **GaLore optimiser variants (G9):** all six `galore_optim` Literal values surfaced, including the three `_layerwise` siblings. - **RoPE scaling types (G10):** `longrope` joined `linear` / `dynamic` / `yarn` (4-of-4 schema coverage). - **LoRA `r` schema parity (G11):** `DEFAULT_LORA_R` lowered from `16` → `8` to match `LoraConfigModel.r`. - **Compliance depth (G5):** `compliance` (Article 11 + Annex IV §1) plus optional `risk_assessment` (Article 9), `data.governance` (Article 10), `retention` (GDPR Article 5(1)(e) + 17), `monitoring` (Article 12+17), `evaluation.benchmark`, `evaluation.llm_judge`, `synthetic` blocks now configurable. - **High-risk auto-coercion (G8):** `risk_classification ∈ {high-risk, unacceptable}` automatically enables `evaluation.safety` + `evaluation.require_human_approval` with a visible operator notice. Front-stops the schema-side `F-compliance-110` `ConfigError`. - **Webhook URL parsing (G15):** single prompt accepts a literal URL or `env:VAR_NAME` reference; `urlparse`-validated; HTTPS recommended (HTTP accepted with warning). - **Safety probe path (G16):** bundled probe set resolved through `importlib.resources.files("forgelm.safety_prompts")`, fixing the `pip install` regression where `configs/safety_prompts/general_safety.jsonl` was the wizard default but never shipped in the wheel. - **Configuration summary (G17):** `_print_wizard_summary` now dumps the full YAML alongside the labelled headline. - **Persistence (G6):** state snapshot at `$XDG_CACHE_HOME/forgelm/wizard_state.yaml` (or `~/.cache/forgelm/wizard_state.yaml`) saved after every completed step. Schema versioned (`v: 1`); cleared on successful completion. - **Step-diff preview (G7):** each completed step prints `+ key.path: value` / `~ key.path: before → after`. - **Beginner / expert toggle (G13):** beginner mode prefixes each step with a 2-3-line tutorial paragraph. - **Use-case integration (G12):** the curated quickstart-template list is also offered as Step 2 of the full flow, seeding defaults for later steps. - **`back` / `reset` navigation (G3):** `back` / `b` returns to the previous step; `reset` / `r` clears in-memory state. - **`POPULAR_MODELS` alignment with web (G14):** CLI list now matches `site/js/wizard.js`'s preset cards. What's covered (web): - **Header comment (I2):** `site/js/wizard.js` opening doc block updated from "7 steps" to "9 steps" with correct list. - **Use-case key alignment (I4):** `USE_CASE_PRESETS` keys renamed (`code-copilot` → `code-assistant`, `medical-tr` → `medical-qa-tr`) so `forgelm/quickstart.py::TEMPLATES` is the single source of truth. What's covered (docs): - **Design doc (I5):** `docs/design/wizard_mode.md` rewritten to describe the actual 9-step flow, sub-package layout, and Phase 22 modernisation status banner. - **Architecture reference:** `docs/reference/architecture.md` + `-tr.md` updated for the `forgelm/wizard/` sub-package. - **Quickstart guide:** `docs/guides/quickstart.md` + `-tr.md` updated with the new flow description. - **CHANGELOG.md:** comprehensive Phase 22 entry under `[Unreleased]` / `### Added`. - **Roadmap link fix:** `docs/roadmap/phase-11-5-backlog.md` now points at `forgelm/wizard/_byod.py` rather than the old monolith. Architecture: `forgelm/wizard.py` (976 LOC monolith) split into the `forgelm/wizard/` sub-package per `docs/standards/architecture.md`'s 1000-line ceiling rule: - `_io.py` (213 LOC) — primitive prompts + navigation tokens + hardware detection. - `_state.py` (290 LOC) — `_WizardState`, persistence, summary, schema-default constants. - `_collectors.py` (688 LOC) — every `_collect_*` helper + strategy choice + use-case preset registry. - `_byod.py` (333 LOC) — Phase 11.5 / 12.5 BYOD inline ingest + audit helpers + quickstart prelude. - `_orchestrator.py` (566 LOC) — step definitions, `_drive_wizard_steps`, `_apply_strict_tier_coercion`, `run_wizard`. - `__init__.py` (176 LOC) — re-export shim for backward compatibility. Tests: 55 new tests in `tests/test_wizard_phase22.py` cover `_parse_webhook_value`, `_default_safety_probes_path`, `_collect_trainer_hyperparameters`, `_check_navigation_token`, `_apply_strict_tier_coercion`, persistence helpers, `_print_step_diff`, `_collect_webhook_config`, use-case preset registry parity with `quickstart.py::TEMPLATES`, strategy choice shape, GaLore optimiser variants, and schema-default parity for LoRA r / alpha / epochs / batch / LR. Existing `tests/test_wizard_phase11.py` test_print_wizard_summary cases updated to use the new dict-based API. Validation: 8/8 doc CI guards green (bilingual_parity, anchor_resolution, cli_help_consistency, yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks); ruff check + format clean across the package; module-size guard reports 0 over the fail-threshold; `forgelm --config config_template.yaml --dry-run` passes; strict-tier auto-coercion produces a config that `ForgeConfig.model_validate` accepts. i18n decision: CLI wizard remains English-only per operator explicit instruction. Web wizard's 6-language `wizard.*` catalogue (en/tr/de/fr/es/zh) is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Addresses inline review comments + verified user findings on the Phase 22 wizard modernisation PR. Critical / major / minor mix: Critical (2): - _io.py: torch.cuda.get_device_properties().total_mem → total_memory (the v0.5.5 bug raised AttributeError on real CUDA hardware). - _orchestrator.py: WizardReset re-loops with a fresh _WizardState instead of returning early; returning let _run_full_wizard treat the reset as a completed run and try to save an empty config. Major (6): - _orchestrator.py: WizardBack now restores the prev_config snapshot so a half-completed step's mutations don't leak into the previous step's prompts. - _orchestrator.py: deep copy now uses copy.deepcopy instead of json.loads(json.dumps(...)) — the JSON path was lossy for tuples and sets that may sneak in via collector helpers. - _orchestrator.py: _step_model preserves a use-case preselect that isn't in POPULAR_MODELS (defaults to the Custom slot pre-filled with the operator's earlier choice). - _byod.py: _prompt_dataset_path_with_ingest_offer validates that the typed value is a directory, JSONL/JSON file, or HF Hub ID; bare typos no longer silently become data.dataset_name_or_path. - _collectors.py: Article 9 strict-tier branch enforces non-empty intended_use / foreseeable_misuse / mitigation_measures. - _collectors.py: Article 10 strict-tier branch enforces non-empty collection_method / annotation_process / known_biases. Medium (4): - _collectors.py: notify_on_start default flips to False (web wizard parity — start pings are noisy). - _collectors.py: _collect_compliance_metadata collects risk_classification first so strict-tier hint shows up before Article 11 fields and downstream Article 9/10 branches see the correct context. - _state.py: _save_config_to_file uses yaml.safe_dump + UTF-8 + allow_unicode (yaml.dump could emit !!python/object tags that ForgeConfig then rejects). Same treatment for the YAML preview. - _state.py: chmod 0o600 on wizard_state.yaml (best-effort) — compliance metadata can be sensitive. - _io.py: _prompt_choice guards negative idx and out-of-range int alongside the existing IndexError fallback. Minor (3): - _byod.py: shlex.quote() the path in BYOD shell-hint output so paths with spaces survive copy-paste into a real shell. - docs/design/wizard_mode.md: MD030 — single space after ordered list markers (was double). - docs/reference/architecture{,-tr}.md: heading wizard.py → wizard/ (the module became a sub-package in Phase 22). Coverage (HIGH): - tests/test_wizard_phase22.py adds end-to-end coverage for _drive_wizard_steps (back-restores-config, reset-re-loops, back-at-first-step is no-op, persist-after-each-step) plus _maybe_resume_state (resume-from-snapshot, decline-clears) and _step_welcome (backend-hint + beginner branch). Orchestrator module coverage rises from 14% → 30% on this file alone. All checks green: - ruff check + format clean - 1544 passed, 14 skipped (full suite) - 8 doc guards: bilingual_parity, anchor_resolution, cli_help_consistency, yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks - module_size: no new warnings - forgelm --config config_template.yaml --dry-run Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

CHANGELOG entry for the c0dd57a fix sweep — Critical / Major / Medium / Minor classifications + the orchestrator coverage bump that landed alongside. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Phase 22 review-cycle 2: addresses 17 newly-discovered parity / UX gaps surfaced during the post-PR-#40 audit (P1-P18 + B1-B5 + C1-C2 + D1-D2 + E1+E4). All five HIGH-severity findings closed; both wizards (CLI sub-package + site/js/wizard.js) tightened in lockstep. ## Critical bug fixes - B-NEW-1: ``_print_step_diff`` strips internal ``_wizard_meta.*`` keys before showing the operator the per-step diff (was leaking ``_wizard_meta.use_galore`` from the strategy step). - B-NEW-2: ``_step_compliance`` skips the Article 10 governance re-prompt when step 6 already populated it under non-strict tier (was silently overwriting earlier answers). - B-NEW-3: ``_save_wizard_state`` now writes via ``tempfile`` + ``fsync`` + ``os.replace`` so a SIGKILL / power loss / concurrent wizard process can never leave a half-written state file. - B5: strict-tier coercion fires at the end of ``_step_compliance`` too (not just ``_step_evaluation``), so a Ctrl-C between steps 8 and 9 still produces a loadable YAML if the operator manually saves. - A1: ``_parse_webhook_value`` SSRF preflight — rejects loopback / RFC1918 / link-local destinations up front by reusing ``forgelm._http._is_private_destination``. ## UX guardrails - B1/E1: ``_validate_generated_config`` runs ``ForgeConfig.model_validate`` on the saved YAML before exit. Schema rejections surface inline. - B2: ``_prompt_unique_filename`` + ``_next_free_filename`` — overwrite confirmation with auto-suffix (no more silent clobbers). - B3: ``run_wizard_full`` refuses to launch on non-tty stdin and points operators at ``forgelm quickstart <template>`` for scripted runs (closes the silent-empty-config CI hole). - E4: ``_print_preflight_checklist`` — GPU/VRAM/dataset/risk-tier signals before the configuration summary. ## CLI ↔ web parity - P1+P18: safety-block field union — CLI now emits ``classifier`` + ``max_safety_regression`` (and ``min_classifier_confidence`` under confidence_weighted scoring); web wizard now emits ``scoring`` + ``track_categories`` + ``severity_thresholds``. Both surfaces produce structurally equivalent ``evaluation.safety`` blocks. - P2: judge ``min_score`` default 6.5 → 5.0 (schema parity). - P3: web wizard surfaces ``model.max_length`` as a number input on the training step (was silently capped at 2048). - P9: monitoring collector accepts ``env:VAR_NAME`` like webhook. - P13: HF Hub format hint when custom model name isn't ``<org>/<name>``. - P14: ``auto_revert.max_acceptable_loss`` default 2.0 emission. - P16: CLI emits ``bnb_4bit_quant_type`` + ``bnb_4bit_compute_dtype`` when QLoRA is selected (web parity). - C1/I3: web ``learningRate: '2e-5'``, ``batchSize: 4`` (schema parity); ``2e-5`` added to LR option list. - C2: web wizard cross-tab ``storage`` event listener. ## Distinct exit codes - D2: new ``EXIT_WIZARD_CANCELLED = 5`` constant; new ``WizardOutcome`` dataclass + ``run_wizard_full()`` entry point. Genuine cancels (Ctrl-C, non-tty, decline-to-save) now exit 5 instead of 0. Saved-and-deferred remains exit 0. - D1: best-effort ``import readline`` for arrow-key + history on Linux/macOS (Windows unaffected). ## Tests + isolation - B10: new autouse ``_isolate_wizard_state`` fixture in ``tests/conftest.py`` redirects ``XDG_CACHE_HOME`` for any ``tests/test_wizard_*`` module. - 28 new tests across the new behaviour (146 wizard tests passing in total; full suite 1573 passed / 14 skipped). - Existing ``test_wizard_still_works`` updated for the new ``WizardOutcome`` contract; new ``test_wizard_cancelled_exits_5``. ## Documentation - ``CHANGELOG.md`` — review-cycle 2 entry under [Unreleased]. - ``README.md`` — directory listing ``wizard.py`` → ``wizard/``. - ``docs/standards/architecture.md`` — Mermaid graph + table updated. - ``docs/design/wizard_mode.md`` — review-cycle 2 status banner. - ``docs/reference/usage{,-tr}.md`` — exit-code 5 row. - ``docs/usermanuals/{en,tr}/reference/exit-codes.md`` — full contract row + compatibility note (six integers, was five). - ``docs/guides/quickstart{,-tr}.md`` — operator-guardrail paragraph. ## Validation - ruff check + format clean (184 files). - 8 doc guards pass (bilingual_parity, anchor_resolution, cli_help_consistency, yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks). - module_size: no new warnings. - ``forgelm --config config_template.yaml --dry-run`` green. - ``pytest tests/`` — 1573 passed, 14 skipped. ## Deferred (roadmap entries, NOT in this PR) - F1: schema-driven single-source-of-truth (``wizard_defaults.json`` build-time generation) — multi-PR refactor, frames every future default drift. - B4: ``--wizard-from-yaml answers.yaml`` mode — surface change pending policy decision on scripted-config UX. - E2: per-prompt rationale catalog — needs ~50 hand-written strings. - E3: ``--wizard --start-from <yaml>`` idempotent re-run — pairs with F1. - F3: state-versioning migration logic. - C3: web wizard mobile-CSS overflow review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the 11 actionable findings from the review-cycle 2 audit. Low-risk batch — no architectural changes. PR-B (F1 schema-driven defaults SOT) follows separately. ## Critical / correctness - A3: ``_save_wizard_state`` unlinks the temp file when ``os.replace`` fails (cross-device home, EACCES, EXDEV) so the ``.wizard_state.*.tmp`` files stop accumulating. Defensive ``finally``-branch sweep also covers the case where ``yaml.safe_dump`` itself raises before the rename runs. - A4: ``_apply_strict_tier_coercion`` notice prints once per wizard run instead of twice (compliance-step + evaluation-step idempotent re-call). Tracked via the ``_wizard_meta.strict_tier_announced`` flag that ``_strip_internal_meta`` removes before save. - F3: ``_STATE_MIGRATIONS`` registry skeleton in ``_state.py`` ready for future ``_STATE_VERSION`` bumps. Empty by design; first real bump only adds ``{1: migrator}``, no plumbing rewrite. ## Performance - C16: ``_detect_hardware()`` cached on ``_WizardState.hardware``; ``_step_welcome`` + ``_print_preflight_checklist`` share the result instead of paying a second torch import + CUDA enumeration (~50–200 ms saved per wizard run). ## Standards / observability - G30: ``_validate_generated_config`` now emits ``logger.warning(...)`` in addition to the inline ``_print`` notice — CI / log pipelines see "wizard output failed schema validation" without scraping stdout (per ``docs/standards/error-handling.md`` rule 2). - B11: web wizard's ``storage`` event listener ``console.warn``s on a sync failure instead of silently swallowing the error. ## UX - C3: expanded YAML accordion on ``<900px`` viewports now caps at ``38vh; overflow:auto`` so long configs no longer push the Copy/Download/Back/Next footer off-screen. - E2 (hedefli): inline rationale on five high-impact prompts (``lora_r``, ``lora_alpha``, ``num_train_epochs``, ``batch_size``, ``max_length``, ``risk_classification``) so first-time operators see the operational hint without flipping into beginner mode. ## Tests - E22-24: new ``test_wizard_start_training_routes_through_dispatcher`` covers the previously-uncovered branch in ``_maybe_run_wizard`` where ``outcome.start_training=True`` mutates ``args.config`` and lets the trainer pipeline run. - 20+ additional tests: atomic-write cleanup (success / replace-fail / dump-fail), announce-once flag, migration registry (unknown version / newer version / chain runs / non-advancing migrator), hardware cache reuse, validate-warning log line. ## Documentation - A9: ``docs/product_strategy{,-tr}.md``, ``docs/design/gdpr_erasure.md``, ``docs/design/data_audit_cli_split.md`` — "0/1/2/3/4 contract" claims now acknowledge ``EXIT_WIZARD_CANCELLED = 5``. - F27: ``docs/usermanuals/{en,tr}/compliance/human-approval-gate.md`` cross-references exit code 5 alongside the existing exit-4 discussion (high-risk operators using ``--wizard`` need both signals). - B10 doc-only: ``_parse_webhook_value`` comment now notes the SSRF preflight catches "wrong URL at config time", not "guaranteed- public destination at runtime" — runtime TOCTOU tracked at #14. - CHANGELOG entry under [Unreleased] / Added. ## Roadmap reshuffle - B4 (``--wizard-from-yaml``) removed from consideration: project identity ("config-driven") + B3 (non-tty refusal + ``quickstart`` hand-off) closed the gap. - E3 (``--wizard --start-from <yaml>``) explicitly bound to F1: reverse-mapping duplicates F1's schema-driven SOT work; deferred until F1 lands. ## Validation - ruff check + format clean (184 files). - 8 doc guards pass (bilingual_parity, anchor_resolution, cli_help_consistency, yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks). - module_size: no new warnings. - 181 wizard-scope tests pass (``test_wizard_*`` + ``test_phase12_5`` + ``test_cli_phase10``). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes the long-running drift surface tracked through P1-P17 across review-cycles 1, 2, and 3. A schema field-default change without a matching wizard update was the single biggest source of CLI ↔ web parity gaps; this PR turns the schema into the canonical source. ## Pydantic field metadata flag 8 wizard-relevant fields in ``forgelm/config.py`` now carry ``json_schema_extra={"wizard": True}``: - ``ModelConfig.max_length`` (2048) - ``LoraConfigModel.r`` / ``alpha`` / ``dropout`` (8 / 16 / 0.1) - ``TrainingConfig.num_train_epochs`` / ``per_device_train_batch_size`` / ``gradient_accumulation_steps`` / ``learning_rate`` (3 / 4 / 2 / 2e-5) The flag is a no-op for ``ForgeConfig.model_validate`` — it only surfaces in ``model_json_schema()`` output, where the generator picks it up. ## Generator + CI guard ``tools/generate_wizard_defaults.py`` walks every nested ``BaseModel`` reachable from ``ForgeConfig``, finds wizard-flagged fields, and writes two artefacts: - ``forgelm/wizard/_defaults.json`` — package data (shipped via ``[tool.setuptools.package-data]``). - ``site/js/wizard_defaults.js`` — exposes ``window.WIZARD_DEFAULTS`` for the web wizard. ``tools/check_wizard_defaults_sync.py`` re-runs the generator into a temp comparison and fails the CI if the shipped artefacts drift from the schema. Operators get a clear "regenerate via …" hint with the first 5 differing lines so the fix is one command away. ## Python consumer ``forgelm/wizard/_state.py`` now reads the ``DEFAULT_*`` constants from the shipped JSON via ``importlib.resources``. A new ``_load_defaults()`` + ``_default(section, key, fallback)`` helper pair handles missing-file / corrupt-JSON cases by returning ``{}``, so the hardcoded fallbacks below still keep the wizard usable on a broken pip install. ## JS consumer ``site/js/wizard.js`` ``defaultState()`` now uses a new ``wd(section, key, fallback)`` helper to read schema-derived values from ``window.WIZARD_DEFAULTS``. ``site/quickstart.html`` loads ``wizard_defaults.js`` BEFORE ``wizard.js`` so the global is set in time. Special-cases ``learning_rate``: arrives as a number from JSON (2e-5), uses ``Number.toExponential()`` to canonicalise to the dropdown's ``'2e-5'`` string format (rather than ``String()`` which would yield ``'0.00002'`` and fail to match). ## Tests New ``TestSchemaDrivenDefaultsSOT`` class pins the contract: - ``test_module_constants_match_schema_via_json`` — every ``DEFAULT_*`` equals its ``Field.default``. - ``test_defaults_json_shipped_and_loadable`` — package data lives on disk + parses cleanly + has expected sections. - ``test_default_helper_returns_fallback_when_section_missing`` — graceful degradation. - ``test_load_defaults_returns_empty_on_missing_file`` — slim install scenario. Full suite green: 1590 passed, 14 skipped. ## Documentation - ``CHANGELOG.md`` — F1 entry under [Unreleased] / Added with B4 removed-from-roadmap + E3-bound-to-F1 notes. - ``CLAUDE.md`` — self-review chain extended to 8 commands; tools/ listing names ``generate_wizard_defaults`` + ``check_wizard_defaults_sync``. - ``pyproject.toml`` — ``[tool.setuptools.package-data]`` ships ``forgelm/wizard/_defaults.json``. ## Why a generator + JSON, not live-import? - Web wizard is a static site — no Python runtime to query schema defaults at page-load; JSON-as-data ships with the assets. - Python side benefits too: low import-cost path, JSON visible in git diffs (auditable), and round-tripping forces wizard-relevant defaults to be JSON-serialisable primitives. ## Roadmap reshuffle - B4 (``--wizard-from-yaml``) confirmed REMOVED from consideration. - E3 (``--wizard --start-from <yaml>``) explicitly bound to F1: reverse-mapping reuses this PR's metadata flags + JSON shape (deferred until needed; no scope rush). ## Validation - ruff check + format clean (186 files including the two new tools). - 9 doc guards pass: bilingual_parity, anchor_resolution, cli_help_consistency, yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks, **wizard_defaults_sync** (new). - module_size: ``forgelm/config.py`` crossed 1000 LOC (now 1011); it was already grandfathered. - 1590 passed, 14 skipped (full suite). - ``forgelm --config config_template.yaml --dry-run`` green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…d test Closes the actionable findings from the independent review of cycle-3 + F1 commits (5667885 + 91d5726). Low-risk batch — internal hardening + test coverage; no operator-visible behaviour change. ## Generator robustness (PR-B follow-up) - B1: ``_walk_model`` now unwraps ``Optional[BaseModel]`` sub-blocks via the new ``_unwrap_basemodel()`` helper (uses ``typing.get_args``). Pre-fix the walk only matched bare-class annotations, so a wizard flag on a field inside an ``Optional[XxxConfig]`` sub-model was silently skipped. 10 of 14 ForgeConfig top-level fields are ``Optional[...]`` — today no flags exist on them, but a future contributor would have hit the silent- skip trap with no error from the generator and no failure from the CI guard. - B2: ``default is PydanticUndefined`` identity check (imported from ``pydantic_core``) replaces the brittle ``repr(default).startswith("PydanticUndefined")`` substring match. Stable across Pydantic v2 minor releases. - C2: module-level header documenting the JSON-serialisable contract for wizard-flagged fields (``int / float / bool / str / None / list of primitives / dict of primitives`` only). Future flags on Path / datetime / callable defaults will fail loudly at generator run; the contract is now explicit so contributors don't get surprised. ## State module hardening (PR-B follow-up) - B5: ``_default()`` now warns via ``logger.warning`` when the shipped JSON value disagrees with the hardcoded fallback literal. The JSON is canonical (returned), but drift between schema and fallback now surfaces in CI logs — eliminates the second-source-of-truth silently-stale risk. ## CI guard test coverage (PR-A follow-up) - G2: new ``tests/test_check_wizard_defaults_sync.py`` (7 tests): - corrupt-JSON triggers failure + diff message + regeneration hint. - missing-target triggers clean failure (no crash). - diff truncation works at 5 lines. - generator unwraps ``Optional[BaseModel]`` (B1 contract pinned). - required (no-default) wizard fields skipped (B2 contract pinned). - ``check_wizard_defaults_sync.py``: new ``_display_path()`` helper falls back to absolute path when target is outside ``_REPO_ROOT`` (prevents ``Path.relative_to`` crash when tests monkeypatch the target to a tmp dir). ## Test hardening (PR-A follow-up) - G3: ``test_wizard_start_training_routes_through_dispatcher`` now asserts ``ForgeConfig.model_validate(config_data)`` on the test fixture YAML before mocking the trainer. Pre-fix, a future change to ``minimal_config()`` that produced an invalid YAML would still pass this test (because the trainer pipeline is mocked BEFORE validation runs in ``_dispatch.main``), giving false-positive coverage. ## Documentation + cosmetic - A3: hardware-cache freshness caveat documented in ``_cached_hardware()`` docstring (GPU hot-plug mid-session is an acceptable edge case — operator can exit-and-resume to refresh). - F3: README directory tree now mentions ``_defaults.json`` as schema-derived defaults SOT shipped under ``forgelm/wizard/``. - B7: ``defer`` attribute on ``wizard_defaults.js`` script tag. Both wizard scripts now defer; HTML5 spec guarantees execution order (document order) so ``WIZARD_DEFAULTS`` still lands before ``defaultState()`` runs, but neither blocks HTML parsing. ## Skipped (review noted as "soft / acceptable") - C3 (``_forgelm_wizard:`` extra-key prefix): ``model_json_schema()`` is not called anywhere in the codebase — the ``wizard:true`` extra cannot leak into operator-facing artefacts today. Cost > value. - E1, E2, E3, H2: explicit refactor / defer in the review. - B4 (``--wizard-from-yaml``) confirmed REMOVED from roadmap. - E3 (``--start-from <yaml>``) bound to F1; deferred until needed. ## Validation - ruff check + format clean (187 files including new test). - 9 doc guards pass: bilingual_parity, anchor_resolution, cli_help_consistency, yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks, wizard_defaults_sync. - module_size: no new warnings. - 1597 passed, 14 skipped (full suite). - ``forgelm --config config_template.yaml --dry-run`` green. - bandit -ll: 0 HIGH / 19 MEDIUM (all pre-existing B615 unpinned HF Hub downloads — project-wide, intentional, not in PR-C scope). ## PR-comment status - 13 inline comments from coderabbitai on commit ``52b113d`` (Phase 22 origin) — ALL addressed in cycle-2 (``c0dd57a``) + cycle-3 (``5667885``) commits. Threads attached to old commits; will auto-resolve on PR merge. - 6 inline comments from gemini-code-assist on commit ``52b113d`` — ALL addressed. - Codacy + SonarQube hotspot summaries from ``52b113d`` — current HEAD (``91d5726`` + this commit) addresses everything they flagged in scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

External scanners (Codacy / SonarCloud) attached to PR #40 reported findings on the original ``52b113d`` Phase 22 commit. Verified each against the current HEAD; the remaining noise is genuinely non-actionable. Shrinks the public quality-gate to its real signal. ## Codacy 7 HIGH severity — all false positives, suppressed in-tree - **Prospector F401 (5×)** in ``forgelm/wizard/__init__.py`` — the re-export shim already carried ``# ruff: noqa: F401`` at the top but Codacy's Prospector doesn't honor ruff directives. Per-line ``# noqa: F401 -- re-export for tests (pyflakes/Prospector compat)`` markers added to the 5 flagged underscore-prefixed symbols (``_AUDIT_LARGE_FILE_THRESHOLD_BYTES``, ``_GALORE_OPTIMIZERS``, ``_BACK_TOKENS``, ``_STEPS``, ``_STATE_VERSION``). These are intentional re-exports for the wizard test suite, not dead imports. - **Bandit B105** in ``forgelm/wizard/_collectors.py:673`` — ``backend_token == "api"`` flagged as "possible hardcoded password". False positive: ``"api"`` is an enum-style discrim- inator (one of ``"api"``/``"local"``/``"file"``). Suppressed with ``# nosec B105 -- enum-style discriminator, not a secret``. - **PMD InaccurateNumericLiteral** in ``site/js/wizard_defaults.js`` on the ``2e-05`` literal — the file is a build-time artefact emitted by ``tools/generate_wizard_defaults.py`` from Pydantic ``learning_rate`` default. PMD's heuristic flags scientific notation as imprecise; the JS consumer canonicalises with ``Number.toExponential()`` so precision is never operator-visible. New ``.codacy.yaml`` excludes the generated artefact (and its Python sibling ``forgelm/wizard/_defaults.json``) from analysis. ## Codacy 29 INFO severity - **markdownlint MD013 (line-length, 17×)** — ForgeLM bilingual reference + design docs intentionally carry long lines (paragraph- style explanations, table cells, prose with long inline links). The 80-col limit is a code-style holdover that hurts readability for documentation; the project's ``bilingual_parity`` guard already enforces structural alignment. Disabled via ``.markdownlint.json``: ``"MD013": false`` with a rationale comment. - **markdownlint MD030 (9×)** in ``docs/design/wizard_mode.md`` — cycle-2 already fixed the list-marker spacing in this file; the Codacy result was stale (older commit snapshot). No code change needed; will clear on next Codacy re-scan. - **markdownlint MD028 (1×)**, **Semgrep pytorch-detection (2×)** — informational only, no action. ## SonarCloud 8 Security Hotspots — all LOW, all in test fixtures Hotspot review (no code changes — test fixtures are intentional): - ``tests/test_wizard_phase22.py:90`` ``ftp://example.com/hook`` — literal in ``test_unknown_scheme_rejected``; verifies the wizard rejects unsupported schemes. - ``tests/test_wizard_phase22.py:93,94`` ``http://hooks.example.com/x`` — literal in ``test_http_warning_but_accepted``; verifies the HTTP-with-warning behaviour mirrors ``forgelm/webhook.py``. - ``tests/test_wizard_phase22.py:830`` IMDS IP ``169.254.169.254`` / ``:834`` RFC1918 ``10.0.0.5`` — literals in ``TestWebhookSSRFPreflight``; verifies the SSRF preflight rejects loopback / RFC1918 / link-local destinations. - ``tests/test_wizard_phase22.py:924,929`` ``tmp_path`` / ``tests/test_cli_phase10.py:383`` — pytest's standard temp- directory fixture pattern. These hotspots are the ENTIRE point of the tests they appear in — suppressing them would require either disabling the SonarCloud rule project-wide (loses signal on production code) or adding inline comments that confuse the test-source-as-spec pattern. Decision: mark as "Safe / Reviewed" via SonarCloud's UI, no in-tree changes. ## Validation - ruff check + format clean (187 files). - 9 doc guards pass. - 139 wizard-scope tests pass (test_wizard_phase22 + test_check_wizard_defaults_sync + test_cli_phase10). - ``check_wizard_defaults_sync`` green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes E3 from the cycle-2 + cycle-3 review backlog. Operators can now iterate on an existing config without losing prior answers — the long-asked-for ergonomic improvement that the previous review cycles all flagged ("wizard always starts from scratch") and bound to F1 (landed in PR-B `91d5726`) so the per-step value-honoring reuses the schema-derived defaults plumbing. ## CLI surface - New flag ``--wizard-start-from PATH`` registered in ``forgelm/cli/_parser.py``; ``--wizard-start-from existing.yaml`` alongside ``--wizard`` triggers the new flow. - ``forgelm/cli/_wizard.py``: ``_maybe_run_wizard`` reads ``args.wizard_start_from`` (via ``getattr`` for back-compat) and threads it into ``run_wizard_full(start_from=...)``. ## Orchestrator - New ``_load_initial_state_from_yaml(path)`` helper: reads YAML, validates it against ``ForgeConfig``, seeds a :class:`_WizardState` with ``experience="expert"`` + ``use_case=_MANUAL_USE_CASE`` + ``config=copy.deepcopy(data)``. Three error categories surface to the operator: ``FileNotFoundError`` for missing path, ``ValueError`` for parse failure / non-mapping root / schema rejection. All caught by ``run_wizard_full`` and rendered as a cancelled outcome with a printed warning — no traceback for the operator. - ``run_wizard``, ``run_wizard_full``, ``_run_full_wizard_outcome``, and ``_run_full_wizard`` now accept ``start_from: Optional[str] = None`` (back-compat preserved — None falls back to today's flow). - When ``start_from`` is supplied the quickstart-template prelude is skipped — the operator's intent is "edit this file", not "pick a fresh template". ## Per-step value honouring - ``_step_strategy``: ``lora_r`` and ``lora_alpha`` prompts default to ``state.config["lora"]`` values when present (falls back to schema defaults). - ``_step_training_params``: ``num_train_epochs`` / ``per_device_train_batch_size`` / ``model.max_length`` / ``output_dir`` prompts honour existing ``state.config`` values. - ``_step_dataset``: when an existing dataset path is present, the step prints "Existing dataset: <path>" and offers a "Keep this dataset?" yes/no early-out so the operator doesn't re-trigger ingestion / audit on an unchanged path. Other steps (welcome / use-case / model / trainer / compliance / evaluation) already honoured existing values via ``state.config.get(...)`` defaults — no change needed. ## Save flow - ``_run_full_wizard_outcome``: when ``start_from`` is supplied the save prompt's default filename becomes the start-from path. The existing ``_prompt_unique_filename`` overwrite confirmation still fires before clobbering. ## Tests 10 new tests under ``tests/test_wizard_phase22.py``: - ``TestStartFromYAMLLoad``: valid YAML pre-populates state (with ``copy.deepcopy`` isolation), missing path raises ``FileNotFoundError``, malformed YAML raises ``ValueError``, non-mapping root raises, schema-invalid YAML raises. - ``TestRunWizardFullStartFrom``: missing / invalid YAML returns a cancelled ``WizardOutcome`` instead of crashing. - ``TestStepHonorsExistingValues``: strategy / training-params / dataset steps each preserve existing values when the operator presses Enter at every prompt. Total wizard-scope tests: 202 (was 192). ## Documentation - ``CHANGELOG.md`` — E3 / PR-D entry under [Unreleased] / Added. - ``docs/design/wizard_mode.md`` — PR-D additions block. - ``docs/reference/usage{,-tr}.md`` — new flag in the cheat-sheet. - ``docs/usermanuals/{en,tr}/reference/cli.md`` — flag table row. - ``docs/guides/quickstart{,-tr}.md`` — "idempotent re-run" paragraph after the existing operator-guardrail block. ## Validation - ruff check + format clean (187 files). - 9 doc guards pass: bilingual_parity, anchor_resolution, cli_help_consistency (439 invocations), yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks, wizard_defaults_sync. - ``forgelm --config config_template.yaml --dry-run`` green. - 1607 passed, 14 skipped (full suite, was 1597). ## Roadmap status — fully on-plan - B4 (``--wizard-from-yaml`` answer-file mode): REMOVED from consideration earlier (B3 closed the gap). - E3: SHIPPED in this PR. - F1 (schema-driven SOT): SHIPPED in PR-B. - F3 (state-version migration skeleton): SHIPPED in PR-A. - C3 (mobile YAML clamp): SHIPPED in PR-A. - E2 (rationale catalog): targeted version SHIPPED in PR-A; full catalog explicitly deferred (no scope rush). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…eview Independent code review of PR-D (`2ac9df2`) flagged 3 HIGH-severity bugs where pressing Enter at certain prompts silently regressed the operator's loaded YAML to wizard defaults — directly violating the "press Enter to keep" contract documented in PR-D's CHANGELOG entry and the quickstart guide. Plus 4 MEDIUM/LOW issues. All closed here. ## Critical: contract violations on `--wizard-start-from` - **PR-D-A1: ``_step_strategy`` regressed `lora.method` / `target_modules` / `dropout` / `bias` / `task_type` on Enter.** ``_select_strategy()`` now accepts ``existing_method`` / ``existing_load_in_4bit`` / ``existing_galore`` kwargs and derives the prompt's ``default_idx`` from the loaded YAML. Target-modules prompt detects the loaded list shape against ``TARGET_MODULE_PRESETS`` and shifts its default accordingly. Non-prompted fields (dropout / bias / task_type) now use ``setdefault``. - **PR-D-A2: ``_step_evaluation`` clobbered existing benchmark / llm_judge / safety blocks.** Function now seeds ``evaluation`` from ``state.config.get("evaluation")`` via ``copy.deepcopy``, defaults the ``auto_revert`` gate based on existing state, and preserves prior blocks when the operator declines a re-prompt. - **PR-D-A3: ``_step_use_case`` overwrote the operator's existing model + trainer choices** with the first template's preset (default index = 1). Combined with save-default-to-``start_from``, this silently corrupted the original YAML. Step now early-returns with a clear notice when ``state.use_case == _MANUAL_USE_CASE`` and ``model.name_or_path`` / ``training.trainer_type`` are populated. Greenfield runs now use ``setdefault`` for the preset application. ## Medium / hardening - **PR-D-A4: ``_step_welcome`` ``model.backend``** → chained ``setdefault`` instead of the nested-dict-replace pattern. An operator's explicit ``model.backend: transformers`` no longer flips silently to ``unsloth`` on a Linux+GPU box. - **PR-D-A5: ``_load_initial_state_from_yaml`` split try/except** for the ``ForgeConfig`` import vs. the ``model_validate`` call. A downstream ``ImportError`` from inside a Pydantic validator can no longer silently bypass schema validation. - **PR-D-A6: warns before discarding pre-existing wizard resume state** when ``--wizard-start-from`` is supplied alongside an in-progress ``$XDG_CACHE_HOME/forgelm/wizard_state.yaml``. Pre-fix the snapshot was silently destroyed at save time. - **PR-D-A7: ``docs/guides/quickstart{,-tr}.md``** scopes the "press Enter to keep" claim to numeric / text / choice prompts; documents the use-case skip and the ``setdefault`` pattern for non-prompted fields. ## UX / test coverage - **PR-D-B1: new CLI dispatcher integration test** (``test_wizard_start_from_threads_through_to_run_wizard_full``) pins parser → dispatcher → ``run_wizard_full`` thread-through. - **PR-D-B3: ``--wizard-start-from`` without ``--wizard``** now prints a one-line warning instead of silently no-op'ing. - **PR-D-B5: ``_collect_data_governance(existing=...)``** kwarg threads existing Article 10 free-text values through as prompt defaults. Strict-tier operators iterating from a populated YAML no longer re-type collection_method / annotation_process / known_biases. - **PR-D-B6: ``_canonical_start_from()``** helper applies ``Path(...).expanduser()`` once at entry. Pre-fix ``--wizard-start-from ~/configs/x.yaml`` created a literal ``~/configs/x.yaml`` directory on save because the load did ``expanduser`` but the save flow's overwrite check did not. ## Tests 13 new tests under ``tests/test_wizard_phase22.py`` covering every fix above: - ``TestPRDA1StrategyHonorsExisting`` (3): DoRA / target_modules / GaLore preserved on Enter. - ``TestPRDA2EvaluationHonorsExisting`` (2): benchmark + judge preserved when declining re-prompt; auto-revert default reflects existing state. - ``TestPRDA3UseCaseSkipsWhenExisting`` (1): early-return with notice when start-from path + populated model/trainer. - ``TestPRDA4WelcomeBackendSetdefault`` (1): existing backend survives Linux+GPU branch. - ``TestPRDA5ImportErrorBypass`` (1): valid YAML happy path still validates. - ``TestPRDA6ResumeStateWarning`` (1): in-progress snapshot warning emitted. - ``TestPRDB3StartFromWithoutWizardWarns`` (1): silent no-op warning. - ``TestPRDB5GovernanceHonorsExisting`` (1): existing free-text values preserved on Enter. - ``TestPRDB6ExpanduserCanonicalisation`` (1): ``~`` expansion at entry. Plus 1 dispatcher integration test (``test_wizard_start_from_threads_through_to_run_wizard_full``) under ``tests/test_cli_phase10.py``. ## Validation - ruff check + format clean (187 files). - 9 doc guards pass: bilingual_parity, anchor_resolution, cli_help_consistency (439 invocations), yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks, wizard_defaults_sync. - module_size: no new warnings. - ``forgelm --config config_template.yaml --dry-run`` green. - 1620 passed, 14 skipped (full suite, was 1607). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two-track cleanup landed atomically: ## Track 1 — Working-memory directories now strictly gitignored ``docs/marketing/`` and ``docs/analysis/`` are operator-local research + audit + draft notes that never appear in fresh clones. Pre-fix, 11 ``docs/analysis/code_reviews/*.md`` files were exempted via inverse ``.gitignore`` rules and referenced from the public tree (CHANGELOG, standards, roadmap, design docs, tests). Every such reference rotted into a 404 the moment the maintainer touched the local working-memory tree. - **`.gitignore` simplified.** Dropped 11 ``!docs/analysis/code_reviews/<file>`` exception lines that re-exposed specific closure-plan and master- review files; the directory-level ignore now applies uniformly. - **``git rm --cached -r docs/analysis/``** untracked the 11 leaked files (kept on disk locally; just no longer versioned in this repo). ## Track 2 — Public-tree reference cleanup 19 references removed / restructured. Where the citation was the load-bearing anchor of a sentence, the surrounding text was rewritten to convey the same intent without the file pointer. - **`CHANGELOG.md`**: 6 references removed (review-cycle 2 audit citation, Phase 22 cycle-1 review citation, closure-plan twice, wave2a fix-summary, master-review-opus, two `Documentation` links) — replaced with PR-thread / roadmap pointers. - **`CLAUDE.md`**: 2 references rewritten — the "Common pitfalls" block dropped its QKV-Core / Trion citations; the memory section rewrote the bullet to forbid future references with a clear rule. - **`docs/standards/coding.md`**: master-review-opus citation removed; "anti-patterns" intro rewritten as project lessons. - **`docs/standards/documentation.md`**: the "researcher" audience row dropped; new top-level "Working-memory directories" section spells out the no-references rule with rationale. - **`docs/standards/localization.md`**: the ``docs/analysis/**`` bilingual-policy row dropped (no public-tree reference == no bilingual concern). - **`docs/design/iso27001_soc2_alignment.md`**: removed the marketing/strategy citation and a self-referential "this file lives under ``docs/analysis/``" note (it lives under ``docs/design/``). - **`docs/roadmap.md`** / **``phase-12-6-closure-cycle.md``** / **``phase-12-data-curation-maturity.md``**: closure-plan and marketing/strategy citations rewritten to point at public roadmap / CHANGELOG / CLAUDE.md anchors instead. - **`tests/test_wizard_phase22.py`** + **``test_check_bilingual_parity.py``**: test docstrings softened to describe the directory category (working-memory) rather than naming specific analysis files. - **`tools/check_anchor_resolution.py`** + **``check_yaml_snippets.py``**: comments + docstrings softened. Functional path-string filters (``_SKIP_PATH_FRAGMENTS = ("docs/analysis/", "docs/marketing/")``, ``--exclude analysis`` default) are KEPT as path filters (they're not citations; they're directory exclusions). - **`.claude/skills/{cut-release,sync-bilingual-docs}/SKILL.md`**: marketing/strategy template + content-strategy citations rewritten to reference local-only working memory rather than specific paths. ## Track 3 — Standard + CI guard preventing regression - **New "Working-memory directories" rule** in `docs/standards/documentation.md` codifies the ban with examples + path-string-filter exemption. - **New CI guard `tools/check_no_analysis_refs.py`** scans every git-tracked file (uses ``git ls-files``, so the gitignored dirs are excluded by construction) and fails the run on any citation into the working-memory tree. Per-file ``_EXEMPT`` allowlist for policy statements (``CLAUDE.md``, ``docs/standards/documentation.md``, ``.gitignore``, the skill that warns about the dirs) and functional path filters; each entry carries a written justification comment. - **CLAUDE.md self-review chain** extended to 9 commands; tools/ listing names ``check_no_analysis_refs``. ## Validation - ruff check + format clean (188 files including the new guard). - 10 doc guards pass: bilingual_parity, anchor_resolution, cli_help_consistency (439 invocations), yaml_snippets, audit_event_catalog, library_api_doc, doc_numerical_claims, bilingual_code_blocks, wizard_defaults_sync, **check_no_analysis_refs** (new — clean). - module_size: no new warnings. - ``forgelm --config config_template.yaml --dry-run`` green. - 1620 passed, 14 skipped (full suite). ## Site verification (read-only side-effect) The audit pass produced a 270-line site verification report (3 CRITICAL + 5 HIGH + 6 MEDIUM + 3 LOW findings on ``site/`` claims vs. live code). The report itself lives in the now- gitignored ``docs/analysis/`` working-memory tree per the new policy; follow-up site fixes will be tracked as separate PRs referencing the live code, not the report. Top three site findings to address in a follow-up PR: 1. ``forgelm.api`` import path advertised on ``index.html`` / ``features.html`` does not exist (real surface is ``from forgelm import ForgeTrainer, audit_dataset, ...``). 2. ``forgelm quickstart byod`` is shown on ``index.html`` but isn't a real template (BYOD is a wizard sub-flow, not a quickstart key). 3. ``compliance.html`` cites ``forgelm/data_audit.py`` as a single-file module; it became a sub-package in Phase 14. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(wizard): Phase 22 — CLI wizard parity with web + sub-package split

Closes 11 CRITICAL + 10 HIGH + 9 MEDIUM + 7 LOW findings from pass-4 site-vs-code audit. Every site claim now matches the live forgelm/ surface; the four non-en/tr locales achieve full key parity. CRITICAL — schema drift killed: - Y1-Y6: index.html YAML demo rewritten to use real Pydantic field names (num_train_epochs, per_device_train_batch_size, evaluation.benchmark.min_score, evaluation.safety.classifier, evaluation.require_human_approval, top-level webhook block). - T1-T5: compliance.html artefact tree redrawn (compliance/ + final_model/ instead of artifacts/, audit_log.jsonl at checkpoint root, config_snapshot.yaml ghost dropped). - X1: forgelm verify-audit example points at audit_log.jsonl, not the directory. - F1: compliance.config_hash ghost replaced with forgelm verify-audit chain integrity claim (HTML + en+tr i18n). - F2: compliance.human_approval ghost YAML replaced with evaluation.require_human_approval (HTML + 6 i18n + usermanuals source markdown). - C1: forgelm.api -> "from forgelm import ..." with real forgelm.__all__ symbols. - C2: forgelm quickstart byod -> forgelm quickstart domain-expert. - C3: forgelm/data_audit.py -> forgelm/data_audit/ (Phase 14 package). HIGH — overclaims softened to match live behaviour: - H4: exit codes 0/1/2/3/4 -> 0-5. - H5: Annex IV "eight" -> "nine required §1-9 sections" matching _ANNEX_IV_REQUIRED_FIELDS. - H6: 168 i18n strings (42 keys * 4 langs) backfilled into de/fr/es/zh — every locale now at 731 keys (was 689). - H7: chat "safety routing" -> "planned for upcoming Pro CLI release" (chat.py has no Llama Guard hook in v0.5.5). - H8: wizard step counter 7 -> 9; fr/es/zh "~ 3 minutes" semantic drift swapped for step-count parity. - R1-R5: auto-revert wording, verify-annex-iv audit-chain claim, safety-eval input formats, --model-card ghost flag, purge data- provenance manifest claim — all reworded to match real behaviour. - W1: wizard.js USE_CASE_PRESETS synced with quickstart.py::TEMPLATES. MEDIUM + LOW: reverse-pii wording (M10), Mermaid jsDelivr disclosure on privacy (M11), audit-event scope (M14), QMS SOP names, PII regex specifics, docker-compose.yaml, doctor pass/warn/fail, Discord webhook, dpia-template ghost link, customer-support/code-assistant body softening, quickstart template parentheticals, JSON-LD operatingSystem + Windows on 8 pages, Last updated 2026-04-28 -> 2026-05-09. M9 wizard step-name divergence + O1 STATE_VERSION drift documented inline in _orchestrator.py + _state.py + wizard.js. CHANGELOG.md added to check_no_analysis_refs.py exempt list (the review-cycle 5 entry that announces this very policy quotes the dir names verbatim — that is the documentation, not a content reference). CI gauntlet (14 guards + ruff + dry-run + 92 guard tests + 139 wizard tests) verified clean post-fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Closes 3 of 4 inline review comments on PR #41; skips the 4th with a brief reason. Accepted (3): - coderabbit @ features.html:239 — safety-eval output filename was wrong: forgelm/safety.py:417 writes safety_results.json, not safety_evaluation.json. Fixed in HTML + all 6 i18n locales. - gemini @ index.html:542 — YAML demo comment "lora / dora / pissa / rslora" updated to "lora / DoRA / PiSSA / rsLoRA" to match the cross-surface academic-prose convention used in the wizard tutorial (forgelm/wizard/_orchestrator.py:339). The YAML *value* "dora" stays lowercase because the schema's Literal accepts lowercase. - gemini @ compliance.html:199 — Article 10 PII evidence row mixed a directory path (forgelm/data_audit/) with a Python symbol (forgelm.mask_pii); aligned with the file-path convention used by every other Article card. Gemini's literal suggestion (forgelm/data_audit/masking.py) does not exist — the real file is forgelm/data_audit/_pii_regex.py:127, so the evidence cell now points there. Skipped (1): - coderabbit @ quickstart.html:154 — wizard STEPS reconciliation between site/js/wizard.js and forgelm/wizard/_orchestrator.py. The divergence is by design and already documented inline at _orchestrator.py:719: both surfaces produce the same generated YAML through different UX flows (different stores: localStorage vs XDG cache; different preset-driven step ordering). Reconciling step IDs would force one surface's UX onto the other with no functional benefit; the user-visible "9 guided steps" counter accurately reflects the JS wizard's STEPS.length. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reviewer flagged pre-existing prose drift on two strings where the en/tr versions describe a thicker mechanism than the four other locales. Token-level claim parity now restored. compliance.art14.body — de/fr/es/zh previously only described the notification side-effect ("blocks model promotion until a human signs. Webhook to your incident tool…") and never named the load-bearing CLI surface (forgelm approve / forgelm reject + forgelm approvals + the staging-directory mechanism). Rewritten to mirror en's wording — the three CLI commands plus the staging-directory move now appear in every locale. Token check across all 6 langs: forgelm approve / reject / approvals — present. privacy.body — de/fr/es/zh + (zh especially) were missing the entire "Data-subject rights you can ship in your own pipeline" paragraph that names forgelm reverse-pii (Article 15), forgelm purge (Article 17), and the ISO 27001 / SOC 2 Type II alignment claim. The paragraph was inserted into all four locales as a new <h2> section between "The toolkit itself" and "Changes," matching en/tr's information architecture. Token check across all 6 langs: forgelm purge / reverse-pii / ISO 27001 — present. This addresses the only "notable drift to flag" item in the PR-41 review verdict. Pre-existing prose drift, not introduced by this PR; the wave-3 i18n batch never re-synced these two strings when the en/tr versions were widened. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fix(site): pass-5 closure — site-vs-code honesty sweep + i18n parity

qodo-code-review · 2026-05-09T20:35:54Z

ⓘ You've reached your Qodo monthly free-tier limit. Reviews pause until next month — upgrade your plan to continue now, or link your paid account if you already have one.

sourcery-ai

Sorry @cemililik, your pull request is larger than the review limit of 150000 diff characters

coderabbitai · 2026-05-09T20:35:57Z

Warning

Rate limit exceeded

@cemililik has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 16 minutes and 50 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 41b9c249-2fbf-479f-b443-70c16fc2abe0

📥 Commits

Reviewing files that changed from the base of the PR and between 2dce74e and 851f087.

📒 Files selected for processing (5)

docs/design/wizard_mode.md
forgelm/wizard/_byod.py
forgelm/wizard/_collectors.py
forgelm/wizard/_orchestrator.py
tools/check_no_analysis_refs.py

📝 Walkthrough

Walkthrough

Phase 22 CLI wizard modernization: the monolithic forgelm/wizard.py is replaced by a forgelm/wizard/ package (orchestrator, collectors, I/O, state), schema-driven wizard defaults are generated and bundled, a CI guard ensures defaults sync, a new exit code EXIT_WIZARD_CANCELLED = 5 and --wizard-start-from flag enable idempotent runs, and docs/tests/site content updated to reflect a 9-step wizard and working‑memory policies.

Changes

Phase 22 CLI Wizard Refactor & Modernization

Layer / File(s)	Summary
Schema Markup `forgelm/config.py`	Tagged multiple fields with `json_schema_extra={"wizard": True}` so generator can extract schema-driven wizard defaults.
Core Wizard Package `forgelm/wizard/__init__.py`, `forgelm/wizard/_io.py`, `forgelm/wizard/_state.py`	Added `forgelm.wizard` subpackage: `_io.py` adds navigation tokens and prompting primitives; `_state.py` implements XDG-aware atomic snapshot persistence, migrations, step-diff helpers, and loads bundled defaults. `__init__` re-exports public API for compatibility.
Wizard Collectors `forgelm/wizard/_collectors.py`	New centralized collectors for webhook, safety, trainer hyperparameters, RoPE scaling, GaLore, compliance (Article 9/10/11), retention/monitoring, evaluation, synthetic data, and use-case/strategy selection.
BYOD & Quickstart Prelude `forgelm/wizard/_byod.py`	BYOD helpers: dataset-path validation, optional directory ingestion, optional JSONL audit, and quickstart-template prelude that can bypass full wizard.
Step Machine Orchestrator `forgelm/wizard/_orchestrator.py`	Nine-step state-machine driver (welcome→use-case→model→strategy→trainer→dataset→training-params→compliance→evaluation) with strict-tier auto-coercion, back/reset handling, eager snapshot persistence, `run_wizard_full(start_from=...)` entry, validation-on-exit, overwrite-safe saving, and `WizardOutcome`.
Schema-Derived Defaults `forgelm/wizard/_defaults.json`, `tools/generate_wizard_defaults.py`	Generator walks `ForgeConfig` to collect wizard-flagged defaults and emits `forgelm/wizard/_defaults.json` and `site/js/wizard_defaults.js` (deterministic ordering, CI-friendly output).
Defaults Sync Guard `tools/check_wizard_defaults_sync.py`, `tests/test_check_wizard_defaults_sync.py`	CI guard regenerates defaults and diffs against committed JSON/JS; tests added to assert guard behavior on drift/missing targets and generator edge cases.
CLI Exit Code & Public Export `forgelm/cli/_exit_codes.py`, `forgelm/cli/__init__.py`	New public constant `EXIT_WIZARD_CANCELLED = 5` added and re-exported; included in public exit-code set.
CLI Argument Parser `forgelm/cli/_parser.py`	New `--wizard-start-from PATH` flag documented and wired into parser.
CLI Wizard Dispatcher `forgelm/cli/_wizard.py`	Dispatcher now calls `run_wizard_full(start_from=...)` and handles `WizardOutcome`: exit `5` on cancellation, set `args.config` and continue if `start_training`, or exit `0` for YAML-only completion.
Packaging & Asset Bundling `pyproject.toml`	Bundled `forgelm.wizard/_defaults.json` as package data for wheel distributions.
Web Wizard Parity `site/js/wizard.js`, `site/js/wizard_defaults.js`	Browser wizard updated to 9-step flow; reads `window.WIZARD_DEFAULTS` via `wd()` helper; emits `model.max_length`, extended safety block, new inputs; cross-tab state sync via storage events; new generated `wizard_defaults.js`.
Mobile CSS Responsive `site/css/style.css`	Clamped `.wizard-yaml-output` to `max-height: 38vh` with `overflow:auto` for mobile YAML accordion.
Code Quality & CI Config `.codacy.yaml`, `.github/workflows/nightly.yml`, `.markdownlint.json`	Added Codacy exclusions for generated files, updated nightly `pip-audit` invocation to ignore `CVE-2026-1839` (with comments), and disabled MD013 (line-length) in markdownlint.
Working-Memory Policy & Guards `.gitignore`, `tools/check_no_analysis_refs.py`, `tools/check_yaml_snippets.py`, `tools/check_anchor_resolution.py`	Simplified `.gitignore` to `docs/marketing/` and `docs/analysis/`; added `tools/check_no_analysis_refs.py` to fail CI on public-tree references into gitignored working-memory directories; updated other tools to skip these directories consistently.
Docs: Architecture, Design, Standards `docs/*`	Rewrote architecture/design docs to show `wizard/` package and 9-step flow; added working-memory rules to documentation standards; updated exit-code contracts and human-approval key to `evaluation.require_human_approval`.
Site Content & Examples `site/*.html`	Updated site pages (index, quickstart, features, compliance, contact, privacy, terms) to reflect 9-step wizard, defaults asset inclusion, Windows OS disclosure, updated compliance mappings and config example.
Analysis Docs Cleanup `docs/analysis/code_reviews/*.md`	Removed multiple legacy multi-agent review prompt/report files from `docs/analysis/code_reviews/`.
Changelog & Release Notes `CHANGELOG.md`, `docs/roadmap/releases.md`, `CLAUDE.md`	Expanded [Unreleased] with Phase 22 changes and fixes; documented v0.5.5 merged with deferred PyPI tag and next-release bundle; added `check_wizard_defaults_sync.py` to verification steps; strengthened `docs/analysis/` usage rules.
Tests & Test Fixtures `tests/test_wizard_phase22.py`, `tests/test_cli_phase10.py`, `tests/test_phase12_5.py`, `tests/test_wizard_phase11.py`, `tests/conftest.py`	Large Phase 22 test suite (coverage: webhook parsing, safety probes, navigation tokens, hyperparams, strict-tier coercion, XDG persistence, step diffs, schema validation, preflight checklist, migration skeleton, PR‑D regressions). CLI tests updated to assert `WizardOutcome` handling; conftest autouse fixture isolates XDG cache for wizard tests.
Supporting Tools `tools/*`, `README.md`	Added generator and checker tools; README updated to show `wizard/` sub-package and bundled `_defaults.json`.

Sequence Diagram(s)

sequenceDiagram
participant User as User (tty)
participant CLI as forgelm CLI
participant Orch as wizard Orchestrator
participant State as XDG Wizard State
participant Trainer as Training Pipeline

User->>CLI: forgelm --wizard [--wizard-start-from path]
CLI->>Orch: run_wizard_full(start_from=path)
Orch->>State: _load_wizard_state() / load start_from YAML
Orch->>User: prompt steps (welcome → ... → evaluation)
User->>Orch: answers / navigation tokens
Orch->>State: _save_wizard_state() after each step (atomic write)
alt cancelled / non-tty / decline-to-save
Orch-->>CLI: WizardOutcome(cancelled=True)
CLI->>User: exit with EXIT_WIZARD_CANCELLED (5)
else produced YAML (maybe start_training)
Orch->>State: _save_config_to_file()
Orch-->>CLI: WizardOutcome(config_path, start_training)
CLI->>Trainer: if start_training True, set args.config and run pipeline
CLI->>User: exit with EXIT_SUCCESS (0)
end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

cemililik/ForgeLM#40: Implements the same Phase‑22 CLI wizard refactor (forgelm/wizard.py → forgelm/wizard/ sub-package), including orchestrator, state, collectors, tooling, and tests.
cemililik/ForgeLM#38: Overlaps heavily with Phase‑22 wizard changes (exit-code, --wizard-start-from, defaults generator, site/docs updates).
cemililik/ForgeLM#9: Related earlier work on the interactive wizard/quickstart integration and package-level wiring.

"Nine steps to config delight,
Snapshots rescued from Ctrl‑C's bite,
Schema defaults synced and true,
Windows welcome (and macOS too!),
Wizard state persists the night." 🐰✨

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch development

codacy-production · 2026-05-09T20:36:05Z

Not up to standards ⛔

🔴 Issues 5 high · 2 medium · 2 minor

Alerts:
⚠ 9 issues (≤ 0 issues of at least minor severity)

Results:
9 new issues

Category Results

BestPractice 1 minor

ErrorProne 5 high

Security 2 medium
1 minor

View in Codacy

🟢 Metrics 709 complexity · 2 duplication

Metric Results

Complexity 709

Duplication 2

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

gemini-code-assist

Code Review

This pull request modernizes the interactive configuration wizard by splitting it into a sub-package, aligning it with the web-based wizard, and introducing an idempotent re-run feature via the --wizard-start-from flag. It also implements strict CI guards to prevent documentation drift and references to gitignored working-memory directories. Feedback identifies a logic flaw in the idempotent re-run flow where previously enabled features, such as auto-revert or safety evaluation, cannot be disabled through the prompts. Improvements are also suggested for ensuring atomic writes of the final configuration file and gating the training start prompt on successful schema validation of the generated YAML.

gemini-code-assist · 2026-05-09T20:39:14Z

+    existing_evaluation = state.config.get("evaluation") or {}
+    evaluation: Dict[str, Any] = copy.deepcopy(existing_evaluation)
+    auto_revert_default = bool(existing_evaluation.get("auto_revert", False))
+    if _prompt_yes_no(
+        "Enable auto-revert (discard model if quality drops)?",
+        default=auto_revert_default,
+    ):
+        evaluation["auto_revert"] = True
+        existing_max_loss = existing_evaluation.get("max_acceptable_loss")
+        # P14: when auto-revert is on but no explicit threshold is given,
+        # emit the web wizard's safe ``2.0`` default rather than leaving
+        # the field unset.  Mirrors ``site/js/wizard.js:163`` so the
+        # two surfaces produce equivalent YAMLs for the same answers.
+        prompt_default = str(existing_max_loss) if existing_max_loss is not None else ""
+        max_loss = _prompt("Max acceptable loss (leave empty for default 2.0)", prompt_default)
+        if max_loss.strip():
+            try:
+                evaluation["max_acceptable_loss"] = float(max_loss)
+            except ValueError:
+                _print(f"  '{max_loss}' is not a number; using default 2.0.")
+                evaluation["max_acceptable_loss"] = 2.0
+        else:
+            evaluation["max_acceptable_loss"] = 2.0
+    elif "auto_revert" in existing_evaluation and not auto_revert_default:
+        # No-op branch: operator left auto-revert disabled (matches
+        # existing state).  Nothing to update.
+        pass


The logic for auto_revert in an idempotent re-run is flawed. If auto_revert was previously enabled (auto_revert_default is True) and the operator answers "no" to the prompt, the if block is skipped and the elif block is also skipped (because not auto_revert_default is False). Since evaluation is a deepcopy of the existing config, auto_revert remains True, ignoring the operator's choice to disable it. To fix this, explicitly remove the keys if the user answers "no".

if _prompt_yes_no( "Enable auto-revert (discard model if quality drops)?", default=auto_revert_default, ): evaluation["auto_revert"] = True existing_max_loss = existing_evaluation.get("max_acceptable_loss") # P14: when auto-revert is on but no explicit threshold is given, # emit the web wizard's safe ``2.0`` default rather than leaving # the field unset. Mirrors ``site/js/wizard.js:163`` so the # two surfaces produce equivalent YAMLs for the same answers. prompt_default = str(existing_max_loss) if existing_max_loss is not None else "" max_loss = _prompt("Max acceptable loss (leave empty for default 2.0)", prompt_default) if max_loss.strip(): try: evaluation["max_acceptable_loss"] = float(max_loss) except ValueError: _print(f" '{max_loss}' is not a number; using default 2.0.") evaluation["max_acceptable_loss"] = 2.0 else: evaluation["max_acceptable_loss"] = 2.0 else: evaluation.pop("auto_revert", None) evaluation.pop("max_acceptable_loss", None)

gemini-code-assist · 2026-05-09T20:39:14Z

+    if safety:
+        evaluation["safety"] = safety


As noted above, if the operator answers "no" to the safety evaluation prompt, the existing safety block should be removed from the configuration to allow disabling the feature during an idempotent re-run.

Suggested change

if safety:

evaluation["safety"] = safety

if safety:

evaluation["safety"] = safety

else:

evaluation.pop("safety", None)

gemini-code-assist · 2026-05-09T20:39:14Z

+    if benchmark:
+        evaluation["benchmark"] = benchmark


As noted above, if the operator answers "no" to the benchmark evaluation prompt, the existing benchmark block should be removed from the configuration.

Suggested change

if benchmark:

evaluation["benchmark"] = benchmark

if benchmark:

evaluation["benchmark"] = benchmark

else:

evaluation.pop("benchmark", None)

gemini-code-assist · 2026-05-09T20:39:14Z

+        evaluation["llm_judge"] = judge
+    if evaluation:


As noted above, if the operator answers "no" to the LLM-as-Judge prompt, the existing llm_judge block should be removed from the configuration.

Suggested change

evaluation["llm_judge"] = judge

if evaluation:

if judge:

evaluation["llm_judge"] = judge

else:

evaluation.pop("llm_judge", None)

gemini-code-assist · 2026-05-09T20:39:14Z

+            yaml.safe_dump(config, f, default_flow_style=False, sort_keys=False, allow_unicode=True)
+        _print(f"\n  Config saved to: {requested_filename}")


The final configuration file is saved using a direct open(..., "w") call, which is not atomic. If the wizard is overwriting an existing config (supported via --wizard-start-from) and the write fails (e.g., disk full, power loss), the original file will be corrupted or lost. Consider using the same atomic write pattern (temp file + os.replace) implemented in _save_wizard_state to ensure the integrity of the operator's configuration files.

gemini-code-assist · 2026-05-09T20:39:14Z

+    }
+
+
+def _collect_monitoring() -> Optional[Dict[str, Any]]:


Most section collectors (like _collect_monitoring, _collect_retention, _collect_benchmark, etc.) do not accept existing values from the loaded configuration. When an operator iterates on an existing config via --wizard-start-from and chooses to reconfigure a section, they are forced to re-type every field from scratch because the prompts use hardcoded defaults. To improve the UX of idempotent re-runs, these collectors should accept an existing dictionary and use its values as prompt defaults, mirroring the pattern used in _collect_data_governance.

gemini-code-assist · 2026-05-09T20:39:14Z

+    _validate_generated_config(config_filename)
+    _clear_wizard_state()
+    if _prompt_yes_no("Start training now?", default=False):


The "Start training now?" prompt is offered even if the generated configuration fails schema validation. Since ForgeConfig validation failures are fatal for the trainer, starting a run with an invalid config is guaranteed to fail. Consider making _validate_generated_config return a boolean and using it to gate the training start prompt.

Suggested change

_validate_generated_config(config_filename)

_clear_wizard_state()

if _prompt_yes_no("Start training now?", default=False):

valid = _validate_generated_config(config_filename)

_clear_wizard_state()

if valid and _prompt_yes_no("Start training now?", default=False):

gemini-code-assist · 2026-05-09T20:39:14Z

+        counter += 1
+
+
+def _validate_generated_config(config_filename: str) -> None:


Change the return type to bool to allow the caller to gate training start on successful validation.

Suggested change

def _validate_generated_config(config_filename: str) -> None:

def _validate_generated_config(config_filename: str) -> bool:

gemini-code-assist · 2026-05-09T20:39:14Z

+    _print("  ✓ Schema validation passed.")
+


Return False on validation failure and True on success.

Suggested change

_print(" ✓ Schema validation passed.")

return False

_print(" ✓ Schema validation passed.")

return True

Two scopes bundled together since both are release-prep housekeeping on `development` ahead of the next PyPI tag: 1. Issue #37 — nightly pip-audit gate - `transformers` CVE-2026-1839 was failing the Supply-chain security workflow; published fix lives in 5.0.0rc3 (release candidate) and ForgeLM's pyproject pins `<5.0.0`. - Added `--ignore-vuln CVE-2026-1839` to `.github/workflows/nightly.yml` with multi-line rationale + remove-after condition. Per the policy in `tools/check_pip_audit.py` each ignore must carry a written justification; this one is reviewed at every release. 2. Release-prep doc refresh - CHANGELOG `[Unreleased]` rewritten in public-facing style (grouped by user-visible capability, no internal review-cycle / PR-letter / round-N terminology). Added entries for Phase 22 wizard parity-with-web, the site documentation correction sweep (PR #41), the working-memory directory policy, and the issue #37 stop-gap. - `docs/roadmap.md` + `docs/roadmap-tr.md` add Phase 22 + site sweep rows in the "Status at a glance" table; "Current state" sentence updated from the stale "release imminent" framing to reflect that v0.5.5 was paper-merged on main but the PyPI tag was deferred so the upcoming release can fold in the post- merge work. - `docs/roadmap/releases.md` marks v0.5.5 as "Merged on main; PyPI tag deferred"; new "Next PyPI release" section documents the upcoming release scope and surfaces the version-slot decision (`0.5.6` vs `0.6.0`) for the maintainer with rationale on each side. - `tools/check_no_analysis_refs.py` `_EXEMPT` list extended with `docs/roadmap/releases.md` (parallel to the CHANGELOG.md exempt for the policy-announcement entry that names the gitignored dirs verbatim). - `docs/roadmap/phase-12-6-closure-cycle.md` link to v0.5.5 section anchor updated to match the new heading. CI gauntlet (anchor resolution / bilingual parity / no-analysis-refs / yaml snippets / audit catalog / library API / numerical claims / wizard defaults / site claims / cli help / ruff / dry-run) verified clean. No code path or test changes; pure docs + workflow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

docs/usermanuals/en/reference/cli.md (1)
205-214: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing exit code 5 documentation — bilingual parity violation.

The Turkish CLI reference (docs/usermanuals/tr/reference/cli.md line 210) documents exit code 5 (wizard cancelled), but this English version does not include it in the exit codes table. As per coding guidelines, user-facing docs must be bilingual mirrors.
➕ Proposed fix: Add exit code 5 to the table
 | 3 | Auto-revert / regression |
 | 4 | Awaiting human approval (training pipeline) |
+| 5 | Wizard cancelled | `forgelm --wizard` exited without writing YAML (Ctrl+C, non-tty rejection, save refusal) — distinct from `0` so CI can distinguish "wizard completed" from "wizard wrote nothing" |
 | 130 | User interrupted (Ctrl+C) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/usermanuals/en/reference/cli.md` around lines 205 - 214, The English CLI
exit codes table is missing exit code 5 present in the Turkish version; update
the table in docs/usermanuals/en/reference/cli.md to include a new row for "5 |
Wizard cancelled" (matching the Turkish description) so both language docs
remain in parity; ensure the new entry follows the same table formatting and
update the "See [Exit Codes]" reference only if needed.
docs/usermanuals/tr/reference/exit-codes.md (1)
113-113: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Bilingual parity issue — compatibility range not updated.

The compatibility guarantee still states "Exit kodları 0-4" (exit codes 0-4) but should be "0-5" to match the English version and include the newly documented exit code 5. The English version (line 113 in docs/usermanuals/en/reference/exit-codes.md) correctly states "Exit codes 0-5 are stable."
📝 Proposed fix
-Exit kodları 0-4 sürümler arası kararlıdır. Yeni kodlar eklenebilir (5, 6, …) ama mevcutların semantiği değişmez.
+Exit kodları 0-5 sürümler arası kararlıdır. Yeni kodlar eklenebilir (6, 7, …) ama mevcutların semantiği değişmez.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/usermanuals/tr/reference/exit-codes.md` at line 113, The Turkish text
still reads "Exit kodları 0-4" but must match the English doc and include the
new stable code 5; update the sentence containing "Exit kodları 0-4" to "Exit
kodları 0-5" (preserve the rest of the sentence and punctuation) so the
bilingual parity and the documented compatibility range align with the English
version.

🧹 Nitpick comments (1)

tools/check_no_analysis_refs.py (1)

144-144: 💤 Low value

Optional: simplify set comprehension.

The comprehension {p for p in files} can be simplified to set(files) for better readability, as flagged by SonarCloud.

♻️ Proposed refactor

-        return sorted({p for p in files})
+        return sorted(set(files))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tools/check_no_analysis_refs.py` at line 144, Replace the set comprehension
in the return of the function that currently uses "sorted({p for p in files})"
with "sorted(set(files))" to improve readability; locate the return statement in
tools/check_no_analysis_refs.py (the line returning sorted({p for p in files}))
and change it to use set(files) instead of the comprehension.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CHANGELOG.md`:
- Around line 9-16: Update the CHANGELOG entry to soften the claim about CLI/web
parity: replace "same 9-step flow as the in-browser wizard" with wording like
"same nine user-visible stages and same generated YAML" (or similar) and note
that internal step IDs/trainer–model order may differ as documented in
forgelm/wizard/_orchestrator.py; ensure the sentence about the wizard module
split remains but remove the stronger parity wording.
- Around line 108-119: Replace the explicit mentions of the gitignored paths
'docs/marketing/' and 'docs/analysis/' in the CHANGELOG entry with the generic
term "working-memory directories"; update the sentence that currently reads
those paths to instead read something like "working-memory directories
(gitignored) are now strictly ignored..." and keep the rest of the paragraph
intact, ensuring any subsequent references (e.g.,
`docs/standards/documentation.md`, `tools/check_no_analysis_refs.py`, `_EXEMPT`)
remain unchanged.

In `@docs/design/wizard_mode.md`:
- Around line 15-16: The doc introduces a new public exit code symbol
EXIT_WIZARD_CANCELLED = 5 which violates the repository's public exit-code
contract (0–4); either remove or mark EXIT_WIZARD_CANCELLED as internal/private
(not part of the public contract) and revert public documentation to the
approved contract, or update the central contract docs in
docs/standards/error-handling.md to include the new code and ensure all
wrappers/CI are updated; locate usages/definitions of EXIT_WIZARD_CANCELLED and
the wizard_mode.md text and either change the symbol to an internal-only
constant or coordinate a contract update in the standards document to keep the
public API consistent.
- Around line 47-105: The documentation's implemented flow for
forgelm/wizard::run_wizard is out of sync with site/js/wizard.js; update the
steps in docs/design/wizard_mode.md to match the browser order (welcome →
use-case → trainer → model → dataset → training → compliance → operations →
review), remove the standalone Strategy step or explicitly note its relocation
into Model/Training UI, and add the missing Review step and any moved Trainer
details so the text and the listed step numbers/names align exactly with
site/js/wizard.js (reference run_wizard, Strategy, Trainer, and the
Review/Operations step names when editing).

In `@docs/reference/architecture-tr.md`:
- Around line 148-149: The heading "### `wizard/`" is missing the required blank
line after it; insert a single blank line immediately after the `### `wizard/``
heading so the following paragraph (starting with "Etkileşimli CLI
sihirbazı...") is separated as per docs/standards and markdownlint MD022
expectations.

In `@docs/reference/architecture.md`:
- Around line 148-149: Missing blank line before the "### `wizard/`" heading;
add a single empty line immediately above the "### `wizard/`" heading in
docs/reference/architecture.md so the markdown linter recognizes the heading
separation and preserves readability (locate the "### `wizard/`" heading text
and insert one blank line directly before it).

In `@forgelm/cli/_exit_codes.py`:
- Around line 15-31: Remove EXIT_WIZARD_CANCELLED from the public contract: keep
the constant (EXIT_WIZARD_CANCELLED) for internal use but do not include it in
the _PUBLIC_EXIT_CODES frozenset so the public contract remains 0–4; update the
nearby comment to note that EXIT_WIZARD_CANCELLED is internal-only and not part
of the exported contract, and ensure only EXIT_SUCCESS, EXIT_CONFIG_ERROR,
EXIT_TRAINING_ERROR, EXIT_EVAL_FAILURE, and EXIT_AWAITING_APPROVAL remain in
_PUBLIC_EXIT_CODES.

In `@forgelm/wizard/_byod.py`:
- Around line 351-355: The printed follow-up shell commands use quickstart_path
unquoted which will break when the path contains spaces; update the two _print
calls that display the suggested commands (the ones referencing quickstart_path
and the "Running: forgelm --config ..." / "forgelm --config ..." messages) to
wrap/quote the quickstart_path (e.g., with single quotes) so the printed
commands are valid shell invocations even when quickstart_path contains spaces.

In `@forgelm/wizard/_collectors.py`:
- Around line 343-367: Collectors like _collect_rope_scaling currently overwrite
any existing advanced settings by always recomputing defaults; change them to
preserve loaded values unless the user explicitly chooses to recompute: in
_collect_rope_scaling check for an existing rope_scaling value (or accept an
optional existing parameter), if present show it and prompt "Keep existing RoPE
scaling?" (default yes) and return the existing dict if the user keeps it,
otherwise only then compute new rope_factor and type; apply the same pattern to
the GaLore-related collectors referenced at 394-412 (check for existing galore_*
tuning, prompt to keep vs recompute, and only compute defaults when creating new
or when user confirms recompute).
- Around line 258-335: The _collect_trainer_hyperparameters helper currently
always uses schema literals as prompt defaults, which causes reruns started from
an existing config to overwrite prior per-trainer knobs; change
_collect_trainer_hyperparameters to accept an optional existing TrainingConfig
(or plain dict) and use its trainer-specific fields as the prompt defaults
(e.g., use existing.dpo_beta, simpo_gamma, kto_beta, orpo_beta,
grpo_num_generations, grpo_max_completion_length, grpo_reward_model) instead of
the hard-coded numbers/strings, and update the caller (_step_trainer) to pass
the loaded config when available so Enter preserves prior values.
- Around line 200-250: The helper _collect_safety_config currently rebuilds the
safety block with hardcoded defaults and overwrites loaded settings; change it
to accept an optional existing_safety: Optional[Dict[str, Any]] parameter
(seeded from _step_evaluation) and use existing_safety values as the defaults
for prompts (e.g., classifier default, test_prompts default, scoring default,
max_safety_regression default, min_safety_score/min_classifier_confidence, and
severity_thresholds), and when a prompt returns empty/unchanged preserve the
existing field instead of replacing it; ensure you only add new keys when the
user explicitly provides new values and keep existing
track_categories/severity_thresholds if not modified.

In `@forgelm/wizard/_orchestrator.py`:
- Around line 1015-1018: The call to _validate_generated_config(config_filename)
currently only logs errors so the flow still prompts to start training even when
validation failed; change _validate_generated_config to return a boolean success
flag (True on valid, False on invalid) and update the caller in the block around
_validate_generated_config, _clear_wizard_state, and _prompt_yes_no so it checks
the returned flag and skips the prompt/short-circuits the immediate training
branch when validation returned False (i.e., only call _prompt_yes_no("Start
training now?", default=False) and print the forgelm command when
_validate_generated_config(...) returned success).

In `@site/js/wizard.js`:
- Around line 2026-2030: The new wizard row uses hard-coded English strings for
the label and hint (created in maxLengthRow via el(..., text: 'Max sequence
length') and el(..., text: 'Tokens per training example...')), so wrap those
strings with the translation helper (tr(...)) like the other wizard fields and
add corresponding keys/entries to the locales file(s) used by the wizard
training UI; update the label and hint usages (referencing maxLengthRow and
maxLengthInput) to use the new locale keys to ensure proper translation.
- Around line 1244-1278: The storage handler must treat e.newValue === null
(remote reset/delete) as an explicit reset: inside the
window.addEventListener('storage', ...) callback (the block that checks e.key
!== STORAGE_KEY and uses loadState(), state, and render()), detect if e.newValue
=== null and in that case clear the in-memory state object (delete all keys from
state), set any derived state to its empty/default form if applicable, then call
render() so the visible wizard reflects the reset; keep the existing try/catch
and console.warn behavior for other errors and still return early when modal has
'hidden'.

In `@tests/test_wizard_phase22.py`:
- Around line 1592-1611: The test test_auto_revert_default_reflects_existing
currently uses a non-specific assertion that will always pass; update it to
assert the exact expected outcome after operator explicitly answers "no" to
auto-revert: replace the loose check on
state.config["evaluation"].get("auto_revert") in (True, False) with a precise
assertion that the value is False (e.g., assert
state.config["evaluation"].get("auto_revert") is False) so that the behavior of
_WizardState/config after wizard._orchestrator._step_evaluation is properly
verified.

In `@tools/check_anchor_resolution.py`:
- Around line 327-333: The help text promises that passing "--exclude ''" will
scan everything but _resolve_excludes currently filters out empty values and
thus falls back to the default excludes; update _resolve_excludes so that it
distinguishes "no --exclude given" (keep defaults) from " --exclude given with
an explicit empty string" (treat as an explicit override to use an empty exclude
set), i.e. stop dropping empty-string entries when the user explicitly provided
excludes and return an empty set in that case; key symbol: _resolve_excludes.

---

Outside diff comments:
In `@docs/usermanuals/en/reference/cli.md`:
- Around line 205-214: The English CLI exit codes table is missing exit code 5
present in the Turkish version; update the table in
docs/usermanuals/en/reference/cli.md to include a new row for "5 | Wizard
cancelled" (matching the Turkish description) so both language docs remain in
parity; ensure the new entry follows the same table formatting and update the
"See [Exit Codes]" reference only if needed.

In `@docs/usermanuals/tr/reference/exit-codes.md`:
- Line 113: The Turkish text still reads "Exit kodları 0-4" but must match the
English doc and include the new stable code 5; update the sentence containing
"Exit kodları 0-4" to "Exit kodları 0-5" (preserve the rest of the sentence and
punctuation) so the bilingual parity and the documented compatibility range
align with the English version.

---

Nitpick comments:
In `@tools/check_no_analysis_refs.py`:
- Line 144: Replace the set comprehension in the return of the function that
currently uses "sorted({p for p in files})" with "sorted(set(files))" to improve
readability; locate the return statement in tools/check_no_analysis_refs.py (the
line returning sorted({p for p in files})) and change it to use set(files)
instead of the comprehension.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5ea8c55e-ca8f-4526-b5b7-c65350c8e168

📥 Commits

Reviewing files that changed from the base of the PR and between ba8ffce and ecf3dcc.

📒 Files selected for processing (88)

.claude/skills/cut-release/SKILL.md
.claude/skills/sync-bilingual-docs/SKILL.md
.codacy.yaml
.github/workflows/nightly.yml
.gitignore
.markdownlint.json
CHANGELOG.md
CLAUDE.md
README.md
docs/analysis/code_reviews/closure-plan-202604300906.md
docs/analysis/code_reviews/foundation-pr-review-prompt.md
docs/analysis/code_reviews/ghost-features-analysis-20260502.md
docs/analysis/code_reviews/master-review-opus-202604300906.md
docs/analysis/code_reviews/wave2a-round2-review-prompt.md
docs/analysis/code_reviews/wave2b-final-merge-review-prompt.md
docs/analysis/code_reviews/wave3-followup-review-prompt.md
docs/analysis/code_reviews/wave3-review-prompt.md
docs/analysis/code_reviews/wave4-followup-review-prompt.md
docs/analysis/code_reviews/wave4-review-prompt.md
docs/analysis/code_reviews/wave5-review-prompt.md
docs/design/data_audit_cli_split.md
docs/design/gdpr_erasure.md
docs/design/iso27001_soc2_alignment.md
docs/design/wizard_mode.md
docs/guides/quickstart-tr.md
docs/guides/quickstart.md
docs/product_strategy-tr.md
docs/product_strategy.md
docs/reference/architecture-tr.md
docs/reference/architecture.md
docs/reference/usage-tr.md
docs/reference/usage.md
docs/roadmap-tr.md
docs/roadmap.md
docs/roadmap/phase-11-5-backlog.md
docs/roadmap/phase-12-6-closure-cycle.md
docs/roadmap/phase-12-data-curation-maturity.md
docs/roadmap/releases.md
docs/standards/architecture.md
docs/standards/coding.md
docs/standards/documentation.md
docs/standards/localization.md
docs/usermanuals/en/compliance/gdpr.md
docs/usermanuals/en/compliance/human-approval-gate.md
docs/usermanuals/en/reference/cli.md
docs/usermanuals/en/reference/exit-codes.md
docs/usermanuals/tr/compliance/gdpr.md
docs/usermanuals/tr/compliance/human-approval-gate.md
docs/usermanuals/tr/reference/cli.md
docs/usermanuals/tr/reference/exit-codes.md
forgelm/cli/__init__.py
forgelm/cli/_exit_codes.py
forgelm/cli/_parser.py
forgelm/cli/_wizard.py
forgelm/config.py
forgelm/wizard.py
forgelm/wizard/__init__.py
forgelm/wizard/_byod.py
forgelm/wizard/_collectors.py
forgelm/wizard/_defaults.json
forgelm/wizard/_io.py
forgelm/wizard/_orchestrator.py
forgelm/wizard/_state.py
pyproject.toml
site/compliance.html
site/contact.html
site/css/style.css
site/features.html
site/guide.html
site/index.html
site/js/translations.js
site/js/wizard.js
site/js/wizard_defaults.js
site/privacy.html
site/quickstart.html
site/terms.html
tests/conftest.py
tests/test_check_bilingual_parity.py
tests/test_check_wizard_defaults_sync.py
tests/test_cli_phase10.py
tests/test_phase12_5.py
tests/test_wizard_phase11.py
tests/test_wizard_phase22.py
tools/check_anchor_resolution.py
tools/check_no_analysis_refs.py
tools/check_wizard_defaults_sync.py
tools/check_yaml_snippets.py
tools/generate_wizard_defaults.py

💤 Files with no reviewable changes (12)

docs/analysis/code_reviews/wave3-followup-review-prompt.md
docs/analysis/code_reviews/ghost-features-analysis-20260502.md
docs/analysis/code_reviews/wave4-followup-review-prompt.md
docs/analysis/code_reviews/wave2a-round2-review-prompt.md
docs/analysis/code_reviews/foundation-pr-review-prompt.md
docs/analysis/code_reviews/master-review-opus-202604300906.md
docs/standards/localization.md
docs/analysis/code_reviews/wave3-review-prompt.md
docs/analysis/code_reviews/wave5-review-prompt.md
forgelm/wizard.py
docs/analysis/code_reviews/wave2b-final-merge-review-prompt.md
docs/analysis/code_reviews/wave4-review-prompt.md

coderabbitai · 2026-05-09T21:46:21Z

+> cross-tab `storage` sync, distinct `EXIT_WIZARD_CANCELLED = 5`
+> exit code, best-effort `readline` integration. Schema-default


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't introduce a new public exit code here without updating the contract.

This doc advertises EXIT_WIZARD_CANCELLED = 5, but the repo’s public exit-code contract is still 0/1/2/3/4. Unless that contract changed in the same release, this is a compatibility break for wrappers and CI that only handle the documented range. As per coding guidelines, "Error paths must follow docs/standards/error-handling.md with exit codes 0/1/2/3/4 as a public contract".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/design/wizard_mode.md` around lines 15 - 16, The doc introduces a new public exit code symbol EXIT_WIZARD_CANCELLED = 5 which violates the repository's public exit-code contract (0–4); either remove or mark EXIT_WIZARD_CANCELLED as internal/private (not part of the public contract) and revert public documentation to the approved contract, or update the central contract docs in docs/standards/error-handling.md to include the new code and ensure all wrappers/CI are updated; locate usages/definitions of EXIT_WIZARD_CANCELLED and the wizard_mode.md text and either change the symbol to an internal-only constant or coordinate a contract update in the standards document to keep the public API consistent.

cemililik · 2026-05-09T21:56:25Z

Both review verdicts (APPROVE with NITs) absorbed. Single concrete action taken:

§1 / §5 NIT — PR description undercount (16 → 17 commits, 4th bucket missing). PR description updated to reflect:

17 commits, 88 files, ~7283 insertions / ~8656 deletions
Added 4th bucket entry for ecf3dcc chore(release-prep): nightly CVE ignore + CHANGELOG/roadmap refresh (the post-prompt commit landing the issue #37 stop-gap and the public-facing CHANGELOG/roadmap rewrite)

Other findings — no action needed (confirmed by both reviewers as non-blocking):

§3 (reviewer 1) — pytest subset coverage threshold. Pre-existing on origin/main (the --cov-fail-under=40 floor in pyproject.toml applies globally; subset runs naturally fall under it). Full pytest tests/ passes the floor cleanly. Not a regression.
§5 (reviewer 1) — PR fix(site): pass-5 closure — site-vs-code honesty sweep + i18n parity #41 follow-up commits without dedicated CHANGELOG bullets. Per the prompt's own guidance, follow-up fixes to a parent feature already documented in CHANGELOG are acceptable as un-bulleted. The site-honesty work is captured under the "Site copy now matches the live code surface" entry in [Unreleased] → ### Changed.

Both reviewers explicitly recommend APPROVE / merge. No blockers.

cemililik · 2026-05-09T22:01:04Z

External-scanner verdict — both Quality Gates fail on pre-existing intentional fixtures

Cross-checked the failing scanner outputs against the silencing baseline established in 40120d8 chore(scanners): silence Codacy + SonarCloud false positives (the commit that audited PR #40's scanner findings).

SonarCloud — all 10 hotspots verified pre-existing false positives

Fetched via the SonarCloud hotspot API. Every entry is LOW severity and lives in test files where the literal IS the test's load-bearing assertion:

tests/test_wizard_phase22.py:90    LOW  ftp://...     (test_unknown_scheme_rejected — verifies SSRF rejection)
tests/test_wizard_phase22.py:93-94 LOW  http://...    (test_http_warning_but_accepted — wizard parser surface)
tests/test_wizard_phase22.py:830   LOW  169.254.169.254 (TestWebhookSSRFPreflight — IMDS rejection)
tests/test_wizard_phase22.py:834   LOW  10.0.0.5      (TestWebhookSSRFPreflight — RFC1918 rejection)
tests/test_wizard_phase22.py:924   LOW  tmp_path      (pytest stdlib fixture)
tests/test_wizard_phase22.py:929   LOW  tmp_path      (pytest stdlib fixture)
tests/test_cli_phase10.py:383      LOW  tmp_path      (pytest stdlib fixture)
tests/test_cli_phase10.py:422      LOW  tmp_path      (pytest stdlib fixture)
tests/test_cli_phase10.py:425      LOW  tmp_path      (pytest stdlib fixture)

These are the exact 8 patterns documented as Safe / Reviewed in 40120d8 for PR #40, plus 2 additional tmp_path lines in test_cli_phase10.py that follow the same documented-as-safe pattern. Suppressing them in code would either disable the SonarCloud rule project-wide (loses signal on production code) or add inline comments that confuse the test-source-as-spec pattern.

The Safe / Reviewed marks made on PR #40's SonarCloud view do not transfer automatically to PR #42's view — that's a SonarCloud cross-PR artefact, not a code regression. Suggested action: re-mark as Safe on the PR #42 dashboard (one-click per hotspot) or set a per-folder rule exemption for tests/.

Codacy — 10 issues (5 high · 2 medium · 3 minor)

Codacy doesn't expose an unauthenticated PR-issue API. From the categorical breakdown (5 ErrorProne · 2 Security medium · 1 Security minor · 1 BestPractice · 1 Comprehensibility), and given:

All 7 Codacy HIGH findings from PR feat(wizard): Phase 22 — CLI wizard parity with web + sub-package split #40 were silenced in-tree in 40120d8 — the markers are still present on the development tip:
- 5× # noqa: F401 -- re-export for tests in forgelm/wizard/__init__.py:62, 75, 103, 122, 151
- 1× # nosec B105 -- enum-style discriminator in forgelm/wizard/_collectors.py:713
- .codacy.yaml excludes site/js/wizard_defaults.js + forgelm/wizard/_defaults.json (generated artefacts)
- .markdownlint.json disables MD013 for prose docs
PR fix(site): pass-5 closure — site-vs-code honesty sweep + i18n parity #41 (against development, same code as PR release: development → main — Phase 22 wizard modernisation + site honesty pass-5 #42's head) was green on Codacy — implying the in-tree silencing IS working on a per-PR basis when the diff base already contains the silencing context.

PR #42's 5 high is most likely the same class of findings re-rendered against a baseline (main) that lacks the silencing context. The in-tree markers will land on main once the merge happens, so the main branch's continuous-scan view will go green after merge.

If any of the 10 Codacy issues is a genuinely new pattern (i.e. something that landed in a commit other than 40120d8's domain), please share the dashboard list and I'll triage each one in-tree.

Verdict

All other checks green: ruff ✓, ruff format ✓, all 4 test-matrix Python versions ✓, lint ✓, validate ✓, CodeRabbit ✓, 14 internal CI guards ✓, dry-run ✓.
Two human-style review verdicts above both recommend APPROVE with NIT-level notes only.
mergeable: MERGEABLE; only blocker is the Quality Gate state, which is the documented cross-PR scanner-view artefact.

No code-level action recommended. Maintainer call: re-mark Safe on SonarCloud's PR #42 view, or merge with the failing Quality Gate on the rationale that the underlying findings are documented + intentional.

Closes 25 inline review comments (16 CodeRabbit + 9 Gemini) plus 2 outside-diff items and 1 nitpick on PR #42. Two reviewer findings explicitly skipped — see "Skipped" below. ## Wizard idempotent re-run — preserves prior answers + lets operator disable previously-enabled gates The pre-fix ``_step_evaluation`` deepcopy of the loaded YAML left ``auto_revert`` / ``safety`` / ``benchmark`` / ``llm_judge`` blocks intact even when the operator answered "no" to a gate prompt — making it impossible to disable a feature on a ``--wizard-start-from`` rerun. Fixed across all four gates with explicit pop-on-decline semantics: - **auto_revert:** explicit "no" sets ``auto_revert = False`` and pops ``max_acceptable_loss`` so the rebuild reflects the answer. - **safety / benchmark / llm_judge:** when the collector returns ``None`` (operator declined the gate), the existing block is popped. Paired with a ``default_enabled`` plumbing pass on the gate prompts so a bare-Enter rerun of a previously-enabled gate keeps the block intact: - ``_collect_safety_config(existing=...)`` — uses the loaded ``classifier``, ``test_prompts``, ``scoring``, ``max_safety_regression``, ``min_safety_score``, ``min_classifier_confidence``, ``track_categories`` / ``severity_thresholds`` as prompt defaults. - ``_collect_trainer_hyperparameters(trainer_type, existing=...)`` — uses the loaded ``dpo_beta`` / ``simpo_beta+gamma`` / ``kto_beta`` / ``orpo_beta`` / ``grpo_*`` as prompt defaults; no more schema-literal regression on a rerun. - ``_collect_rope_scaling(max_length, existing=...)`` — offers "Keep existing RoPE scaling?" when a prior block is loaded. - ``_collect_galore_config(use_galore, existing=...)`` — same "Keep existing GaLore tuning?" prompt. - ``_collect_benchmark(existing=...)`` and ``_collect_judge(existing=...)`` — gate prompt defaults to "yes" when the loaded block has the inner contract field set, so a bare Enter preserves; explicit "n" still drops. Tests updated to match the new semantics: - ``test_existing_benchmark_and_judge_preserved_on_enter`` (renamed) pins the bare-Enter preserve path. - ``test_existing_benchmark_and_judge_dropped_on_explicit_no`` (new) pins the explicit-disable contract. - ``test_auto_revert_default_reflects_existing`` tightened to ``is False`` (was ``in (True, False)`` — non-specific). ## Wizard validation gates "Start training now?" prompt ``_validate_generated_config`` now returns ``bool``; the orchestrator short-circuits the "Start training now?" prompt and the immediate- train branch when validation fails. Pre-fix the prompt was offered even after the operator was told the YAML would fail to load. ## Wizard saves config atomically ``_save_config_to_file`` and the fallback save path now route through new ``_atomic_yaml_write`` helper (``tempfile`` + ``fsync`` + ``os.replace``). An interrupted save during a ``--wizard-start-from`` rerun can no longer corrupt the operator's pre-existing config. ## Exit-code contract sync ``EXIT_WIZARD_CANCELLED = 5`` was added to the contract during Phase 22 review-cycle 2 but the standards / user-manual docs were not all updated in lockstep. Closed: - ``docs/standards/error-handling.md`` — adds the constant + table row. - ``docs/usermanuals/en/reference/cli.md`` — adds row "5 | Wizard cancelled". - ``docs/usermanuals/tr/reference/cli.md`` — same TR mirror. - ``docs/usermanuals/tr/reference/exit-codes.md`` — "Exit kodları 0-4" → "Exit kodları 0-5" matching the EN file. ## Site / JS - ``site/js/wizard.js`` storage-event handler now treats ``e.newValue === null`` (cross-tab reset / delete) as an explicit reset: clears in-memory state, restores defaults, and re-renders. Pre-fix the current tab kept editing a config that no longer existed. - Max-sequence-length wizard row now uses ``tr('wizard.training. max_length.{label,hint}')`` instead of hard-coded English; new locale entries added across all 6 languages so the field localises with the rest of the wizard UI. ## Smaller polish - ``forgelm/wizard/_byod.py`` — ``_finalize_quickstart_path`` quotes the printed ``--config <path>`` via ``shlex.quote`` so suggested shell commands are runnable even when the path contains spaces. Same fix applied in ``_orchestrator.py``'s post-save messages. - ``tools/check_anchor_resolution.py`` — ``--exclude`` help text now honestly describes that empty values are dropped (and points at ``--exclude __none__`` as the no-exclude path) instead of promising a "scan everything" path the code does not execute. - ``tools/check_no_analysis_refs.py`` — replace ``sorted({p for p in files})`` with ``sorted(set(files))`` (readability nit). ## Docs - ``CHANGELOG.md`` ``[Unreleased]`` Phase 22 entry softened to "covers the same nine user-visible stages as the in-browser wizard and produces the same generated YAML" with a cross-reference to the inline NOTE about by-design step-ID divergence. Working-memory cleanup paragraph rewritten to use the generic "working-memory directories" term so the policy-announcement no longer names the gitignored paths verbatim. Same generic rewrite applied to ``docs/roadmap/releases.md``; the corresponding ``CHANGELOG.md`` / ``docs/roadmap/releases.md`` entries removed from ``check_no_analysis_refs.py::_EXEMPT`` because they no longer need the carve-out. - ``docs/design/wizard_mode.md`` — prelude to the step list cross- references the web wizard's different IDs / ``trainer ↔ model`` order swap so readers don't expect lockstep between the two surfaces. - ``docs/reference/architecture{,-tr}.md`` — blank line inserted after the ``### `wizard/`` heading to satisfy markdownlint MD022. ## Skipped (with reason) - **Revert ``EXIT_WIZARD_CANCELLED = 5`` from the public contract** (CodeRabbit @ ``docs/design/wizard_mode.md:16`` and ``forgelm/cli/_exit_codes.py:31``). The 0–5 extension was a deliberate Phase 22 review-cycle 2 decision, captured in the CHANGELOG entry under the "Distinct exit code for wizard cancel" bullet. The right action for the documented-contract gap was to update the standards / user-manual docs (Group A above), not to walk the contract back. ## Validation - ruff check + format clean (178 files). - 14 doc / parity / consistency CI guards green. - ``forgelm --config config_template.yaml --dry-run`` valid. - 159 wizard tests pass; 191 check/cli tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cemililik · 2026-05-09T23:14:29Z

@coderabbitai review

coderabbitai · 2026-05-09T23:14:35Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (4)

docs/design/wizard_mode.md (1)

8-8: 💤 Low value

Remove blank lines inside blockquotes to fix markdown lint warnings.

Multiple linters flag blank lines inside blockquotes (MD028). While this doesn't affect functionality, it breaks markdown formatting conventions.

📝 Proposed fix

Remove the blank line at line 8 (between the two blockquote paragraphs in the status note) and line 19 (between the review-cycle 2 and PR-D blockquote paragraphs), or break them into separate blockquote blocks if the visual separation is intentional.

Also applies to: 19-19
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/design/wizard_mode.md` at line 8, Remove the blank lines inside
blockquotes that trigger MD028: in the "status note" blockquote (between the two
quote paragraphs) and in the blockquote containing "review-cycle 2" and "PR-D"
(the blank line between those quote paragraphs); either delete those empty lines
so the paragraphs remain in a single blockquote or split them into separate
blockquote blocks if intentional spacing is desired, ensuring the blockquote
markers (">") directly precede each paragraph line.

forgelm/wizard/_collectors.py (1)

619-637: 💤 Low value

Extract Article 10 prompt strings to reduce duplication.

The same prompt strings appear three times each in the strict vs non-strict branches. Extracting them as module-level constants improves maintainability.

♻️ Proposed extraction

+_ARTICLE_10_COLLECTION_METHOD = "Article 10(2)(b): how was data collected?"
+_ARTICLE_10_ANNOTATION = "Article 10(2)(b): annotation methodology"
+_ARTICLE_10_BIASES = "Article 10(2)(f): known_biases"
+
 def _collect_data_governance(
     ...
     if mandatory:
         ...
         collection_method = (
-            _prompt("Article 10(2)(b): how was data collected?", existing["collection_method"])
+            _prompt(_ARTICLE_10_COLLECTION_METHOD, existing["collection_method"])
             if existing.get("collection_method")
-            else _prompt_required("Article 10(2)(b): how was data collected?")
+            else _prompt_required(_ARTICLE_10_COLLECTION_METHOD)
         )
         # ... similar for annotation_process and known_biases
     else:
-        collection_method = _prompt("Article 10(2)(b): how was data collected?", existing.get("collection_method", ""))
+        collection_method = _prompt(_ARTICLE_10_COLLECTION_METHOD, existing.get("collection_method", ""))

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@forgelm/wizard/_collectors.py` around lines 619 - 637, The prompt text
literals for Article 10 are duplicated across the strict/non-strict branches;
extract them into module-level constants (e.g., ARTICLE10_COLLECTION_PROMPT,
ARTICLE10_ANNOTATION_PROMPT, ARTICLE10_KNOWN_BIASES_PROMPT) and replace the
inline strings used with _prompt and _prompt_required in the collection_method,
annotation_process, and known_biases assignments so both branches reference the
same constants; keep existing calls to _prompt and _prompt_required unchanged,
only substitute the literal strings with the new constants and update any
imports/exports if needed.

forgelm/wizard/_byod.py (1)

112-122: 💤 Low value

Redundant exception classes in chained except.

PermissionError and IsADirectoryError are subclasses of OSError, so catching them separately alongside OSError is redundant — the OSError handler already covers them. The same pattern appears at line 175. This is harmless but slightly confusing.

♻️ Proposed simplification

-    except (FileNotFoundError, ValueError) as exc:
-        _print(f"  Ingest failed: {exc}")
-        return None
-    except (PermissionError, IsADirectoryError, OSError) as exc:
-        _print(f"  Ingest failed due to filesystem error: {exc} — check permissions or output path.")
-        return None
+    except (FileNotFoundError, ValueError) as exc:
+        _print(f"  Ingest failed: {exc}")
+        return None
+    except OSError as exc:
+        _print(f"  Ingest failed due to filesystem error: {exc} — check permissions or output path.")
+        return None

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@forgelm/wizard/_byod.py` around lines 112 - 122, The except clause currently
catches (PermissionError, IsADirectoryError, OSError) which is redundant because
PermissionError and IsADirectoryError are subclasses of OSError; change the
tuple to just (OSError,) to simplify the handler while keeping behavior the
same, and apply the same simplification to the repeated handler later (the
similar except tuple at line 175) so both places only catch OSError as exc.

forgelm/wizard/_orchestrator.py (1)

982-990: 💤 Low value

Consider renaming the fallback variable to avoid shadowing the class name.

The pattern ForgeConfig = None when the import fails works but triggers static analysis warnings about naming conventions. A clearer alternative uses a distinct name.

♻️ Alternative pattern

     try:
         from ..config import ForgeConfig
+        _forge_config_cls = ForgeConfig
     except ImportError:  # pragma: no cover — config always present
-        ForgeConfig = None  # type: ignore[assignment]
-    if ForgeConfig is not None:
+        _forge_config_cls = None
+    if _forge_config_cls is not None:
         try:
-            ForgeConfig.model_validate(data)
+            _forge_config_cls.model_validate(data)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@forgelm/wizard/_orchestrator.py` around lines 982 - 990, The code currently
assigns ForgeConfig = None on import failure which shadows the imported class
name and triggers static-analysis warnings; change the fallback variable to a
distinct identifier (e.g., _ForgeConfig or ForgeConfigModule) and update the
subsequent conditional and validation call accordingly so you check the new
identifier (e.g., if _ForgeConfig is not None) and call its model_validate
method name (e.g., _ForgeConfig.model_validate(data)) to preserve behavior
without shadowing the class name.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tools/check_no_analysis_refs.py`:
- Around line 49-51: The current _PROHIBITED_RE only matches
docs/(analysis|marketing)/ or relative refs when followed by a hardcoded set of
subfolders, so paths like "analysis/new_topic/..." slip through; update
_PROHIBITED_RE to match any reference to analysis or marketing subtrees (e.g.,
match docs/(analysis|marketing)/ and also (?:^|[^A-Za-z./])(analysis|marketing)/
followed by any path segment) so it flags arbitrary subfolders instead of just
the fixed list—modify the regex in tools/check_no_analysis_refs.py where
_PROHIBITED_RE is defined to broaden the second alternative to allow any
following path component.
- Around line 151-154: The current except block around
path.read_text(encoding="utf-8") swallows OSError/UnicodeDecodeError and returns
no findings (false-green); change it to fail-closed by recording the error into
the findings list instead of silently returning: catch (OSError,
UnicodeDecodeError) and append a finding that includes the path (path) and the
exception message (str(e)) or otherwise raise the exception so CI fails; update
the handler around path.read_text to use the findings list (findings) or
re-raise to ensure unreadable files produce a visible failure.

---

Nitpick comments:
In `@docs/design/wizard_mode.md`:
- Line 8: Remove the blank lines inside blockquotes that trigger MD028: in the
"status note" blockquote (between the two quote paragraphs) and in the
blockquote containing "review-cycle 2" and "PR-D" (the blank line between those
quote paragraphs); either delete those empty lines so the paragraphs remain in a
single blockquote or split them into separate blockquote blocks if intentional
spacing is desired, ensuring the blockquote markers (">") directly precede each
paragraph line.

In `@forgelm/wizard/_byod.py`:
- Around line 112-122: The except clause currently catches (PermissionError,
IsADirectoryError, OSError) which is redundant because PermissionError and
IsADirectoryError are subclasses of OSError; change the tuple to just (OSError,)
to simplify the handler while keeping behavior the same, and apply the same
simplification to the repeated handler later (the similar except tuple at line
175) so both places only catch OSError as exc.

In `@forgelm/wizard/_collectors.py`:
- Around line 619-637: The prompt text literals for Article 10 are duplicated
across the strict/non-strict branches; extract them into module-level constants
(e.g., ARTICLE10_COLLECTION_PROMPT, ARTICLE10_ANNOTATION_PROMPT,
ARTICLE10_KNOWN_BIASES_PROMPT) and replace the inline strings used with _prompt
and _prompt_required in the collection_method, annotation_process, and
known_biases assignments so both branches reference the same constants; keep
existing calls to _prompt and _prompt_required unchanged, only substitute the
literal strings with the new constants and update any imports/exports if needed.

In `@forgelm/wizard/_orchestrator.py`:
- Around line 982-990: The code currently assigns ForgeConfig = None on import
failure which shadows the imported class name and triggers static-analysis
warnings; change the fallback variable to a distinct identifier (e.g.,
_ForgeConfig or ForgeConfigModule) and update the subsequent conditional and
validation call accordingly so you check the new identifier (e.g., if
_ForgeConfig is not None) and call its model_validate method name (e.g.,
_ForgeConfig.model_validate(data)) to preserve behavior without shadowing the
class name.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a7b694a1-f100-44b5-beaf-6c88d3390125

📥 Commits

Reviewing files that changed from the base of the PR and between ecf3dcc and 2dce74e.

📒 Files selected for processing (18)

CHANGELOG.md
docs/design/wizard_mode.md
docs/reference/architecture-tr.md
docs/reference/architecture.md
docs/roadmap/releases.md
docs/standards/error-handling.md
docs/usermanuals/en/reference/cli.md
docs/usermanuals/tr/reference/cli.md
docs/usermanuals/tr/reference/exit-codes.md
forgelm/wizard/_byod.py
forgelm/wizard/_collectors.py
forgelm/wizard/_orchestrator.py
forgelm/wizard/_state.py
site/js/translations.js
site/js/wizard.js
tests/test_wizard_phase22.py
tools/check_anchor_resolution.py
tools/check_no_analysis_refs.py

✅ Files skipped from review due to trivial changes (6)

docs/standards/error-handling.md
docs/reference/architecture.md
docs/reference/architecture-tr.md
docs/roadmap/releases.md
tools/check_anchor_resolution.py
CHANGELOG.md

🚧 Files skipped from review as they are similar to previous changes (4)

docs/usermanuals/en/reference/cli.md
docs/usermanuals/tr/reference/exit-codes.md
site/js/wizard.js
forgelm/wizard/_state.py

…its) Closes 2 valid issues + 4 nitpicks from PR #42's second review pass. ## tools/check_no_analysis_refs.py — guard hardening - **_PROHIBITED_RE broadened to catch arbitrary working-memory subfolders.** The pre-fix regex hardcoded a fixed list of subfolder names (``code_reviews|QKV-Core|Trion|ART|...``); any newly-created folder under ``analysis/`` or ``marketing/`` slipped past the guard. The relative-form alternative now matches ``(analysis|marketing)/<segment>`` followed by either another path separator or a known file extension (``.md`` / ``.py`` / ``.html`` / ``.js`` / ``.css`` / ``.yaml`` / ``.yml`` / ``.json`` / ``.txt`` / ``.jsonl`` / ``.sh`` / ``.cfg`` / ``.toml`` / ``.ini``). The continuation-or-extension constraint preserves the existing no-false-positive-on-prose property: phrases like "marketing/product copy" and "analysis/report results" stay clean (slash-as-"or" in English prose), while real path references (``marketing/strategy/``, ``analysis/code_reviews/x.md``, ``marketing/some-future-folder/z.md``) are flagged. Self-exempt entry extended with the two example patterns the guard's own docstring uses to document this behaviour. - **_check_file fail-closed on read errors.** Pre-fix, an unreadable file (``OSError``) or a file with non-UTF-8 bytes (``UnicodeDecodeError``) silently returned an empty findings list, so CI went green on a stale or malformed input. Now records the exception as a finding (with class + message) so the failure surfaces in the same channel as a real prohibited reference. ## docs/design/wizard_mode.md — markdownlint MD028 - Three distinct status-note blockquotes (``Status``, ``Review-cycle 2 additions``, ``PR-D additions``) had bare blank lines between them, which markdown renderers treat as a single blockquote with a gap and which markdownlint MD028 flags. Split with explicit ```` HTML comments so each block stays semantically distinct without violating the rule. ## forgelm/wizard/_byod.py — except-tuple simplification - ``except (PermissionError, IsADirectoryError, OSError)`` → ``except OSError``. Both ``PermissionError`` and ``IsADirectoryError`` subclass ``OSError``; the bare catch handles every filesystem failure mode. - ``except (FileNotFoundError, ValueError, OSError)`` → ``except (OSError, ValueError)``. ``FileNotFoundError`` subclasses ``OSError``; ``ValueError`` is independent and intentional (audit raises it on malformed JSONL), so it stays in the tuple — this is a deviation from the reviewer's literal "OSError only" suggestion to preserve behaviour. ## forgelm/wizard/_collectors.py — Article 10 prompt constants - Three prompt strings (collection method / annotation methodology / known biases) were duplicated across the strict-tier and non-strict-tier branches of ``_collect_data_governance``. Hoisted to module-level ``_ARTICLE10_{COLLECTION,ANNOTATION,KNOWN_BIASES}_ PROMPT`` constants so a future EU-AI-Act wording tweak only changes one site instead of six. ## forgelm/wizard/_orchestrator.py — class-name shadow - ``from ..config import ForgeConfig`` followed by ``ForgeConfig = None`` on the ``ImportError`` path shadowed the imported class name (static analyzers flag the rebind). Renamed both occurrences to a module-private ``_ForgeConfig`` identifier so the import- failure fallback no longer collides with the public class name. ## Validation - ruff check + format clean (178 files). - 14 doc / parity / consistency CI guards green. - ``forgelm --config config_template.yaml --dry-run`` valid. - 132 wizard-phase22 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sonarqubecloud · 2026-05-09T23:58:49Z

Quality Gate failed

Failed conditions
10 Security Hotspots
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Promotes the [Unreleased] block into the [0.5.5] section under a new "Post-merge follow-up" subsection that captures the three PRs landed on top of the paper-merged closure cycle: - PR #40 (2026-05-08) — Phase 22 CLI wizard parity-with-web - PR #41 (2026-05-09) — Site documentation correction sweep + i18n parity backfill - PR #42 (2026-05-10) — Release-prep + nightly pip-audit gate fix + two review-absorption rounds The CHANGELOG [0.5.5] header date is bumped to 2026-05-10 (the actual PyPI release date) and a fresh empty [Unreleased] block is added at the top. __api_version__ stays at 1.0.0 — Phase 22 is a CLI/wizard surface change, not a Python library API addition; the 30-symbol forgelm.__all__ surface that this release anchors was already added during the Phase 12.6 closure cycle (paper-merged as v0.5.5). Roadmap (en + tr) marks v0.5.5 as released; "Latest release on PyPI" flipped from v0.5.0 to v0.5.5; "Current state" sentence simplified (no more "merged but tag deferred" framing). releases.md drops the "Next PyPI release (slot pending)" section since v0.5.5 now bundles everything that was staged in [Unreleased]. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cemililik and others added 16 commits May 9, 2026 00:10

docs(changelog): Phase 22 wizard PR #40 review-cycle 1

7d08bba

CHANGELOG entry for the c0dd57a fix sweep — Critical / Major / Medium / Minor classifications + the orchestrator coverage bump that landed alongside. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge pull request #40 from cemililik/feat/phase22-wizard-modernisation

1aa457b

feat(wizard): Phase 22 — CLI wizard parity with web + sub-package split

Merge pull request #41 from cemililik/closure/site-honesty-pass5

b4b8126

fix(site): pass-5 closure — site-vs-code honesty sweep + i18n parity

sourcery-ai Bot reviewed May 9, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 9, 2026

View reviewed changes

coderabbitai Bot reviewed May 9, 2026

View reviewed changes

Comment thread tools/check_no_analysis_refs.py

Comment thread tools/check_no_analysis_refs.py

cemililik merged commit 16d00f6 into main May 10, 2026
9 of 11 checks passed

		yaml.safe_dump(config, f, default_flow_style=False, sort_keys=False, allow_unicode=True)
		_print(f"\n Config saved to: {requested_filename}")

		counter += 1


		def _validate_generated_config(config_filename: str) -> None:

		> cross-tab `storage` sync, distinct `EXIT_WIZARD_CANCELLED = 5`
		> exit code, best-effort `readline` integration. Schema-default

Conversation

cemililik commented May 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed (high level)

Test plan

Notes

Summary by CodeRabbit

Uh oh!

qodo-code-review Bot commented May 9, 2026

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Uh oh!

codacy-production Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Not up to standards ⛔

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cemililik commented May 9, 2026

Uh oh!

cemililik commented May 9, 2026

External-scanner verdict — both Quality Gates fail on pre-existing intentional fixtures

SonarCloud — all 10 hotspots verified pre-existing false positives

Codacy — 10 issues (5 high · 2 medium · 3 minor)

Verdict

cemililik commented May 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading

codacy-production Bot commented May 9, 2026 •

edited

Loading