diff --git a/docs/AI-discoverability-guide.md b/docs/AI-discoverability-guide.md index 9c87848..99084f5 100644 --- a/docs/AI-discoverability-guide.md +++ b/docs/AI-discoverability-guide.md @@ -309,7 +309,7 @@ tool:m-cli cmd:m-cli#test cmd:m-cli#ci.init module:m-stdlib#STDJSON -doc:m-tools#gap-analysis +doc:m-dev-tools#m-tool-gap-analysis data:m-standard#grammar-surface workflow:tdd_inner_loop task:workflow.scaffold_new_project diff --git a/docs/AI-discoverability-plan.md b/docs/AI-discoverability-plan.md index 144a66f..22ae365 100644 --- a/docs/AI-discoverability-plan.md +++ b/docs/AI-discoverability-plan.md @@ -117,7 +117,7 @@ For each repo, in order: | `tree-sitter-m-vscode` | exists | `package.json` already declares it; wrap | needs add | tier 3 | | `m-stdlib-vscode` | needs check | `package.json` + manifest discovery config | needs add | tier 3 | | `m-cli-extras` | unknown | dump plugin entry points to JSON | needs add | tier 3 | -| `m-tools` | exists | **archived** — emit `status: archived` only | n/a | optional | +| `m-tools` | archived (upstream) | **rehosted** under [`.github/docs/history/`](history/); dropped from `tools.json` | n/a | resolved | Until tier 1 emits machine-readable artifacts, the org catalog is fiction. @@ -418,8 +418,11 @@ Goal: freshness, link-check, license-reconcile gates running weekly in CI. `CONTRIBUTING.md` is ~30 lines pointing at each repo's own contribution guide. 4. **No history/archive routing in the catalog.** `m-tools` is archived; - either rehost its docs in `.github/docs/history/` or drop them from - `tools.json`. Agents care about the current shape. + its three design docs are rehosted under + [`.github/docs/history/`](history/) and routed via `task_index.json`'s + `history` category as `doc:m-dev-tools#` typed IDs. The + `tool:m-tools` entry is dropped from `tools.json` — agents care about + the current shape, not retired tools. 5. **No human/AI documentation split per repo.** One `AGENTS.md` per repo; AI-specific sections marked inline. Two parallel docs always drift. diff --git a/docs/history/README.md b/docs/history/README.md new file mode 100644 index 0000000..3fe36da --- /dev/null +++ b/docs/history/README.md @@ -0,0 +1,55 @@ +# Historical documents + +Frozen, in-org copies of design documents from now-archived repositories in +the `m-dev-tools` org. The original repos remain read-only on GitHub; these +copies exist so the *why* behind the current org shape stays discoverable +inside `.github` itself, immune to upstream pruning or renames. + +These documents are **not maintained**. They reflect the state of the world +at the moment they were imported. For the *current* shape of the org, start +at [`profile/README.md`](../../profile/README.md) and +[`profile/tools.json`](../../profile/tools.json). + +## Contents + +| Document | Source | Imported from commit | Why it's preserved | +|---|---|---|---| +| [`m-tool-gap-analysis.md`](m-tool-gap-analysis.md) | `m-dev-tools/m-tools/docs/m-tool-gap-analysis.md` | [`16fe3f7`](https://github.com/m-dev-tools/m-tools/commit/16fe3f7dc6982070809cd1d8290d01fedc5905ac) (2026-04-27) | The Go/Rust/Python toolchain comparison that produced the `m ` design and `m-cli`'s CLI ergonomics. | +| [`m-tooling-tier1.md`](m-tooling-tier1.md) | `m-dev-tools/m-tools/docs/m-tooling-tier1.md` | [`16fe3f7`](https://github.com/m-dev-tools/m-tools/commit/16fe3f7dc6982070809cd1d8290d01fedc5905ac) (2026-04-27) | The scoped Tier-1 strategy that defined what `m-cli` shipped first (fmt / lint / test / coverage / watch / LSP). | +| [`gap-analysis-and-remediation-strategy.md`](gap-analysis-and-remediation-strategy.md) | `m-dev-tools/m-tools/docs/gap-analysis-and-remediation-strategy.md` | [`16fe3f7`](https://github.com/m-dev-tools/m-tools/commit/16fe3f7dc6982070809cd1d8290d01fedc5905ac) (2026-04-27) | The phased remediation roadmap that produced both `m-cli` and `m-stdlib`. | + +## Provenance policy + +- **Imported verbatim**, with a single `> Archived snapshot.` banner added + after each H1 to make the rehosting fact visible inline. +- **No rewrites, no link-rot patching**, except where a *sibling-doc* link + pointed at a file we did not rehost — those links were retargeted at the + archived upstream repo (read-only) so they still resolve. +- **Typed IDs** for these documents live under + [`profile/task_index.json`](../../profile/task_index.json) (category + `history`). The grammar is `doc:m-dev-tools#`. + +## Adding a new historical doc + +Trigger: another repo in the org is archived and contains design rationale +that future agents/contributors will benefit from reading. + +1. Copy the file(s) verbatim into this directory. +2. Add the `> Archived snapshot.` banner immediately after the H1, citing + the source repo, source commit hash, and date. +3. Append a row to the table above. +4. Add a `doc:m-dev-tools#` typed ID to `task_index.json` under + the `history` category, with an `intent` line that names the + plain-English question the doc answers. +5. Run `make validate-catalog` to confirm the typed IDs validate. +6. Open a PR titled `chore(history): rehost /`. + +## Not on this list + +- **`m-tools/docs/implementation.md`** — implementation log; superseded by + `m-cli/docs/evolution.md` and `m-cli/docs/plans/m-cli-history-and-evolution.md`. +- **`m-tools/docs/ydb-dev-tools-gap-analysis.md`** — 10-line stub; no + preserved content worth rehosting. + +Both remain reachable in the archived `m-tools` repo on GitHub for anyone +who wants the deeper context. diff --git a/docs/history/gap-analysis-and-remediation-strategy.md b/docs/history/gap-analysis-and-remediation-strategy.md new file mode 100644 index 0000000..85cdbc7 --- /dev/null +++ b/docs/history/gap-analysis-and-remediation-strategy.md @@ -0,0 +1,1313 @@ +# M Tools — Gap Analysis and Remediation Strategy + +> **Archived snapshot.** Imported verbatim from [`m-dev-tools/m-tools`](https://github.com/m-dev-tools/m-tools) — source commit [`16fe3f7`](https://github.com/m-dev-tools/m-tools/commit/16fe3f7dc6982070809cd1d8290d01fedc5905ac) (2026-04-27), before that repo was archived. Preserved as the original phased remediation roadmap that scoped what became `m-cli` and `m-stdlib`. **Not maintained.** For the *current* shape of the org, start at [`profile/README.md`](../../profile/README.md). + +**Document type:** Strategic planning +**Scope:** Developer toolchain for the M (MUMPS) language +**Audience:** Developers building productivity tools for the M ecosystem +**Sibling document:** [`implementation.md`](https://github.com/m-dev-tools/m-tools/blob/main/docs/implementation.md) — what's actually shipped *(not rehosted; resolves to the archived m-tools repo, which remains read-only on GitHub)* + +--- + +## Scope and portability + +This document analyses the developer experience for the M (MUMPS) programming language. M itself is a portable, ISO-standardised language — implementations include InterSystems IRIS, YottaDB, GT.M, and historically others. The *toolchain* this analysis recommends (linters, formatters, AST tools, package managers) operates over `.m` source code and is portable in principle to any conformant M runtime. + +In practice this project uses **YottaDB as the foundation runtime** for two reasons: + +1. **YottaDB is open source under AGPL-3.0**, which makes it the most portable foundation for non-commercial / community tooling — anyone can install it and run the full toolchain end-to-end without licence negotiation. A toolchain bound to a closed-source runtime would be unreproducible for most contributors and unusable in CI without per-developer licensing. Open source means the toolchain is genuinely portable in the practical sense, not just the theoretical sense. +2. **YottaDB's command-line surface is mature and well-documented.** `mupip` (database management), `gde` (global directory editor), `lke` (lock examination), `dse` (database structure editor), and the `ydb` runtime — together with the `%XCMD` mechanism for one-shot M execution — give a concrete substrate to integrate with. These vendor tools are not wrapped or renamed by this project; they are used directly, with their own `--help` as canonical documentation. + +Where this matters in the analysis: tool *recommendations* (e.g., "auto-formatter using `tree-sitter-m`") are M-portable. Tool *implementations* that touch a runtime (e.g., the test runner, coverage instrumentation, the trace tail) are YottaDB-bound today and would need a runtime-adapter layer to run against IRIS or other implementations. The shell-level naming convention reflects this split: portable analysis commands use `m `; runtime-bound commands use `ydb `. See [implementation.md](implementation.md) for the canonical command map and as-built status. + +--- + +## Table of Contents + +- [1. Introduction — The Problem](#1-introduction--the-problem) +- [2. Comprehensive Gap Analysis](#2-comprehensive-gap-analysis) +- [3. Strategic Recommendations](#3-strategic-recommendations) + - [3.1 Tier 1 — Immediate (high impact, low effort)](#31-tier-1--immediate-high-impact-low-effort) + - [3.2 Tier 2 — Short term (high impact, medium effort)](#32-tier-2--short-term-high-impact-medium-effort) + - [3.3 Tier 3 — Medium term (medium impact, medium/high effort)](#33-tier-3--medium-term-medium-impact-mediumhigh-effort) + - [3.4 Tier 4 — Long term / aspirational](#34-tier-4--long-term--aspirational) +- **[Addendum A: Technology-Optimal Remediation Strategy](#addendum-a-technology-optimal-remediation-strategy)** + - [A.1 — The Foundation Problem: MUMPS Needs a Parser](#a1--the-foundation-problem-mumps-needs-a-parser) + - [A.2 — Technology Selection Matrix](#a2--technology-selection-matrix) + - [A.3 — The Database Layer: ZWR Format as Universal Interface](#a3--the-database-layer-zwr-format-as-universal-interface) + - [A.4 — The Instrumentation Layer: Observability Without a Profiler](#a4--the-instrumentation-layer-observability-without-a-profiler) + - [A.5 — Per-Gap Remediation (Major Gaps 🔴)](#a5--per-gap-remediation-major-gaps-) + - [A.6 — Per-Gap Remediation (Moderate Gaps 🟡)](#a6--per-gap-remediation-moderate-gaps-) +- **[Addendum B: Prioritized Sequence of Remediation (Post-Parser)](#addendum-b-prioritized-sequence-of-remediation-post-parser)** + - [B.1 — Sequencing Principles](#b1--sequencing-principles) + - [B.2 — Phase 1: Canonicalise the Codebase](#b2--phase-1-canonicalise-the-codebase) + - [B.3 — Phase 2: Catch Bugs Before Runtime](#b3--phase-2-catch-bugs-before-runtime) + - [B.4 — Phase 3: Replace Approximations with Truth](#b4--phase-3-replace-approximations-with-truth) + - [B.5 — Phase 4: Interactive Surfaces (No Parser Dep)](#b5--phase-4-interactive-surfaces-no-parser-dep) + - [B.6 — Phase 5: Ecosystem Layer](#b6--phase-5-ecosystem-layer) + - [B.7 — Cross-Cutting: Umbrella Dispatcher Rename](#b7--cross-cutting-umbrella-dispatcher-rename) + - [B.8 — Sequence Summary](#b8--sequence-summary) +- **[Appendix B: Gold Standard — Top 5 Language Toolchains](#appendix-b-gold-standard--top-5-language-toolchains)** + - [B.1 Python](#b1-python) + - [B.2 JavaScript / TypeScript](#b2-javascript--typescript) + - [B.3 Go](#b3-go) + - [B.4 Rust](#b4-rust) + - [B.5 Java](#b5-java) +- **[Appendix C: What Ships with YottaDB (Foundation Runtime)](#appendix-c-what-ships-with-yottadb-foundation-runtime)** + - [C.1 Runtime and Interactive Tools](#c1-runtime-and-interactive-tools) + - [C.2 MUPIP — Database Management Utility](#c2-mupip--database-management-utility) + - [C.3 Auxiliary Utilities](#c3-auxiliary-utilities) + - [C.4 MUMPS Intrinsic Debugging Commands](#c4-mumps-intrinsic-debugging-commands) + - [C.5 Percent-Sign Utility Routines](#c5-percent-sign-utility-routines) + +--- + +## 1. Introduction — The Problem + +### Background + +MUMPS (Massachusetts General Hospital Utility Multi-Programming System), now standardized as M, is a programming language and integrated hierarchical database that has been in continuous production use since 1966. It powers the majority of the world's large-scale healthcare IT infrastructure — Epic Systems, MEDITECH, the U.S. Department of Veterans Affairs' VistA system, and many others collectively manage hundreds of millions of patient records in MUMPS databases. M is implemented by several runtimes: InterSystems IRIS (commercial), YottaDB (open source, the foundation used here), GT.M (the open-source ancestor of YottaDB), and historically by several other vendors. + +Despite this operational scale and longevity, the developer experience around M has received comparatively little investment in tooling. The language itself predates virtually every modern software development practice: unit testing, continuous integration, static analysis, code coverage, package management, and automated formatting all emerged decades after M was in widespread use. As a result, the ecosystem of developer productivity tools that mainstream language communities take for granted simply does not exist in the M world. + +### The Core Problem + +A developer arriving at an M codebase from Python, Go, JavaScript, Rust, or Java faces a jarring regression in developer experience. The gap is not merely cosmetic — it affects every stage of the development lifecycle: + +**Edit:** No formatter exists. Code style is enforced only by convention and discipline. There is no equivalent of `black`, `gofmt`, or `prettier` to keep a codebase consistent without manual effort. + +**Lint:** The only available static check is syntax validation (`zcompile`). There is no analysis of logic errors, unused variables, unreachable code, missing QUIT statements, undefined labels, or style violations. Python's `ruff`/`pylint`, Go's `golangci-lint`, and Rust's `clippy` all catch categories of bugs before runtime; M has nothing comparable. + +**Test:** A test framework (`TESTRUN.m`) exists in this project, but the tooling around it is primitive. There is no way to run a single test case without running the entire suite, no coverage measurement, no test history, and no parallel execution. The `make watch` command reruns *all* tests on every file save — a workflow that degrades as the test suite grows. + +**Debug:** M has built-in debugging commands (`ZBREAK`, `ZSTEP`, `ZSHOW`) but they are interactive and require entering the runtime manually. There is no scriptable debugger, no conditional breakpoint wrapper, and no integration with any IDE debugger protocol. + +**Observe:** The integrated database is both a strength and an observability challenge. Globals are persistent and shared across processes, which makes it easy to accidentally carry test state between runs. There is no tool to snapshot the database state before a test, compare it after, or reliably reset it to a known fixture. The trace log (`^trace`) exists but cannot be tailed live. + +**Integrate:** There are no pre-commit hooks, no CI pipeline script, no coverage gate, and no automated quality check that runs before code is committed or deployed. + +### Why This Matters + +The consequence of this tooling gap is not merely inconvenience. It means that: + +1. **Bugs that would be caught automatically in other ecosystems reach manual testing** — or production. A Go developer's `go vet` or a Python developer's `mypy` catches entire categories of errors before a single test runs. In M, these categories are only caught when the faulty code path is manually exercised. + +2. **The feedback loop is slower and more manual.** A Rust developer running `cargo watch -x test` gets sub-second feedback on every save. An M developer runs `make test`, waits for all 11 suites, and manually reads output. As the codebase grows, this degrades. + +3. **Onboarding new developers is harder.** Modern languages have toolchains that enforce consistency and provide guardrails. M has neither, so new developers must learn conventions that are undocumented and unenforced. + +4. **The barrier to contribution is higher.** Open-source projects with good tooling (formatters, lint gates, coverage requirements) attract more contributors because the bar for a "correct" contribution is clear and automatically checkable. + +### The Strategic Opportunity + +YottaDB's runtime is mature, performant, and POSIX-compliant. The runtime provides powerful hooks — `%XCMD` for one-shot execution, `$ZHOROLOG` for microsecond timing, `ZSHOW` for full process introspection, `mupip extract` for database export, and a straightforward routine compilation model. These are the building blocks of a complete developer toolchain. What is missing is the shell toolchain layer that assembles these primitives into a coherent, ergonomic developer experience comparable to what Python, Go, and Rust developers have. Because YottaDB is open source, every layer of this toolchain is reproducible without licence negotiation — a property no closed-source M runtime can offer. + +This document surveys what currently exists, maps the complete gap against the toolchains of the five most popular programming languages (see [Appendix B](#appendix-b-gold-standard--top-5-language-toolchains) for the per-language reference tables), and proposes a prioritized roadmap of shell tools that can be built now using existing YottaDB capabilities. + +--- + +## 2. Comprehensive Gap Analysis + +This chapter maps every significant developer toolchain category against four reference points: what the gold standard provides (synthesized from the toolchains of Python, JavaScript/TypeScript, Go, Rust, and Java — see [Appendix B](#appendix-b-gold-standard--top-5-language-toolchains) for the per-language tables), what YottaDB ships with natively (see [Appendix C](#appendix-c-what-ships-with-yottadb-foundation-runtime)), what this project has built (see [implementation.md](implementation.md) for the live status), and the remaining gap with severity. + +**Severity key:** 🔴 Major gap (daily pain) · 🟡 Moderate gap (occasional friction) · 🟢 Minor gap or N/A + +**Status legend:** ✅ shipped (Tier 1–3) · 🟢 unblocked (parser foundation now exists in [`tree-sitter-m`](https://github.com/rafael5/tree-sitter-m); Tier 4 tool not yet built) · ⏸ deferred (no parser dep; awaiting demand) · 🟢/🟡/🔴 = original severity + +| Category | Gold Standard | YDB Native | This Project | Original Sev | Status | +|----------|--------------|------------|--------------|--------------|--------| +| **Syntax check** | Per-file, fast, exit-code | `zcompile` via `%XCMD` | `ycheck` | 🟢 | ✅ shipped (with known exit-code bug — see TODO.md) | +| **Interactive REPL** | History, completion, multiline | `ydb` direct mode (bare) | `yeval` (single expression) | 🟡 | ⏸ Tier 4 (`yrepl` — needs prompt_toolkit) | +| **Lint — style** | Configurable style rules | Nothing | Nothing | 🔴 | 🟢 unblocked (tree-sitter-m AST visitor) | +| **Lint — logic** | Unused vars, unreachable code, missing returns | Nothing | Nothing | 🔴 | 🟢 unblocked (tree-sitter-m + CFG analysis) | +| **Lint — deep** | Data flow, type errors, null safety | Nothing | Nothing | 🔴 | 🟢 unblocked (tree-sitter-m + whole-program call graph) | +| **Auto-formatter** | Zero-config, deterministic | Nothing | Nothing | 🔴 | 🟢 unblocked (tree-sitter-m CST pretty-printer) | +| **Type checking** | Full static type analysis | N/A (untyped language) | N/A | 🟢 | N/A by language design | +| **Run all tests** | `make test` / `cargo test` | Nothing | `make test` | 🟢 | ✅ pre-existing | +| **Run one suite** | Select by name/path | Nothing | `ytest ` | 🔴 | ✅ Tier 1 | +| **Run one test** | Select individual test case | Nothing | `ytest