v0.15.0
Added — extensions declare diagnostic codes; the host validates [diagnostics.rules] against them (#659)
Extension-emitted [diagnostics.rules] entries (<namespace>.<code>) are now schema-validated against the resolved registry, so a misspelled or undeclared code no longer silently retunes nothing.
- Schema declares codes.
lex_extension::schema::Schemagains an optionaldiagnosticslist — each entry carriescode, an optionaldescription, and adefault_severity(defaulting towarning). The field is additive: schemas that omit it load with no declared codes, and built-inlex.*schemas declare none (they surface diagnostics throughlex-analysis, not the extension code path). - Registry accessor.
Registry::declared_diagnostic_codes(namespace)aggregates and de-dupes the codes a namespace's schemas declare, returningNonefor an unregistered namespace so callers can distinguish "unknown namespace" from "known namespace, undeclared code". - Host validation.
lex_fmt::validate_extension_diagnostic_rulesclassifies each rule key: an unregistered namespace passes (rules may be staged ahead of installing the extension), a declared code passes, and an undeclared code under a registered namespace is reported as a dead letter with a closest-match "did you mean … ?" suggestion and the list of declared codes. The LSP runs this after registry boot and surfaces findings viawindow/showMessage.
Fixed — paragraph lines split only by indentation no longer merge on format (#699)
A paragraph whose continuation lines were merely more-indented (alignment / hanging indent) was split into separate sibling paragraphs by the parser, then re-merged into one paragraph after formatting normalized the indent — a silent semantic change across a round-trip. Such hanging-indent continuations now fold back into the paragraph at parse time (real blank-line breaks are preserved).
Changed — removed the open-form data marker; unrecognized :: label lines are kept as text (#700)
There was never a real "open form" of a data marker. A :: label with no closing :: was classified as a distinct token that no grammar rule consumed, so the parser silently dropped such lines (and a definition whose sole body was one collapsed into a paragraph). Following Lex's rule that anything unrecognized becomes a paragraph — be forgiving, never lose content — these lines now classify as paragraph text.
- New diagnostic. An
unclosed-annotationWarning flags a paragraph line shaped like:: label …with no closing::, so authors know it looks like metadata but is treated as content. Configurable via[diagnostics.rules]. LineType::DataLineand its classifier are removed. Closed-form:: label ::(annotations, verbatim closings) is unchanged.
Fixed — table column alignment is read from the markdown separator row (#702)
The separator row's colon hints (:--- left, ---: right, :---: center) were detected only to be discarded; alignment was sourced solely from the :: table align=… :: parameter. Markdown-style aligned tables now keep their alignment across a format round-trip. The explicit align= parameter still overrides the separator row.
Fixed — annotation parameters keep their comma separator on format (#703)
The Lex serializer joined annotation parameters with a space instead of a comma, so :: warning type=critical, id=123 :: re-serialized as :: warning type=critical id=123 :: and re-parsed to a single parameter. Parameters now re-emit comma-separated.
Changed — extension diagnostic codes carry their namespace on the wire (#657)
Follow-up to #636: [diagnostics.rules] now accepts extension-emitted codes (acme.task-due-date-missing, mit.plasma-specs.invalid-version) alongside built-ins, configured identically.
- Wire-format change (breaking).
DiagnosticKind::Handler::code()returns the namespace-prefixed form: a handler emittingcode: Some("foo")under namespaceacmeproduces wirecode = "acme.foo"(previously bare"foo"). When the handler omits a code, the fallback is per-namespace"acme.diagnostic"rather than the old global literal"handler.diagnostic". Any consumer matching on the bare wirecodefor an extension diagnostic must update to the dotted form. [diagnostics.rules]extension keys.DiagnosticsRulesConfiggains anextra: BTreeMap<String, RuleConfig>map. Any key under[diagnostics.rules]that doesn't match a built-in field flows intoextravia#[serde(flatten)]; clapfig's strict-mode validator sees them as consumed (the same#[serde(flatten)]is pushed onto the confique-generatedLayerfield via#[config(layer_attr(...))]). Tradeoff: typo detection for built-in field names is sacrificed at this attribute level — a misspelledmissing_foototenow lands inextrainstead of erroring. Schema-based validation (deferred, follow-up issue) will restore typo detection once extension schemas declare their codes.lookup_by_codesemantics. Resolution order is named built-in field →extramap →None. Built-ins always win — a strayextraentry with a built-in code does not override the typed surface.apply_rulessemantics are unchanged:allowdrops the diagnostic,warnkeeps intrinsic severity,denyupgrades toError. Identical code path to built-ins; extension support is purely a lookup-table extension.- Lenient. Any string key is accepted into
extrawithout further validation. Entries that never match anything are harmless (ESLint / Clippy convention).
Added — configurable diagnostic rules via [diagnostics.rules] (#636)
Closes the v1 loop on the diagnostic-configuration system. .lex.toml gains a [diagnostics.rules] block with one field per built-in diagnostic code, and the LSP server honours those rules when publishing diagnostics.
- Configuration surface.
DiagnosticsConfiggains arules: DiagnosticsRulesConfignested struct. Each field carries its description as a doc comment and its intrinsic severity as the#[config(default)]. Schema-validation codes nest under[diagnostics.rules.schema].lexd config genemits the full annotated catalog automatically. - Severity verbs. Each code accepts
"allow"(suppress emission),"warn"(keep intrinsic LSP severity), or"deny"(upgrade toError). - Code centralisation.
DiagnosticKind::code()andSchemaValidationKind::code()return the on-the-wire diagnostic code (formerly hard-coded insidelex-lsp::to_lsp_diagnostic).DiagnosticsRulesConfig::lookup_by_coderesolves codes → rule entries. - Runtime wiring. A new
apply_rulesfunction inlex-analysisfilters and remaps diagnostics. The LSP server reads[diagnostics.rules]from.lex.tomland applies the registry to every analysis pass before publishing — editor squiggles honour the configuration immediately. - Drift test. A test in
lex-analysisiterates every built-inDiagnosticKindvariant and assertslookup_by_code(kind.code())returnsSome, so adding a new diagnostic without a matching config field fails CI. - Breaking. The previous
diagnostics.spellcheck = boolknob is replaced by[diagnostics.rules].spellcheck = "warn" | "allow" | "deny". Existing.lex.tomlfiles using the boolean form fail strict-key validation and must migrate. - Out of scope for v1. Extension-emitted codes (
<namespace>.<code>) pass through untouched until theextramap surface lands. Per-document and per-region annotation overrides (v2 / v3 of #636) ship in follow-up work. CLIlexd lintrendering is its own work-stream.
Added — lex-extension-host::GitFetcher real shell-out implementation (#650)
The git transport is no longer a stub. GitFetcher::fetch shells out to git clone --depth=1 to populate the destination directory. Honors uri.rev as --branch <ref> (branch or tag) and uri.subdir to extract a subdirectory of the repo as the schema root. The .git/ directory is stripped after clone — the cache only holds schema content.
Both registered schemes (git: and git+ssh:) route to this fetcher. URL forms accepted are whatever git clone accepts: https://...git, git@host:owner/repo.git, file:///path/to/bare, plus git+ssh://... (preserved verbatim — git treats it as a synonym for ssh://). Spec §3.3 / §6.3 cover the URL surface and the choice to shell out rather than embed libgit2.
No new dependencies — std::process::Command is the entire surface. Auth is whatever git clone would honor at the command line (SSH agent, OS keychain credential helpers, gh auth setup-git, gitconfig-declared SSO providers); there is no Lex-side credential knob. GIT_TERMINAL_PROMPT=0 is set on the spawned process so a missing credential helper surfaces as a clean error rather than blocking the boot path. git must be in PATH; if it's missing the fetcher returns FetchError::Other with an actionable message pointing at the path: / --ext-schema escape hatches.
Git's stderr is classified into typed FetchError variants:
FetchError::Network— connectivity failures (DNS, connection refused/timeout, unreachable).FetchError::UpstreamStatus— auth-shaped failures (permission denied, authentication failed, repository not found — the github/gitlab APIs use the last as a private-repo not-authorised signal too).FetchError::Other— everything else, carrying git's raw stderr verbatim (unknown ref, corrupted upstream, "not a git repository", etc.).
is_immutable_rev returns true for SHA-shaped refs (^[0-9a-f]{7,40}$) and tag-shaped refs (optional v prefix + <digits>.<digits> + optional suffix). The cache treats these as cacheable indefinitely; branch names and None are mutable and expire after the 24-hour TTL.
What this enables in lex.toml:
[labels.X] git = "git@internal.example.com:docs/lex-labels.git"— private repos work end-to-end, inheriting the user's git credential setup.- The
via = "git"knob ongithub:/gitlab:URL templates — private-repo path for the forge shorthands. - Self-hosted git over any transport git understands (HTTPS, SSH, git://, file://).
Spec §11.2 mirror/fallback URLs are intentionally out of scope.
Added — lex-extension-host::HttpsFetcher real network implementation (#649)
The HTTPS transport is no longer a stub. HttpsFetcher::fetch performs a single HTTPS GET (sync, via ureq with rustls + webpki-roots), detects the archive format from Content-Type with URL-extension fallback, and extracts tar.gz or zip archives into the destination directory. Honors uri.subdir for archives that wrap content in a top-level directory (the GitHub tarball API does this).
Path-traversal defence at the extraction layer: archive members with .. components or absolute paths are rejected; tarball symlink/hardlink entries and zip symlinks (detected via S_IFLNK in unix_mode) are skipped on both archive paths (schema directories are pure data, allowing archive-shipped symlinks would expand the trust surface). Response size capped at 256 MiB to defend against pathological servers, with a separate 64 KiB cap on 4xx/5xx error-response bodies so a hostile server can't OOM us on the diagnostic path either. Connect (30s) and read (120s) timeouts on the ureq agent so a stalled upstream can't hang the resolver indefinitely. Successful response bodies are streamed to a temp file rather than buffered into memory, so peak resident memory stays bounded even at the 256 MiB cap. subdir matching uses a component-windowed search, so nested paths (subdir = "src/labels") work as well as single-component ones.
New deps: ureq (sync HTTP client, tokio-free), flate2 + tar (gzipped tarballs), zip (zip archives). All gated behind the new https-fetcher cargo feature on lex-extension-host (default-on for lex-cli, lex-lsp, and lex-fmt, off for wasm builds where the underlying ring/getrandom 0.2 chain doesn't compile to wasm32-unknown-unknown). With the feature off, HttpsFetcher::fetch returns FetchError::Unimplemented. All deps sit on the resolver path only, so consumers that don't use remote namespaces don't pay the cost at boot.
The header knob for Authorization / custom header pass-through (spec §6.2) is plumbing-ready in the fetcher but not yet exposed via lex-config; that's the follow-up tracked in #651.
Changed — lex-extension-host resolver factored into transports + URL templates (#648)
Restructures the namespace resolver to match the new extending-lex-stores.lex companion spec. The previous model registered four peer fetchers (GithubFetcher, GitlabFetcher, HttpsFetcher, GitSshFetcher); the new model registers two transport fetchers (HttpsFetcher, GitFetcher) covering three schemes (https, git, git+ssh) and adds a URL-template layer (github:, gitlab:) that expands forge shorthands into transport URIs before dispatch.
Public-API changes (visible to direct lex-extension-host consumers):
- Removed:
resolve::fetcher::GithubFetcher,resolve::fetcher::GitlabFetcher. - Renamed:
resolve::fetcher::GitSshFetcher→resolve::fetcher::GitFetcher. The renamed fetcher claims bothgit:andgit+ssh:schemes. - Added:
resolve::ResolveError::UnknownSchemegains ascheme: Stringfield that names the actual missing transport (after template expansion, if any) for clearer diagnostics.
No behaviour change for end users: every stub still returns FetchError::Unimplemented; per-transport network implementations are tracked at #562 (now rescoped from "implement four fetchers" to "implement two fetchers + two templates").
Added — reference anchoring in HTML / Markdown serializers (references-general.lex §2.3)
The babel HTML and Markdown serializers now honour Lex's implicit reference anchors instead of always linking a bracketed reference to itself.
- Inline word anchor (§2.3.1). A link-like inline reference (
Url/File/Session/General) wraps its anchored word — the preceding word by default, or the following word when the reference is first on the line — and the bracketed reference no longer renders as literal[...]text.the project website [https://lex.ing] today→ HTMLthe project <a href="https://lex.ing">website</a> today, Markdownthe project [website](https://lex.ing) today. - Whole-element anchor (§2.3.2). A reference line targeting an element's head line wraps that head line in the link: session title (
<h2><a …>Title</a></h2>), list item (<li><a …>Water</a></li>), definition term and verbatim subject (trailing colon excluded), and a plain paragraph line. The reference line itself emits no separate output. - Self-link (§2.3.2). A reference line with no element directly above renders as a standalone link of its own text, spliced into the document at its source position.
- Marker-style references unchanged (§2.3.4). Footnotes
[1], citations[@key], and annotation references[::label]keep their existing marker rendering and are never given a word or whole-element anchor.
Anchors are read from lex-core's authoritative resolution (ReferenceInline.word_anchor and Document::reference_lines()); the previous in-babel anchor heuristic (common/links.rs) is removed. IR Verbatim gains a subject_href field carrying the verbatim-subject link through to the serializers.