Summary
A standalone MyST Markdown linting tool that validates syntax and provides deterministic error detection for MyST documents. This tool would support the translation sync pipeline and can be used independently for quality assurance across all QuantEcon lecture repositories.
Background
During evaluation of the action-translation-sync tool (v0.6.0), human reviewer @HumphreyYang identified markdown syntax errors in translator output that were not caught:
- Missing space after
#### in headings (e.g., ####Title instead of #### Title)
- Incorrect code block delimiters
- Math block delimiter mismatches
While we have added LLM-based syntax checking to prompts as a first line of defense, a deterministic linting tool would provide:
- 100% reliable detection (vs ~90% from LLM)
- No API costs
- Fast validation
- CI/pre-commit integration
Proposed Solution
Build myst-lint as a standalone tool that wraps markdownlint (5.3M monthly downloads, actively maintained) with MyST-specific extensions.
Architecture
┌─────────────────────────────────────────────────────────────┐
│ myst-lint │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌─────────────────────────────────┐ │
│ │ markdownlint │ + │ MyST Custom Rules (optional) │ │
│ │ (core rules) │ │ - directive validation │ │
│ │ - MD018 (ATX) │ │ - role validation │ │
│ │ - MD031 (code) │ │ - math delimiter matching │ │
│ │ - MD040 (lang) │ │ - code-cell validation │ │
│ └─────────────────┘ └─────────────────────────────────┘ │
│ │
├─────────────────────────────────────────────────────────────┤
│ CLI: myst-lint <file.md> [options] │
│ API: import { lintMyST } from "myst-lint" │
└─────────────────────────────────────────────────────────────┘
Integration Points
Source Repo ──► [myst-lint] ──► Valid input guaranteed
│
▼
Translator ──► [myst-lint] ──► Block sync if errors
│
▼
Target Repo PR ◄── Sync ◄── Valid translated content
│
└──► [Evaluator] ──► Failsafe LLM check (advisory)
- Source validation: Lint English source before translation (assume valid input)
- Output validation: Lint translator output before sync (catch translator-introduced errors)
- CI integration: Pre-commit hooks for lecture repositories
- Evaluator failsafe: LLM-based check as backup (already implemented)
Core markdownlint Rules (Built-in)
These rules work out of the box with MyST:
| Rule |
Description |
Catches |
| MD018 |
No space after hash on ATX heading |
####Title → error |
| MD031 |
Fenced code blocks surrounded by blank lines |
Structure issues |
| MD040 |
Fenced code blocks should have language |
Missing language spec |
| MD047 |
Files should end with single newline |
Formatting |
MyST Custom Rules (Phase 2)
Extend markdownlint with MyST-specific validation:
| Rule |
Description |
| myst-directive-known |
Validate {directive-name} is recognized |
| myst-directive-options |
Check :option: value syntax |
| myst-role-syntax |
Validate {role}\target`` format |
| myst-math-delimiters |
Check $$ pairs are balanced |
| myst-code-cell-tags |
Validate code-cell tag syntax |
Implementation Plan
Phase 1: Core Tool (MVP)
- Wrap markdownlint with MyST-friendly config
- Disable rules that conflict with MyST syntax
- CLI and programmatic API
- Integration with
action-translation-sync
Phase 2: MyST Custom Rules
- Implement directive/role validation
- Math delimiter checking
- Code-cell validation
Phase 3: CI Integration
- Pre-commit hooks
- GitHub Actions workflow
- VS Code extension recommendations
Technical Details
Base Package: markdownlint v0.39.0
- 5.3M monthly downloads
- Actively maintained (last release Oct 2025)
- Supports custom rules via micromark parser
- Works with MyST out of the box (no false positives on directives)
Testing Performed:
$ echo "####NoSpaceHeading" | npx markdownlint-cli --stdin
stdin:1:1 MD018/no-missing-space-atx No space after hash on atx style heading
MyST directives ({note}, {code-cell}, etc.) are correctly ignored as valid code blocks.
Priority
MEDIUM - The LLM-based syntax checking provides good coverage now. This tool would provide deterministic guarantees and enable broader use across the organization.
Related
Summary
A standalone MyST Markdown linting tool that validates syntax and provides deterministic error detection for MyST documents. This tool would support the translation sync pipeline and can be used independently for quality assurance across all QuantEcon lecture repositories.
Background
During evaluation of the
action-translation-synctool (v0.6.0), human reviewer @HumphreyYang identified markdown syntax errors in translator output that were not caught:####in headings (e.g.,####Titleinstead of#### Title)While we have added LLM-based syntax checking to prompts as a first line of defense, a deterministic linting tool would provide:
Proposed Solution
Build
myst-lintas a standalone tool that wraps markdownlint (5.3M monthly downloads, actively maintained) with MyST-specific extensions.Architecture
Integration Points
Core markdownlint Rules (Built-in)
These rules work out of the box with MyST:
####Title→ errorMyST Custom Rules (Phase 2)
Extend markdownlint with MyST-specific validation:
{directive-name}is recognized:option: valuesyntax{role}\target`` format$$pairs are balancedImplementation Plan
Phase 1: Core Tool (MVP)
action-translation-syncPhase 2: MyST Custom Rules
Phase 3: CI Integration
Technical Details
Base Package: markdownlint v0.39.0
Testing Performed:
MyST directives (
{note},{code-cell}, etc.) are correctly ignored as valid code blocks.Priority
MEDIUM - The LLM-based syntax checking provides good coverage now. This tool would provide deterministic guarantees and enable broader use across the organization.
Related