Enhancement: Interactive Shell with Syntax Highlighting#580
Merged
tnaum-ms merged 18 commits intorel/release_0.8.0from Apr 15, 2026
Merged
Enhancement: Interactive Shell with Syntax Highlighting#580tnaum-ms merged 18 commits intorel/release_0.8.0from
tnaum-ms merged 18 commits intorel/release_0.8.0from
Conversation
Create monarchRules.ts with JavaScript Monarch tokenizer rules extracted from Monaco Editor, extended with DocumentDB-specific token categories: - BSON constructors (ObjectId, ISODate, NumberLong, etc.) - Shell commands (show, use, it, exit, etc.) - $-prefixed query/aggregation operators ($gt, $match, etc.) Regex patterns are inlined directly rather than using @name references, eliminating the need for runtime regex source string resolution. Rule types use named properties instead of positional tuples for clarity.
Implement tokenize() function that runs Monarch tokenizer rules against a plain string and returns an array of TokenSpan objects. Features: - State stack management with max depth guard (32) - Regex anchoring via sticky flag with WeakMap cache - Case-based lookups into named arrays (keywords, bsonConstructors, etc.) - Group-based actions for multi-capture-group rules - Adjacent token merging for cleaner output - Single-entry memoization for cursor-only movements 45 tests covering: keywords, strings, template literals, numbers (int, float, hex, octal, binary), comments (line, block, JSDoc), regex literals, BSON constructors, DocumentDB operators, shell commands, mixed expressions, identifiers, delimiters, operators, escapes, caching.
Implement colorizeInput() that converts token spans to ANSI-colorized strings for terminal display. Color palette matches ShellOutputFormatter: - Keywords, BSON constructors: Cyan - Strings: Green - Numbers, DocumentDB operators, escape sequences: Yellow - Comments: Gray - Shell commands: Magenta - Regex, invalid strings: Red - Identifiers, delimiters: uncolored (default) 21 tests covering all token types, uncolored tokens, gaps, trailing text, and full-line integration.
Replace per-character ANSI echo with centralized reRenderLine() method: - Add optional 'colorize' callback to ShellInputHandlerCallbacks - Store _promptWidth for cursor positioning - reRenderLine() moves to input start, writes colorized buffer, erases leftover chars, repositions cursor - Simplify all buffer mutation methods (insertCharacter, handleBackspace, handleDelete, clearBeforeCursor, clearAfterCursor, deleteWordBeforeCursor, replaceLineWith) to just update buffer/cursor then call reRenderLine() - Cursor-only movements (moveCursorLeft/Right/To, wordLeft/Right) unchanged - Update 2 test assertions from exact write output to buffer-based checks All 50 ShellInputHandler tests + 30 DocumentDBShellPty tests pass.
- Create colorizeShellInput() convenience function that combines tokenizer and colorizer in one call - Wire up colorize callback in DocumentDBShellPty constructor, gated by the existing documentDB.shell.display.colorOutput setting - Add 12 integrated end-to-end tests covering: keyword highlighting, partial keywords, strings, BSON constructors, DocumentDB operators, shell commands, numbers, backspace re-highlighting, history recall, Ctrl+U clear, mixed expressions, disabled colorize All 9 shell test suites (253 tests) pass.
- Add file-level eslint-disable for no-useless-escape in monarchRules.ts (vendored regex patterns from Monaco preserved as-is) - Regenerate l10n bundle - Mark all completion checklist items as done in plan
…de-documentdb into dev/tnaum/query-syntax-highlight-shell # Conflicts: # src/documentdb/shell/DocumentDBShellPty.ts # src/documentdb/shell/ShellInputHandler.ts
Contributor
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adds real-time syntax highlighting for the DocumentDB interactive shell input by introducing a lightweight Monarch tokenizer runner, a token→ANSI colorizer, and switching input editing to a full-line re-render approach.
Changes:
- Added Monarch-based tokenization + ANSI token colorization pipeline for shell input.
- Updated
ShellInputHandlerto re-render the full line on buffer edits (enables highlighting + consistent cursor positioning). - Added unit + integration tests covering tokenization, colorization, and end-to-end shell typing behavior.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/documentdb/shell/shellHighlighter.test.ts | New integrated tests validating highlighting via ShellInputHandler + real colorizer pipeline |
| src/documentdb/shell/highlighting/tokenColorizer.ts | New token-span → ANSI colorized string mapper |
| src/documentdb/shell/highlighting/tokenColorizer.test.ts | Unit tests for token→ANSI mapping behavior |
| src/documentdb/shell/highlighting/monarchRunner.ts | New lightweight Monarch state machine executor + caching |
| src/documentdb/shell/highlighting/monarchRunner.test.ts | Unit tests for tokenizer correctness + caching behavior |
| src/documentdb/shell/highlighting/monarchRules.ts | Vendored/extended Monarch rules (JS + DocumentDB token categories) |
| src/documentdb/shell/highlighting/colorizeShellInput.ts | Convenience wrapper combining tokenize + colorize |
| src/documentdb/shell/ShellInputHandler.ts | Adds optional colorize, prompt width tracking, and centralized reRenderLine() |
| src/documentdb/shell/ShellInputHandler.test.ts | Updates assertions to account for re-render output format |
| src/documentdb/shell/DocumentDBShellPty.ts | Wires shell input colorization behind the existing color output setting |
| docs/ai-and-plans/shell-syntax-highlighting/0-plan.md | Design/implementation plan and manual test checklist for the feature |
The regexpesc pattern contained a trailing space in the second alternative ([...] |c[A-Z]), requiring a space after escaped regex control characters. This caused valid regex escapes like \( or \[ to fall through to the regexp.invalid rule when not followed by a space. Addresses PR review I-01 and Copilot comment C-02.
The comment stated that @name references are resolved by the executor, but all references were inlined directly per the documented deviation. Updated to reflect the actual implementation.
The memoization cache only compared input strings but ignored the rules parameter. Calling tokenize() with different rules but the same input would incorrectly return a cached result from the previous ruleset. Added a cachedRules reference check (rules === cachedRules) so the cache is only hit when both input and rules match. Addresses PR review I-03 and Copilot comments C-01.
Renamed snake_case function to camelCase per TypeScript naming conventions and the project's ESLint configuration.
Added a comment noting that key order in actionCases objects matters because resolveCases iterates Object.entries() and returns the first matching array. Reordering keys would silently change token classification priority (e.g., shellCommands must be checked before keywords).
Fixes two issues with the full-line re-render approach: I-10: reRenderLine() now handles wrapped input. When prompt + buffer exceeds the terminal width, the cursor is moved up to the prompt row before the carriage return. Uses \x1b[J (erase to end of screen) instead of \x1b[K (erase to end of line) to clear leftover wrapped rows. Cursor repositioning computes both row and column offsets. I-11: Cursor math now uses terminalDisplayWidth() instead of String.length, correctly handling CJK full-width characters (2 columns), emoji, surrogate pairs, and combining marks. Extracted terminalDisplayWidth() and isWideCharacter() from ShellGhostText.ts into a shared module (terminalDisplayWidth.ts) so both ShellGhostText and ShellInputHandler use the same logic. Added setColumns() to ShellInputHandler; wired from DocumentDBShellPty's open() and setDimensions() so terminal width is always in sync.
insertText() and replaceText() now use reRenderLine() instead of manual ANSI echo sequences. This ensures syntax highlighting is applied when: - Tab completion inserts text - Ghost text is accepted - Replacement completions (bracket notation, quoted field paths) are applied rewriteCurrentLine() (used after showing a completion list) now delegates to renderCurrentLine() which uses the colorize callback, so the re-drawn buffer preserves highlighting. Added renderCurrentLine() as a public method on ShellInputHandler for PTY-controlled re-renders.
Marked all addressed issues as fixed with commit hashes in the summary table. Added a Fix Log section documenting each commit's changes. I-05 and I-13 deferred to follow-up work.
Collaborator
Author
PR Review Fixes AppliedAddressed all actionable issues from the PR review and all 4 Copilot reviewer comments. All 398 shell tests pass, lint and prettier are clean. Commits
Not addressed (deferred)
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds real-time syntax highlighting to the interactive shell input line. As the user types commands like
db.users.find({ age: { $gt: 25 } }), the terminal renders keywords, strings, numbers, operators, BSON constructors, and$-prefixed query operators in distinct ANSI colors.Approach
Reuses the JavaScript Monarch tokenizer rules from Monaco Editor (vendored + extended), running them in the extension host via a lightweight state-machine executor. Zero new dependencies.
What's Included
monarchRules.ts$-prefixed operator categoriesmonarchRunner.tstokenColorizer.tsShellOutputFormatterpaletteShellInputHandler.ts,DocumentDBShellPty.tsreRenderLine(), replacing per-character echo;colorizecallback wiringcolorizeShellInput.tsColor Mapping
$-operators, escape sequencesshow,use,it, etc.)Gating
Highlighting is gated by the existing
documentDB.shell.display.colorOutputsetting. When disabled, thecolorizecallback returns the input unchanged (no ANSI codes emitted).Tests
ShellInputHandler.test.tsandDocumentDBShellPty.test.tstests updated and passingdocs/ai-and-plans/shell-syntax-highlighting/0-plan.mdPlan
See docs/ai-and-plans/shell-syntax-highlighting/0-plan.md for the full implementation plan including deviations and alternatives analyzed.