Skip to content

Enhancement: Interactive Shell with Syntax Highlighting#580

Merged
tnaum-ms merged 18 commits intorel/release_0.8.0from
dev/tnaum/query-syntax-highlight-shell
Apr 15, 2026
Merged

Enhancement: Interactive Shell with Syntax Highlighting#580
tnaum-ms merged 18 commits intorel/release_0.8.0from
dev/tnaum/query-syntax-highlight-shell

Conversation

@tnaum-ms
Copy link
Copy Markdown
Collaborator

@tnaum-ms tnaum-ms commented Apr 14, 2026

Summary

Adds real-time syntax highlighting to the interactive shell input line. As the user types commands like db.users.find({ age: { $gt: 25 } }), the terminal renders keywords, strings, numbers, operators, BSON constructors, and $-prefixed query operators in distinct ANSI colors.

Approach

Reuses the JavaScript Monarch tokenizer rules from Monaco Editor (vendored + extended), running them in the extension host via a lightweight state-machine executor. Zero new dependencies.

What's Included

Work Item File(s) Description
WI-1 monarchRules.ts Vendored JS Monarch tokenizer rules extended with BSON constructors, shell commands, and $-prefixed operator categories
WI-2 monarchRunner.ts Lightweight Monarch state-machine executor (no Monaco/DOM dependency)
WI-3 tokenColorizer.ts Token-to-ANSI 16-color mapper matching ShellOutputFormatter palette
WI-4 ShellInputHandler.ts, DocumentDBShellPty.ts Full-line re-rendering via reRenderLine(), replacing per-character echo; colorize callback wiring
Convenience colorizeShellInput.ts Single-call wrapper combining tokenizer + colorizer

Color Mapping

Color ANSI Applied To
Cyan 36 JS keywords, BSON constructors
Green 32 Strings
Yellow 33 Numbers, $-operators, escape sequences
Magenta 35 Shell commands (show, use, it, etc.)
Gray 90 Comments
Red 31 Regex literals, unterminated strings
Default Identifiers, delimiters, brackets

Gating

Highlighting is gated by the existing documentDB.shell.display.colorOutput setting. When disabled, the colorize callback returns the input unchanged (no ANSI codes emitted).

Tests

  • 80 new tests across 3 test suites (monarchRunner, tokenColorizer, shellHighlighter integration)
  • Existing ShellInputHandler.test.ts and DocumentDBShellPty.test.ts tests updated and passing
  • Manual test plan included in docs/ai-and-plans/shell-syntax-highlighting/0-plan.md

Plan

See docs/ai-and-plans/shell-syntax-highlighting/0-plan.md for the full implementation plan including deviations and alternatives analyzed.

Create monarchRules.ts with JavaScript Monarch tokenizer rules extracted
from Monaco Editor, extended with DocumentDB-specific token categories:
- BSON constructors (ObjectId, ISODate, NumberLong, etc.)
- Shell commands (show, use, it, exit, etc.)
- $-prefixed query/aggregation operators ($gt, $match, etc.)

Regex patterns are inlined directly rather than using @name references,
eliminating the need for runtime regex source string resolution.

Rule types use named properties instead of positional tuples for clarity.
Implement tokenize() function that runs Monarch tokenizer rules against
a plain string and returns an array of TokenSpan objects. Features:
- State stack management with max depth guard (32)
- Regex anchoring via sticky flag with WeakMap cache
- Case-based lookups into named arrays (keywords, bsonConstructors, etc.)
- Group-based actions for multi-capture-group rules
- Adjacent token merging for cleaner output
- Single-entry memoization for cursor-only movements

45 tests covering: keywords, strings, template literals, numbers (int,
float, hex, octal, binary), comments (line, block, JSDoc), regex
literals, BSON constructors, DocumentDB operators, shell commands,
mixed expressions, identifiers, delimiters, operators, escapes, caching.
Implement colorizeInput() that converts token spans to ANSI-colorized
strings for terminal display. Color palette matches ShellOutputFormatter:
- Keywords, BSON constructors: Cyan
- Strings: Green
- Numbers, DocumentDB operators, escape sequences: Yellow
- Comments: Gray
- Shell commands: Magenta
- Regex, invalid strings: Red
- Identifiers, delimiters: uncolored (default)

21 tests covering all token types, uncolored tokens, gaps, trailing
text, and full-line integration.
Replace per-character ANSI echo with centralized reRenderLine() method:
- Add optional 'colorize' callback to ShellInputHandlerCallbacks
- Store _promptWidth for cursor positioning
- reRenderLine() moves to input start, writes colorized buffer, erases
  leftover chars, repositions cursor
- Simplify all buffer mutation methods (insertCharacter, handleBackspace,
  handleDelete, clearBeforeCursor, clearAfterCursor, deleteWordBeforeCursor,
  replaceLineWith) to just update buffer/cursor then call reRenderLine()
- Cursor-only movements (moveCursorLeft/Right/To, wordLeft/Right) unchanged
- Update 2 test assertions from exact write output to buffer-based checks

All 50 ShellInputHandler tests + 30 DocumentDBShellPty tests pass.
- Create colorizeShellInput() convenience function that combines
  tokenizer and colorizer in one call
- Wire up colorize callback in DocumentDBShellPty constructor, gated
  by the existing documentDB.shell.display.colorOutput setting
- Add 12 integrated end-to-end tests covering: keyword highlighting,
  partial keywords, strings, BSON constructors, DocumentDB operators,
  shell commands, numbers, backspace re-highlighting, history recall,
  Ctrl+U clear, mixed expressions, disabled colorize

All 9 shell test suites (253 tests) pass.
- Add file-level eslint-disable for no-useless-escape in monarchRules.ts
  (vendored regex patterns from Monaco preserved as-is)
- Regenerate l10n bundle
- Mark all completion checklist items as done in plan
@tnaum-ms tnaum-ms changed the title Dev/tnaum/query syntax highlight shell Enhancement: Interactive Shell with Syntax Highlighting Apr 14, 2026
@tnaum-ms tnaum-ms added this to the 0.8.0 milestone Apr 14, 2026
…de-documentdb into dev/tnaum/query-syntax-highlight-shell

# Conflicts:
#	src/documentdb/shell/DocumentDBShellPty.ts
#	src/documentdb/shell/ShellInputHandler.ts
@tnaum-ms tnaum-ms marked this pull request as ready for review April 15, 2026 11:20
@tnaum-ms tnaum-ms requested a review from a team as a code owner April 15, 2026 11:20
@tnaum-ms tnaum-ms requested a review from Copilot April 15, 2026 11:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds real-time syntax highlighting for the DocumentDB interactive shell input by introducing a lightweight Monarch tokenizer runner, a token→ANSI colorizer, and switching input editing to a full-line re-render approach.

Changes:

  • Added Monarch-based tokenization + ANSI token colorization pipeline for shell input.
  • Updated ShellInputHandler to re-render the full line on buffer edits (enables highlighting + consistent cursor positioning).
  • Added unit + integration tests covering tokenization, colorization, and end-to-end shell typing behavior.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/documentdb/shell/shellHighlighter.test.ts New integrated tests validating highlighting via ShellInputHandler + real colorizer pipeline
src/documentdb/shell/highlighting/tokenColorizer.ts New token-span → ANSI colorized string mapper
src/documentdb/shell/highlighting/tokenColorizer.test.ts Unit tests for token→ANSI mapping behavior
src/documentdb/shell/highlighting/monarchRunner.ts New lightweight Monarch state machine executor + caching
src/documentdb/shell/highlighting/monarchRunner.test.ts Unit tests for tokenizer correctness + caching behavior
src/documentdb/shell/highlighting/monarchRules.ts Vendored/extended Monarch rules (JS + DocumentDB token categories)
src/documentdb/shell/highlighting/colorizeShellInput.ts Convenience wrapper combining tokenize + colorize
src/documentdb/shell/ShellInputHandler.ts Adds optional colorize, prompt width tracking, and centralized reRenderLine()
src/documentdb/shell/ShellInputHandler.test.ts Updates assertions to account for re-render output format
src/documentdb/shell/DocumentDBShellPty.ts Wires shell input colorization behind the existing color output setting
docs/ai-and-plans/shell-syntax-highlighting/0-plan.md Design/implementation plan and manual test checklist for the feature

Comment thread src/documentdb/shell/highlighting/monarchRunner.ts
Comment thread src/documentdb/shell/highlighting/monarchRunner.ts
Comment thread src/documentdb/shell/highlighting/monarchRules.ts Outdated
Comment thread src/documentdb/shell/DocumentDBShellPty.ts
The regexpesc pattern contained a trailing space in the second alternative
([...] |c[A-Z]), requiring a space after escaped regex control characters.
This caused valid regex escapes like \( or \[ to fall through to the
regexp.invalid rule when not followed by a space.

Addresses PR review I-01 and Copilot comment C-02.
The comment stated that @name references are resolved by the executor,
but all references were inlined directly per the documented deviation.
Updated to reflect the actual implementation.
The memoization cache only compared input strings but ignored the rules
parameter. Calling tokenize() with different rules but the same input
would incorrectly return a cached result from the previous ruleset.

Added a cachedRules reference check (rules === cachedRules) so the cache
is only hit when both input and rules match.

Addresses PR review I-03 and Copilot comments C-01.
Renamed snake_case function to camelCase per TypeScript naming conventions
and the project's ESLint configuration.
Added a comment noting that key order in actionCases objects matters
because resolveCases iterates Object.entries() and returns the first
matching array. Reordering keys would silently change token classification
priority (e.g., shellCommands must be checked before keywords).
Fixes two issues with the full-line re-render approach:

I-10: reRenderLine() now handles wrapped input. When prompt + buffer
exceeds the terminal width, the cursor is moved up to the prompt row
before the carriage return. Uses \x1b[J (erase to end of screen) instead
of \x1b[K (erase to end of line) to clear leftover wrapped rows.
Cursor repositioning computes both row and column offsets.

I-11: Cursor math now uses terminalDisplayWidth() instead of
String.length, correctly handling CJK full-width characters (2 columns),
emoji, surrogate pairs, and combining marks.

Extracted terminalDisplayWidth() and isWideCharacter() from
ShellGhostText.ts into a shared module (terminalDisplayWidth.ts) so both
ShellGhostText and ShellInputHandler use the same logic.

Added setColumns() to ShellInputHandler; wired from DocumentDBShellPty's
open() and setDimensions() so terminal width is always in sync.
insertText() and replaceText() now use reRenderLine() instead of manual
ANSI echo sequences. This ensures syntax highlighting is applied when:
- Tab completion inserts text
- Ghost text is accepted
- Replacement completions (bracket notation, quoted field paths) are applied

rewriteCurrentLine() (used after showing a completion list) now delegates
to renderCurrentLine() which uses the colorize callback, so the re-drawn
buffer preserves highlighting.

Added renderCurrentLine() as a public method on ShellInputHandler for
PTY-controlled re-renders.
Marked all addressed issues as fixed with commit hashes in the summary
table. Added a Fix Log section documenting each commit's changes.
I-05 and I-13 deferred to follow-up work.
@tnaum-ms
Copy link
Copy Markdown
Collaborator Author

PR Review Fixes Applied

Addressed all actionable issues from the PR review and all 4 Copilot reviewer comments. All 398 shell tests pass, lint and prettier are clean.

Commits

Commit Issue(s) Summary
eba9286 I-01, C-02 Removed spurious space in regexpesc regex — was causing valid regex escapes like \( to mis-tokenize
cad3ae1 I-02 Fixed misleading comment about @name resolution (all references were inlined)
af0a427 I-03, C-01 Added cachedRules identity check to tokenizer cache
6c35419 I-04 Renamed input_indexOfinputIndexOf (camelCase convention)
50c0173 I-08 Documented key order dependency in actionCases objects
6c2e7e4 I-10, I-11 Made reRenderLine() wrap-aware (cursor-up, \x1b[J, row/col positioning). Extracted terminalDisplayWidth() to shared module; all cursor math now uses display width instead of String.length
63c8f4f I-12 Routed insertText(), replaceText(), and rewriteCurrentLine() through reRenderLine() so highlighting applies on every buffer mutation

Not addressed (deferred)

  • I-05 — Linear array scan for keyword lookup (not blocking; O(n) on 47 items is fine for <200 char input)
  • I-13 — Additional regression tests for wrapped lines, Unicode width, and completion redraw (separate PR)

@tnaum-ms tnaum-ms merged commit a430900 into rel/release_0.8.0 Apr 15, 2026
2 checks passed
@tnaum-ms tnaum-ms deleted the dev/tnaum/query-syntax-highlight-shell branch April 15, 2026 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

2 participants