-
Notifications
You must be signed in to change notification settings - Fork 0
Branch main
title: main branch summary: "The only branch: the published MSL.Lexi regex lexer with the VocabularyBuilder API, maximal-munch scanning, and Math and SQL-like sample parsers." tags: [lexi, branch, main, lexer, tokenizer, regex, csharp] created: 2026-06-24 status: draft dotnet: [net6.0, net7.0, net8.0] build-status: builds
main is the only branch — the whole project. It is a regex-driven, allocation-light lexer for .NET. You declare a vocabulary of Match and Ignore regex patterns mapped to integer token ids with VocabularyBuilder, then call Lexer.NextMatch to pull tokens left to right using maximal-munch — longest match, lowest index on a tie. It is built to feed simple recursive-descent parsers, demonstrated by the bundled math and predicate sample parsers and their REPLs.
The tokenization pipeline described in Architecture (source: Lexi):
-
LexerandVocabularyBuilder— the scanner and its fluent configuration. -
Pattern,Symbol,Source,MatchResult— the ref-struct token and scanning types. -
CommonPatterns— the reusable identifier, literal, and whitespace regexes.
Builds clean with the installed .NET SDK — dotnet build restored and compiled all three target frameworks (net6.0, net7.0, net8.0) with no warnings or errors; tests were not run. Last touched June 2024. It is finished and stable: a published NuGet package (MSL.Lexi v2.2.2), with a test suite, CI workflows, and two working sample parsers. The one loose end is a // todo in CommonPatterns.CharacterLiteral noting the char-literal pattern does not yet handle escape sequences.
- Architecture — the tokenization pipeline this branch implements.