From ed28792504a57ae50efc2e53562320daee9387ca Mon Sep 17 00:00:00 2001 From: matt rice Date: Sun, 3 May 2026 18:11:11 -0700 Subject: [PATCH] Add docs on lex syntax --- doc/src/SUMMARY.md | 1 + doc/src/lexsyntax.md | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 34 insertions(+) create mode 100644 doc/src/lexsyntax.md diff --git a/doc/src/SUMMARY.md b/doc/src/SUMMARY.md index 8e73b3305..6df7be33e 100644 --- a/doc/src/SUMMARY.md +++ b/doc/src/SUMMARY.md @@ -7,6 +7,7 @@ - [Extensions](lexextensions.md) - [Hand-written lexers](manuallexer.md) - [Start States](start_states.md) + - [Syntax](lexsyntax.md) - [Parsing](parsing.md) - [Yacc compatibility](yacccompatibility.md) - [Extensions](yaccextensions.md) diff --git a/doc/src/lexsyntax.md b/doc/src/lexsyntax.md new file mode 100644 index 000000000..fc7f445e9 --- /dev/null +++ b/doc/src/lexsyntax.md @@ -0,0 +1,33 @@ +The syntax of the lrlex `.l` format, aims to be familiar to the format used by posix lex. +It uses the same basic structure as lex. + +``` +Definitions +%% +Rules +%% +User Subroutines +``` + +## Definitions + +Within the definitions section, you can add an option `%grmtools` section +documented in [extensions](./lexextensions.md) + +## Rules + +Each rule is given by the following elements in sequence: + +1. Optional Start State, a name given between angle brackets for example `` documented in [Start States](./start_states.md) +2. Regex, syntax defined by the rust [regex](https://docs.rs/regex) crate with optional escaping for any character. +3. Separator space, any horizontal space character in the unicode `Pattern_White_Space` character set +4. Optional State operator, given between angle brackets documented in [Start States](./start_states.md) + + `<+STATE>` Push the state given by `STATE` to the top of the stack. + + `<-STATE>` Pop the state off the top of the stack. + + `` Replace the state stack with the state `STATE` +5. Token Name, for example `"token"` any non-space character between double quotes, including double quotes. +6. End of line, finishes each rule. + +## User Subroutines + +Since lrlex doesn't support user actions, it doesn't support User Subroutines either. \ No newline at end of file