This extension adds support for an EBNF-like syntax (Extended Backus-Naur Form) to Visual Studio Code.
- Syntax highlighting + semantic highlighting
- Basic error checking
- Syntax errors
- Undefined symbols
- Duplicate symbols
- Go to definition
- Find all references
- Document symbols (go to symbol, outline)
- Basic code completion
- Rule names
- Hover information
- Rule name and definition
- Code folding
- Railroad diagram generation
This extension implements a simple and strict-ish version of EBNF. The syntax is defined in itself in ebnf.ebnf.
Rules are defined using the assignment operator =
. The left-hand side is the rule name, and the right-hand side is an expression. Rules must end with a semicolon ;
.
Rule names can start with any letter, number, or an underscore. They can also contain a hyphen, but not at the beginning. Rule names are case-sensitive.
Expressions are made up of terms and operators. Terms are either literals, references to other rules (by name), special cases, groups, or ranges. Operators are used to combine terms into more complex expressions.
Literals are enclosed in single quotes or double quotes. They can contain any character except for the quote character used to enclose them. No escaping is considered, so you can't use a single quote inside a single-quoted literal, or a double quote inside a double-quoted literal. How to interpret sequences like \n
is up to the reader. Both literals and special cases can be multiline.
Special cases are used to describe content that cannot be easily expressed using the other terms. They are enclosed in question marks ?
, and can have multiple lines.
? any character ?
? valid UTF-8 ?
There are three different types of groups:
- Parentheses (group) are only used to group terms together.
- Brackets (optional) indicate that the content inside is optional, i.e. it can appear zero or one times.
- Braces (repetition) indicate that the content inside can appear zero or more times.
Ranges are used to define a set a contiguous characters. They are composed of two strings joined by two dots ..
.
Ranges have no specific definition of what a range "is". It should be obvious what the range should represent. For example, a range of "A".."Z"
is probably a set of uppercase letters, while a range of "0".."9"
is probably a set of digits.
The concatenation operator is the comma ,
. It does not define what whitespace is allowed between terms; it is assumed that the reader knows what is and isn't allowed.
"A", "B", "C" (* probably "ABC" *)
"fn", name, "()" (* probably "fn foo()" *)
The alternation operator is the pipe |
. It is used to define a set of possible choices for a term.
"A" | "B" | "C" (* "A", "B", or "C" *)
"A", ( "B" | "C" ) (* "AB" or "AC" *)
The exclusion operator is the caret -
. It is used to define a set of possible choices for a term, but excludes one or more of them.
letter = "A".."Z" ;
not_z = letter - "Z" ; (* "A".."Y" *)