Skip to content

canndrew/malk-lexer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

malk-lexer

A unicode lexer for use as a first-pass when writing a parser.

The main function exported by this library is lex which takes a &str and a table of valid symbols and converts them to a token tree.

The kinds of token recognized by the lexer are:

  • Idents: A string starting with a XID_Start character followed by a sequence of XID_Continue characters.
  • Whitespace: Any sequence of whitespace characters.
  • Brackets: Any bracket character, it's corresponding closing bracket and the tokens in-between returned as a sub-tree.
  • Symbols: Any string that appears in the symbol table provided to lex
  • Strings: A string enclosed with either " or ' and which may contain escaped characters.

Patches welcome!

About

A unicode lexer written in Rust

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages