Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Syntax highlighting improvements #127
The current syntax highlighting solution uses a simple lexer to generate a stream of tokens. Adding new languages means adding a new implementation of the lexer to add behaviour for that specific language. Color values are also hard coded right now.
I'd like to look into a better solution. Regular expressions is one way this could go (syntect could be interesting). Another way would be to implement some kind of AST based parser which would allow for much more granular & meaningful highlighting, I think.
Basically, I'm not sure what the best approach is, but I don't think the way it is now is it. Current solution works, but I want something better.
Comments in #88 will be useful in this.
I was researching a while ago, in making the implementation for issue #88. Just never took the time to actually do it. One thing that I did spent time in was researching and comparing implementations of grammar parsers for syntax highlighting.
Now it could be because I'm bias to textadept's way, but after looking over how vim and sublime text have gone about it, I've come to the realization that using an AST parser defined by a PEG generator would be best. Right now there's two generators written in rust, which come to mind for the process; rusty-peg and oak. These two would allow to get away with making a lexer definition which is highly expressive, with leaner symantics, and less of a resource hog for the editor at hand.
The reason why I bring this up is for the ease in definitions of a lexer for Iota and the ease in implementing the parser should both be simple enough to refactor and modify by contributors. Using a PEG generator would allow very easy definitions of lexers. And I've noticed that PEG generated lexers, like scintillua, can be less daunting for the editor at hand.
In any case, here's an old talk about the power of PEG generator, which is still relevant to the issue at hand and can probably speak better on the advantage over regex parser.