Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keeping error messages relevant while parsing with GenericToken and IgnoreEOL = true #177

Closed
mcclown opened this issue May 23, 2020 · 12 comments

Comments

@mcclown
Copy link

mcclown commented May 23, 2020

Hi, great project, thanks for creating it.

I'm implementing a parser for a language where statements are delimited by EOL.

I've set Lexer(IgnoreEOL), added a GenericToken.SugarToken("\r\n") to my enum and handled EOL appropriately in my Parser class.

I'm hitting an issue when there are errors in the ParseResult. The errors are interpreting my input code as if there are no lines, all code is on line 0 and all lines have been concatenated onto one line. Is there a better way to handle this, to make sure the error messages are still possible to follow?

Here are some snippets from my code, to show what I've done.

    public class Parser
    {

        [Production("sequence: statement*")]
        public void Sequence(List<TurnipRoot> statements)
        { 
        }

        [Production("statement: declaration (EOL)+ [d]")]
        [Production("statement: rule (EOL)+ [d]")]
        [Production("statement: override (EOL)+ [d]")]
        public void Statement(TurnipRoot statement)
        {

        }

...

    [Lexer(IgnoreEOL = false)]
    public enum Token 
    {
        [Lexeme(GenericToken.SugarToken, "\r\n")]
        EOL,
...
@b3b00
Copy link
Owner

b3b00 commented May 25, 2020

that is quite normal as if you choose to manage EOL by yourself CSLy's lexer has no way to know is hitting a new line and then increment the line counter : ineed it 's "thinking" that all the source code is one line.
maybe one way to manage it would be to tag EOL tokens as ... end of lines. then lexer would be able to increment line counter, and maybe by some more hacky way the column counter.

For your language this would be something like :

[Lexer(IgnoreEOL = false)]
    public enum Token 
    {
        [Lexeme(GenericToken.SugarToken, "\r\n", isEndOfLine:true)]
        EOL,

What do you think about it ?

This is a major change to the lexer (both Generic and Regex should be modified).

@b3b00
Copy link
Owner

b3b00 commented May 25, 2020

I've started looking at it, I have not much time right now so you'll probably have to wait a little.

@b3b00
Copy link
Owner

b3b00 commented May 25, 2020

In fact there already is a IslineEnding parameter on Lexeme attribute. but it does not magane correctly the line counter.

@b3b00
Copy link
Owner

b3b00 commented May 25, 2020

You can start to test a fix with branch feature/#177-error-messages-and-EOL-tokens feature
It manages line number but column numbers still need some additional work

your lexer should look like

[Lexer(IgnoreEOL = false)]
    public enum Token 
    {
        [Lexeme(GenericToken.SugarToken, "\r\n", IsLineEnding:true)]
        EOL,

@mcclown
Copy link
Author

mcclown commented May 25, 2020

Thanks mate! That really helps with working through errors in my parser. I'll test it out.

@mcclown mcclown changed the title Keeping error messages relevant white parsing with GenericToken and IgnoreEOL = true Keeping error messages relevant while parsing with GenericToken and IgnoreEOL = true May 25, 2020
@b3b00
Copy link
Owner

b3b00 commented May 25, 2020

@mcclown , i've just pushed a better fix for line and column computation when eol are not ignored. You can check on branch feature/#177-error-messages-and-EOL-tokens feature

I will wait for your approval to close the issue as I don't have a "real world" parser to check it complete. By the way would you mind sharing your parser ? I am always interesting in the ways CSLY is used.

@mcclown
Copy link
Author

mcclown commented May 25, 2020

That's working as expected, thank you for the quick turnaround. I'll send you an email with some details of my parser.

@mcclown mcclown closed this as completed May 25, 2020
@mcclown
Copy link
Author

mcclown commented May 25, 2020

Woops, didn't mean to close that until the fix had been merged. Sorry!

@mcclown mcclown reopened this May 25, 2020
@mcclown
Copy link
Author

mcclown commented May 25, 2020

Just noticed one thing, the line numbers are 0 indexed. ie. an error on line 7 shows an error saying it happened on line 6.

@b3b00
Copy link
Owner

b3b00 commented May 26, 2020

Yes indeed, generic lexer is 0 based (both lines and columns). I think that's not an issue and csly client can manage the shift easily if needed.
Now merging.

@b3b00 b3b00 closed this as completed May 26, 2020
b3b00 added a commit that referenced this issue May 26, 2020
b3b00 added a commit that referenced this issue May 26, 2020
…om:b3b00/csly into feature/#177-error-messages-and-EOL-tokens

* 'feature/#177-error-messages-and-EOL-tokens' of github.com:b3b00/csly:
  bugfix #175 :  column counter
b3b00 added a commit that referenced this issue May 26, 2020
@b3b00
Copy link
Owner

b3b00 commented May 26, 2020

appveyor is failing on this branch for some mysterious reasons.... I will look at it but for now your only way to get it is to use a CSLY clone. Sorry for the inconvenience.

@b3b00
Copy link
Owner

b3b00 commented May 26, 2020

@mcclown new nuget available as 2.6.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants