Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to do good syntax error handling? #76

Closed
joagre opened this issue Mar 15, 2024 · 2 comments
Closed

How to do good syntax error handling? #76

joagre opened this issue Mar 15, 2024 · 2 comments

Comments

@joagre
Copy link

joagre commented Mar 15, 2024

Hi,
I can use special rules to catch common errors and point out which row they occur on. I keep track of rows and store it in auxil:

_ <- (WS / Comments)*
__ <- (WS / Comments)+
WS <- [ \t\r\n] {
    if ($0[0] == '\n') {
        auxil->row++;
    }
}
Comments <- SingleLineComment / BlockComment
SingleLineComment <- "//" (!EOL .)* EOL?
EOL <- ("\r\n" / "\n" / "\r") { auxil->row++; }
BlockComment <- "/*" (BlockCommentContent / EOL)* "*/"
BlockCommentContent <- (!("*/" / EOL) .)

I can then use a special rule to catch a common error, e.g.

Block <- e:Expr { $$ = CN(BLOCK, 1, e); } ( _ CommaSeparator _ e:Expr { AC($$, e); })*
CommaSeparator <- ("," / ";") {
    if (strcmp($0, ";") == 0) {
        fprintf(stderr, "%d: Use ',' to separate expressions in blocks", auxil->row);
    }
}

But with unexpected syntax errors everything breaks down and I cannot point out which row the error occured on.

As a workaround I added the following:

    static int ROW = 1;

    static int satie_getchar(satie_auxil_t* _auxil) {
        int c = getchar();
        if (c == '\n') {
            ROW++;
        }
        return c;
    }

    static void satie_error(satie_auxil_t* auxil) {
        panic("Syntax error near line %d", ROW);
    }

It works and I have re-invented awk-like error handling. :-) It's crude though.

Ideally I would like to point out syntax errors very precisely with both row and column info.

I haven't been able to figure out how to do that? Any hints?

Cheers
/Joakim

@arithy
Copy link
Owner

arithy commented Apr 14, 2024

The example TinyC might be helpful to find the solution for precise counting rows and columns.
It uses the customized macro PCC_GETCHAR() with the text reader function system__read_source_file(). In this function, line break positions in bytes are recorded by calling append_line_head_() while fetching byte characters from an input text. The parsing positions in the input text can be detected using the predefined variables $0s and $0e (see README.md). The row number and the column number are computed in the function compute_line_and_column_() using line break positions and the parsing position. If not supporting multibyte characters, the code below

count_characters_(obj->source.text.p, obj->source.line.p[i - 1], pos) + 1

can be simplified with

pos - obj->source.line.p[i - 1] + 1

Unless considering multibyte characters, the input text needn't be memorized as the example does. Regarding error reporting, the example does it like this using system__handle_syntax_error().

@arithy
Copy link
Owner

arithy commented May 20, 2024

@joagre , I'm wondering if my answer was what you wanted. If not so, let me know it.
I'll close this issue in a week if no reply. Feel free to reopen it when you need.

@arithy arithy closed this as completed May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants