How to do good syntax error handling? #76

joagre · 2024-03-15T00:05:31Z

Hi,
I can use special rules to catch common errors and point out which row they occur on. I keep track of rows and store it in auxil:

_ <- (WS / Comments)*
__ <- (WS / Comments)+
WS <- [ \t\r\n] {
    if ($0[0] == '\n') {
        auxil->row++;
    }
}
Comments <- SingleLineComment / BlockComment
SingleLineComment <- "//" (!EOL .)* EOL?
EOL <- ("\r\n" / "\n" / "\r") { auxil->row++; }
BlockComment <- "/*" (BlockCommentContent / EOL)* "*/"
BlockCommentContent <- (!("*/" / EOL) .)

I can then use a special rule to catch a common error, e.g.

Block <- e:Expr { $$ = CN(BLOCK, 1, e); } ( _ CommaSeparator _ e:Expr { AC($$, e); })*
CommaSeparator <- ("," / ";") {
    if (strcmp($0, ";") == 0) {
        fprintf(stderr, "%d: Use ',' to separate expressions in blocks", auxil->row);
    }
}

But with unexpected syntax errors everything breaks down and I cannot point out which row the error occured on.

As a workaround I added the following:

    static int ROW = 1;

    static int satie_getchar(satie_auxil_t* _auxil) {
        int c = getchar();
        if (c == '\n') {
            ROW++;
        }
        return c;
    }

    static void satie_error(satie_auxil_t* auxil) {
        panic("Syntax error near line %d", ROW);
    }

It works and I have re-invented awk-like error handling. :-) It's crude though.

Ideally I would like to point out syntax errors very precisely with both row and column info.

I haven't been able to figure out how to do that? Any hints?

Cheers
/Joakim

The text was updated successfully, but these errors were encountered:

arithy · 2024-04-14T07:20:32Z

The example TinyC might be helpful to find the solution for precise counting rows and columns.
It uses the customized macro PCC_GETCHAR() with the text reader function system__read_source_file(). In this function, line break positions in bytes are recorded by calling append_line_head_() while fetching byte characters from an input text. The parsing positions in the input text can be detected using the predefined variables $0s and $0e (see README.md). The row number and the column number are computed in the function compute_line_and_column_() using line break positions and the parsing position. If not supporting multibyte characters, the code below

count_characters_(obj->source.text.p, obj->source.line.p[i - 1], pos) + 1

can be simplified with

pos - obj->source.line.p[i - 1] + 1

Unless considering multibyte characters, the input text needn't be memorized as the example does. Regarding error reporting, the example does it like this using system__handle_syntax_error().

arithy · 2024-05-20T11:20:12Z

@joagre , I'm wondering if my answer was what you wanted. If not so, let me know it.
I'll close this issue in a week if no reply. Feel free to reopen it when you need.

arithy closed this as completed May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to do good syntax error handling? #76

How to do good syntax error handling? #76

joagre commented Mar 15, 2024 •

edited

Loading

arithy commented Apr 14, 2024

arithy commented May 20, 2024

How to do good syntax error handling? #76

How to do good syntax error handling? #76

Comments

joagre commented Mar 15, 2024 • edited Loading

arithy commented Apr 14, 2024

arithy commented May 20, 2024

joagre commented Mar 15, 2024 •

edited

Loading