-
Notifications
You must be signed in to change notification settings - Fork 524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Char class in comment gives m4 "ERROR: end of file in string" #553
Comments
Do you get the same error if you misspell a character class name anywhere
else in your lexer?
…-Joe
On Fri, Feb 17, 2023, 15:36 Alex Mohr ***@***.***> wrote:
The following input to flex 2.6.4 gives an m4 error:
%%
A { return 'A'; }
/*
* Bug: [[:alnum:]_]
*/
%%
> flex bug.ll
/bin/m4:stdin:1315: ERROR: end of file in string
The error disappears if I remove the underscore character from the
comment, like * Bug: [[:alnum:]]
—
Reply to this email directly, view it on GitHub
<#553>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVJXIKYDWC3AZANTVNHR2TWX7OM3ANCNFSM6AAAAAAU73CTI4>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Well, the contents of a comment should not affect the output. But FWIW using that char class in the grammar works fine:
(If I add the underscore back into the comment like |
Okay, that's weird. I notice you named the file .ll. Is that for c++?
(Shouldn't matter, but weird is weird.) Any command line switches or other
options needed to reproduce this?
…On Fri, Feb 17, 2023, 17:58 Alex Mohr ***@***.***> wrote:
Well, the contents of a comment should not affect the output. But FWIW
using that char class in the grammar works fine:
%%
[[:alnum:]_] { return 'Z'; }
/*
* Bug: [[:alnum:]]
*/
%%
(If I add the underscore back into the comment like [[:alnum:]_] the
error returns.)
—
Reply to this email directly, view it on GitHub
<#553 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVJXIJK3DB2HXS3LM3CWIDWX77CXANCNFSM6AAAAAAU73CTI4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
No that's just what the file happens to be named. I renamed it to to bug.l. I'm invoking flex with no arguments other than the input file. I built flex 2.6.4 into /usr/local by
And just to say it, this isn't just a spurious error; flex's output is truncated and invalid. I can work around it by modifying my comments, but it seems like a bona fide bug in flex's comment handling, so I wanted to report it. |
Here's a slightly more reduced repro. This fails:
This works:
|
There is something crucial about having additional characters after the |
First, sorry for saying [[:alnum:]_] was a misspelling. I was holding on to a false notion that character class names in flex included the outer square braces. Probably because of the next thing. Second, I found the problem but I can't fix it right now. It's peculiar to comment handling, as you noticed. Flex wraps comments in its customized M4 quotes, which happen to be [[ and ]]. Because the character classes aren't being scanned and replaced in the comments, M4 is reading the braces around them as quotation marks. This is usually okay when they are balanced (i.e. [[:alnum:]]). It leads to the error you saw when they look like unbalanced quotes (i.e. [[:alnum:]_]). Options:
Sorry the comment quoting makes this edge case complicated. |
No worries -- thanks for taking a look. I can easily work around this. For what it's worth, this example works in version 2.5.39, so the bug was introduced somewhere between 2.5.39 and 2.6.4. |
Drat! Now it's a bug instead of an oddity.
…On Mon, Feb 27, 2023, 15:38 Alex Mohr ***@***.***> wrote:
No worries -- thanks for taking a look. I can easily work around this. For
what it's worth, this example works in version 2.5.39, so the bug was
introduced somewhere between 2.5.39 and 2.6.4.
—
Reply to this email directly, view it on GitHub
<#553 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAVJXILFHZCXTVKI33P6NLLWZUGDLANCNFSM6AAAAAAU73CTI4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Think I found it. Mainly for my reference when writing a test & patch: we aren't escaping m4qstart and m4qend in the COMMENT_DISCARD condition the same way we are in COMMENT. I think that's the source of this. I'll write tests based on the cases above, thanks for those! |
Nope, none of that worked. @gitamohr, exactly what example did you test in 2.5.39? I'm trying to reproduce a working test from your comments above and finding no differences between 2.5.39 and HEAD. %% Flex accepts g but dies on h. Here's what's up: However! The following construction works for long comments in 2.5.39 and HEAD: i {; } /* Outcomes: |
I just tried the shortest example from above:
cheers! |
In this thread: I show myself to be an idiot. I have my trusty, old "2.5.39" folder connected to the 2.6.4 tag for some reason. Beg your pardon. Be back with better results shortly. |
Well, I'm back where we started. I see the issue, but I can't fix it for a while.
Flex sees the comment after A's action as a "CODE_COMMENT". Those aren't m4 quoted the same way as other comments because quoting them cause other problems. Until we get rid of the m4 dependency, I can't change the behavior back to what you came to expect in 2.5.39 without breaking other functionality. That said, you can use the constructions I provided above instead. I'm about done with the tests for them so we'll notice before losing any more comment functionality. |
Yep no worries, as I've mentioned this is no real impediment; just something I noticed. |
fixed by #557 |
Any idea if https://stackoverflow.com/questions/78157667/error-end-of-file-in-string-error-coming-from-m4-when-using-flex is related to this? |
The following input to flex 2.6.4 gives an m4 error:
The error disappears if I remove the underscore character from the comment, like
* Bug: [[:alnum:]]
The text was updated successfully, but these errors were encountered: