New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Java.flex example file can't handle most negative int litera [sf#95] #97

Closed
lsf37 opened this Issue Feb 15, 2015 · 9 comments

Comments

Projects
None yet
2 participants
@lsf37
Member

lsf37 commented Feb 15, 2015

Reported by mcspanky on 2008-07-04 03:26 UTC
The examples/java/java.flex scanner can't handle the integer literal -2147483648, because it treats this as two tokens (a unary minus followed by a positive integer literal), and the second token is not a valid integer literal, it's too big to fit into an int.

This is the same representation problem that causes Math.abs(int) to return negative values when called with Integer.MIN_VALUE.

@lsf37 lsf37 changed the title from Java.flex example file can't handle most negative int litera to [Bug] Java.flex example file can't handle most negative int litera [sf#95] Feb 15, 2015

@lsf37 lsf37 added this to the jflex bug milestone Feb 15, 2015

@lsf37 lsf37 closed this Feb 15, 2015

@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Commented by lsf37 on 2008-07-27 07:21 UTC
Logged In: YES
user_id=93534
Originator: NO

This is now fixed in revision 381 of the repository.

I've added a "-"? to the definition of decimal, hex and oct literals. This only works if there is no whitespace between the minus and the digits, but my reading of the standard is that - 2147483648 (with a white space between the minus and the number) is to be lexed as unary minus and a (too large) int literal and should therefore be rejected.

Workaround for current release: add "-"? to the definition of the Dec, Hex, and Oct Literal macros.

Member

lsf37 commented Feb 15, 2015

Commented by lsf37 on 2008-07-27 07:21 UTC
Logged In: YES
user_id=93534
Originator: NO

This is now fixed in revision 381 of the repository.

I've added a "-"? to the definition of decimal, hex and oct literals. This only works if there is no whitespace between the minus and the digits, but my reading of the standard is that - 2147483648 (with a white space between the minus and the number) is to be lexed as unary minus and a (too large) int literal and should therefore be rejected.

Workaround for current release: add "-"? to the definition of the Dec, Hex, and Oct Literal macros.

@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Updated by lsf37 on 2008-07-27 07:21 UTC

  • status: open --> open-fixed
Member

lsf37 commented Feb 15, 2015

Updated by lsf37 on 2008-07-27 07:21 UTC

  • status: open --> open-fixed
@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Commented by nobody on 2008-07-27 17:28 UTC
Logged In: NO

Will this still properly parse x = y-3 ? Wouldn't the RHS lex as two tokens, an identifier and a literal, with no operator?

In my project, I solved this by having the lexer return null for the too-big literal, and leave it up to the parser to turn unary-minus-then-literal into a literal, with special case code if the literal value is null.

Member

lsf37 commented Feb 15, 2015

Commented by nobody on 2008-07-27 17:28 UTC
Logged In: NO

Will this still properly parse x = y-3 ? Wouldn't the RHS lex as two tokens, an identifier and a literal, with no operator?

In my project, I solved this by having the lexer return null for the too-big literal, and leave it up to the parser to turn unary-minus-then-literal into a literal, with special case code if the literal value is null.

@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Commented by lsf37 on 2008-08-13 07:09 UTC
Logged In: YES
user_id=93534
Originator: NO

You are perfectly right, of course, this breaks normal parsing of binary minus.

I have reverted the previous "fix".

I'd propose to match the number 2147483648 explicitly and return -1 in this case. Since all other numbers must be non-negative this should uniquely indicate MIN_INT if it occurs together with unary minus in the parser. (Very similar to what you suggest).

Does this sound better?

Cheers,
Gerwin

Member

lsf37 commented Feb 15, 2015

Commented by lsf37 on 2008-08-13 07:09 UTC
Logged In: YES
user_id=93534
Originator: NO

You are perfectly right, of course, this breaks normal parsing of binary minus.

I have reverted the previous "fix".

I'd propose to match the number 2147483648 explicitly and return -1 in this case. Since all other numbers must be non-negative this should uniquely indicate MIN_INT if it occurs together with unary minus in the parser. (Very similar to what you suggest).

Does this sound better?

Cheers,
Gerwin

@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Updated by lsf37 on 2008-08-13 07:10 UTC

  • status: open-fixed --> open-accepted
Member

lsf37 commented Feb 15, 2015

Updated by lsf37 on 2008-08-13 07:10 UTC

  • status: open-fixed --> open-accepted
@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Commented by mcspanky on 2008-08-13 10:55 UTC
Logged In: YES
user_id=461433
Originator: YES

Yes, that sounds great! Is this scanner used by an example CUPS parser? Perhaps that parser should be modified as well.

Best,
Martin

Member

lsf37 commented Feb 15, 2015

Commented by mcspanky on 2008-08-13 10:55 UTC
Logged In: YES
user_id=461433
Originator: YES

Yes, that sounds great! Is this scanner used by an example CUPS parser? Perhaps that parser should be modified as well.

Best,
Martin

@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Commented by lsf37 on 2009-01-31 06:54 UTC
Hi Martin,

I've now fixed this in revision r383 in the repository. The solution with matching just 2147483648 separately
didn't work out so nicely after all, because it leads to shift/reduce conflicts in the parser. I now just match
"-2147483648" separately. This can only conflict with things like x -2147483648 which is not a valid expression either way (for "x" "-" "2147483648" you have a too big literal, and for "x" "-2147483648"
you'd be missing an operator). The solution has the advantage that the (negative) number can be
passed as integer like any other literal.

Cheers,
Gerwin

Member

lsf37 commented Feb 15, 2015

Commented by lsf37 on 2009-01-31 06:54 UTC
Hi Martin,

I've now fixed this in revision r383 in the repository. The solution with matching just 2147483648 separately
didn't work out so nicely after all, because it leads to shift/reduce conflicts in the parser. I now just match
"-2147483648" separately. This can only conflict with things like x -2147483648 which is not a valid expression either way (for "x" "-" "2147483648" you have a too big literal, and for "x" "-2147483648"
you'd be missing an operator). The solution has the advantage that the (negative) number can be
passed as integer like any other literal.

Cheers,
Gerwin

@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Updated by lsf37 on 2009-01-31 06:54 UTC

  • status: open-accepted --> open-fixed
Member

lsf37 commented Feb 15, 2015

Updated by lsf37 on 2009-01-31 06:54 UTC

  • status: open-accepted --> open-fixed
@lsf37

This comment has been minimized.

Show comment
Hide comment
@lsf37

lsf37 Feb 15, 2015

Member

Updated by lsf37 on 2009-01-31 13:12 UTC

  • status: open-fixed --> closed
Member

lsf37 commented Feb 15, 2015

Updated by lsf37 on 2009-01-31 13:12 UTC

  • status: open-fixed --> closed

@lsf37 lsf37 added bug and removed bug labels Feb 17, 2015

@lsf37 lsf37 modified the milestone: jflex bug Feb 17, 2015

@regisd regisd added this to the 1.4.3 milestone Nov 4, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment