Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assembler: parsing error for identifiers starting with "INF" #54

Closed
skochinsky opened this issue Nov 30, 2015 · 2 comments
Closed

assembler: parsing error for identifiers starting with "INF" #54

skochinsky opened this issue Nov 30, 2015 · 2 comments

Comments

@skochinsky
Copy link

Simple example:
.class A
.super java/lang/Object

  .field public static final INFO_TYPE_SERIAL_NUMBER S

  .field public static final INFO_TYPE_SUBJECT S

  .field public static final INFO_TYPE_SUBJECT_ALTERNATIVE_NAME S

  .field public static final INFO_TYPE_SUBJECT_RAW S

.end class

produces:

Syntax error at line 4: unexpected token u'INF'
Expected: SYNTHETIC, OP_FIELD, OP_INT, OP_CLASS, TOP, OBJECT, OP_WIDE, OP_LOOKUPSWITCH, OP_CLASS_INT, OP_LBL, PROTECTED, STATIC, SAME, METHODTYPE, OP_NONE, SAME_EXTENDED, NULL, PARAMETER, FINAL, LOCALS, WORD, DEFAULT, INVOKEDYNAMIC, SAME_LOCALS_1_STACK_ITEM_EXTENDED, UTF8, CPINDEX, OP_METHOD_INT, PRIVATE, CHOP, TO, APPEND, INTEGER, ARRAY, STACK, FULL, STRING, OP_DYNAMIC, IS, ENUM, UNINITIALIZEDTHIS, METHOD, FIELD, OP_LDC2, OP_TABLESWITCH, OP_LDC1, METHODHANDLE, OP_METHOD, SAME_LOCALS_1_STACK_ITEM, UNINITIALIZED, FROM, STRING_LITERAL, INT, INTERFACEMETHOD, FLOAT, OP_INT_INT, OP_NEWARR, CLASS, TRANSIENT, VOLATILE, DOUBLE, USING, LONG, PUBLIC, NAMEANDTYPE
Found: DOUBLE_LITERAL
Current stack: [$end, sep, classwithends, version_opt, class_directive_lines, classdec, superdec, interfacedecs, class_directive_lines, topitems, LexToken(D_FIELD,u'.field',5,38), fflags, LexToken(FINAL,u'final',5,59)]

The following change in tokenize.py seems to fix it:

float_base = r'''(?:
    [Nn][Aa][Nn]|                                       #Nan
    [-+]?(?:                                            #Inf and normal both use sign
        [Ii][Nn][Ff]\b|                                   #Inf
        \d+\.\d*(?:[eE][+-]?\d+)?|                         #decimal float
        \d+[eE][+-]?\d+|                                   #decimal float with no fraction (exponent mandatory)
        0[xX][0-9a-fA-F]*\.[0-9a-fA-F]+[pP][+-]?\d+        #hexidecimal float
        )
    )
'''

(added \b) in line 4

@Storyyeller
Copy link
Owner

Good catch. Should I change the way infinity/nan are represented in the next version to avoid the ambiguity?

@Storyyeller
Copy link
Owner

In the next version, I'm planning to require that Infinity/NaN begin with a sign to avoid the ambiguity. I'm also requiring that all tokens are whitespace separated (which was supposed to be true already, but apparently Ply doesn't work that way).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants