fix(huff_lexer): Bytes are being incorrectly lexed #15

clabby · 2022-06-02T01:07:22Z

Overview

Right now (master @ 5fd1e32), the huff_lexer incorrectly parses bytes / hex number tokens.

Failing Test

#[test]
fn parses_bytes() {
    let source = "0xDDF252AD1BE2C89B69C2B068FC378DAA952BA7F163C4A11628F55A4DF523B3EF";
    let mut lexer = Lexer::new(source);
    assert_eq!(lexer.source, source);

    // Try to parse the bytes token (assuming that it is a single token)
    let tok = lexer.next();
    let unwrapped = tok.unwrap().unwrap();
    let bytes_span = Span::new(0..source.len());
    assert_eq!(unwrapped, Token::new(TokenKind::Ident(source), bytes_span));
    assert_eq!(lexer.span, bytes_span);

    // We covered the whole source
    assert_eq!(lexer.span.end, source.len());
    assert!(lexer.eof);
    assert!(lexer.next().is_none());
}

Expected Result

tok = Token { kind: Ident("0xDDF252AD1BE2C89B69C2B068FC378DAA952BA7F163C4A11628F55A4DF523B3EF"), span: Span { start: 0, end: 66 } }

Or a dedicated TokenKind variant for bytes / hex tokens.

Actual Result

tok = Token { kind: Num(0), span: Span { start: 0, end: 1 } }

Suspected Reason for Issue

The lexer dyn_consumes all valid ascii digits when it comes across what it thinks is a TokenKind::Num. When dyn_consume hits the second character of a bytes / hex number token, x, it completes, splitting the bytes / hex number token into two parts: The TokenKind::Num(0) and the TokenKind::Ident("xDDF252AD1BE2C89B69C2B068FC378DAA952BA7F163C4A11628F55A4DF523B3EF"). Some extra logic needs to be added to this match arm in order to correctly lex these tokens.

The text was updated successfully, but these errors were encountered:

sudovirtual · 2022-06-02T08:49:14Z

will look into this - it should lex them as literals

sudovirtual · 2022-06-02T18:20:31Z

#21

clabby added the lexer Huff Lexer Changes. label Jun 2, 2022

sudovirtual closed this as completed Jun 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(huff_lexer): Bytes are being incorrectly lexed #15

fix(huff_lexer): Bytes are being incorrectly lexed #15

clabby commented Jun 2, 2022 •

edited

Loading

sudovirtual commented Jun 2, 2022

sudovirtual commented Jun 2, 2022

fix(huff_lexer): Bytes are being incorrectly lexed #15

fix(huff_lexer): Bytes are being incorrectly lexed #15

Comments

clabby commented Jun 2, 2022 • edited Loading

Overview

Failing Test

Expected Result

Actual Result

Suspected Reason for Issue

sudovirtual commented Jun 2, 2022

sudovirtual commented Jun 2, 2022

clabby commented Jun 2, 2022 •

edited

Loading