Skip to content
This repository has been archived by the owner on Sep 9, 2024. It is now read-only.

fix(huff_lexer): Bytes are being incorrectly lexed #15

Closed
clabby opened this issue Jun 2, 2022 · 2 comments
Closed

fix(huff_lexer): Bytes are being incorrectly lexed #15

clabby opened this issue Jun 2, 2022 · 2 comments
Labels
lexer Huff Lexer Changes.

Comments

@clabby
Copy link
Member

clabby commented Jun 2, 2022

Overview

Right now (master @ 5fd1e32), the huff_lexer incorrectly parses bytes / hex number tokens.

Failing Test

#[test]
fn parses_bytes() {
    let source = "0xDDF252AD1BE2C89B69C2B068FC378DAA952BA7F163C4A11628F55A4DF523B3EF";
    let mut lexer = Lexer::new(source);
    assert_eq!(lexer.source, source);

    // Try to parse the bytes token (assuming that it is a single token)
    let tok = lexer.next();
    let unwrapped = tok.unwrap().unwrap();
    let bytes_span = Span::new(0..source.len());
    assert_eq!(unwrapped, Token::new(TokenKind::Ident(source), bytes_span));
    assert_eq!(lexer.span, bytes_span);

    // We covered the whole source
    assert_eq!(lexer.span.end, source.len());
    assert!(lexer.eof);
    assert!(lexer.next().is_none());
}

Expected Result

tok = Token { kind: Ident("0xDDF252AD1BE2C89B69C2B068FC378DAA952BA7F163C4A11628F55A4DF523B3EF"), span: Span { start: 0, end: 66 } }

Or a dedicated TokenKind variant for bytes / hex tokens.

Actual Result

tok = Token { kind: Num(0), span: Span { start: 0, end: 1 } }

Suspected Reason for Issue

The lexer dyn_consumes all valid ascii digits when it comes across what it thinks is a TokenKind::Num. When dyn_consume hits the second character of a bytes / hex number token, x, it completes, splitting the bytes / hex number token into two parts: The TokenKind::Num(0) and the TokenKind::Ident("xDDF252AD1BE2C89B69C2B068FC378DAA952BA7F163C4A11628F55A4DF523B3EF"). Some extra logic needs to be added to this match arm in order to correctly lex these tokens.

@clabby clabby added the lexer Huff Lexer Changes. label Jun 2, 2022
@sudovirtual
Copy link
Member

will look into this - it should lex them as literals

@sudovirtual
Copy link
Member

#21

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lexer Huff Lexer Changes.
Projects
None yet
Development

No branches or pull requests

2 participants