Skip to content

Commit

Permalink
refactor(parser): macro for ASCII byte handlers (#2066)
Browse files Browse the repository at this point in the history
As discussed on #2046, it wasn't ideal to have `unsafe {
lexer.consume_ascii_char() }` in every byte handler. It also wasn't
great to have a safe function `consume_ascii_char()` which could cause
UB if called incorrectly (so wasn't really safe at all).

This PR achieves the same objective of #2046, but using a macro to
define byte handlers for ASCII chars, which builds in the assertion that
next char is guaranteed to be ASCII.

Before #2046:

```rs
const SPS: ByteHandler = |lexer| {
  lexer.consume_char();
  Kind::WhiteSpace
};
```

After this PR:

```rs
ascii_byte_handler!(SPS(lexer) {
  lexer.consume_char();
  Kind::WhiteSpace
});
```

i.e. The body of the handlers are unchanged from how they were before
#2046.

This expands to:

```rs
const SPS: ByteHandler = |lexer| {
  unsafe {
    let s = lexer.current.chars.as_str();
    assert_unchecked!(!s.is_empty());
    assert_unchecked!(s.as_bytes()[0] < 128);
  }
  lexer.consume_char();
  Kind::WhiteSpace
};
```

But due to the assertions the macro inserts, `consume_char()` is now
optimized for ASCII characters, and reduces to a single instruction. So
the `consume_ascii_char()` function introduced by #2046 is unnecessary,
and can be removed again.

The "boundary of unsafe" is moved to a new function `handle_byte()`
which `read_next_token()` calls. `read_next_token()` is responsible for
upholding the safety invariants, which include ensuring that
`ascii_byte_handler!()` macro is not being misused (that last part is
strictly speaking a bit of a cheat, but...).

I am not a fan of macros, as they're not great for readability. But in
this case I don't think it's *too* bad, because:

1. The macro is well-documented.
2. It's not too clever (only one syntax is accepted).
3. It's used repetitively in a clear pattern, and once you've understood
one, you understand them all.

What do you think? Does this strike a reasonable balance between
readability and safety?
  • Loading branch information
overlookmotel committed Jan 17, 2024
1 parent 18a58d4 commit 8d5f5b8
Showing 1 changed file with 217 additions and 201 deletions.
Loading

0 comments on commit 8d5f5b8

Please sign in to comment.