Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unreachable branch in LUTs are still linked to #385

Closed
RustyYato opened this issue Apr 8, 2024 · 0 comments · Fixed by #386
Closed

Unreachable branch in LUTs are still linked to #385

RustyYato opened this issue Apr 8, 2024 · 0 comments · Fixed by #386

Comments

@RustyYato
Copy link
Contributor

RustyYato commented Apr 8, 2024

I have some code like this:

#[derive(Logos)]
#[logos(source = [u8])]
enum Token {
    // NOTE: This is needed because logos has dot_matches_newline(false) set for regex_syntax (which is the default)
    #[token("\n")]
    Newline,
    #[regex(b".", priority = 0)]
    UnknownByte,
}

And this lexer should be impossible to error from so I use the error type enum LexerError {} which will cause a linker error in release mode like so

impl Default for LexerError {
    #[cfg(not(debug_assertions))]
    fn default() -> Self {
        extern "C" {
            fn __lexer_error_unreachable_default() -> !;
        }

        // force a linker error
        unsafe { __lexer_error_unreachable_default() }
    }

    #[cfg(debug_assertions)]
    fn default() -> Self {
        panic!("It is impossible for the lexer to error")
    }
}

This would work if the LUT didn't generate the error branch. And for some reason LLVM is unable to optimize out this branch. I suspect it's because the LUT is stored in a static, which tends to be an optimization barrier.

To fix this, the error branch simply shouldn't be generated if it is unreachable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant