Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster Turtle Parsing using Compile Time Regexes #307

Merged
merged 2 commits into from
Mar 20, 2020

Commits on Jan 30, 2020

  1. Implemented a Faster Tokenizer

    - We now have two tokenizer's, one using Google's RE2 and one using hanickadot
     (Hana Dusikova's) CTRE (compile time regex) library.
    
    - The CTRE tokenizer is faster but currently only supports prefixes that
       contain only ascii character's
    
    - for exactly this reason they have to be explicitly activated in the settings file
    
    - Added Unit Tests for the CTRE Tokenizer
      Some of them are commented out, because they test the currently
      unsupported none-ascii prefixes
    
    - Implemented the Regexes for using correct prefixes in CTRE in the UTF8RegexTest.cpp file but don't use them in actual code because
      they bloat up the compile-time by an unacceptable amount.
    
    - Included the two Parsing Modes (CTRE with relaxed prefixes and Google Re2 as before) into the IndexBuilderMain
    joka921 committed Jan 30, 2020
    Configuration menu
    Copy the full SHA
    fdc36ff View commit details
    Browse the repository at this point in the history
  2. Rebased to the PipelinedIndexBuild PR

    Also use the ctre parser in the e2e tests.
    joka921 committed Jan 30, 2020
    Configuration menu
    Copy the full SHA
    511a711 View commit details
    Browse the repository at this point in the history