Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
regexp: implement backtracking search #4154
In the spirit of RE2's multiple implementations, one possible way to make regexp faster for easy regexps is to use actual backtracking. We'd only do this for those regexps where we can compute that backtracking is okay. It would be something like the BitState execution in RE2 but perhaps the details would be different. We do have a reliable stack, unlike in C++.
I have been wondering how C++ RE2 could be 13 times faster than Go in the benchmark shootout. Now, I kind of know. C++ is dynamically assessing the cost of the execution and chooses the algo that fits most. Almost like how SQL is executed in a database. I read your papers regarding regexp multiple times last week and read the source of the current "regexp" package. The currently code implements Thompson's algo faithfully. I could not have imagined a better way to improve it. But adding "backtracking" to the assessment will do! I hate to see Go's current regexp is slower than PHP and Python in easy cases. Once "backtracking" is added, more Rubist, Python writer and PHP writer would have one less excuse to switch over. Thanks for your outstanding papers that make me understand regexp more clearly! Please add "backtracking" into the coming releases. Best Regards, GoPro