Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
GitHub is where the world builds software
Millions of developers and companies build, ship, and maintain their software on GitHub — the largest and most advanced development platform in the world.
go 1.8, linux, amd64
Recently, I use regular expression a lot in golang. From the benchmarksgame, I see the regexp of go is very slow, about 5x-10x than many other languages.
It seems that there are many type assertion in regexp, am I right? And is there any solution for this?
The benchmark game is, as the name suggest, a game. A game that is won by tuning a specific program until it performs good on the specific input used in the benchmark and/or, in this case, using highly-optimized regexp libraries.
The page you linked, for example, shows that C++ is slower than PHP, Hack, Node.js, and Typescript.
Is it, though?
This issue is, IMHO, a little too broad to be actionable. "Make the regexp engine faster" well, yes, everyone likes fast regexp engines. We have a few (a little more specific) issues about that.
@ALTree > The benchmark game is, as the name suggest, a game.
No. Shows a C++ program slower than PHP, Hack, Node.js and Typescript programs.
regex-redux is a new task, based on regexdna.
Try re-writing some of the old Go regex-dna programs to perform the new substitutions.
As you say, in a language like Python, the regexp engine is implemented in C. So you are comparing a somewhat tuned Go implementation with a highly tuned C implementation. You also need to consider the characteristics of the engine. Go has chosen to follow the re2 path (not surprising, since Russ Cox is a major author of both Go and re2). re2 has much better performance characteristics than some other regexp engines, in that it never has an exponential slowdown, but that comes at a cost for other regexps (https://swtch.com/~rsc/regexp/). Even then, the C++ re2 implementation has many optimizations that are not in the Go implementation. See, for example, #11646.
So: can regexp be sped up? Yes. But it will take work. I'm not sure whether to bother leaving this issue open or not.