-
-
Notifications
You must be signed in to change notification settings - Fork 740
New std.uni meets std.regex #2001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Woot, this fixes issue 11784 |
so when do we pull? |
@andralex |
You will need to accompany your fixes with proofs. (Just kidding!!) |
@andralex Simple benchmark for std.regex is no big secret and have been lying for ages here: And some patterns to use with it:
|
may I close while u work on it? |
Okay, I had my share of auto-tester) |
So, does you re-opening the PR mean it will be ready to be pulled soon? (The description still says it's not) |
@Orvid Yes, it's getting ready for prime time. I'm investigating last bits for signs of potential regressions, hence the "DO NOT PULL LABEL" still here. In a day or so I will either close it for more work, or remove the note and do the last adjustments. |
Okay 2 CTFE bugs were worked around, this should be finally ready to go. |
Seems to be passing. Okay to merge? |
Would be awesome.. Actually wait a sec, the last commit has a few stupid typos in text messages. |
Fixed. Kenji as usual saves my day, a recent fix nearly halved the memory usage. |
Auto-merge toggled on |
few steps more
Saving amount to ~290Kb on 32bit.
Actually add a test case, the issue was fixed as part of the set of commits that precede this one.
Simplify set construction.
…ode. Also use Gallop search policy for CodepointSets, as it's closer to to the common cases for merging charsets.
Cover even more, but in 5 separate compiler runs. Few cases still hit CTFE bugs
Pull updated, auto_merge toggled off |
Now I wish GitHub had a pull request update diff view. |
@klickverbot It's rebase, since recent fix foobared auto-merge for std.regex |
Auto-merge toggled on |
New std.uni meets std.regex
Fantastic! |
This pull request introduced a regression: |
I've been preparing this for over half a year now.
Puts our std.regex on the new rails of recently revamped std.uni that now has enough of mojo to power a Unicode-aware regular expression engine. All of it with a minor exception via normal "user-land" public API.
UPDATE: indeed fixes issue 11784.
TODOs:
1. Regressed ctRegex - it outs of memory on some patterns that previously were passing just fine. In particular blocks 2 & 3 of CT-tests on win32.Memory usage is something out of my control. I improved it a bit simplifying std.uni abstractions. Tests are now split in 7 groups instead of 4 but cover more cases.
UPDATE: Only 5, with recent CTFE fix.
2. May have regressed regex compilation speed due to less straight-forward construction.Seems just fine and even a bit better.