Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some simple std.regex optimization #1553

Merged
merged 1 commit into from Dec 28, 2013

Conversation

DmitryOlshansky
Copy link
Member

This picks a couple of low-hanging fruits to optimize R-T & (a bit) C-T regex.

First is inlining. DMD basically never inlines stuff that is in the library and is not a template. It doesn't matter if it has the source for it (obviously DMD has all of Phobos source).
Currently the only way out is marking everything as empty-arg template and this gets us the rough analog of 'explicitly inlined' in C++.

Anyhow - around 30% of speed gain is obtained by adding a couple of empty brackets here and there alone. Or rather I got them back (in the days of FReD it was always inlined).

UPDATE:
The latter was addressed in a separate pull.

Second is removal of indirect call in the outer loop search loop. Basically it had to pick if it can do fast (kicked) search or if it should just use char by char crawling. Being outer loop I thought it wouldn't cost much to use an indirect call. Yet test runs on line by line matching indicate a noticeble speed up of around ~2% in the new version.

P.S. See e.g. this sample (one of examples that are currently difficult for std.regex) https://gist.github.com/blackwhale/6470498

@braddr
Copy link
Member

braddr commented Sep 6, 2013

The inliner isn't as bad as you suggest. If merely doing what you did by turning simple functions into no-arg templates affects inlining, then please produce a separate bug report against dmd. That needs to be solved generally, not by working around it like this.

@DmitryOlshansky
Copy link
Member Author

If merely doing what you did by turning simple functions into no-arg templates affects inlinining

It does else I wouldn't spent a better part of an evening to figure it out.

That needs to be solved generally, not by working around it like this.

Here you go:
http://d.puremagic.com/issues/show_bug.cgi?id=10985

The worst thing would be if that was the intended behaviour and modules that are merely imported are not usable for inlining (they are not used ATM see the report).

@DmitryOlshansky
Copy link
Member Author

I'd be glad to have this tiny workaround pulled untill we can have:
dlang/dmd#2561

Which AFAICT doesn't help any compiler other then DMD(?).

@DmitryOlshansky
Copy link
Member Author

Removed 2nd change. Now this pull is all about working around the bug.
http://d.puremagic.com/issues/show_bug.cgi?id=10985

Marked all cases as such.

@DmitryOlshansky
Copy link
Member Author

@andralex FYI

@andralex
Copy link
Member

@blackwhale noice

Not sure what's the best route to take right now. One thing I've learned is that if one one wants maximum speed, gdc and ldc must be tried. Does this change affect them?

@DmitryOlshansky
Copy link
Member Author

@andralex

Not sure what's the best route to take right now. One thing I've learned is that if one one wants maximum speed, gdc and ldc must be tried. Does this change affect them?

It does. Originally I tried this patch with LDC and got a nice speed up. Can't find exact figures but it's around 30-50%. What I do recall is that it puts us back in the game in one particular benchmark (linked in pull description).

P.S. The pull that would address the bug in the compiler
dlang/dmd#2561
apparently only works with DMD's AST-based inliner.
Neither of GDC or LDC uses it AFAIK.

@ibuclaw
Copy link
Member

ibuclaw commented Dec 26, 2013

@blackwhale - that's right. As you've noticed with the LDC speed up, cross module inlining does not happen in GDC either (we may have the code loaded up in the frontend AST, but that is no good if your inliner is in the backend).

@dnadlinger
Copy link
Member

@andralex, @blackwhale: As I noted in that Bugzilla issue, I'm not even sure what the best way to address this issue would even be. If we generate code for all the important modules just so that we might be able to inline a few functions later, we'd effectively make separate compilation impossible/pointless. I've been meaning to experiment a bit more with LTO in LDC (shipping druntime/Phobos as bitcode libraries, …), now would probably be a good point for somebody to do so.

@DmitryOlshansky
Copy link
Member Author

@klickverbot From this I gather that we ought to make 1-liners of Phobos inlinable at all costs as it's not clear when/if other compilers are going to be able to inline normal functions across modules. At the moment it appears to me that the only reliable way to do so is 0-arg template workaround applied judiciously throughout Phobos. We surely need a better solution long-term.

@dnadlinger
Copy link
Member

@blackwhale: Speaking of other compilers, in optimized builds LDC actually uses the DMD inliner to determine which extra functions from imported modules to emit, so that they can then be inlined by the LLVM inliner. So, if DMD is able to inline a Phobos function that LDC doesn't catch, please report this as a performance problem for the time being.

In the long term, however, I'd really like to get away from this model, as running the "late" semantic3 passes on modules that are not actually emitted has been a source of AST breakage problems in the past, and of course it's somewhat pointless to use an inferior inliner to determine what the better inliner is able to catch…

@andralex
Copy link
Member

I'm fine with pulling this in unless there are significant objections to the contrary. Definitely there's some TODO that has been formalized as a bug report. @ibuclaw @klickverbot?

@dnadlinger
Copy link
Member

I'm fine with merging this too.

andralex added a commit that referenced this pull request Dec 28, 2013
@andralex andralex merged commit 762bf5c into dlang:master Dec 28, 2013
@andralex
Copy link
Member

no mercy was offered in merging this request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants