-
Notifications
You must be signed in to change notification settings - Fork 367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BREAKING] sync combine with select #2158
Conversation
Just as a reference in 0.20.2 we have:
(which is not super fast, but still takes 2x less time) |
I have made some optimizations to reduce compilation cost (in particular dropped
|
Last commit is performance related only, after it:
and on master
The problem was that the check 20c9c88#diff-23657e51a9cc9e627fc153ba1e6e04c1L982 was very expensive (I earlier thought it should be optimized out by the compiler). This signals type instability risk, but it seems we do not have it, so this is strange:
|
So now the PR is faster than master? That type instability is indeed unexpected. Have you checked with (BTW, I'd rather use |
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
PR is faster than master and is ~ as fast as 0.20.2. I will check @code_warntype when we stabilize the design.
Right (I checked both but in general I will report |
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
Do you think we should allow using selectors and renaming in
would be allowed? |
That would be quite natural with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. I went over and think I understand the logic but don't have any substantive comments about the implementation. This looks good to me.
A method to add would be
which would |
I plan it for a separate PR as this is already big enough I think. |
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
We have a very strange error at https://travis-ci.org/github/JuliaData/DataFrames.jl/jobs/669620787?utm_medium=notification&utm_source=github_status. It seems totally unrelated and theoretically impossible to occur. I am on x64 Linux and never had it. I will re-run the tests to see what happens. |
The error did not repeat - @nalimilan - do you think we should worry about this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool.
The error is annoying but without a way to reproduce it I guess we can just hope we'll be able to identify it at some point. I remember I bumped into Julia bugs before in this code, you probably hit a rare path that wasn't exercised before.
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
I have committed your Timing (with unrolling, but before
Timing of
Timing of
As you can see |
OK, we will go for Generated timings:
The only drawback is when we would pass hundreds of variables, but this is a different issue - I will add such a note to the docs. |
This reverts commit 8f0caf7.
I am leaving here the
However, the tests show that it is better to do manual unrolling for some reason for low number of arguments (and I think that for more than 4 arguments I will leave the |
For |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's keep the manual unrolling then. Can you just add a comment explaining that @generated
is slower for a low number of columns? I guess this is including compilation times, right? (That's indeed something we care about, but that isn't always considered when writing package code.)
I will add a comment. Actually it was worse - it was allocating more and was slightly slower even after first compilation and also compilation for some reason was invoked twice for the same signature (I have put In summary - this is a strange situation (indicating that I will rebase it against "transform" PR and add some more tests before merging. |
Well - a last part is updates to the manual. Just pushed. |
Co-Authored-By: Milan Bouchet-Valat <nalimilan@club.fr>
I will merge this PR when CI passes. Thank you for all the discussions! |
Anything that was broken here now should be fixed by a separate PR. Thank you for a fantastic joint effort. |
Fixes #2156.
I have initially implemented the functionality discussed in #2156 and added
keepkeys
kwarg. Here is an example:I have not updated the documentation nor tests so CI should fail.
Also I have left most of the internals unchanged because:
by
NamedTuple
to a function instead of auto-splattingThe major problem I will have to look into is that the current implementation significantly strains the compiler unfortunately. Probably some more
@nospecialize
calls should be added (@nalimilan you might have some experience with it so can you please comment if you have tips to share)?