-
Notifications
You must be signed in to change notification settings - Fork 176
Improve DeltaList behavior for large projects #335
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve DeltaList behavior for large projects #335
Conversation
this.added = newList.stream().filter(i -> !oldList.contains(i)).collect(Collectors.toList()); | ||
this.removed = oldList.stream().filter(i -> !newList.contains(i)).collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to use HashSet
s and avoid copying then removing a lot of elements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be better to use the old for loop instead of streams?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could sure.
hey @jorsol , any chance you could have a look at this PR? |
I have been experimenting with a large code base to try to improve Quarkus build time and I stumbled upon the fact that DeltaList was very inefficient in these cases. In this commit, I keep the ordering even if not actually important. I will push further optimizations in a second commit that we can discuss further. Fixes apache#334
@desruisseaux @gnodet ping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I adjusted the commits:
|
It looks fine to me as well. |
This pull request targets the 3.x branch as I think this support has been rewritten in 4.
Here is the situation prior to this patch with a massive project (~40k source files):
The first commit keeps the ordering as is and make sure we write the file sorted and the outputs of
DeltaList
are sorted.The behavior is exactly the same as before.
Here is the situation after the first commit:
The second commit is a further optimization, which might not be useful and only sorts when displaying the output. The file is written unsorted.
It doesn't change much tbh so we might as well not include the second commit.
I did the work to be comprehensive so I thought I would present it.
Here is the situation after the second commit:
Opening as a draft so that we decide what to do about the second commit: my advice would be to drop it but YMMV
BTW, it might be possible to optimize further but given this code is readable and give good results, I think it's good enough.
Fixes #334
To make clear that you license your contribution under
the Apache License Version 2.0, January 2004
you have to acknowledge this by using the following check-box.