Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Fixing performance while calling containsAll over list #2272

Merged
merged 3 commits into from
Apr 20, 2023

Conversation

saurabh-rahate
Copy link
Contributor

@saurabh-rahate saurabh-rahate commented Apr 5, 2023

Improving performance in case of

list.containsAll(anotherList)

by using HashSet wrapper around one of the lists

new HashSet<>(list).containsAll(anotherList)

This improves time complexity.

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Apr 5, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: saurabh-rahate / name: Saurabh Rahate (67c83d2)

@cf-gitbot
Copy link

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/184880763

The labels on this github issue will be updated when the story is started.

@saurabh-rahate saurabh-rahate marked this pull request as ready for review April 5, 2023 18:52
Copy link
Member

@strehle strehle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for me

@strehle
Copy link
Member

strehle commented Apr 12, 2023

@Tallicia @bruce-ricard any concerns

@Tallicia
Copy link
Contributor

The 8 test look good and the optimization does as well.

Copy link
Contributor

@Tallicia Tallicia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement.

@Tallicia
Copy link
Contributor

Would be nice to see performance tests to see if sets are of large enough size in practice to know the real world performance improvement.

Copy link
Contributor

@bruce-ricard bruce-ricard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for making performance improvements.

To be perfectly frank, I am not convinced that this change will improve performance in the general practical world. I don't know if most objects don't have one or two roles. And this change forces the creation of 2 new hashsets every single time.

Also, we cannot find any usage of that equals method in the UAA code base. I don't think that this change will have any impact at all.

Additionally, I am not convinced that Java doesn't already use HashSets internally to resolve containsAll calls on List object.

@Tallicia and me

@Tallicia Tallicia added in progress in_review The PR is currently in review and removed unscheduled labels Apr 13, 2023
@strehle
Copy link
Member

strehle commented Apr 20, 2023

I have tested this and created an extra PR #2294
With a test of role about 10000 the difference is 300 ms compared to 30 ms , in case of more e.g. 100000xxx it sometimes runs into many seconds, so that I decided to stay only with 10000
So it brings the expected result from times described here:
https://www.baeldung.com/java-hashset-arraylist-contains-performance
List -> O(mn)
HashSet -> O(n)

Finally, I will merge this, so thanks for your PR

@strehle strehle merged commit bc6623c into cloudfoundry:develop Apr 20, 2023
@cf-gitbot cf-gitbot added delivered accepted Accepted the issue and removed delivered labels Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Accepted the issue in_review The PR is currently in review
Projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants