Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set size and characteristics for Spliterator used with Tags and KeyValues #3409

Merged
merged 1 commit into from Sep 16, 2022

Conversation

kilink
Copy link
Contributor

@kilink kilink commented Sep 13, 2022

Tags and KeyValues were inheriting their spliterator method from Iterable, which returns a Spliterator with unknown size and no characteristics set.

Tags and KeyValues were inheriting their spliterator method from Iterable, which returns
a Spliterator with unknown size and no characteristics set.
Copy link
Member

@shakuzen shakuzen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the pull request. This does seem better in principle and we should do it. I'm curious, though, if you found this due to some negative effect in your code using this.

@shakuzen shakuzen added the enhancement A general enhancement label Sep 13, 2022
@shakuzen shakuzen added this to the 1.10 backlog milestone Sep 13, 2022
@kilink
Copy link
Contributor Author

kilink commented Sep 13, 2022

Thank you for the pull request. This does seem better in principle and we should do it. I'm curious, though, if you found this due to some negative effect in your code using this.

I did not observe any actual problems due to this, but it's easy to demonstrate how it can lead to inefficiencies. It is something I stumbled across and seems to be fairly common for code that extends Iterable. I get the sense that it's almost never a good idea to rely on the default spliterator method from Iterable!

Here are a couple of examples of why it can be good to propagate the exact size:

  1. Calling count on a Stream will just return the size directly if it's known, otherwise it's forced to do the reduction / iteration.
  2. Calling trySplit on a Spliterator of unknown size will allocate an array of 1024 elements by default, which is likely much larger than a typical set of tags; granted, I think trySplit is more commonly used for parallel streams, but I have seen some code that makes use of trySplit even for non-parallel streams.

I didn't comb through all of the internals of Stream, but there are likely other optimizations that are made possible by propagating the exact size.

@kilink
Copy link
Contributor Author

kilink commented Sep 13, 2022

I forgot to mention that an additional benefit to overriding the spliterator method from Iterable, is that even though Tags has its own stream method, users may still call spliterator directly, in which case they'd also get a spliterator with no characteristics set (SORTED, DISTINCT, etc). That means such operations would also be needlessly inefficient.

@shakuzen shakuzen modified the milestones: 1.10 backlog, 1.10.0-RC1 Sep 14, 2022
@shakuzen
Copy link
Member

Thank you for the added details. That makes sense. I was wondering if there were examples of places in code outside of this repository that Tags spliterator is being used, so I can learn about some usage in users' code that I might not have thought about. It's a bit hard to do a GitHub search for these things given the generic method names, and of course there is a lot of code that isn't in GitHub that may be using this.

@shakuzen shakuzen merged commit 23962fd into micrometer-metrics:main Sep 16, 2022
@shakuzen shakuzen changed the title Override default spliterator method in Tags and KeyValues Set size and characteristics for Spliterator used with Tags and KeyValues Sep 16, 2022
@shakuzen shakuzen added the performance Issues related to general performance label Sep 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A general enhancement performance Issues related to general performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants