Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datasketches perf: SketchAggregatorFactory.combine(..) returns Union object now #3471

Merged
merged 1 commit into from
Oct 5, 2016

Conversation

himanshug
Copy link
Contributor

so that it can be reused across multiple combine(..) calls which are extensively used for merging in various places.

improvement would depend upon the type of query and amount of data being merged at brokers, for some of the timeseries queries it brought down the response time from 10 secs to 1.5 secs.

existing UTs in the datasketches module cover the change.

Note: I can move the milestone to 0.9.2 if this gets merged in time or else, we will cherry-pick it in our internal distro.

…it can be reused across multiple combine(..) calls
@himanshug himanshug added this to the 0.9.3 milestone Sep 20, 2016
Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 assuming rolling updates still work (it looks like they should)

public void serialize(Union union, JsonGenerator jgen, SerializerProvider provider)
throws IOException, JsonProcessingException
{
jgen.writeBinary(union.getResult(true, null).toByteArray());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that serialized format is unchanged, so rolling updates are still possible, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, changes here are totally backwards compatible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @himanshug

@fjy
Copy link
Contributor

fjy commented Oct 4, 2016

👍 @himanshug what are the perf benefits?

I think this is fine for 0.9.2

@cheddar
Copy link
Contributor

cheddar commented Oct 5, 2016

👍

These changes produce an 80% speed-up on some timeseries queries that we were running. Given that the old groupBy aggregates using the aggregator (which builds a Union implicitly and keeps unioning into it), I don't think this will greatly affect performance there. But it will significantly speed up timeseries, topN and any other query that depends on the combine() function for the merge step.

@himanshug
Copy link
Contributor Author

moving to 0.9.2 since it is ready to be merged.

@himanshug himanshug modified the milestones: 0.9.2, 0.9.3 Oct 5, 2016
@gianm gianm merged commit 1523de0 into apache:master Oct 5, 2016
@gianm
Copy link
Contributor

gianm commented Oct 5, 2016

@himanshug can you please do a backport PR to 0.9.2?

himanshug added a commit to himanshug/druid that referenced this pull request Oct 5, 2016
…it can be reused across multiple combine(..) calls (apache#3471)
@himanshug himanshug mentioned this pull request Oct 5, 2016
gianm pushed a commit that referenced this pull request Oct 5, 2016
…it can be reused across multiple combine(..) calls (#3471) (#3540)
@gianm gianm mentioned this pull request Dec 4, 2016
fundead pushed a commit to fundead/druid that referenced this pull request Dec 7, 2016
…it can be reused across multiple combine(..) calls (apache#3471)
@himanshug himanshug deleted the sketch_combine_union branch January 3, 2017 16:24
seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this pull request Jan 10, 2020
seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this pull request Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants