Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized intersection of rollup job #228

Closed
wants to merge 1 commit into from

Conversation

pbberlin
Copy link

Some metrics / nodes took > 15 minutes for processing.

There were over hundred thousand iterations over range(coarseArchive['retention']).

Each iteration looped the entire list of overflowDatapoints - containing thousands of entries.

Intention is to find intersections in time intervals.

I added a primitive optimization:
I compute the timestamp boundaries of overflowDatapoints before the big loop.
Then I check the boundaries for intersection before overflowDatapoints is iterated.

Now we can process ~3 nodes per Second.

Some metrics aka nodes took > 15 Minutes to process.


There were over hundred thousand iterations  over range(coarseArchive['retention']).
Each iteration looped the entire list of overflowDatapoints - containing thousands of entries.

Intention is to find intersections in time intervals.

I added a primitive optimization.
I compute the timestamp boundaries of overflowDatapoints before the big loop.
Then I check the boundaries for intersection before overflowDatapoints is iterated.

Now we can process ~3 nodes per Second.
@pbberlin pbberlin closed this Mar 12, 2014
@pbberlin pbberlin deleted the patch-1 branch March 13, 2014 17:31
jraby pushed a commit to datacratic/carbon that referenced this pull request Jun 19, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant