Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List of possible query2 optimizations #148

Open
johan-bjareholt opened this issue Jul 8, 2020 · 4 comments
Open

List of possible query2 optimizations #148

johan-bjareholt opened this issue Jul 8, 2020 · 4 comments
Labels
enhancement New feature or request

Comments

@johan-bjareholt
Copy link
Member

johan-bjareholt commented Jul 8, 2020

Processing optimizations:

  • Have the list of events sorted by timestamp, then we could binary search the events intersecting. Doing so would significantly improve performance of transforms like filter_period_intersect and the union_event_split in aw-transform: Add union_events_split #179 .

Memory optimizations:

  • Wrap Vec<Event> or whole queries in an allocation arena
  • Reduce the amount of clones in our transforms (already been improved quite a bit the past few months)
  • Add proper lifetimes on variables in query2 to avoid a clone on each variable assignment and reference to it (See Add explicit cloning to query2 #119)

Something else?

@johan-bjareholt johan-bjareholt added the enhancement New feature or request label Jul 8, 2020
@johan-bjareholt johan-bjareholt changed the title Lower RAM usage on large queries List of possible query2 optimizations Nov 1, 2020
@johan-bjareholt
Copy link
Member Author

johan-bjareholt commented Nov 5, 2020

@ErikBjare After I benchmarked #183 I realized that memory copying was incredibly fast already and the improvement was still barely noticeable in an extreme benchmark scenario I made up.

I started working on adding tracing instead and got something working quickly, heres the result from a month view (it's running in debug mode though, so significanly slower than usual)

Nov 05 23:31:57.034 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=256ms
Nov 05 23:31:57.138 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=103ms
Nov 05 23:31:57.179 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=4.33ms
Nov 05 23:31:57.181 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=1.48ms
Nov 05 23:31:57.183 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=1.07ms
Nov 05 23:31:57.730 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=517ms
Nov 05 23:31:58.503 TRACE query::function_call::categorize: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=677ms
Nov 05 23:31:59.007 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=400ms
Nov 05 23:31:59.069 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=14.3ms
Nov 05 23:31:59.121 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=39.2ms
Nov 05 23:31:59.128 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=397µs
Nov 05 23:31:59.485 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=317ms
Nov 05 23:31:59.531 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=305µs
Nov 05 23:31:59.532 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=364µs
Nov 05 23:31:59.558 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=20.2ms
Nov 05 23:31:59.753 TRACE query::function_call::sum_durations: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=144ms
Nov 05 23:31:59.799 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=698µs
Nov 05 23:31:59.800 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=150µs
Nov 05 23:32:00.003 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=152ms
Nov 05 23:32:00.062 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=10.9ms
Nov 05 23:32:00.066 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=139µs
Nov 05 23:32:00.066 TRACE query::function_call::concat: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=129µs
Nov 05 23:32:00.066 TRACE query::function_call::sort_by_timestamp: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=113µs
Nov 05 23:32:00.269 TRACE query::function_call::query_bucket: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=203ms
Nov 05 23:32:00.338 TRACE query::function_call::flood: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=68.9ms
Nov 05 23:32:00.532 TRACE query::function_call::filter_keyvals: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=133ms
Nov 05 23:32:07.393 TRACE query::function_call::filter_period_intersect: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=6.78s
Nov 05 23:32:07.740 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=261ms
Nov 05 23:32:08.030 TRACE query::function_call::concat: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=181ms
Nov 05 23:32:08.150 TRACE query::function_call::sort_by_timestamp: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=120ms
Nov 05 23:32:08.568 TRACE query::function_call::split_url_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=317ms
Nov 05 23:32:09.043 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=323ms
Nov 05 23:32:09.125 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=18.5ms
Nov 05 23:32:09.176 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=24.5ms
Nov 05 23:32:09.505 TRACE query::function_call::merge_events_by_keys: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=266ms
Nov 05 23:32:09.565 TRACE query::function_call::sort_by_duration: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=2.17ms
Nov 05 23:32:09.570 TRACE query::function_call::limit_events: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=2.04ms
Nov 05 23:32:09.788 TRACE query::function_call::sum_durations: aw_query::functions::qfunctions: close time.busy=0.00ns time.idle=175ms
Nov 05 23:32:10.049 DEBUG interpret: aw_query::interpret: close time.busy=0.00ns time.idle=13.3s

There is one huge outlier here and that's one call to filter_period_intersect which takes 6.78 seconds out of 13.3 seconds total.

The previous filter_period_intersects are much faster so I assume that this is when we do an intersection between the window events and the browser events as that's when we have the most amount of events.

EDIT: Tried running the tracing with my refcounting too, the same query took 13.0 seconds instead of 13.3 seconds so it's probably not worth it considering the increased complexity.

@johan-bjareholt
Copy link
Member Author

johan-bjareholt commented Nov 7, 2020

Apparently the calculate_endtime() function is incredibly slow, I was able to do a 5x optimization by caching the endtime for the filter events in each loop. Will push a PR soon.

@johan-bjareholt
Copy link
Member Author

Same query on exact same data in the web-ui in the above example went from 13.3 seconds to 7.53 seconds where the big filter_period_intersect went from 6.78 seconds to 1.24 seconds.

I'd consider that a huge win, changes are in #184

@johan-bjareholt
Copy link
Member Author

Wrapping our event lists in allocation arenas should significantly improve memory usage, would probably completely fix the "leak-like" behavior of the allocator keeping a lot more memory after big queries than it needs to.

Exactly how to do this in a nice manner in Rust though I have to investigate a bit more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant