Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize out unnecessary joins across autogenerated instance keys #3107

Closed
teh-cmc opened this issue Aug 25, 2023 · 3 comments · Fixed by #3377
Closed

Optimize out unnecessary joins across autogenerated instance keys #3107

teh-cmc opened this issue Aug 25, 2023 · 3 comments · Fixed by #3377
Labels
📉 performance Optimization, memory use, etc 🔍 re_query affects re_query itself

Comments

@teh-cmc
Copy link
Member

teh-cmc commented Aug 25, 2023

The store will auto-generate monotically increasing instance keys when they are not passed in by the user. This allows re_query's join logic to "just work" no matter what but it also means that in most cases we're computing a costly join across 2 components of the same size, both with autogenerated instance keys, even though we already know the answer to that join.

@teh-cmc teh-cmc added 🔍 re_query affects re_query itself 📉 performance Optimization, memory use, etc labels Aug 25, 2023
@teh-cmc
Copy link
Member Author

teh-cmc commented Aug 25, 2023

My proposed solution here would be to completely drop the notion of autogenerated instance keys altogether (which would fix a bunch of other issues).

When we get two components to join and none of them have keys, then we know we can blindly pair them.

@teh-cmc teh-cmc self-assigned this Aug 25, 2023
@teh-cmc
Copy link
Member Author

teh-cmc commented Aug 26, 2023

The solution proposed above induces way too many cascading changes resulting from having to make cluster keys optional (internally), since everything has been built on the assumption that they should always be present (and rightly so).

A less invasive solution would be to indicate in the query results whether or not the returned instance keys were autogenerated (which we know since all autogenerated keys of a given length point to the same arrow buffer).

@emilk
Copy link
Member

emilk commented Sep 20, 2023

I think this is high-priority for 0.10, since we have introduces potential performance regressions in 0.9 because of it (#3363)

@teh-cmc teh-cmc self-assigned this Sep 20, 2023
teh-cmc added a commit that referenced this issue Sep 20, 2023
What the title says; see the commits for details, it's pretty trivial.

Going from a ~62ms space view build time to ~40ms in the OPF example, so
pretty nice gains overall.

OPF, before:
![image
(17)](https://github.com/rerun-io/rerun/assets/2910679/700fe397-7fa2-4f4c-b661-07b9dfb30937)

OPF, after:

![image](https://github.com/rerun-io/rerun/assets/2910679/4b4ae9db-ee6b-4318-a0ed-ea25e08facdd)


* Fixes #3107
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📉 performance Optimization, memory use, etc 🔍 re_query affects re_query itself
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants