Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visible time range queries skip random chunks of data (for 2D points) #5686

Closed
roym899 opened this issue Mar 26, 2024 · 5 comments · Fixed by #6177
Closed

Visible time range queries skip random chunks of data (for 2D points) #5686

roym899 opened this issue Mar 26, 2024 · 5 comments · Fixed by #6177
Assignees
Labels
🪳 bug Something isn't working 🔍 re_query affects re_query itself
Milestone

Comments

@roym899
Copy link
Collaborator

roym899 commented Mar 26, 2024

Describe the bug
It seems like the visible time range queries sometimes miss chunks, i.e., one or more consecutive datapoints. I have observed this with the drone example (with Points3D) before, but had trouble reproducing it reliably in that case. I believe this is happening since the caching was introduced.

To Reproduce
Check this rrd file.

It contains sampled images over time (logged as colored 2D points) and the image changes every second. When setting the visibility history to -1s and scrolling along the timeline, black parts without any data appear despite data being logged at these times. The sections with missing data change when restarting the viewer, but there always seem to be one or more. Somewhat easy to find by just scrolling along the axis looking for black parts to appear.

visibletimerange.mp4

The issue can be seen at around 0:18, before I'm just showing what the data looks like.

Desktop (please complete the following information):

  • OS: Ubuntu 22.04

Rerun version

rerun_py 0.14.1 [rustc 1.74.0 (79e9716c9 2023-11-13), LLVM 17.0.4] x86_64-unknown-linux-gnu release-0.14.1 74f1c23, built 2024-02-29T11:00:55Z

@roym899 roym899 added 🪳 bug Something isn't working 👀 needs triage This issue needs to be triaged by the Rerun team labels Mar 26, 2024
@Wumpf Wumpf added this to the Triage milestone Mar 26, 2024
@Wumpf Wumpf added 🔍 re_query affects re_query itself and removed 👀 needs triage This issue needs to be triaged by the Rerun team labels Mar 26, 2024
@Wumpf Wumpf changed the title Visible time range queries skip random chunks of data Visible time range queries skip random chunks of data (for 2D points) Mar 26, 2024
@teh-cmc
Copy link
Member

teh-cmc commented Mar 26, 2024

Oooo, nice catch. That's interesting -- off the top of my head this is very likely to be caused by the crazy last-minute shenanigans that went it for read recursivity etc.

We're not too far away from landing the new cached range APIs at this point so maybe the best approach here is to wait for that and see if that fixes it (among many other things...) 🤔

@teh-cmc teh-cmc self-assigned this Mar 26, 2024
@teh-cmc
Copy link
Member

teh-cmc commented Apr 29, 2024

Somehow, I can still reproduce this on main, where all the APIs have been rewritten from scratch.

I can also reproduce it in single-threaded mode.

Not quite sure what's going on yet, but the API is clearly returning 0 results when it shouldn't.

@roym899
Copy link
Collaborator Author

roym899 commented Apr 29, 2024

Fwiw, I think it might be affected by how quickly I start scrolling around in the data. Like if I just open the rrd file and not touch the viewer for a while it happens less compared to immediately scrolling around while the data is still being loaded.

@teh-cmc
Copy link
Member

teh-cmc commented Apr 29, 2024

Interesting, might be related to invalidation then.

@teh-cmc teh-cmc modified the milestones: Triage, 0.16 Apr 30, 2024
@teh-cmc
Copy link
Member

teh-cmc commented Apr 30, 2024

I have an automated reproduction of this. It's a pretty humongous bug right in the middle of the range cache, hard to unsee once you've seen it 😄

I've implemented the same exact bug twice, which is why it exists both in the old and the new. I'm that consistent.

It can only affect offset-based range queries and requires the user to jump the time cursor in a specific pattern to happen, which is why it has gone unnoticed until now.

Should have the PR soon, if the headache finally goes away...

teh-cmc added a commit that referenced this issue May 3, 2024
A cheap fix for #5686, both in terms of added code and runtime
performance, as it only kicks in in very particular circumstances.
The test added in this PR explains the situation better than words ever
could.

The proper solution here would to implement proper bucketing in the
range cache (#4810), but that's a much bigger change that I don't want
to get into until after the upcoming drastic changes to the datastore.

- Fixes #5686
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🪳 bug Something isn't working 🔍 re_query affects re_query itself
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants