Implement recording/last-modified-at aware garbage collection #4183

teh-cmc · 2023-11-08T16:10:31Z

Commit by commit, there's renaming involved!

GC will now focus on the oldest-modified recording first.
Tried a lot of fancy things, but a lot of stress testing has shown that nothing worked as well as doing this the dumb way.

Speaking of stress testing, the scripts I've used are now committed in the repository. Make sure to try them out when modifying the GC code 😬.

In general, the GC supports stress much better than I thought/hoped:

many_medium_sized_single_row_recordings.py, many_medium_sized_many_rows_recordings.py & many_large_many_rows_recordings.py all behave pretty nicely, something like this:

23-11-08_16.41.47.patched.mp4

many_large_single_row_recordings.py on the other hand is still a disaster (watch til the end, this slowly devolves into a blackhole):

23-11-08_17.00.12.patched.mp4

This is not a new problem (not to me at least 😬), large recordings with very few rows have always been a nightmare on the GC (not specifically the DataStore GC, the GC as a whole through the entire app).
I've never had time to investigate why, but now we have an issue for it at least:

Ginormous recordings with only a handful of rows obliterate the GC stack #4185

Fixes Garbage collection should be aware of app_id/recording_id semantics #1904

Checklist

I have read and agree to Contributor Guide and the Code of Conduct
I've included a screenshot or gif (if applicable)
I have tested demo.rerun.io (if applicable)
The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG

jleibs · 2023-11-08T17:54:28Z

crates/re_viewer/src/store_hub.rs

+        };
+
+        let store_dbs = &mut self.store_bundle.store_dbs;
+        if store_dbs.len() <= 1 {


Surprised to see this return early when there is 1 store_db? Is this guaranteed to be some kind of special store that we don't want to GC? Please clarify with a comment if so.

Uh-oh. No that's just me having shuffled things around one time too many 😬 Nice catch.

Hmmm, I went with the following... opinions welcome.

commit 8e6b8853b3751037989e80c26704fbf3d0438a64 Author: Clement Rey <cr.rey.clement@gmail.com> Date: Wed Nov 8 19:21:51 2023 +0100 always GC, but dont remove the last one diff --git a/crates/re_viewer/src/store_hub.rs b/crates/re_viewer/src/store_hub.rs index e17fd0d0dd..3457abc1b6 100644 --- a/crates/re_viewer/src/store_hub.rs +++ b/crates/re_viewer/src/store_hub.rs @@ -222,9 +222,6 @@ impl StoreHub { }; let store_dbs = &mut self.store_bundle.store_dbs; - if store_dbs.len() <= 1 { - return; - } let Some(store_db) = store_dbs.get_mut(&store_id) else { if cfg!(debug_assertions) { @@ -239,9 +236,21 @@ impl StoreHub { let store_size_after = store_db.store().timeless_size_bytes() + store_db.store().temporal_size_bytes(); + // No point keeping an empty recording around. + if store_db.is_empty() { + self.remove_recording_id(&store_id); + return; + } + // Running the GC didn't do anything. - // That's because all that's left in that store is protected rows: it's time to remove it entirely. - if store_size_before == store_size_after { + // + // That's because all that's left in that store is protected rows: it's time to remove it + // entirely, unless it's the last recording still standing, in which case we're better off + // keeping some data around to show the user rather than a blank screen. + // + // If the user needs the memory for something else, they will get it back as soon as they + // log new things anyhow. + if store_size_before == store_size_after && store_dbs.len() > 1 { self.remove_recording_id(&store_id); }

jleibs

Mostly seems like a net improvement relative to today.

However, there's one edge case I'm a bit worried about, which is when you have several incoming recordings in parallel in which case you really do want to distribute your GCs as before. In this case "last modified" is going to jump around somewhat unpredictably as new data comes into the system.

I'm trying to think if there's something we could do with an "overlap" metric. Basically if all recordings are considered overlapping, then we spread out our GC evenly. Otherwise we GC the oldest recording, as implemented here.

teh-cmc force-pushed the cmc/gc_old_recordings branch from 3206392 to 9b0e2a1 Compare November 8, 2023 16:19

teh-cmc added ⛃ re_datastore affects the datastore itself 📺 re_viewer affects re_viewer itself include in changelog labels Nov 8, 2023

teh-cmc marked this pull request as ready for review November 8, 2023 16:21

jleibs reviewed Nov 8, 2023

View reviewed changes

teh-cmc added 5 commits November 8, 2023 19:07

keep track of StoreDb modification time

198e74b

some MemoryUse helpers that I didn't need in the end

da14bd9

implement recording-aware GC

6f51ced

store_dbs -> store_bundle

493710d

add GC stress scripts

0164f56

teh-cmc force-pushed the cmc/gc_old_recordings branch from 9b0e2a1 to fa66a63 Compare November 8, 2023 18:21

always GC, but dont remove the last one

8e6b885

teh-cmc force-pushed the cmc/gc_old_recordings branch from fa66a63 to 8e6b885 Compare November 8, 2023 18:23

jleibs approved these changes Nov 8, 2023

View reviewed changes

teh-cmc merged commit b8ea6af into main Nov 9, 2023
37 checks passed

teh-cmc deleted the cmc/gc_old_recordings branch November 9, 2023 08:57

teh-cmc mentioned this pull request Dec 12, 2023

Rerun 0.12 zombie TODO pass #4498

Closed

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement recording/last-modified-at aware garbage collection #4183

Implement recording/last-modified-at aware garbage collection #4183

teh-cmc commented Nov 8, 2023 •

edited by github-actions bot

jleibs Nov 8, 2023

teh-cmc Nov 8, 2023

teh-cmc Nov 8, 2023

jleibs left a comment

Implement recording/last-modified-at aware garbage collection #4183

Implement recording/last-modified-at aware garbage collection #4183

Conversation

teh-cmc commented Nov 8, 2023 • edited by github-actions bot

Checklist

jleibs Nov 8, 2023

Choose a reason for hiding this comment

teh-cmc Nov 8, 2023

Choose a reason for hiding this comment

teh-cmc Nov 8, 2023

Choose a reason for hiding this comment

jleibs left a comment

Choose a reason for hiding this comment

teh-cmc commented Nov 8, 2023 •

edited by github-actions bot