-
Notifications
You must be signed in to change notification settings - Fork 220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework places maintenance code #5115
Comments
For that second part, I think we should try one more time to get a good interruption system. What if This seems like a nice simplification of the current The main drawback I see is that it becomes harder to interrupt a specific operation, rather than a type of operation. However, it doesn't seem like we need that. |
Stepping back a little, I think the interruption requirement comes from mozilla-mobile/fenix#7227 (comment)
So this seems to be met by our existing interrupt support - we don't actually care what's running, just interrupt the world. But then the discussion turned into making sure we don't interrupt the "wrong" operation - but is there actually a wrong operation at shutdown? Or is there a non-shutdown use-case I've forgotten about? @jonalmeida / @csadilek, what are the actual constraints here? |
@mhammond Yes, I agree that interrupting the world would be fine in case the app is being shut down, but the case we wanted to discuss with you is this:
Ideally, we should then cancel the maintenance operation and run again the next time the device is idle. If we keep it running we risk slowing down the app / device. If we interrupt via the single If we're reasonably confident that a single I am currently leaning towards not interrupting (unless we can make sure it's the right operation), but wondering what you think? Do we have a rough idea how long a single execution of |
So seems like we're leaning towards not interrupting. What if we updated the result of the operation to give you some metrics that could be useful for both scheduling purposes and double checking that this isn't leading to pathological cases? I'm thinking we can give a result with the following fields:
AFAICT there's no chance of ending up in a bad state.
I don't have an estimate, but I think Grisha's analysis in fenix#7227 is correct. In almost all cases it should take a very short time, but in rare cases it might take a longer time. I think the most likely time is the very first VACUUM we run after not running it for a while. |
@bendk yes, if you and @mhammond agree as well. I think this would be a good starting point.
That sounds good. Then we can record telemetry for it and land in Nightly (maybe even behind a feature flag first). This will allow us to look at the data esp. total time taken, and we can still decide if we need to add interrupting logic later. I expect the data to be significantly different on Release though (more diverse devices), but even then we will learn and can adjust.
What is the target size in this case? /cc @jonalmeida |
Whatever value you pass in to us. Desktop uses 75 MiB which seems like a good starting point (https://searchfox.org/mozilla-central/source/toolkit/components/places/PlacesExpiration.sys.mjs#47) |
One slight tweak after writing this code is that I think we should metrics on time taken, but not return it as part of the metrics struct. The reason is that Glean's timing distribution metrics requires you to use it's timer -- you can't just pass it a specific value to record nor can you get the value that it recorded. |
AFAICT all the app-services code is ready now that #5123 has been merged. Closing this one. |
A while back we updated the
run_maintenance
function to prune old history if the places DB was over a certain limit. It looks like Fenix wants to start using it, but we've identified a few areas that could be improved:run_maintenance()
call. Theinterrupt()
method will interrupt any operation, so it's not really safe for consumers to call it if they want to only interrupt maintenance. We should add support for interruption that's limited to certain kinds of operations.┆Issue is synchronized with this Jira Task
┆Sprint End Date: 2022-09-16
The text was updated successfully, but these errors were encountered: