Re-enable storage maintenance call #7227

grigoryk · 2019-12-16T23:19:57Z

In #2765 we added periodic storage maintenance calls that run once a day (or so), during startup.

In #6937 we saw that this interferes with our fennec migration code, which also runs on startup. So, maintenance was disabled.

In #6938 I proposed a solution for this problem, but it doesn't apply now that we're running migration code in a background service.

This issue is to re-enable maintenance call in a way that doesn't interfere with other parts of the browser that may need to acquire higher levels of SQL locks (reserved+).

┆Issue is synchronized with this Jira Task

boek · 2019-12-18T23:55:22Z

@grigoryk are we tracking this anywhere? Or is this a nice-to-have in the future?

mcomella · 2020-02-24T17:46:57Z

Perf self triage: this doesn't seem like it'll land in GA so moving to low impact so we don't have to track it.

mcomella · 2020-08-03T21:19:07Z

This isn't inherently a performance problem so removing it from our board. However, the proposed solution may have perf implications so if that's the case, please let us know. Thanks!

mcomella · 2020-08-19T20:45:28Z

From the original issue:

@grigoryk indicated that it does not [run maintenance], which will result in worse perf, more disk usage, etc.

Re-adding to our board because this may impact long term app performance.

I'd guess the solution should happen via WorkManager when the app isn't running though if it does happen while the app is running, please don't put it into the startup critical path: it should probably live in the visualCompletenessQueue if it happens during startup.

stale · 2021-03-27T18:34:39Z

See: #17373 This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

mcomella · 2021-04-05T20:19:11Z

@grigoryk I noticed stalebot closed this but this seems important. What do you think the urgency/priority is on this issue? If high, do you think it's something the team would have cycles to work on?

mhammond · 2022-08-29T22:29:01Z

Thought about two more issues worth considering:

* I think we'd want to use a dedicated writer for this though, so we don't interrupt/cancel the wrong operation by mistake when using a shared writer.

I don't think using a dedicated writer will be able to work without the risk of the "other" writer seeing "database is locked" errors. I do agree we need to work out how to not interrupt the wrong operation though (and I think Ben was suggesting a way to "scope" the interrupt, but I haven't thought it through)

* We should also make sure we don't run this every time the device is idle. Maybe just once a day?

The call only removes 6 visits each time it is called, so doing it once per day seems unlikely to effectively keep a cap on the number of visits.

csadilek · 2022-08-30T01:27:02Z

The call only removes 6 visits each time it is called, so doing it once per day seems unlikely to effectively keep a cap on the number of visits.

OK, makes sense! But running it every time the device is idle might also not be ideal for battery consumption. If you all have a suggestion of how often we should ideally run this, please let us know. We could start with a compromise and introduce some upper limit of runs per day otherwise.

mhammond · 2022-08-30T01:32:31Z

It's certainly true that we are never calling it today, so ramping up slowly should in theory have some small benefit. Another challenge is that a world where this has been in-place and being called regularly (ie, there will usually be just a few records to purge) is quite different to the world we have today (where there will be many as it catches up).

IOW, calling it daily is still better than what we do today, so let's do that! But it does open the question about telemetry etc - how will we ever know when we've found the sweet-spot?

bendk · 2022-08-31T20:52:05Z

I don't think using a dedicated writer will be able to work without the risk of the "other" writer seeing "database is locked" errors. I do agree we need to work out how to not interrupt the wrong operation though (and I think Ben was suggesting a way to "scope" the interrupt, but I haven't thought it through)

I didn't have a plan for how to stop this, so maybe adding another writer is the simplest way to move forward. My one concern is that the number of connections continues to grow as we add more interruptible operations. If that happens we should probably come up with a different situation.

OK, makes sense! But running it every time the device is idle might also not be ideal for battery consumption. If you all have a suggestion of how often we should ideally run this, please let us know. We could start with a compromise and introduce some upper limit of runs per day otherwise.

I like the idea of starting once per day and increasing based on telemetry. Maybe we count how many devices are still over the size target after running maintenance and try to get that down to some small-ish number.

As to how often maybe we could return a boolean if the DB is still above the size target after running maintenance. If that's false, then just run it once a day (maybe even less if the app isn't used everyday). If it's true, then run it more often, with some upper limit.

Mugurell · 2022-10-10T06:52:51Z

Saw this next in our backlog but it seems like there are still discussions to take plans and a dependency in AS?
Not sure what the next steps are for Fenix. @csadilek @mhammond @bendk .

mhammond · 2022-10-10T23:03:44Z

I thought we had a plan written somewhere in one place but can't find it. I believe we agreed:

We'd start small, running once per day, with a target size of 75 MiB
We accept that this is not interruptable because we expect it to complete very quickly. If we don't already, it would be fine to interrupt the "writer" connection at shutdown, which would interrupt any executing operations, possibly including this, which is fine.
Christian suggested that it might make sense to put this behind a feature flag in Nightly.
We have already landed telemetry in app-services in #5123, so as long we we have a fairly recent app-services we should get telemetry. Our team will monitor that and offer advice for next steps (which will hopefully be to run more often than once per day)

I'm sure @bendk or @csadilek will correct what I got wrong or add what I forgot.

csadilek · 2022-10-11T13:54:42Z

Thanks @mhammond. Yes, I think this captures it all. The rest of the discussion happened in mozilla/application-services#5115.

@Mugurell Please sync up with @jonalmeida and @kycn though, because I think they had planned to look into this soon, or possibly already started.

Re-enable storage maintenance call by introducing WorkManager worker on A-C side and consuming it from Fenix. The work request is periodic and the repeat interval is 24h. It requires the device to be idle and not to have low battery. This feature is available only for Nightly for now.

kycn · 2022-11-03T16:15:08Z

Status update:
A PeriodicWorkRequest is introduced in A-C with a 24h repeat interval.
It runs when the following constraints are met: setRequiresBatteryNotLow(true) and setRequiresDeviceIdle(true). Fenix enqueues this request to WorkManager if visualCompletenesQueue is ready.
The related PRs: Fenix and A-C.
The feature is behind a feature flag and is enabled for only Debug and Nightly variants.
@csadilek @jonalmeida @mhammond @bendk

Re-enable storage maintenance call by introducing WorkManager worker on A-C side and consuming it from Fenix. The work request is periodic and the repeat interval is 24h. It requires the device to be idle and not to have low battery. This feature is available only for Nightly for now.

jonalmeida · 2022-11-08T19:18:02Z

No QA testing is needed here.

…e call. Re-enable storage maintenance call by introducing WorkManager worker on A-C side and consuming it from Fenix. The work request is periodic and the repeat interval is 24h. It requires the device to be idle and not to have low battery. This feature is available only for Nightly for now.

grigoryk added eng:health Improve code health performance Possible performance wins labels Dec 16, 2019

grigoryk mentioned this issue Dec 16, 2019

Closes #6937: Disable runStorageMaintenance during startup #7228

Merged

4 tasks

boek added P2 Upcoming release S2 Major Functionality/product severely impaired and a satisfactory workaround doesn't exist labels Dec 18, 2019

mcomella added this to Needs prioritization in Performance, front-end roadmap Dec 26, 2019

mcomella moved this from Needs prioritization to Needs prioritization: waiting in Performance, front-end roadmap Jan 21, 2020

mcomella moved this from Needs prioritization: waiting to Low impact (unprioritized) in Performance, front-end roadmap Feb 24, 2020

mcomella removed the performance Possible performance wins label Aug 3, 2020

mcomella removed this from Low impact (unprioritized) in Performance, front-end roadmap Aug 3, 2020

mcomella added the performance Possible performance wins label Aug 19, 2020

mcomella added this to Needs prioritization in Performance, front-end roadmap via automation Aug 19, 2020

mcomella moved this from Needs prioritization to Top 10 Inter-Team bugs in Performance, front-end roadmap Aug 19, 2020

jonalmeida assigned jonalmeida and grigoryk and unassigned jonalmeida Sep 28, 2020

jonalmeida added this to ⏳ Sprint Backlog in A-C: Android Components Sprint Planning Sep 28, 2020

grigoryk removed this from ⏳ Sprint Backlog in A-C: Android Components Sprint Planning Oct 5, 2020

stale bot added the wontfix label Mar 27, 2021

stale bot closed this as completed Apr 3, 2021

Performance, front-end roadmap automation moved this from Top 10 Inter-Team bugs to Done Apr 3, 2021

mcomella removed this from Done in Performance, front-end roadmap Apr 5, 2021

mcomella added this to Needs prioritization in Performance, front-end roadmap via automation Apr 5, 2021

mcomella removed the wontfix label Apr 5, 2021

mhammond mentioned this issue Sep 6, 2022

Rework places maintenance code mozilla/application-services#5115

Closed

kycn self-assigned this Oct 12, 2022

github-actions bot added the eng:reopen-for-qa Reopens and tags the issue for QA needed when the issue is merged label Nov 3, 2022

github-actions bot mentioned this issue Nov 3, 2022

Close #7227: Re-enable storage maintenance call. #27689

Merged

4 tasks

mergify bot closed this as completed in #27689 Nov 3, 2022

Performance, front-end roadmap automation moved this from Perf P2-ish (unordered) to Done Nov 3, 2022

github-actions bot added this to the 108 milestone Nov 3, 2022

github-actions bot reopened this Nov 3, 2022

github-actions bot added eng:qa:needed QA Needed and removed eng:reopen-for-qa Reopens and tags the issue for QA needed when the issue is merged labels Nov 3, 2022

Performance, front-end roadmap automation moved this from Done to Needs triage Nov 3, 2022

jonalmeida closed this as completed Nov 8, 2022

Performance, front-end roadmap automation moved this from Needs triage to Done Nov 8, 2022

jonalmeida removed the eng:qa:needed QA Needed label Nov 8, 2022

jonalmeida mentioned this issue Nov 8, 2022

Investigate and remove feature flag for storageMaintenance #27759

Closed

This was referenced Dec 7, 2022

[pull] main from mozilla-mobile:main fork-house/fenix#14

Merged

[pull] main from mozilla-mobile:main Leland-Takamine/fenix#159

Merged

[pull] main from mozilla-mobile:main nathanmkaya/fenix#108

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-enable storage maintenance call #7227

Re-enable storage maintenance call #7227

grigoryk commented Dec 16, 2019 •

edited by data-sync-user

boek commented Dec 18, 2019

mcomella commented Feb 24, 2020

mcomella commented Aug 3, 2020

mcomella commented Aug 19, 2020 •

edited

stale bot commented Mar 27, 2021

mcomella commented Apr 5, 2021

mhammond commented Aug 29, 2022

csadilek commented Aug 30, 2022 •

edited

mhammond commented Aug 30, 2022

bendk commented Aug 31, 2022

Mugurell commented Oct 10, 2022

mhammond commented Oct 10, 2022 •

edited

csadilek commented Oct 11, 2022 •

edited

kycn commented Nov 3, 2022

jonalmeida commented Nov 8, 2022

Re-enable storage maintenance call #7227

Re-enable storage maintenance call #7227

Comments

grigoryk commented Dec 16, 2019 • edited by data-sync-user

boek commented Dec 18, 2019

mcomella commented Feb 24, 2020

mcomella commented Aug 3, 2020

mcomella commented Aug 19, 2020 • edited

stale bot commented Mar 27, 2021

mcomella commented Apr 5, 2021

mhammond commented Aug 29, 2022

csadilek commented Aug 30, 2022 • edited

mhammond commented Aug 30, 2022

bendk commented Aug 31, 2022

Mugurell commented Oct 10, 2022

mhammond commented Oct 10, 2022 • edited

csadilek commented Oct 11, 2022 • edited

kycn commented Nov 3, 2022

jonalmeida commented Nov 8, 2022

grigoryk commented Dec 16, 2019 •

edited by data-sync-user

mcomella commented Aug 19, 2020 •

edited

csadilek commented Aug 30, 2022 •

edited

mhammond commented Oct 10, 2022 •

edited

csadilek commented Oct 11, 2022 •

edited