fix: prevent duplicate backups on container restart#240
Merged
Conversation
7de7b9b to
5050a12
Compare
Seed _last_poll from the newest backup file on disk during BackupCollector init. This way should_poll() correctly waits until the configured interval has passed since the last actual backup, even after a container restart. Previous approach (checking backup age in collect()) had two issues flagged in review: - A skipped collect still advanced the schedule via record_success() - Manual backups suppressed the next automatic backup The seed approach avoids both: _last_poll is set once on init from disk state, and only the orchestrator updates it after a real collect() run.
5050a12 to
d475977
Compare
Owner
Author
Reworked approach (v2)Changed from skip-in-collect to seed-on-init after review: Old approach: Check backup age in
New approach: Seed
Tests now verify |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
BackupCollectorcreates a backup immediately on every container startup because_last_pollis initialized to0.0— meaningshould_poll()always returnsTrueon the first tick. In environments where the container restarts frequently (e.g. via Dockhand auto-updates during active development), this leads to multiple backups per day despite a 24h interval being configured.Observed on production instance (
v2026-03-14.726):5 backups in 7 hours with a daily interval configured — each one triggered by a container restart.
Fix
On init,
_seed_last_poll()reads the newest backup file's modification timestamp from disk and sets_last_pollaccordingly. This wayshould_poll()correctly waits until the configured interval has elapsed since the last actual backup, even after a container restart.Design decision: The seed uses the newest backup file regardless of whether it was created by the scheduled collector or a manual "Backup Now" click. This means a manual backup can shift the automatic schedule after a restart. This is intentional — the guarantee is "at least one backup every <interval>", not "backups at a fixed time of day". A manual backup is a valid backup, and waiting another full interval after it is operationally correct.
Changes
app/modules/backup/collector.py_seed_last_poll()in__init__, documented design decisiontests/test_collectors.pyTestBackupCollectortests covering seed behavior,should_poll(), and manual backup inclusionTest plan
should_poll()returns True → backup createdshould_poll()returns False → no duplicate