-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full Maintenance: snapshot GC failure: error running snapshot gc: unable to find in-use content ID: error walking snapshot tree #1402
Comments
FYI - we debugged this on Slack. The issue was caused by a rogue index blob write that happened after the epoch was already finalized. The good news is we were able to leverage the epoch structure and rolled back to a last-known-good epoch. |
Here's the sequence of index blob writes showing that index for epoch 226 was written while we were in epoch 228:
|
Reduces likelihood of kopia#1402 until we fix the root cause.
Reduces likelihood of #1402 until we fix the root cause.
Digging deeper, the client that caused this bad write slept for a very long time:
|
Timeline of that one snapshot session:
17:11:52.946168 wrote-pack q580acdabce2b47ca63ba648f1e951be7-s4bd4ee9a844b2be7109 1941010 |
The dual time measurement is described in https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md The fix is to discard hidden monotonic time component of time.Time by converting to unix time and back. Reviewed usage of clock.Now() and replaced with clock.WallClockTime() or time.Now() as appropriate. The problem in kopia#1402 was that passage of time was measured using the monotonic time and not wall clock time. When the computer goes to sleep, monotonic time is still monotonic while wall clock time makes a leap when the computer wakes up. This is the behavior that epoch manager (and most other compontents in Kopia) rely upon. Fixes kopia#1402
The dual time measurement is described in https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md The fix is to discard hidden monotonic time component of time.Time by converting to unix time and back. Reviewed usage of clock.Now() and replaced with timetrack.StartTimer() when measuring time. The problem in kopia#1402 was that passage of time was measured using the monotonic time and not wall clock time. When the computer goes to sleep, monotonic time is still monotonic while wall clock time makes a leap when the computer wakes up. This is the behavior that epoch manager (and most other compontents in Kopia) rely upon. Fixes kopia#1402
The dual time measurement is described in https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md The fix is to discard hidden monotonic time component of time.Time by converting to unix time and back. Reviewed usage of clock.Now() and replaced with timetrack.StartTimer() when measuring time. The problem in #1402 was that passage of time was measured using the monotonic time and not wall clock time. When the computer goes to sleep, monotonic time is still monotonic while wall clock time makes a leap when the computer wakes up. This is the behavior that epoch manager (and most other compontents in Kopia) rely upon. Fixes #1402 Co-authored-by: Julio Lopez <julio+gh@kasten.io>
hello!
i've run into another issue running my weekly maintenance.
kopia contents verify
doesn't find any errors, butkopia maintenance run --full
ends up throwing:this is on v0.9.2 on macOS 12 beta, as installed from homebrew. repository is hosted on wasabi, and has five clients across three timezones, all using KopiaUI. three are windows and two are mac. i'm the only one that uses the CLI and i only use it when using it with my "archival" config file or running maintenance.
kopia blob stats
kopia contents stats
amusingly while
kopia contents verify
returns no errors,kopia contents show k619061c02b693321c886c285b314084d
returns content not found.let me know what next steps should be, i'm around on slack, thanks again for the awesome software and all your help!
The text was updated successfully, but these errors were encountered: