Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Full Maintenance: snapshot GC failure: error running snapshot gc: unable to find in-use content ID: error walking snapshot tree #1402

Closed
zyradyl opened this issue Oct 17, 2021 · 4 comments · Fixed by #1437

Comments

@zyradyl
Copy link
Contributor

zyradyl commented Oct 17, 2021

hello!

i've run into another issue running my weekly maintenance. kopia contents verify doesn't find any errors, but kopia maintenance run --full ends up throwing:

ERROR snapshot GC failure: error running snapshot gc: unable to find in-use content ID: error walking snapshot tree: error verifying k619061c02b693321c886c285b314084d: error getting content info for k619061c02b693321c886c285b314084d: content not found

this is on v0.9.2 on macOS 12 beta, as installed from homebrew. repository is hosted on wasabi, and has five clients across three timezones, all using KopiaUI. three are windows and two are mac. i'm the only one that uses the CLI and i only use it when using it with my "archival" config file or running maintenance.

kopia blob stats

Count: 161784
Total: 3.5 TB
Average: 21.4 MB
Histogram:

        0 between 0 B and 10 B (total 0 B)
        3 between 10 B and 100 B (total 42 B)
     1619 between 100 B and 1 KB (total 274.5 KB)
     4173 between 1 KB and 10 KB (total 17 MB)
     2227 between 10 KB and 100 KB (total 50 MB)
      743 between 100 KB and 1 MB (total 264.1 MB)
     4460 between 1 MB and 10 MB (total 15.9 GB)
   148556 between 10 MB and 100 MB (total 3.4 TB)

kopia contents stats

Count: 3722100
Total Bytes: 3.6 TB
Total Packed: 3.4 TB (compression 6.3%)
By Method:
  (uncompressed)         count: 2765744 size: 3.2 TB
  zstd                   count: 952514 size: 408.8 GB packed: 182.4 GB compression: 55.4%
  zstd-best-compression  count: 3383 size: 4.3 GB packed: 3.5 GB compression: 18.5%
  zstd-fastest           count: 459 size: 1.6 GB packed: 1.3 GB compression: 16.5%
Average: 1 MB
Histogram:

        0 between 0 B and 10 B (total 0 B)
    13956 between 10 B and 100 B (total 1 MB)
   877340 between 100 B and 1 KB (total 465.7 MB)
   812603 between 1 KB and 10 KB (total 2.8 GB)
   609853 between 10 KB and 100 KB (total 24.9 GB)
   561270 between 100 KB and 1 MB (total 189.9 GB)
   847078 between 1 MB and 10 MB (total 3.2 TB)
        0 between 10 MB and 100 MB (total 0 B)

amusingly while kopia contents verify returns no errors, kopia contents show k619061c02b693321c886c285b314084d returns content not found.

let me know what next steps should be, i'm around on slack, thanks again for the awesome software and all your help!

@jkowalski
Copy link
Contributor

jkowalski commented Oct 18, 2021

FYI - we debugged this on Slack. The issue was caused by a rogue index blob write that happened after the epoch was already finalized.

The good news is we were able to leverage the epoch structure and rolled back to a last-known-good epoch.
This should have not happened, but clearly something is missing - we likely have a bug somewhere in the epoch manager I have logs to dig through, will post updates here.

@jkowalski
Copy link
Contributor

Here's the sequence of index blob writes showing that index for epoch 226 was written while we were in epoch 228:

==> epoch 226 started

2021-10-16 16:30:35 CDT xn226_18ceac9bae6576150dd1dcd3a5cda7b7-s7594687f7ae1b502109-c1              17683 
2021-10-16 16:31:08 CDT xn226_ceed2247f3f93f06e28c4ac2ccbc168e-s1097767a6ee74e92109-c1               6150 
2021-10-16 16:40:13 CDT xn226_42cfb574b3911b1ef78c49860c1d8db1-s4b048b07360c7eb7109-c1                143 
2021-10-16 16:40:17 CDT xn226_67db09178b60f15c3c337176ccbf366d-s1421441276b8c78e109-c1               4153 
2021-10-16 16:44:02 CDT xn226_d0918686acc8e7e01c87850160669293-s836922e6e1dcc17b109-c1              22631 
2021-10-16 16:46:13 CDT xn226_28135c40afd82f5e89d8f86af23054b4-se65af9077130f3a2109-c1              22291 
2021-10-16 16:46:55 CDT xn226_04e6da16c71381c2a831b42d7ddce673-s6c4c1f882e75b46e109-c1               1615 
2021-10-16 16:50:16 CDT xn226_ea70a7212f87738b5eb36d84596fc04a-s345b35a98a12f929109-c1               3541 
2021-10-16 16:50:35 CDT xn226_8a2661ebbe3a45359b2678cf36ee8482-s7e55bb28a93ca2b8109-c1              32482 
2021-10-16 17:00:19 CDT xn226_69253e019f7ca1bc315eb03440d870a1-sd0f11185e68d788c109-c1               8743 
2021-10-16 17:00:39 CDT xn226_e2a379dbdfad5dc33b98c6e69f2bb01f-s3df437958f31ce5e109-c1              19809 
2021-10-16 17:01:15 CDT xn226_3636a03a6a6cd7b554aca207a158a84a-s9b29d76dcdc3d24a109-c1              16511 
2021-10-16 17:01:31 CDT xn226_02c2c53e5d98771438e1006f0e109ba6-sd72a809ee58884c2109-c1                143 
2021-10-16 17:10:16 CDT xn226_40950df658daba34f8e8165a6e30c7dd-s75f263b30eb254e2109-c1               4221 
2021-10-16 17:10:47 CDT xn226_9bd7736e990c27a42d1a9d9586004409-s3799885c1081e890109-c1              23676 
2021-10-16 17:16:15 CDT xn226_75365dfc223ca0612f42e0070e960293-sbda89f7d9e2c9e1c109-c1              11402 
2021-10-16 17:20:14 CDT xn226_72e6e37442006edb482bb2a6a49869b9-sb3eb2186c2849686109-c1                143 
2021-10-16 17:20:15 CDT xn226_e3973671cc471971ea3d51f8151065bc-sa95c7fd67a866c38109-c1               3065 
2021-10-16 17:20:44 CDT xn226_06ebd5d2c6b03ee685f60cbfcf26e603-s5b8738e539f9bb9c109-c1              15151 
2021-10-16 17:30:15 CDT xn226_5fbad871553cb309c5f0f7bab4fad037-sd5f6e7e23a005b2f109-c1               5887 
2021-10-16 17:30:51 CDT xn226_1f46ed4ce642f4bf5166895dd17f2478-s716c997949a5e271109-c1              14233 
2021-10-16 17:31:15 CDT xn226_e66ffa4bbd19414ec2c2f64a3cbc91db-s5f950db4cfdbc579109-c1              13458 
2021-10-16 17:32:07 CDT xn226_66d3e982d4c501d07b54f4d2453cd378-sbd42279cf9f12472109-c1               1615 
2021-10-16 17:40:12 CDT xn226_8d65cc8a5fb5ee29a8d3d45656978fa6-s0f9384a66c5c8163109-c1                143 
2021-10-16 17:40:15 CDT xn226_975ce284d9a719b7d2d6d5f370c14d19-s445f7df456729a0a109-c1               3745 
2021-10-16 17:40:28 CDT xn226_fec4e0ebdf1bb50f42651ff4db2b4ec2-s600031a385802514109-c1               8301 
2021-10-16 17:46:19 CDT xn226_b1f2ab6a17bff36f20212214ea303af4-s6d8556f552fb0c74109-c1              12576 
2021-10-16 17:50:15 CDT xn226_2b0411b7923bf1632b4f9438dd8dde9c-se0e6925fd4f8f716109-c1               4153 
2021-10-16 17:50:28 CDT xn226_2ae1b66205ee8c191ac136ab8280272c-sbb738f7716bedb01109-c1               7927 
2021-10-16 18:00:15 CDT xn226_12a44035d89b27b1dad17c68076a6e0c-s0e76f5b25ca6653f109-c1               4187 
2021-10-16 18:00:28 CDT xn226_29560341f5848c0475063d15326b3e5c-scd3e94e8317e80d3109-c1               5921 
2021-10-16 18:01:04 CDT xn226_5dc870365fc66b1deb48edaca63f9076-s0aeb358f7190e6f6109-c1              11674 
2021-10-16 18:01:17 CDT xn226_a5c9d8020588a166019716b65ed67d2e-s089b81b25dc8f009109-c1                143 
2021-10-16 18:10:15 CDT xn226_06ba437304cd5b66ce5eca204136bafc-s251fa80dba72cf2a109-c1               3201 
2021-10-16 18:10:28 CDT xn226_336fa6cee7fb026bc91819c471e66852-scb724a4d0831d914109-c1               5207 
2021-10-16 18:16:26 CDT xn226_abcd564b675580fe3c6c56fc384e4504-s512cb70dbfd9ecb0109-c1              16103 
2021-10-16 18:20:12 CDT xn226_ba366e0039f4c248a0c776fdeaa6897d-s15554e2a32155f37109-c1                143 
2021-10-16 18:20:17 CDT xn226_c6978ae941160182a11aa7b4ede4d462-sb124f684c787930c109-c1               3643 
2021-10-16 18:20:31 CDT xn226_6eeb73736f764cdb8c1c6a41e848a0d2-sf994fb4eab4ea885109-c1               9609 
2021-10-16 18:22:16 CDT xn226_0d4cf5e55f07cbb9e66225ab33451952-s3d3fd41e996b3464109-c1               1707 
2021-10-16 18:30:16 CDT xn226_f840222cbadf74781d1cddf5a57332c5-s5944e7a050332426109-c1               8131 
2021-10-16 18:30:28 CDT xn226_ef1846a6a0e5280202ada89799b7937a-s89b83a136c987453109-c1               5921 
2021-10-16 18:31:19 CDT xn226_58a37d42e467125ebec9d62aa1a5d407-se51ea6de5ec47436109-c1              23345 
2021-10-16 18:40:12 CDT xn226_5a773f3b81fafa18b9d3136016df6e86-s7e4bd855906730f9109-c1                143 
2021-10-16 18:40:15 CDT xn226_d4c9de0e52dd1a7a27b169b024495ac9-s4758993a856a3027109-c1               3983 
2021-10-16 18:40:28 CDT xn226_1ade0e506d2078d9aea25ba149dab2f5-s339700ce055c6dd9109-c1               5173 
2021-10-16 18:46:18 CDT xn226_f5cedc48efa5999326527a1e6d821590-sa9c7b801fc834741109-c1              16001 
2021-10-16 18:50:18 CDT xn226_df2e1a59406da06cbd8173d377c0553d-s71aeaa44e8309332109-c1               7621 
2021-10-16 18:50:30 CDT xn226_3d9585901d8ec909065ab9fd7277d0f3-s61a631b2c88ccdac109-c1               8539 
2021-10-16 19:00:15 CDT xn226_762948c11d02b5d908408c78af672573-se0c5aba69bbfc3fa109-c1               4153 
2021-10-16 19:00:47 CDT xn226_85bf5eeebf058c92172ae6d02b4b310a-s59bc766bb81d2f55109-c1              16885 
2021-10-16 19:01:04 CDT xn226_ade17eaa9eb094b85a32b00cdf3459b5-s04eab2bc2573b4c0109-c1              13426 
2021-10-16 19:01:17 CDT xn226_b2c5027cc0bdf91a86690f1c1ca438a4-seb848a71d941ad52109-c1                143 
2021-10-16 19:04:43 CDT xn226_c25d81395bdea36175b16753539c7de2-sd34e4efa97237b44109-c1              27000 
2021-10-16 19:10:16 CDT xn226_b9c83a2e48df78704ed78f876343de68-s4ab23f8744465d52109-c1               3609 
2021-10-16 19:10:17 CDT xn226_6a3dacaf5cb4776c5c22340f65ff2652-s7688879a2e08e93f109-c1               1615 
2021-10-16 19:10:50 CDT xn226_19d21a8eb65cda7c917c26b4ddc5c29e-s4e7edb305e5fde7b109-c1              42435 
2021-10-16 19:10:52 CDT xn226_0ce238725dfe61acde1131ee5ab13064-se96e55d6be119bc9109-c1               1615 
2021-10-16 19:16:07 CDT xn226_5ac19fb89b236e1362834d037b949f81-sf564e6c046430bec109-c1              20879 
2021-10-16 19:20:14 CDT xn226_cd07a9818b243980a8c1dcfdc7490716-s949c4a915d7105fe109-c1                143 
2021-10-16 19:20:16 CDT xn226_803f20053a5f8200e9ac4554adb2a314-s193e334c3fa234ea109-c1               3745 
2021-10-16 19:30:16 CDT xn226_62571ffc03e8a3a9200830c4c8d583df-s2a91bc655bedcd22109-c1               5717 
2021-10-16 19:30:58 CDT xn226_5c2090006364dc2d01279ce4d3b26266-sa8655c77619b8f34109-c1               4382 
2021-10-16 19:40:11 CDT xn226_9c80f5e3012bdb1cc7e138d28b0f6340-sedd8174e555a39bf109-c1                143 
2021-10-16 19:40:15 CDT xn226_c7caf6c967759946ac8b2df44ed440c7-sa01973888b93da58109-c1               4221 
2021-10-16 19:50:17 CDT xn226_6a6704b6ebd6f6537a583353a90ef2ad-sd2933830e6018134109-c1               7859 
2021-10-16 19:51:22 CDT xn226_6a2ef23073062fe91687580227a5409a-sbc8e6f34f7eedb02109-c1              17522 
2021-10-16 20:21:35 CDT xn226_923b180c8270dd2af63e3a3663be1d88-s061c2294ace76214109-c1               5105 
2021-10-16 20:37:02 CDT xn226_8947bcabdc7965383ae7b1608aa88b42-s6aef0ec7177939d2109-c1              23735 
2021-10-16 20:39:00 CDT xn226_64d67863fb4186f1379a612e08b02ea5-se0718bde146027e4109-c1               5309 
2021-10-16 20:46:30 CDT xn226_2603e3428d35fac046fea9996a7af5e2-sb1f0221dd1797eb7109-c1               7655 
2021-10-16 20:47:17 CDT xn226_1c987ff1bd02c883e1de990ed504c351-sf1f5c074e799d341109-c1              11717 
2021-10-16 20:47:18 CDT xn226_9e5ce2ad6b9c409bb93b1fa4cbdf7b4c-s2e3acac108bf2918109-c1               1615 
2021-10-16 20:47:55 CDT xn226_bd65eb808f7ac248a3efac5ec3ae5086-sd7368b1a76ed2186109-c1               7553 
2021-10-16 20:50:41 CDT xn226_3f5cd2cf8a6fe91c8bd5c5acc72159cb-s0537e0def192a75a109-c1               8811 
2021-10-16 20:52:41 CDT xn226_d9b136b356db752b8aca5728d840cfb3-s9cdecace7f750e84109-c1              10953 
2021-10-16 20:52:43 CDT xn226_ca4383da43798c1a028fadc6d81a3c5d-s35b6d45481909ca3109-c1               1615 
2021-10-16 20:53:28 CDT xn226_6bb5e94fb62d65d9404a4b1217d4d59b-s858d8b7dcb5666b9109-c1                143 
2021-10-16 20:54:28 CDT xn226_595ec31ed661933aa6c175d1ca78ba8f-sd034483aea98e0ab109-c1               5751 
2021-10-16 21:00:17 CDT xn226_75ed1283a46b0e17e7382e4c8a71c55e-s69f281f326a7ca1a109-c1                143 
2021-10-16 21:00:17 CDT xn226_fb9bff1644fa29e29ad5b068c12ed292-sdeab259e307351e2109-c1              10987 
2021-10-16 21:00:43 CDT xn226_d3129903a2a927ddfb8e47baf52fe872-s15c167dff0103276109-c1               9389 
2021-10-16 21:01:29 CDT xn226_5b2946a954dcbcd60a3d5792d0a1bc0c-s9ad23c7caed32173109-c1              12091 
2021-10-16 21:10:15 CDT xn226_b8e1c55792f4d6d2fdb385ebaf4043ac-sd0d60c4a0f1da481109-c1               3507 
2021-10-16 21:16:05 CDT xn226_cbd81e44ce8a3aca46de845130fe0d77-sbaf1d9385814d583109-c1              15602 
2021-10-16 21:20:12 CDT xn226_49fc5cce087ffe0487d30a554d3c8fdd-s63ea981ee97d6ddc109-c1                143 
2021-10-16 21:20:15 CDT xn226_9727933ee226cfc1b108d8a83014dc5f-s67b7034a98b4f188109-c1               3677 
2021-10-16 21:29:55 CDT xn226_9d190188a385b48341e7c064ff49ea3c-sce869c610439bb4d109-c1              17905 
2021-10-16 21:30:16 CDT xn226_d3577ba0861d79a0d426f5f369b314fb-sf89b781371b5e75c109-c1               7043 
2021-10-16 21:30:56 CDT xn226_db5d1e611c6650e70a84ed109afec304-scfa4e21c30a54e79109-c1               7825 
2021-10-16 21:31:20 CDT xn226_bf82b22f6d4362612fcb6e0dc482eb84-s5fe958d47827736c109-c1              14648 
2021-10-16 21:33:24 CDT xn226_cc43a58caedcf3e433015a1bce0e3494-s3cc0c257975aa8d5109-c1               1707 
2021-10-16 21:40:13 CDT xn226_f15a25e25f77ea917bcafb80d8b937f8-s365c9d46887cfa82109-c1                143 
2021-10-16 21:40:17 CDT xn226_4e50cd6599c4ae3bde4e4e7953a297f9-sc354e6ce6f2076ef109-c1               4153 
2021-10-16 21:40:49 CDT xn226_b3485b03b9ed29e77c98b85f325207ec-se2ddeb8ef4f5bfac109-c1              22563 
2021-10-16 21:47:04 CDT xn226_f9c89dbcd5e307cbd34ad0648eec5b4e-sa47f158cad245960109-c1              12939 
2021-10-16 22:00:59 CDT xn226_f33b12945c2445f6449ee19d5a8b8b7f-saa72e30a93496591109-c1               4654 
2021-10-16 22:01:11 CDT xn226_69857e4a23cd3bb39af8b034d0906918-s6ccb897b4c1a8bae109-c1                143 
2021-10-16 22:22:02 CDT xn226_5c196b7c742d913ad66e10f7b0669e37-sb81f7cee7372d75f109-c1               8573 
2021-10-16 22:22:41 CDT xn226_106a9839fa9c618b0d921b91ab23ff95-s81cdac7fd7c7b52a109-c1               2385 
2021-10-17 00:24:32 CDT xn226_02f7c5ea9c7de06f5f806d4451303e49-s023a4543362c92cf109-c1               7850 

==> epoch 227 started

2021-10-17 01:46:28 CDT xn227_5fd53f28bf2fbdd494a36e5e7e924446-sa5bf5dcc38068ab1109-c1               7383 
2021-10-17 04:06:15 CDT xn227_9a3942c4f55e34bedf91f8bdb6a77d39-s7c53c55a43da67bd109-c1              11275 
2021-10-17 05:16:37 CDT xn227_942c35b134d8efb0526b6317fc9237c3-s39c61714723f56ca109-c1               5513 
2021-10-17 06:35:05 CDT xn227_04913a79e8866def2b3966b775637ea4-s62ad9c8ae2840445109-c1               6091 
2021-10-17 06:35:45 CDT xn227_12dfaf589d477678ba2cec86a2d9c121-s97aa82b486fa3bf6109-c1               4085 
2021-10-17 08:47:03 CDT xn227_c3b0547f66a4949a4600bc051a7ed0c0-s3eccecf9a0678fef109-c1               4765 
2021-10-17 08:47:05 CDT xn227_e84ff59a7dbf1845dfea1c378f03ea18-sfcf3d98b72b015af109-c1               1615 
2021-10-17 09:08:02 CDT xn227_e9daf01f6890037186f78f458cf92c9a-sa8002209861ce7ef109-c1              26403 
2021-10-17 09:08:47 CDT xn227_8d1351772fcb9e62f0be05c764c3cfa3-s6c5bacfd70b6a567109-c1               6397 
2021-10-17 09:10:40 CDT xn227_b889c2d9aa44b2a0d169aaef184d5c28-s92166233800fa73c109-c1               4867 
2021-10-17 09:44:49 CDT xn227_bee896c9c1aa743773083a0bb1fe83db-s2e076f0c5653e61a109-c1              23710 
2021-10-17 09:50:42 CDT xn227_31df76b77c098a2dbcc2a3c886540189-s2f66ab9aeff18213109-c1              12857 
2021-10-17 10:00:43 CDT xn227_460f868871a2df5fd1468484d12c1b61-s0b820c15be466361109-c1               9627 
2021-10-17 10:10:41 CDT xn227_1ffd4137c0b29ceaf1dceffbf54b0c9e-sb027d513449f2def109-c1              10375 
2021-10-17 10:20:49 CDT xn227_e8cd5c06be8f3465768effc174304910-sdb3ec725d754c035109-c1              28708 
2021-10-17 10:30:47 CDT xn227_379ee488f6e4caee7a7e69b2d93429a1-s60c2d8099adcd9c6109-c1              18789 
2021-10-17 10:40:44 CDT xn227_13510110c09da0ccab8b193c33d70985-s235b7c9576336c21109-c1              22207 
2021-10-17 10:50:54 CDT xn227_c67d6448083d54c9968f20c20a883e4a-sf78f969a5d2e9160109-c1              13859 
2021-10-17 11:00:54 CDT xn227_683e463c091c60b62b5006fea9196a48-sdc86dd3abb0ffebd109-c1               7145 
2021-10-17 11:10:53 CDT xn227_75eaca41dc4986eacb27df03bf695502-sf966d274446f0a7f109-c1               7791 
2021-10-17 11:21:19 CDT xn227_cceafb20ad50f5841e1bebb1057d8c98-s09e101a9a9a79525109-c1              21498 

==> epoch 228 started

2021-10-17 11:31:13 CDT xn228_1472bf4f751723aa37da4b2141770994-se548b629d73bfcfe109-c1              16069 
2021-10-17 11:31:15 CDT xn228_c2a4c82a65282872fdb5865c30e39d8e-s560e52b18440b72a109-c1               1615 
2021-10-17 11:40:52 CDT xn228_cfc71fab53bb4132a336acfa50ec1ad7-s4d1a0d243fb0778a109-c1               8607 
2021-10-17 11:51:24 CDT xn228_bb5489a550e1d10552d76aa0fce9152d-saa0feeb38ac21d42109-c1              19460 
2021-10-17 12:00:48 CDT xn228_f8f11c74d5e00287bf9e5c89b5757245-s103b7bc12871b28c109-c1              19095 
2021-10-17 12:09:02 CDT xn228_24fc7a4d9e36a36f7dfcc55f9f6084eb-sb9567991e311a283109-c1                143 
2021-10-17 12:10:45 CDT xn228_c74945e1da0617e5526a1b7169c43e4c-secd526740bd0818d109-c1              14251 
2021-10-17 12:11:49 CDT xn228_466e66f7081467a1ab9b0d24c38c8626-s13d7339fdfee476f109-c1              13945 

==> stray index for epoch 226 written while we're already in 228
2021-10-17 12:11:53 CDT xn226_4e79092c1fbd7898062cd040b5300f47-s4bd4ee9a844b2be7109-c1               5207 

2021-10-17 12:12:13 CDT xn228_a80bfadde61944d523fd7ecedadb593d-s21f57db626ab1a99109-c1               8437 
2021-10-17 12:16:01 CDT xn228_2ed386f3778dac365e0ff20d8ae3198c-sf0f49290a1d3c820109-c1               3983 
2021-10-17 12:20:17 CDT xn228_e1327593d59c30a3670673b0838d2ea4-s218f33ea2ed8ee3f109-c1                143 
2021-10-17 12:20:46 CDT xn228_2504089933a62bd6be18dcf58660d21e-s8b418be0b0def3f8109-c1              57320 
2021-10-17 12:20:57 CDT xn228_b7d4c6d052faa1a9a8373d6c5930efdd-s434ab1b8dad8ed77109-c1              38942 
2021-10-17 12:30:19 CDT xn228_4d2bbe4ceb7d5f2144e7f8f3e44324de-s16b4ede04735abec109-c1              11225 
2021-10-17 12:30:51 CDT xn228_953f3c5b34815714e56ac6c7620377a3-sef2816832ed9384a109-c1              16579 

2021-10-17 12:31:07 CDT xn228_6986579fae3ab8e5f15d62c90bb0589b-s9cb887b1c67285d1109-c1              10407 
2021-10-17 12:40:12 CDT xn228_fb1358b29dec791da4e9de6dd93b58b4-s4c29a91c08d74f11109-c1                143 
2021-10-17 12:40:13 CDT xn228_ffb842debb7672854751aba4d296a703-sf0dd0d18e8ed42a1109-c1               1615 
2021-10-17 12:40:16 CDT xn228_16a639441128dd292eac1f7a4a810667-s78911df587fdd799109-c1               2215 
2021-10-17 12:40:18 CDT xn228_1e8e2683dc3c98f3de731f25fbb8cbe7-sfa7260af949fc40e109-c1               1615 
2021-10-17 12:40:51 CDT xn228_f38f4d672f5954a7d8e35b9a0385f537-s2278ca4cf6056b2d109-c1              23617 
2021-10-17 12:46:21 CDT xn228_5ba39f60ed66967182e0455e4019d224-s9fa9f5ccdbb15245109-c1               8215 
2021-10-17 12:50:17 CDT xn228_09120d1cd58371f40e3a6c946c463395-s17cadbf0fdd8fa16109-c1               3881 
2021-10-17 12:50:49 CDT xn228_3ce1574c52639b4e8c5001be4d6bb0be-sa561cc0456eabb70109-c1              20081 
2021-10-17 13:00:16 CDT xn228_16623447d0d177e633fc49c99404af23-s60e6f4f18a7c1a28109-c1               5513 
2021-10-17 13:01:10 CDT xn228_e7155b83d71a98d9a699e5972670fc4a-sce44ba8ddbbb8fa8109-c1              14453 
2021-10-17 13:01:23 CDT xn228_c7b87245a6c44b8bb013c8557fa03e2e-s4ebb3f9a485269b8109-c1                143 
2021-10-17 13:10:18 CDT xn228_abb60c938d6a700229ef10bb501c6277-s9cab81b25ec4a38f109-c1               3813 
2021-10-17 13:16:31 CDT xn228_9dd6f67f0482391ade078a29ab0cf85d-se2b4c05d2888b995109-c1              21414 
2021-10-17 13:20:15 CDT xn228_c25a12a111a77ba5a46c2a49b19d6c75-s09eb6fb193a8ddef109-c1                143 
2021-10-17 13:20:17 CDT xn228_78f6b44a10f2b89b66596d100e08bc7b-s4ff9a80e5c9e4937109-c1               4697 
2021-10-17 13:30:16 CDT xn228_e01c49c22c49bf9dc50048f6d9323bbd-s4424fc98ed429b8c109-c1               3983 
2021-10-17 13:31:05 CDT xn228_65d86ebc68d47de172b1232fb6e2032f-sd2625b2094272224109-c1              14165 
2021-10-17 13:40:12 CDT xn228_6a1030edfe343dacbf19f321a0aa6832-s7df575a67de46d9b109-c1                143 
2021-10-17 13:40:13 CDT xn228_1b95d13f2ead0998cf6bec0e4dead8bc-sd96b68e752566d5c109-c1               1615 
2021-10-17 13:40:16 CDT xn228_9534a92fc9c9e136c62a0267b9081c30-sd665c20cbdf1af6e109-c1               4459 
2021-10-17 13:40:17 CDT xn228_1deb1b13558fe7b208f6bfe164354a46-s6a6cd90477f9f156109-c1               1615 
2021-10-17 13:43:56 CDT xn228_47521fc85ed797f1c6c3beda0a30815d-s3c787433b520c692109-c1              24016 
2021-10-17 13:46:09 CDT xn228_99a9c665cd5dfd473154c5e8b79094e3-s4ae0bba8a2d03d93109-c1               6014 
2021-10-17 13:50:16 CDT xn228_51a2d8353fe349d7ebe234bef751c107-s6162b4079073acfa109-c1               5547 
2021-10-17 14:00:14 CDT xn228_63218a649baf1e93e6ac2df1cd7f5ca8-s764f099f01bc6ff5109-c1                143 
2021-10-17 14:00:16 CDT xn228_1f080d74d4f1689bcbdbd085227d19f0-s2730f9ba7c554fd1109-c1               4323 
2021-10-17 14:01:34 CDT xn228_2c2610bd6c4725f35c846fdc2246fa0e-s028d810191be4502109-c1              17869 
2021-10-17 14:10:16 CDT xn228_9988bbb04336b7c56352d158591fd59e-s815ae20fa49d780b109-c1               4153 
2021-10-17 14:12:11 CDT xn228_4a72c3da172e2ce1e6fcaf2d48bae158-s2c7f0081ae7e0eb3109-c1              13009 
2021-10-17 14:13:01 CDT xn228_73ab1bde046b9e0146f305cdde137a4f-sdbe42f759dad1a7f109-c1              12619 
2021-10-17 14:13:02 CDT xn228_aea85d85873525550271fa29c85955c6-se5fc8158cb426364109-c1               1615 
2021-10-17 14:16:16 CDT xn228_2d20fc1b0628b1ba3fd42b2285782188-s1cbe13c0f9dd0d5d109-c1              19594 
2021-10-17 14:20:13 CDT xn228_989cc185c4eae27eb937e242ada62419-sc85a7b058f37a238109-c1                143 
2021-10-17 14:20:16 CDT xn228_0f9df5a0827af8471cfde4fb0c9d753e-sd3136cc42869f533109-c1               7213 
2021-10-17 14:20:55 CDT xn228_05954be3aa96cbff0170baf5c9b75ce2-s081168fdf5f26896109-c1              20956 
2021-10-17 14:30:17 CDT xn228_f6ea44fb8c2fe8a1f94714be6b23c2cc-se936d522a4111297109-c1               3745 
2021-10-17 14:30:48 CDT xn228_ca5ee335b0c2189d20ec714c54e5fc17-s20b897c7a32678bf109-c1              28123 
2021-10-17 14:30:51 CDT xn228_ec51e60b80367aec6acf2ea3641ec9db-sab8a1a73290095cd109-c1               1615 
2021-10-17 14:31:06 CDT xn228_950d2b1af049f22c56d21a13b58ad1fd-s5cffc47ec8804359109-c1               5955 
2021-10-17 14:40:17 CDT xn228_44067c448188fbb9ab59a837cdf025d0-s6793cefbcdcc78ba109-c1                143 
2021-10-17 14:40:17 CDT xn228_dd081844b68128b131d94789f787ebe7-sbec5e2c1ac169578109-c1               4493 
2021-10-17 14:40:56 CDT xn228_1129ebb68bf81c03f9cb0c9f25a2994b-s943a1d032535c573109-c1              24322 
2021-10-17 14:46:16 CDT xn228_078c08a4273a64c53a0914c335b5e9c2-s7268269fd1a87b8b109-c1              15287 
2021-10-17 14:50:17 CDT xn228_282d8ab711ee0659dfeb0d320cf7bb5d-s4bc99d499d150ed0109-c1               3575 
2021-10-17 14:50:46 CDT xn228_263e22c546b4facee277e08a45ad412f-s91d6984c4ac4590c109-c1              10477 
2021-10-17 15:00:19 CDT xn228_4459d12bcc6ffa5348f3f6527de9b063-s151c36cac3928c5a109-c1              12245 
2021-10-17 15:00:46 CDT xn228_a3588377099dc8da913fc924130f6f16-s3f19684640038290109-c1               6533 
2021-10-17 15:01:13 CDT xn228_fbc5c7a98575bc69b37507481335a8b9-scd3fabd3c4d86b58109-c1              13009 
2021-10-17 15:01:29 CDT xn228_ba8c2998cfebc9940f3a6fc75272bca0-se88a7c5ff1c774a2109-c1                143 
2021-10-17 15:10:16 CDT xn228_5d8a69332cf25923b8dbecb09ef1a05e-s9554ef3139b7e44e109-c1               3813 
2021-10-17 15:10:50 CDT xn228_f4a472f12f10f03339fc75dd841f2c9c-sb3c6d9ff7d6638be109-c1              10774 
2021-10-17 15:16:18 CDT xn228_9fd365bcfb06cd5e95f47c19e44f47b9-sb7054440f737073a109-c1              13535 
2021-10-17 15:20:17 CDT xn228_f78bb478a244626132594b24c43e6761-sa5f2dba1ca9ce213109-c1               4969 
2021-10-17 15:20:24 CDT xn228_1a3bfd34efa395220bfc4bbcaba925a7-s07f73605e1601438109-c1                143 
2021-10-17 15:20:25 CDT xn228_4f2dd6147b481ee157e88022915d8122-sc4e73be6b6922635109-c1               1615 
2021-10-17 15:20:46 CDT xn228_f86a6679f469f76d076c54ecb0086ee8-s21b183b2da2b0f7b109-c1               7383 
2021-10-17 15:30:17 CDT xn228_3f7d5843ed30814ecfb7b707eb060b69-s5d67b3470be7af9f109-c1               3541 
2021-10-17 15:30:56 CDT xn228_41690ffec04e4357fc66926a2e98a8c0-s3d80d26f9fec8c2c109-c1              16375 
2021-10-17 15:31:39 CDT xn228_acd0a2b5166cd321f7bba6400af5fd63-sc653b6ce1b4ac13f109-c1              23110 
2021-10-17 15:40:15 CDT xn228_0833d9e08a6785e4809e82e382424f5e-sae46d6547b057575109-c1                143 
2021-10-17 15:40:16 CDT xn228_1c8243011acf7babe8e98d15b3b14f3c-sfe163982a594ff5a109-c1               3507 
2021-10-17 15:40:50 CDT xn228_59bcd5e048598185d5d48cbfb5d2580a-s93625ab4636f8e44109-c1              10749 
2021-10-17 15:46:28 CDT xn228_86a37076c9dea8d83fc91c3ad954d6fd-s4a2734c5b45a882d109-c1              28847 
2021-10-17 15:50:17 CDT xn228_9d7b9d664e1e61b96ecac8025e4fb6d1-s5cf754d8df756740109-c1               3541 
2021-10-17 15:50:51 CDT xn228_c879a655cd385eef609d3ee6da4926fd-s0a684222901ae969109-c1              11343 
2021-10-17 16:00:13 CDT xn228_27ca599f13fabb7fb0601e4ce7717882-s595ef52ba20dc0ce109-c1                143 
2021-10-17 16:00:16 CDT xn228_668359012555ee892c0e6df933c05242-s156c7fe09ee79671109-c1               3745 
2021-10-17 16:00:47 CDT xn228_6c1bf54cf926809c4f3abe1d31e683a2-s2942e03349f037d1109-c1              14217 
2021-10-17 16:01:37 CDT xn228_18c18936760cc400bbec08f0b149bdcd-s6dd158ce195c39d3109-c1              24783 
2021-10-17 16:02:52 CDT xn228_dd1335068ae03ae24c8f0687055c5daa-s97d2a64ab18f8c15109-c1               1615 
2021-10-17 16:10:17 CDT xn228_5db1793ebcc17eaefa232987fe648512-s2bab902d4e968bd1109-c1               4357 
2021-10-17 16:10:49 CDT xn228_422f264526aa1ae187788ba0cabee908-sc22158482cf454a9109-c1               9967 
2021-10-17 16:16:40 CDT xn228_580f7f9e0df16772fd85e59e2a34f2dd-sdedbbcf8123190f1109-c1              12740 
2021-10-17 16:20:15 CDT xn228_dc7238f188aa7da902e08b21bb0e0a94-s038df1cb160c8059109-c1                143 
2021-10-17 16:20:20 CDT xn228_ac9132b05618f4a31f6d873964581a2f-sc87ee1f474fe7721109-c1               6771 
2021-10-17 16:20:47 CDT xn228_e4610c596fb95f3b6910e1a9ad7de2f2-sf71615819b6c1c01109-c1               6635 
2021-10-17 16:30:23 CDT xn228_f482ad2d6d4ef01db03db50283fb2b4e-s9cfba3ee205a40e7109-c1               3609 
2021-10-17 16:30:43 CDT xn228_031430cbc42d6c6a0296f7f35eb3645e-s02800a14769c0aa1109-c1              17352 
2021-10-17 16:31:09 CDT xn228_182346558aa713405319a43b0e6d6432-s660c392f66ed63aa109-c1               6150 
2021-10-17 16:40:17 CDT xn228_caed636ce5aa87102cdb370cd1e4d6ef-s25bd35bbdb5fc01e109-c1                143 
2021-10-17 16:40:22 CDT xn228_6b013e232b18255a097093752c7ed63a-s37322cbd5dd1f5ca109-c1               5479 
2021-10-17 16:40:40 CDT xn228_9554fc4f9f30ce171cc08a73e9a997a5-sa2b484e3eb48d451109-c1              22495 
2021-10-17 16:47:05 CDT xn228_76ff19c4f4a67d9190325abe983684cd-sd5d6de325fbacece109-c1              25946 
2021-10-17 16:50:19 CDT xn228_be0bfd08ce1b396913eefaf6e44a5d57-s35cf8800e408230c109-c1               3473 
2021-10-17 16:52:15 CDT xn228_2d65f571c3b250eafc1452df010375b0-s6ee6776c6dc9e843109-c1                617 
2021-10-17 16:53:02 CDT xn228_0802e9995d97812e8e0606b0031d3997-s0d70e0c665f76404109-c1               1615 
2021-10-17 16:54:00 CDT xn228_1ca43f61d2efa13358ab95435f22efe8-s881a3f7cbb756efd109-c1              12456 
2021-10-17 16:54:01 CDT xn228_b64d06fc3fbfb226aa37587acb2169dd-s4b0eac1b038d6083109-c1               1615 
2021-10-17 17:00:20 CDT xn228_2aac2221bd987a08d7a316a0b21b30fc-s2c8945930eb773a5109-c1               5955 
2021-10-17 17:01:07 CDT xn228_c46312b132b9c66850ac9c464e8a00c2-sbde5a9975bfefa28109-c1              11615 
2021-10-17 17:01:10 CDT xn228_4f9c94320fdb44c48b63e76ad4fc84f5-s26a61fdec473fc28109-c1              15813 
2021-10-17 17:01:23 CDT xn228_3b9eefae970b9d07b2847dd2f57bfea7-s7d3eadbd6773ddb3109-c1                143 
2021-10-17 17:10:19 CDT xn228_8cef543ddc1c2499b4970a23e39c354b-s2e013ab3c112331d109-c1               3643 
2021-10-17 17:12:41 CDT xn228_1ef8ac8b9e0177075284ed413d891cad-s6d2be7757a7a442f109-c1              74828 
2021-10-17 17:16:16 CDT xn228_b7b0d41eb151180efd269ede0ba56b19-s1c346d67e6a60270109-c1               9142 

==> compaction of epoch 226 deliberately ignored xn226_4e79092c1fbd7898062cd040b5300f47 because it came in late.
2021-10-17 17:16:40 CDT xs226_857fb0696782b0401ec62373d0a5c53b-s906b2b3c43bf7d21-c1         842941 

2021-10-17 17:20:15 CDT xn228_64fe6ab36ca04bd379f2212a1ca919f8-s8e77eb325241463c109-c1                143 
2021-10-17 17:20:19 CDT xn228_546b89b98c70eaa40e118ca25eaa469b-sfabe80e05315c569109-c1               4867 
2021-10-17 17:21:47 CDT xn228_39d2fbd1f784b505791bc7d098acc8d0-s3c6c302d1db5db72109-c1              48249 
2021-10-17 17:30:16 CDT xn228_05b3a806b6b253cb5a0f85b9fdad3543-se9e02de80fa7ac81109-c1               4527 
2021-10-17 17:33:17 CDT xn228_0cf6ffbd942d77515466e7239b091ce1-s9f93473e631167db109-c1              30337 
2021-10-17 17:35:57 CDT xn228_625a834e9fcfb31f044d16047b4dc1ad-sbb7f2f3a1cc7c8fb109-c1             655104 

jkowalski added a commit to jkowalski/kopia that referenced this issue Oct 18, 2021
Reduces likelihood of kopia#1402 until we fix the root cause.
jkowalski added a commit that referenced this issue Oct 18, 2021
Reduces likelihood of #1402 until we fix the root cause.
jkowalski added a commit to jkowalski/kopia that referenced this issue Oct 19, 2021
@jkowalski
Copy link
Contributor

Digging deeper, the client that caused this bad write slept for a very long time:

2021-10-17 02:00:16.774465 use-new-committed-index xn226_fb9bff1644fa29e29ad5b068c12ed292-sdeab259e307351e2109-c1
2021-10-17 02:10:14.962895 use-new-committed-index xn226_b8e1c55792f4d6d2fdb385ebaf4043ac-sd0d60c4a0f1da481109-c1
2021-10-17 02:20:15.057839 use-new-committed-index xn226_9727933ee226cfc1b108d8a83014dc5f-s67b7034a98b4f188109-c1
2021-10-17 02:30:16.225384 use-new-committed-index xn226_d3577ba0861d79a0d426f5f369b314fb-sf89b781371b5e75c109-c1
2021-10-17 02:40:17.042279 use-new-committed-index xn226_4e50cd6599c4ae3bde4e4e7953a297f9-sc354e6ce6f2076ef109-c1

==> slept for >14h

2021-10-17 17:11:53.154113 use-new-committed-index xn226_4e79092c1fbd7898062cd040b5300f47-s4bd4ee9a844b2be7109-c1
2021-10-17 17:12:13.195954 use-new-committed-index xn228_a80bfadde61944d523fd7ecedadb593d-s21f57db626ab1a99109-c1
2021-10-17 17:20:45.894912 use-new-committed-index xn228_2504089933a62bd6be18dcf58660d21e-s8b418be0b0def3f8109-c1
2021-10-17 17:30:18.688951 use-new-committed-index xn228_4d2bbe4ceb7d5f2144e7f8f3e44324de-s16b4ede04735abec109-c1

@jkowalski
Copy link
Contributor

Timeline of that one snapshot session:

  • started at 04:44:33 local time
  • slept 04:44:33.306409
  • woke up at 06:11:32.860909
  • slept at 06:12:03.592298
  • woke up at 12:45:03.478499
  • slept at 12:45:34.200079
  • woke up at 17:11:52.946168

17:11:52.946168 wrote-pack q580acdabce2b47ca63ba648f1e951be7-s4bd4ee9a844b2be7109 1941010
17:11:53.154113 use-new-committed-index xn226_4e79092c1fbd7898062cd040b5300f47-s4bd4ee9a844b2be7109-c1

jkowalski added a commit to jkowalski/kopia that referenced this issue Oct 22, 2021
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with clock.WallClockTime()
or time.Now() as appropriate.

The problem in kopia#1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes kopia#1402
jkowalski added a commit to jkowalski/kopia that referenced this issue Oct 22, 2021
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.

The problem in kopia#1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes kopia#1402
julio-lopez pushed a commit that referenced this issue Oct 22, 2021
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.

The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes #1402

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants