Checkpoint replaying very very slow #5569

mxmorin · 2022-08-04T13:06:57Z

versions

thanos, version 0.27.0 (branch: HEAD, revision: f68bb366d18557a5347e6b38013f363ad5be05a4)
  build user:       root@708554d4fa46
  build date:       20220705-14:57:42
  go version:       go1.17.11
  platform:         linux/amd64

OS : RHEL8

Architecture
hashing (2 routers and 3 receivers with replica=3)

Commande line :
/usr/bin/thanos receive --http-address 0.0.0.0:19904 --grpc-address 0.0.0.0:19903 --remote-write.address 0.0.0.0:19291 --label=thanos_replica="p1thanosprod02 " --tsdb.path=/projet/data/thanos/receive --tsdb.retention=1d --objstore.config-file=/etc/thanos/objstore.yaml --receive.default-tenant-id=prod

Object Storage Provider: S3

What happened:
After upgrading to 0.27, each restart of receiver takes more than 5min per tenant

What you expected to happen:
Fast replaying

Full logs to relevant components:

Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.874729523Z caller=receive.go:481 component=receive msg="hashring has changed; server is not ready to receive web requests"
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.874738059Z caller=receive.go:563 component=receive msg="updating storage"
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.874786265Z caller=multitsdb.go:311 component=receive component=multi-tsdb tenant=prod msg="opening TSDB"
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.87503383Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659520800000 maxt=1659528000000 ulid=01G9HWN8RCCDYYYRX0E998QM6J
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875081097Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659528000000 maxt=1659535200000 ulid=01G9J3GZZF4KCZXX7H04AFY8A9
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875107989Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659535200000 maxt=1659542400000 ulid=01G9JACQH1E7WDXKYGMA6QRA9E
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875132359Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659542400000 maxt=1659549600000 ulid=01G9JH8EFZ47ST4AEQVASCDMMY
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875152892Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659549600000 maxt=1659556800000 ulid=01G9JR45Q4S79TDXGXQE443GMW
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875172316Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659556800000 maxt=1659564000000 ulid=01G9JYZWYS6JPS062RAMVG7S95
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875191784Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659564000000 maxt=1659571200000 ulid=01G9K5VM6WA3THBP7N0T2WWDAN
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875210939Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659571200000 maxt=1659578400000 ulid=01G9KCQAYK2DTMVHDRPWG2PKHH
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875232684Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659578400000 maxt=1659585600000 ulid=01G9KKK2854PC83E81W22GMPR1
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875251863Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659585600000 maxt=1659592800000 ulid=01G9KTESFFPTBXZTABEVBE41B1
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875270352Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659592800000 maxt=1659600000000 ulid=01G9M1AGXV7V25TKD9ZA7GPR3G
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875289027Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659600000000 maxt=1659607200000 ulid=01G9M86862RR25NW64SR4YNV1E
Aug 04 13:14:38 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:38.875311507Z caller=repair.go:57 component=receive component=multi-tsdb tenant=prod msg="Found healthy block" mint=1659607200000 maxt=1659611617558 ulid=01G9M8Z6TTT1CMBYDVNFYXDSPP
Aug 04 13:14:39 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:39.110533063Z caller=head.go:493 component=receive component=multi-tsdb tenant=prod msg="Replaying on-disk memory mappable chunks if any"
Aug 04 13:14:39 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:39.1114571Z caller=head.go:527 component=receive component=multi-tsdb tenant=prod msg="On-disk memory mappable chunks replay completed" duration=886.864µs
Aug 04 13:14:39 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:14:39.111484133Z caller=head.go:533 component=receive component=multi-tsdb tenant=prod msg="Replaying WAL, this may take a while"
Aug 04 13:19:45 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:45.058705918Z caller=head.go:569 component=receive component=multi-tsdb tenant=prod msg="WAL checkpoint loaded"
Aug 04 13:19:47 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:47.44320823Z caller=head.go:604 component=receive component=multi-tsdb tenant=prod msg="WAL segment loaded" segment=30317 maxSegment=30323
Aug 04 13:19:48 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:48.963708983Z caller=head.go:604 component=receive component=multi-tsdb tenant=prod msg="WAL segment loaded" segment=30318 maxSegment=30323
Aug 04 13:19:50 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:50.558080352Z caller=head.go:604 component=receive component=multi-tsdb tenant=prod msg="WAL segment loaded" segment=30319 maxSegment=30323
Aug 04 13:19:52 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:52.336362645Z caller=head.go:604 component=receive component=multi-tsdb tenant=prod msg="WAL segment loaded" segment=30320 maxSegment=30323
Aug 04 13:19:52 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:52.385878218Z caller=head.go:604 component=receive component=multi-tsdb tenant=prod msg="WAL segment loaded" segment=30321 maxSegment=30323
Aug 04 13:19:52 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:52.943759516Z caller=head.go:604 component=receive component=multi-tsdb tenant=prod msg="WAL segment loaded" segment=30322 maxSegment=30323
Aug 04 13:19:52 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:52.943949553Z caller=head.go:604 component=receive component=multi-tsdb tenant=prod msg="WAL segment loaded" segment=30323 maxSegment=30323
Aug 04 13:19:52 p1thanosprod02 thanos[3417605]: level=info ts=2022-08-04T11:19:52.943968195Z caller=head.go:610 component=receive component=multi-tsdb tenant=prod msg="WAL replay completed" checkpoint_replay_duration=**5m5.947265781s** wal_replay_duration=7.88520992s total_replay_duration=5m13.833397584s

This is due to checkpoint replay (see checkpoint_replay_duration=5m5.947265781s)

The text was updated successfully, but these errors were encountered:

BoringCat · 2022-08-05T13:52:39Z

I have same issue on version v0.26.0 and v0.27.0. (and maybe version v0.24.0? I forgot and lost log)

I have a large dataset which average 48,130 samples per second and I need to store it 7 days for fast query.
With 5 receivers set replica=3. There is about 17,465,271,337.6 samples stored in tsdb. And every time I need to run kubectl rollout restart (for change pod resource limit or upgrade version). It will take about 10 minute on "Replaying WAL".

matej-g · 2022-08-05T14:59:43Z

Yeah this is becoming increasingly an issue for us as well, where replays on some replicas for some tenants can take ten(s) of minutes.

I guess it has been like this since at least a few versions (so my assumption is this is not related to any recent change); if it even is related to Thanos changes and not just the fact users handle more data. Either way this require more investigation and potential ideas on how to alleviate it.

I was also looking at the experimental snapshot-on-shutdown feature in TSDB (https://ganeshvernekar.com/blog/prometheus-tsdb-snapshot-on-shutdown/) and trying it in receive, but I'd need to first take a better look at this feature.

fpetkovski · 2022-08-19T06:21:36Z

Memory snapshotting seems to be a TSDB option. Maybe it's just a matter of setting this flag when starting receiver TSDBs?

yeya24 · 2022-08-20T04:58:03Z

prometheus/prometheus#10973 This commit has been merged to Prometheus main branch already. I think this is also worth trying to see if this could help.

matej-g · 2022-11-01T14:01:51Z

I have now added possibility to use snapshot on shutdown in #5836, also the change @yeya24 mentions already landed in Thanos, so we should see improvements. Please try either or both solutions and let us know if you see improvements!

yeya24 · 2023-01-27T08:56:17Z

I will close this issue since the improvements from upstream Prometheus are already included.

matej-g added component: receive needs-investigation labels Aug 5, 2022

matej-g mentioned this issue Oct 28, 2022

Receive: Add experimental snapshot on shutdown #5836

Merged

2 tasks

yeya24 closed this as completed Jan 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Checkpoint replaying very very slow #5569

Checkpoint replaying very very slow #5569

mxmorin commented Aug 4, 2022

BoringCat commented Aug 5, 2022

matej-g commented Aug 5, 2022

fpetkovski commented Aug 19, 2022

yeya24 commented Aug 20, 2022

matej-g commented Nov 1, 2022

yeya24 commented Jan 27, 2023

Checkpoint replaying very very slow #5569

Checkpoint replaying very very slow #5569

Comments

mxmorin commented Aug 4, 2022

BoringCat commented Aug 5, 2022

matej-g commented Aug 5, 2022

fpetkovski commented Aug 19, 2022

yeya24 commented Aug 20, 2022

matej-g commented Nov 1, 2022

yeya24 commented Jan 27, 2023