Ingester Panic on shutdown when using boltdb-shipper #2609

primeroz · 2020-09-09T15:24:33Z

Describe the bug
Tested with grafana/loki:1.6.1 ( Noticed with this version but have not tried older ones ) and also with grafana/loki:k32-164f5cd as asked by ewelch on slack

Whenever i do a rolling restart of my Ingesters i can see all of them panic with

ingester-1 ingester level=info ts=2020-09-09T15:15:05.789748298Z caller=signals.go:55 msg="=== received SIGINT/SIGTERM ===\n*** exiting"
ingester-1 ingester level=info ts=2020-09-09T15:15:05.790101645Z caller=lifecycler.go:421 msg="lifecycler loop() exited gracefully" ring=ingester
ingester-1 ingester level=info ts=2020-09-09T15:15:05.790134341Z caller=lifecycler.go:710 msg="changing instance state from" old_state=ACTIVE new_state=LEAVING ring=ingester
ingester-1 ingester level=info ts=2020-09-09T15:15:05.790400011Z caller=module_service.go:90 msg="module stopped" module=runtime-config
ingester-1 ingester level=info ts=2020-09-09T15:15:05.800942679Z caller=lifecycler.go:749 msg="transfers are disabled"
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344199809Z caller=lifecycler.go:473 msg="instance removed from the KV store" ring=ingester
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344284404Z caller=module_service.go:90 msg="module stopped" module=ingester
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344415348Z caller=module_service.go:90 msg="module stopped" module=memberlist-kv
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344789138Z caller=table_manager.go:82 msg="stopping table manager"
ingester-2ingester-1  ingester ingesterlevel=info ts=2020-09-09T15:15:36.344835775Z caller=table_manager.go:158 msg="uploading tables"
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344927265Z caller=table.go:213 msg="uploading table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.556499152Z caller=table.go:243 msg="finished uploading table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.556563069Z caller=table.go:301 msg="cleaning up unwanted dbs from table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.55687916Z caller=table_manager.go:82 msg="stopping table manager"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.556910447Z caller=table_manager.go:158 msg="uploading tables"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.556968228Z caller=table.go:213 msg="uploading table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.557011781Z caller=table.go:243 msg="finished uploading table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.557026962Z caller=table.go:301 msg="cleaning up unwanted dbs from table loki_index_18514"
ingester-1 ingester panic: close of closed channel
ingester-1 ingester 
ingester-1 ingester goroutine 682 [running]:
ingester-1 ingester github.com/cortexproject/cortex/pkg/chunk/local.(*BoltIndexClient).Stop(0xc000340cd0)
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/chunk/local/boltdb_index_client.go:119 +0x54
ingester-1 ingester github.com/grafana/loki/pkg/storage/stores/shipper.(*Shipper).Stop(0xc00029e600)
ingester-1 ingester     /src/loki/pkg/storage/stores/shipper/shipper_index_client.go:190 +0x46
ingester-1 ingester github.com/cortexproject/cortex/pkg/chunk/storage.(*cachingIndexClient).Stop(0xc000341f90)
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/chunk/storage/caching_index_client.go:68 +0x4a
ingester-1 ingester github.com/cortexproject/cortex/pkg/chunk.(*baseStore).Stop(0xc0008f3500)
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/chunk/chunk_store.go:120 +0x6c
ingester-1 ingester github.com/cortexproject/cortex/pkg/chunk.compositeStore.Stop(0x0, 0x0, 0xc000738630, 0x2, 0x2)
ingester-2ingester-1 ingester   /src/loki/vendor/github.com/cortexproject/cortex/pkg/chunk/composite_store.go:194 +0x60
github.com/grafana/loki/pkg/loki.(*Loki).initStore.func1(0x0, 0x0, 0x3, 0xc000dedf78)
 ingester       /src/loki/pkg/loki/modules.go:287 +0x3b
ingester-1 ingester github.com/cortexproject/cortex/pkg/util/services.(*BasicService).main(0xc00057e480)
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/util/services/basic_service.go:184 +0x2fc
ingester-1 ingester created by github.com/cortexproject/cortex/pkg/util/services.(*BasicService).StartAsync.func1
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/util/services/basic_service.go:96 +0xa8

For reference this is my Ingester manifest and Config

I did not see any OOMKilling happening nor any of the local PVC are full ( just because i run this with smaller resources than the upstream loki mixin would set )

To Reproduce
Steps to reproduce the behavior:

Started Loki (SHA or version) v1.6.1 on kubernetes using the upstream loki mixin
Do a rolling restart of the ingester statefulset kubectl rollout restart -n loki ingester
Watch it panic on shutdown

Expected behavior
Clean shutdown

Environment:

Infrastructure: [e.g., Kubernetes, bare-metal, laptop] - Kubernetes GKE 1.15.12-gke.9
Deployment tool: [e.g., helm, jsonnet] tanka + loki mixin@1e08530cb5d87c147ac76abc4d00f57710522689

The text was updated successfully, but these errors were encountered:

primeroz · 2020-09-10T06:42:14Z

For reference in Slack it was suggested this might be a bug due to having 2 entries in my schema because , when i upgraded to 1.6, i changed the new indexes to be 24h.

sandeepsukhani mentioned this issue Sep 10, 2020

fix a panic when trying to stop boltdb-shipper multiple times using sync.once #2613

Merged

1 task

sandeepsukhani closed this as completed in #2613 Sep 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingester Panic on shutdown when using boltdb-shipper #2609

Ingester Panic on shutdown when using boltdb-shipper #2609

primeroz commented Sep 9, 2020

primeroz commented Sep 10, 2020

Ingester Panic on shutdown when using boltdb-shipper #2609

Ingester Panic on shutdown when using boltdb-shipper #2609

Comments

primeroz commented Sep 9, 2020

primeroz commented Sep 10, 2020