Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingester Panic on shutdown when using boltdb-shipper #2609

Closed
primeroz opened this issue Sep 9, 2020 · 1 comment · Fixed by #2613
Closed

Ingester Panic on shutdown when using boltdb-shipper #2609

primeroz opened this issue Sep 9, 2020 · 1 comment · Fixed by #2613

Comments

@primeroz
Copy link
Contributor

primeroz commented Sep 9, 2020

Describe the bug
Tested with grafana/loki:1.6.1 ( Noticed with this version but have not tried older ones ) and also with grafana/loki:k32-164f5cd as asked by ewelch on slack

Whenever i do a rolling restart of my Ingesters i can see all of them panic with

ingester-1 ingester level=info ts=2020-09-09T15:15:05.789748298Z caller=signals.go:55 msg="=== received SIGINT/SIGTERM ===\n*** exiting"
ingester-1 ingester level=info ts=2020-09-09T15:15:05.790101645Z caller=lifecycler.go:421 msg="lifecycler loop() exited gracefully" ring=ingester
ingester-1 ingester level=info ts=2020-09-09T15:15:05.790134341Z caller=lifecycler.go:710 msg="changing instance state from" old_state=ACTIVE new_state=LEAVING ring=ingester
ingester-1 ingester level=info ts=2020-09-09T15:15:05.790400011Z caller=module_service.go:90 msg="module stopped" module=runtime-config
ingester-1 ingester level=info ts=2020-09-09T15:15:05.800942679Z caller=lifecycler.go:749 msg="transfers are disabled"
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344199809Z caller=lifecycler.go:473 msg="instance removed from the KV store" ring=ingester
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344284404Z caller=module_service.go:90 msg="module stopped" module=ingester
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344415348Z caller=module_service.go:90 msg="module stopped" module=memberlist-kv
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344789138Z caller=table_manager.go:82 msg="stopping table manager"
ingester-2ingester-1  ingester ingesterlevel=info ts=2020-09-09T15:15:36.344835775Z caller=table_manager.go:158 msg="uploading tables"
ingester-1 ingester level=info ts=2020-09-09T15:15:36.344927265Z caller=table.go:213 msg="uploading table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.556499152Z caller=table.go:243 msg="finished uploading table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.556563069Z caller=table.go:301 msg="cleaning up unwanted dbs from table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.55687916Z caller=table_manager.go:82 msg="stopping table manager"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.556910447Z caller=table_manager.go:158 msg="uploading tables"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.556968228Z caller=table.go:213 msg="uploading table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.557011781Z caller=table.go:243 msg="finished uploading table loki_index_18514"
ingester-1 ingester level=info ts=2020-09-09T15:15:37.557026962Z caller=table.go:301 msg="cleaning up unwanted dbs from table loki_index_18514"
ingester-1 ingester panic: close of closed channel
ingester-1 ingester 
ingester-1 ingester goroutine 682 [running]:
ingester-1 ingester github.com/cortexproject/cortex/pkg/chunk/local.(*BoltIndexClient).Stop(0xc000340cd0)
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/chunk/local/boltdb_index_client.go:119 +0x54
ingester-1 ingester github.com/grafana/loki/pkg/storage/stores/shipper.(*Shipper).Stop(0xc00029e600)
ingester-1 ingester     /src/loki/pkg/storage/stores/shipper/shipper_index_client.go:190 +0x46
ingester-1 ingester github.com/cortexproject/cortex/pkg/chunk/storage.(*cachingIndexClient).Stop(0xc000341f90)
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/chunk/storage/caching_index_client.go:68 +0x4a
ingester-1 ingester github.com/cortexproject/cortex/pkg/chunk.(*baseStore).Stop(0xc0008f3500)
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/chunk/chunk_store.go:120 +0x6c
ingester-1 ingester github.com/cortexproject/cortex/pkg/chunk.compositeStore.Stop(0x0, 0x0, 0xc000738630, 0x2, 0x2)
ingester-2ingester-1 ingester   /src/loki/vendor/github.com/cortexproject/cortex/pkg/chunk/composite_store.go:194 +0x60
github.com/grafana/loki/pkg/loki.(*Loki).initStore.func1(0x0, 0x0, 0x3, 0xc000dedf78)
 ingester       /src/loki/pkg/loki/modules.go:287 +0x3b
ingester-1 ingester github.com/cortexproject/cortex/pkg/util/services.(*BasicService).main(0xc00057e480)
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/util/services/basic_service.go:184 +0x2fc
ingester-1 ingester created by github.com/cortexproject/cortex/pkg/util/services.(*BasicService).StartAsync.func1
ingester-1 ingester     /src/loki/vendor/github.com/cortexproject/cortex/pkg/util/services/basic_service.go:96 +0xa8

For reference this is my Ingester manifest and Config

I did not see any OOMKilling happening nor any of the local PVC are full ( just because i run this with smaller resources than the upstream loki mixin would set )

To Reproduce
Steps to reproduce the behavior:

  1. Started Loki (SHA or version) v1.6.1 on kubernetes using the upstream loki mixin
  2. Do a rolling restart of the ingester statefulset kubectl rollout restart -n loki ingester
  3. Watch it panic on shutdown

Expected behavior
Clean shutdown

Environment:

  • Infrastructure: [e.g., Kubernetes, bare-metal, laptop] - Kubernetes GKE 1.15.12-gke.9
  • Deployment tool: [e.g., helm, jsonnet] tanka + loki mixin@1e08530cb5d87c147ac76abc4d00f57710522689
@primeroz
Copy link
Contributor Author

For reference in Slack it was suggested this might be a bug due to having 2 entries in my schema because , when i upgraded to 1.6, i changed the new indexes to be 24h.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant