Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v3.0.0: loki backend SIGSEGV if index_gateway.mode: ring #12270

Open
awoimbee opened this issue Mar 20, 2024 · 6 comments
Open

v3.0.0: loki backend SIGSEGV if index_gateway.mode: ring #12270

awoimbee opened this issue Mar 20, 2024 · 6 comments
Labels
type/bug Somehing is not working as expected

Comments

@awoimbee
Copy link

Describe the bug
Running version grafana/loki:main-0bf894b, loki-backend (replicas: 1) crashes:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x288 pc=0x223f470]

goroutine 1 [running]:
github.com/grafana/loki/pkg/loki.(*Loki).updateConfigForShipperStore(0xc000638be0?)
	/src/loki/pkg/loki/modules.go:709 +0xb0
github.com/grafana/loki/pkg/loki.(*Loki).initBloomStore(0xc000d3c000)
	/src/loki/pkg/loki/modules.go:663 +0x68
github.com/grafana/dskit/modules.(*Manager).initModule(0xc000c86720, {0x7ffe92a04bb1, 0x7}, 0x0?, 0x42?)
	/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:136 +0x1f7
github.com/grafana/dskit/modules.(*Manager).InitModuleServices(0x0?, {0xc000ce2990, 0x1, 0x40d39a?})
	/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:108 +0xd8
github.com/grafana/loki/pkg/loki.(*Loki).Run(0xc000d3c000, {0x0?, {0x4?, 0x3?, 0x4751b00?}})
	/src/loki/pkg/loki/loki.go:431 +0x9d

Workaround: edit the configmap, change index_gateway.mode from ring to simple.
Note that I use tsdb, having a boltdb config or not in storage_config does not change anything.

Environment:

  • Infrastructure: Kubernetes
  • Deployment tool: Helm
@awoimbee awoimbee changed the title main branch: loki backend sigsev if index_gateway.mode: ring main branch: loki backend SIGSEGV if index_gateway.mode: ring Mar 20, 2024
@JStickler JStickler added the type/bug Somehing is not working as expected label Mar 25, 2024
@awoimbee
Copy link
Author

Closing since there have been some releases since, if it still happens I'll reopen

@Nissou31
Copy link

Happend for me today while deploying a simple scalable loki 3.0.0 only on backend pod

@awoimbee awoimbee reopened this May 2, 2024
@awoimbee awoimbee changed the title main branch: loki backend SIGSEGV if index_gateway.mode: ring v3.0.0: loki backend SIGSEGV if index_gateway.mode: ring May 2, 2024
@alexandergoncharovaspecta

The same problem only the difference i have 3 pods 2 are ok 1 - CrashLoopBack

k8 logs -n observability loki-backend-1 -c loki
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x288 pc=0x22f02b0]

goroutine 1 [running]:
github.com/grafana/loki/v3/pkg/loki.(*Loki).updateConfigForShipperStore(0xc0006d5ea0?)
/src/loki/pkg/loki/modules.go:755 +0xb0
github.com/grafana/loki/v3/pkg/loki.(*Loki).initBloomStore(0xc000cab500)
/src/loki/pkg/loki/modules.go:715 +0x68
github.com/grafana/dskit/modules.(*Manager).initModule(0xc0004f2f90, {0x7fffb01fda84, 0x7}, 0x1?, 0xc00096e1e0?)
/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:136 +0x1f7
github.com/grafana/dskit/modules.(*Manager).InitModuleServices(0x0?, {0xc00097ca80, 0x1, 0xc0005a9b30?})
/src/loki/vendor/github.com/grafana/dskit/modules/modules.go:108 +0xd8
github.com/grafana/loki/v3/pkg/loki.(*Loki).Run(0xc000cab500, {0x0?, {0x4?, 0x3?, 0x4912940?}})
/src/loki/pkg/loki/loki.go:453 +0x9d
main.main()
/src/loki/cmd/loki/main.go:122 +0x113b

@chaudum
Copy link
Contributor

chaudum commented May 3, 2024

@alexandergoncharovaspecta Can you provide your config?

@chaudum
Copy link
Contributor

chaudum commented May 3, 2024

@alexandergoncharovaspecta Can you provide your config?

I am able to reproduce the bug on the release-3.0.x branch using

$ ./cmd/loki/loki -target=backend -index-gateway.mode=ring
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x288 pc=0x22efff0]

goroutine 1 [running]:
github.com/grafana/loki/v3/pkg/loki.(*Loki).updateConfigForShipperStore(0xc0008b8960?)
	/home/christian/sandbox/grafana/loki/pkg/loki/modules.go:755 +0xb0
github.com/grafana/loki/v3/pkg/loki.(*Loki).initBloomStore(0xc0007c9500)
	/home/christian/sandbox/grafana/loki/pkg/loki/modules.go:715 +0x68
github.com/grafana/dskit/modules.(*Manager).initModule(0xc00063c780, {0x7fffab192a32, 0x7}, 0x1?, 0xc000eb8d20?)
	/home/christian/sandbox/grafana/loki/vendor/github.com/grafana/dskit/modules/modules.go:136 +0x1f7
github.com/grafana/dskit/modules.(*Manager).InitModuleServices(0x0?, {0xc000a0dc20, 0x1, 0xc000eb8bd0?})
	/home/christian/sandbox/grafana/loki/vendor/github.com/grafana/dskit/modules/modules.go:108 +0xd8
github.com/grafana/loki/v3/pkg/loki.(*Loki).Run(0xc0007c9500, {0x0?, {0x4?, 0x3?, 0x493d3e0?}})
	/home/christian/sandbox/grafana/loki/pkg/loki/loki.go:453 +0x9d
main.main()
	/home/christian/sandbox/grafana/loki/cmd/loki/main.go:122 +0x113b

chaudum added a commit that referenced this issue May 3, 2024
The bloom store initialisation updates the shipper configuration
which in turn requires the index gateway ring to be initialized in case
`-index-gateway.mode` is set to `ring`.

Therefore the `BloomStore` module needs to depend on the
`IndexGatewayRing` module.

Fixes #12270

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
@alexandergoncharovaspecta

@alexandergoncharovaspecta Can you provide your config?

I am able to reproduce the bug on the release-3.0.x branch using

$ ./cmd/loki/loki -target=backend -index-gateway.mode=ring
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x288 pc=0x22efff0]

goroutine 1 [running]:
github.com/grafana/loki/v3/pkg/loki.(*Loki).updateConfigForShipperStore(0xc0008b8960?)
	/home/christian/sandbox/grafana/loki/pkg/loki/modules.go:755 +0xb0
github.com/grafana/loki/v3/pkg/loki.(*Loki).initBloomStore(0xc0007c9500)
	/home/christian/sandbox/grafana/loki/pkg/loki/modules.go:715 +0x68
github.com/grafana/dskit/modules.(*Manager).initModule(0xc00063c780, {0x7fffab192a32, 0x7}, 0x1?, 0xc000eb8d20?)
	/home/christian/sandbox/grafana/loki/vendor/github.com/grafana/dskit/modules/modules.go:136 +0x1f7
github.com/grafana/dskit/modules.(*Manager).InitModuleServices(0x0?, {0xc000a0dc20, 0x1, 0xc000eb8bd0?})
	/home/christian/sandbox/grafana/loki/vendor/github.com/grafana/dskit/modules/modules.go:108 +0xd8
github.com/grafana/loki/v3/pkg/loki.(*Loki).Run(0xc0007c9500, {0x0?, {0x4?, 0x3?, 0x493d3e0?}})
	/home/christian/sandbox/grafana/loki/pkg/loki/loki.go:453 +0x9d
main.main()
	/home/christian/sandbox/grafana/loki/cmd/loki/main.go:122 +0x113b

Yes

Source: loki/templates/config.yaml

apiVersion: v1
kind: ConfigMap
metadata:
name: loki
namespace: observability
labels:
helm.sh/chart: loki-6.3.4
app.kubernetes.io/name: loki
app.kubernetes.io/instance: loki
app.kubernetes.io/version: "3.0.0"
app.kubernetes.io/managed-by: Helm
data:
config.yaml: |

auth_enabled: false
chunk_store_config:
  chunk_cache_config:
    background:
      writeback_buffer: 500000
      writeback_goroutines: 1
      writeback_size_limit: 500MB
    default_validity: 0s
    memcached:
      batch_size: 4
      parallelism: 5
    memcached_client:
      addresses: dnssrvnoa+_memcached-client._tcp.loki-chunks-cache.observability.svc
      consistent_hash: true
      max_idle_conns: 72
      timeout: 2000ms
common:
  compactor_address: 'http://loki-backend:3100'
  path_prefix: /var/loki
  replication_factor: 3
  storage:
    azure:
      account_key: ${LOKI_AZURE_ACCOUNT_KEY}
      account_name: ${LOKI_AZURE_ACCOUNT_NAME}
      container_name: chunks
      use_federated_token: false
      use_managed_identity: false
frontend:
  scheduler_address: ""
  tail_proxy_url: http://loki-querier.observability.svc.cluster.local:3100
frontend_worker:
  scheduler_address: ""
index_gateway:
  mode: ring
limits_config:
  allow_structured_metadata: false
  max_cache_freshness_per_query: 10m
  max_query_parallelism: 32
  max_query_series: 100000
  query_timeout: 300s
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  retention_period: 720h
  split_queries_by_interval: 15m
  tsdb_max_query_parallelism: 512
  volume_enabled: true
memberlist:
  join_members:
  - loki-memberlist
pattern_ingester:
  enabled: false
querier:
  max_concurrent: 16
query_range:
  align_queries_with_step: true
  cache_results: true
  results_cache:
    cache:
      background:
        writeback_buffer: 500000
        writeback_goroutines: 1
        writeback_size_limit: 500MB
      default_validity: 12h
      memcached_client:
        addresses: dnssrvnoa+_memcached-client._tcp.loki-results-cache.observability.svc
        consistent_hash: true
        timeout: 500ms
        update_interval: 1m
query_scheduler:
  max_outstanding_requests_per_tenant: 32768
ruler:
  storage:
    azure:
      account_key: ${LOKI_AZURE_ACCOUNT_KEY}
      account_name: ${LOKI_AZURE_ACCOUNT_NAME}
      container_name: ruler
      use_federated_token: false
      use_managed_identity: false
    type: azure
runtime_config:
  file: /etc/loki/runtime-config/runtime-config.yaml
schema_config:
  configs:
  - from: "2024-02-29"
    index:
      period: 24h
      prefix: loki_index_
    object_store: azure
    schema: v13
    store: tsdb
server:
  grpc_listen_port: 9095
  http_listen_port: 3100
  http_server_read_timeout: 600s
  http_server_write_timeout: 600s
storage_config:
  boltdb_shipper:
    index_gateway_client:
      server_address: dns+loki-backend-headless.observability.svc.cluster.local:9095
  hedging:
    at: 250ms
    max_per_second: 20
    up_to: 3
  tsdb_shipper:
    index_gateway_client:
      server_address: dns+loki-backend-headless.observability.svc.cluster.local:9095
tracing:
  enabled: false

chaudum added a commit that referenced this issue May 6, 2024
The bloom store initialisation updates the shipper configuration which in turn requires the index gateway ring to be initialised in case `-index-gateway.mode` is set to `ring`.

Therefore the `BloomStore` module needs to depend on the `IndexGatewayRing` module.

Fixes #12270

Signed-off-by: Christian Haudum <christian.haudum@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Somehing is not working as expected
Projects
None yet
Development

No branches or pull requests

5 participants