Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Promtail: panic: duplicate metrics collector registration attempted (loki_push_api) #10796

Closed
jcreixell opened this issue Oct 5, 2023 · 4 comments · Fixed by #10798
Closed

Comments

@jcreixell
Copy link

Describe the bug

Dynamically reloading promtail's configuration when using loki_push_api leads to a panic caused by double registration of metrics. This also indirectly affects Grafana Agent.

To Reproduce
Steps to reproduce the behavior:

  1. Run promtail with the following configuration:
server:
  http_listen_port: 9080
  grpc_listen_port: 0
  enable_runtime_reload: true
positions:
  filename: /tmp/positions.yaml
client:
  url: http://ip_or_hostname_where_Loki_run:3100/api/prom/push
scrape_configs:
  - job_name: remote_logs
    loki_push_api:
      server:
        http_listen_port: 3500
        grpc_listen_port: 0
  1. Change something in the config while promtail is running (otherwise the reload will do a no-op)
  2. curl http://localhost:9080/reload

The a panic shows up in the logs.

Expected behavior
Promtail shouldn't panic on reload.

Environment:
Tested directly from the source on the latest release (commit 3b4bd12).

Screenshots, Promtail config, or terminal output

level=error ts=2023-10-05T12:29:01.554072921Z caller=promtail.go:289 msg="Error reloading config" err="config has not changed"
level=warn ts=2023-10-05T12:29:01.55411467Z caller=logging.go:126 traceID=68fb6a732b3fa2ed msg="GET /reload (500) 844.528µs Response: \"failed to reload config: config has not changed\\n\" ws: false; Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7; Accept-Encoding: gzip, deflate, br; Accept-Language: en-US,en;q=0.9; Cache-Control: max-age=0; Connection: keep-alive; Sec-Ch-Ua: \"Chromium\";v=\"116\", \"Not)A;Brand\";v=\"24\", \"Google Chrome\";v=\"116\"; Sec-Ch-Ua-Mobile: ?0; Sec-Ch-Ua-Platform: \"Linux\"; Sec-Fetch-Dest: document; Sec-Fetch-Mode: navigate; Sec-Fetch-Site: none; Sec-Fetch-User: ?1; Upgrade-Insecure-Requests: 1; User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36; "
panic: duplicate metrics collector registration attempted

goroutine 242 [running]:
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0x36c5479?, {0xc0017a57d0?, 0x1, 0xb?})
	/home/jorge/workspace/src/github.com/grafana/loki/vendor/github.com/prometheus/client_golang/prometheus/registry.go:405 +0x78
github.com/grafana/loki/clients/pkg/promtail/wal.NewWatcherMetrics({0x3feac60, 0x5d1aba0})
	/home/jorge/workspace/src/github.com/grafana/loki/clients/pkg/promtail/wal/watcher_metrics.go:73 +0xa79
github.com/grafana/loki/clients/pkg/promtail/client.NewManager(0x0?, {0x3fcfdc0, 0xc001894050}, {0x40c3880000000000, 0x2710, 0x0, 0x1, 0x0, 0x0, 0x0}, ...)
	/home/jorge/workspace/src/github.com/grafana/loki/clients/pkg/promtail/client/manager.go:61 +0x85
github.com/grafana/loki/clients/pkg/promtail.(*Promtail).reloadConfig(0xc000ca37c0, 0xc000ca6600)
	/home/jorge/workspace/src/github.com/grafana/loki/clients/pkg/promtail/promtail.go:170 +0x8cc
github.com/grafana/loki/clients/pkg/promtail.(*Promtail).reload(0xc000ca37c0)
	/home/jorge/workspace/src/github.com/grafana/loki/clients/pkg/promtail/promtail.go:286 +0xa5
github.com/grafana/loki/clients/pkg/promtail.(*Promtail).watchConfig(0xc000ca37c0)
	/home/jorge/workspace/src/github.com/grafana/loki/clients/pkg/promtail/promtail.go:271 +0x3d1
created by github.com/grafana/loki/clients/pkg/promtail.(*Promtail).Run in goroutine 1
	/home/jorge/workspace/src/github.com/grafana/loki/clients/pkg/promtail/promtail.go:214 +0xcd
@jcreixell
Copy link
Author

semi-related, grafana agent shows the following similar error when reloading a config with loki_push_api:

2023/10/05 14:49:34 http: panic serving 127.0.0.1:34356: duplicate metrics collector registration attempted
goroutine 322 [running]:
net/http.(*conn).serve.func1()
        /usr/lib/go/src/net/http/server.go:1868 +0xb9
panic({0x6b7dd80?, 0xc0039e44a0?})
        /usr/lib/go/src/runtime/panic.go:920 +0x270
github.com/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5.1()
        /home/jorge/workspace/pkg/mod/github.com/opentracing-contrib/go-stdlib@v1.0.0/nethttp/server.go:150 +0x11e
panic({0x6b7dd80?, 0xc0039e44a0?})
        /usr/lib/go/src/runtime/panic.go:914 +0x21f
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0xc00382e7b0?, {0xc002090160?, 0x1, 0x0?})
        /home/jorge/workspace/pkg/mod/github.com/prometheus/client_golang@v1.16.0/prometheus/registry.go:405 +0x78
github.com/prometheus/client_golang/prometheus/promauto.Factory.NewGaugeVec({{0x8543b70?, 0xcbc6fa0?}}, {{0xc00382e7b0, 0x14}, {0x0, 0x0}, {0x7624738, 0xf}, {0x7735aa8, 0x2b}, ...}, ...)
        /home/jorge/workspace/pkg/mod/github.com/prometheus/client_golang@v1.16.0/prometheus/promauto/auto.go:308 +0x1bc
github.com/grafana/dskit/server.NewServerMetrics({{0xc00382e7b0, 0x14}, 0x0, {0x75f1162, 0x3}, {0x0, 0x0}, 0xdac, 0x0, {0x75f1162, ...}, ...})
        /home/jorge/workspace/pkg/mod/github.com/grafana/dskit@v0.0.0-20230829141140-06955c011ffd/server/metrics.go:30 +0x15a
github.com/grafana/dskit/server.New({{0xc00382e7b0, 0x14}, 0x0, {0x75f1162, 0x3}, {0x0, 0x0}, 0xdac, 0x0, {0x75f1162, ...}, ...})
        /home/jorge/workspace/pkg/mod/github.com/grafana/dskit@v0.0.0-20230829141140-06955c011ffd/server/server.go:224 +0x58
github.com/grafana/loki/clients/pkg/promtail/targets/lokipush.(*PushTarget).run(0xc0039016e0)
        /home/jorge/workspace/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/targets/lokipush/pushtarget.go:85 +0x338
github.com/grafana/loki/clients/pkg/promtail/targets/lokipush.NewPushTarget({0x84f4ac0?, 0xc002c9b090}, {0x85180e0?, 0xc001e01840}, {0x0, 0x0, 0x0}, {0xc00208e7e0, 0xb}, 0xc002b90a80)
        /home/jorge/workspace/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/targets/lokipush/pushtarget.go:64 +0x325
github.com/grafana/loki/clients/pkg/promtail/targets/lokipush.NewPushTargetManager({0x8543860, 0xc002cd2210}, {0x84f4ac0?, 0xc002c9b090}, {0x85180b8, 0xc0039e7360}, {0xc002cadb00?, 0x1, 0x1})
        /home/jorge/workspace/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/targets/lokipush/pushtargetmanager.go:47 +0x2b0
github.com/grafana/loki/clients/pkg/promtail/targets.NewTargetManagers({0x84f5d80?, 0xc002c9dcc0?}, {0x8543860, 0xc002cd2210}, {0x84f4ac0, 0xc002c9b090}, {0x2540be400, {0xc00399de00, 0x2a}, 0x0, ...}, ...)
        /home/jorge/workspace/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/targets/manager.go:222 +0x1cc5
github.com/grafana/loki/clients/pkg/promtail.(*Promtail).reloadConfig(0xc002c9dcc0, 0xc0020ca000)
        /home/jorge/workspace/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/promtail.go:187 +0xc0d
github.com/grafana/loki/clients/pkg/promtail.New({{{0xee6b280, 0xee6b280}}, {{{0x0, 0x0}, 0x0, {0x0, 0x0}, {0x0, 0x0}, 0x0, ...}, ...}, ...}, ...)
        /home/jorge/workspace/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/promtail.go:110 +0x2dd
github.com/grafana/agent/pkg/logs.(*Instance).ApplyConfig(0xc002d070b0, 0xc002c9cbe0, {{0xee6b280, 0xee6b280}, {0xc002b222c0, 0x1, 0x1}}, 0x68?)

@hainenber
Copy link
Contributor

I can reproduce the above issue and additionally, reloading via SIGHUP also causes panic.

@bboreham
Copy link
Contributor

bboreham commented Oct 9, 2023

The one here:

github.com/grafana/dskit/server.New({{0xc00382e7b0, 0x14}, 0x0, {0x75f1162, 0x3}, {0x0, 0x0}, 0xdac, 0x0, {0x75f1162, ...}, ...})
        /home/jorge/workspace/pkg/mod/github.com/grafana/dskit@v0.0.0-20230829141140-06955c011ffd/server/server.go:224 +0x58
github.com/grafana/loki/clients/pkg/promtail/targets/lokipush.(*PushTarget).run(0xc0039016e0)
        /home/jorge/workspace/pkg/mod/github.com/grafana/loki@v1.6.2-0.20231004111112-07cbef92268a/clients/pkg/promtail/targets/lokipush/pushtarget.go:85 +0x338

could be avoided by calling NewWithMetrics here:

srv, err := server.New(t.config.Server)

once for the whole program.

MichelHollands added a commit that referenced this issue Oct 12, 2023
…er reloaded (#10798)

**What this PR does / why we need it**:
Prevent Promtail panicking after getting reloaded.

**Which issue(s) this PR fixes**:
Fixes #10796 

**Special notes for your reviewer**:

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [x] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](d10549e)

---------

Co-authored-by: Michel Hollands <42814411+MichelHollands@users.noreply.github.com>
@svarcinator
Copy link

I've encountered similar error aftter I tried http://localhost:9081/reload.
I got this error:

panic: duplicate metrics collector registration attempted

goroutine 190 [running]:
github.com/prometheus/client_golang/prometheus.(*Registry).MustRegister(0x3e7294d?, {0xc001950390?, 0x1, 0xb?})
        /drone/src/vendor/github.com/prometheus/client_golang/prometheus/registry.go:405 +0x78
github.com/grafana/loki/clients/pkg/promtail/wal.NewWatcherMetrics({0x478b7b0, 0x648ec00})
        /drone/src/clients/pkg/promtail/wal/watcher_metrics.go:73 +0xa79
github.com/grafana/loki/clients/pkg/promtail/client.NewManager(0x0?, {0x4770b20, 0xc000a58af0}, {0x40c3880000000000, 0x2710, 0x0, 0x1, 0x0, 0x0, 0x0}, ...)
        /drone/src/clients/pkg/promtail/client/manager.go:61 +0x85
github.com/grafana/loki/clients/pkg/promtail.(*Promtail).reloadConfig(0xc0016e65a0, 0xc000097180)
        /drone/src/clients/pkg/promtail/promtail.go:170 +0x88c
github.com/grafana/loki/clients/pkg/promtail.(*Promtail).reload(0xc0016e65a0)
        /drone/src/clients/pkg/promtail/promtail.go:286 +0xa5
github.com/grafana/loki/clients/pkg/promtail.(*Promtail).watchConfig(0xc0016e65a0)
        /drone/src/clients/pkg/promtail/promtail.go:271 +0x3d1
created by github.com/grafana/loki/clients/pkg/promtail.(*Promtail).Run in goroutine 1
        /drone/src/clients/pkg/promtail/promtail.go:214 +0xcd

I have following version of loki and promtail:

loki, version 2.9.3 (branch: HEAD, revision: 2535f9bede)
  build user:       root@998f10a08814
  build date:       2023-12-11T19:17:52Z
  go version:       go1.21.3
  platform:         windows/amd64
  tags:             netgo

promtail, version 2.9.3 (branch: HEAD, revision: 2535f9bede)
  build user:       root@998f10a08814
  build date:       2023-12-11T19:17:52Z
  go version:       go1.21.3
  platform:         windows/amd64
  tags:             netgo

rhnasc pushed a commit to inloco/loki that referenced this issue Apr 12, 2024
…er reloaded (grafana#10798)

**What this PR does / why we need it**:
Prevent Promtail panicking after getting reloaded.

**Which issue(s) this PR fixes**:
Fixes grafana#10796 

**Special notes for your reviewer**:

**Checklist**
- [x] Reviewed the
[`CONTRIBUTING.md`](https://github.com/grafana/loki/blob/main/CONTRIBUTING.md)
guide (**required**)
- [ ] Documentation added
- [ ] Tests updated
- [x] `CHANGELOG.md` updated
- [ ] If the change is worth mentioning in the release notes, add
`add-to-release-notes` label
- [ ] Changes that require user attention or interaction to upgrade are
documented in `docs/sources/setup/upgrade/_index.md`
- [ ] For Helm chart changes bump the Helm chart version in
`production/helm/loki/Chart.yaml` and update
`production/helm/loki/CHANGELOG.md` and
`production/helm/loki/README.md`. [Example
PR](grafana@d10549e)

---------

Co-authored-by: Michel Hollands <42814411+MichelHollands@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants