Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic from /metrics during startup #7454

Closed
carlpett opened this issue Jun 25, 2020 · 3 comments · Fixed by #7501
Closed

Panic from /metrics during startup #7454

carlpett opened this issue Jun 25, 2020 · 3 comments · Fixed by #7501

Comments

@carlpett
Copy link
Member

What did you do?
Restart Prometheus

What did you expect to see?
Clean startup

What did you see instead? Under which circumstances?
Panic and stack trace during a call to /metrics before startup finished (that the upstream Prometheus server called /metrics despite Prometheus not having signalled ready appears to be a bug in our service discovery in that setup)
{"caller":"web.go:99","client":"10.86.14.4:43362","component":"web","err":"runtime error: invalid memory address or nil pointer dereference","level":"error","msg":"panic while serving request","stack":"<see decoded stack below>","url":"/metrics"}

Full logs supplied below, here's the stack:

goroutine 166 [running]:
github.com/prometheus/prometheus/web.withStackTracer.func1.1(0x2f3b7e0, 0xc0004ccc60, 0xc0000f6600)
	/app/web/web.go:98 +0xb5
panic(0x2334740, 0x42d4ba0)
	/usr/local/go/src/runtime/panic.go:969 +0x166
github.com/prometheus/prometheus/tsdb.(*Head).MinTime(...)
	/app/tsdb/head.go:1407
github.com/prometheus/prometheus/tsdb.newDBMetrics.func3(0x0)
	/app/tsdb/db.go:228 +0x80
github.com/prometheus/client_golang/prometheus.(*valueFunc).Write(0xc0001fab40, 0xc0001c0620, 0xc000113040, 0x3)
	/app/vendor/github.com/prometheus/client_golang/prometheus/value.go:76 +0x2f
github.com/prometheus/client_golang/prometheus.processMetric(0x2f53020, 0xc0001fab40, 0xc000112ed0, 0xc000112f00, 0x0, 0x0, 0x0)
	/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:595 +0x9b
github.com/prometheus/client_golang/prometheus.(*Registry).Gather(0xc0001385f0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:489 +0x87c
github.com/prometheus/client_golang/prometheus/promhttp.HandlerFor.func1(0x7f98a29be170, 0xc0000a1180, 0xc0000f6900)
	/app/vendor/github.com/prometheus/client_golang/prometheus/promhttp/http.go:126 +0x93
net/http.HandlerFunc.ServeHTTP(0xc000492850, 0x7f98a29be170, 0xc0000a1180, 0xc0000f6900)
	/usr/local/go/src/net/http/server.go:2012 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerInFlight.func1(0x7f98a29be170, 0xc0000a1180, 0xc0000f6900)
	/app/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:40 +0xab
net/http.HandlerFunc.ServeHTTP(0xc0005c3b90, 0x7f98a29be170, 0xc0000a1180, 0xc0000f6900)
	/usr/local/go/src/net/http/server.go:2012 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1(0x7f98a29be170, 0xc0000a1130, 0xc0000f6900)
	/app/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:100 +0xda
net/http.HandlerFunc.ServeHTTP(0xc0005c3cb0, 0x7f98a29be170, 0xc0000a1130, 0xc0000f6900)
	/usr/local/go/src/net/http/server.go:2012 +0x44
net/http.HandlerFunc.ServeHTTP(0xc00018bd40, 0x7f98a29be170, 0xc0000a1130, 0xc0000f6900)
	/usr/local/go/src/net/http/server.go:2012 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1(0x7f98a29be170, 0xc0000a10e0, 0xc0000f6900)
	/app/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:196 +0xea
net/http.HandlerFunc.ServeHTTP(0xc0005c3e30, 0x7f98a29be170, 0xc0000a10e0, 0xc0000f6900)
	/usr/local/go/src/net/http/server.go:2012 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2(0x7f98a29be170, 0xc0000a10e0, 0xc0000f6900)
	/app/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:76 +0xb2
net/http.HandlerFunc.ServeHTTP(0xc0005c3f50, 0x7f98a29be170, 0xc0000a10e0, 0xc0000f6900)
	/usr/local/go/src/net/http/server.go:2012 +0x44
github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1(0x2f85560, 0xc0000a1040, 0xc0000f6900)
	/app/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:100 +0xda
github.com/prometheus/prometheus/web.setPathWithPrefix.func1.1(0x2f85560, 0xc0000a1040, 0xc0000f6800)
	/app/web/web.go:1142 +0x1c6
github.com/prometheus/common/route.(*Router).handle.func1(0x2f85560, 0xc0000a1040, 0xc0000f6700, 0x0, 0x0, 0x0)
	/app/vendor/github.com/prometheus/common/route/route.go:83 +0x27f
github.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc000183f80, 0x2f85560, 0xc0000a1040, 0xc0000f6700)
	/app/vendor/github.com/julienschmidt/httprouter/router.go:387 +0xc37
github.com/prometheus/common/route.(*Router).ServeHTTP(0xc00018a0e0, 0x2f85560, 0xc0000a1040, 0xc0000f6700)
	/app/vendor/github.com/prometheus/common/route/route.go:121 +0x4c
net/http.(*ServeMux).ServeHTTP(0xc0005fe440, 0x2f85560, 0xc0000a1040, 0xc0000f6700)
	/usr/local/go/src/net/http/server.go:2387 +0x1a5
github.com/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5(0x2f83ae0, 0xc0001ce0e0, 0xc0000f6600)
	/app/vendor/github.com/opentracing-contrib/go-stdlib/nethttp/server.go:140 +0x5f6
net/http.HandlerFunc.ServeHTTP(0xc0007869f0, 0x2f83ae0, 0xc0001ce0e0, 0xc0000f6600)
	/usr/local/go/src/net/http/server.go:2012 +0x44
github.com/prometheus/prometheus/web.withStackTracer.func1(0x2f83ae0, 0xc0001ce0e0, 0xc0000f6600)
	/app/web/web.go:103 +0x8b
net/http.HandlerFunc.ServeHTTP(0xc000786a20, 0x2f83ae0, 0xc0001ce0e0, 0xc0000f6600)
	/usr/local/go/src/net/http/server.go:2012 +0x44
net/http.serverHandler.ServeHTTP(0xc00081a000, 0x2f83ae0, 0xc0001ce0e0, 0xc0000f6600)
	/usr/local/go/src/net/http/server.go:2807 +0xa3
net/http.(*conn).serve(0xc000360640, 0x2f8cfa0, 0xc000694d00)
	/usr/local/go/src/net/http/server.go:1895 +0x86c
created by net/http.(*Server).Serve
	/usr/local/go/src/net/http/server.go:2933 +0x35c

Environment

  • System information:
    Linux 4.14.138+ x86_64 (Container running on Google Kubernetes Engine)

  • Prometheus version:

prometheus, version 2.19.1 (branch: HEAD, revision: eba3fdc)
build user: root@62700b3d0ef9
build date: 20200618-16:35:26
go version: go1.14.4

Logs

{"caller":"main.go:337","level":"info","msg":"Starting Prometheus","ts":"2020-06-25T05:05:22.144Z","version":"(version=2.19.1, branch=HEAD, revision=eba3fdcbf0d378b66600281903e3aab515732b39)"}
{"build_context":"(go=go1.14.4, user=root@62700b3d0ef9, date=20200618-16:35:26)","caller":"main.go:338","level":"info","ts":"2020-06-25T05:05:22.144Z"}
{"caller":"main.go:339","host_details":"(Linux 4.14.138+ #1 SMP Tue Sep 3 02:58:08 PDT 2019 x86_64 prometheus-player-metrics-0 (none))","level":"info","ts":"2020-06-25T05:05:22.144Z"}
{"caller":"main.go:340","fd_limits":"(soft=1048576, hard=1048576)","level":"info","ts":"2020-06-25T05:05:22.144Z"}
{"caller":"main.go:341","level":"info","ts":"2020-06-25T05:05:22.144Z","vm_limits":"(soft=unlimited, hard=unlimited)"}
{"caller":"main.go:678","level":"info","msg":"Starting TSDB ...","ts":"2020-06-25T05:05:22.240Z"}
{"address":"0.0.0.0:9090","caller":"web.go:524","component":"web","level":"info","msg":"Start listening for connections","ts":"2020-06-25T05:05:22.240Z"}
{"caller":"web.go:99","client":"10.86.14.4:43362","component":"web","err":"runtime error: invalid memory address or nil pointer dereference","level":"error","msg":"panic while serving request","stack":"<see decoded stack above>","url":"/metrics"}
{"caller":"http: panic serving 10.86.14.4:43362","component":"web","level":"error","msg":"runtime error: invalid memory address or nil pointer dereference","ts":"2020-06-25T05:05:32.640Z"}
{"caller":"head.go:645","component":"tsdb","level":"info","msg":"Replaying WAL and on-disk memory mappable chunks if any, this may take a while","ts":"2020-06-25T05:05:33.096Z"}
...
@brian-brazil
Copy link
Contributor

@codesome Are we not handling metrics collection properly before the TSDB is ready?

@codesome codesome self-assigned this Jun 25, 2020
@codesome
Copy link
Member

We register the metrics before we assign the Head

db.metrics = newDBMetrics(db, r)

And the panic is here

return float64(db.head.MinTime())

But opening DB wont return without error unless Head is assigned here

db.head, err = NewHead(r, l, wlog, rngs[0], dir, db.chunkPool, opts.StripeSize, opts.SeriesLifecycleCallback)

So if a DB is opened successfully, I am not sure how this panic is possible. Needs more investigation.

@codesome
Copy link
Member

codesome commented Jul 2, 2020

I think what I mentioned above is the thing, we register before we assign head, and it panics. The solution is as simple as registering after assigning head block (but before WAL replay).

@prometheus prometheus locked as resolved and limited conversation to collaborators Nov 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants