Prometheus OOM for lots of short time series #3780

extraaa2 · 2018-02-01T10:21:54Z

Hi all,
I have been playing with prometheus and trying to configure it so it will work.
What will be the optimal settings for this case? What caused the OOM issue?

What did you do?
We want to store around 50 million short time series (ts that are not appearing regularly) for a long period (lets say 5 years).
The number of time series is increasing exponentially, and we expect to reach a logarithmic slope at around 50 million time series.
The time series are pulled from a custom server written in NodeJS using siimon/prom-client npm library.

What did you expect to see?
Prometheus ingesting time series for at least a period of time until the storage is full.

What did you see instead? Under which circumstances?
Prometheus runs out of memory after 10 minutes. The docker is run with --restart always. After a few restarts, the docker appears to be up in docker ps -a, but docker stats shows no action happening. Also, I cannot longer enter the docker with exec -it <DOCKER_ID> /bin/sh (after a few restarts).
I have to mention that when making requests using the HTTP API, the response is Service Unavailable from the beginning.

Environment
- 24GB Ram
- 8 cores
- 500 GB HDD
- Docker version 17.09.1-ce, build 19e2cf6
- Starting prometheus docker with:

sudo docker run --restart always -d -p 9090:9090 -u root -v /data/prometheus/config/prometheus.yml:/etc/prometheus/prometheus.yml -v /data/prometheus/data:/prometheus prom/prometheus --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.retention=34176h --storage.tsdb.max-block-duration=1d

System information:

Linux 4.15.0-041500rc3-lowlatency x86_64

Prometheus version:

  prometheus, version 2.0.0 (branch: HEAD, revision: 
  0a74f98628a0463dddc90528220c94de5032d1a0)
  build user:       root@615b82cb36b6
  build date:       20171108-07:11:59
  go version:       go1.9.2

Prometheus configuration file:

global:
  scrape_interval:     10s 
  evaluation_interval: 10s 
scrape_configs:
  - job_name: 'k_prometheus'

    metrics_path: "/metrics"

    scrape_interval: 10s
    scrape_timeout: 10s

    static_configs:
      - targets: ['172.24.11.98:8080']

Logs:

level=info ts=2018-02-01T10:13:41.872332029Z caller=main.go:215 msg="Starting Prometheus" version="(version=2.0.0, branch=HEAD, revision=0a74f98628a0463dddc90528220c94de5032d1a0)"
level=info ts=2018-02-01T10:13:41.873505911Z caller=main.go:216 build_context="(go=go1.9.2, user=root@615b82cb36b6, date=20171108-07:11:59)"
level=info ts=2018-02-01T10:13:41.873838088Z caller=main.go:217 host_details="(Linux 4.15.0-041500rc3-lowlatency #201712110230 SMP PREEMPT Mon Dec 11 02:33:33 UTC 2017 x86_64 d4ee3ece019c (none))"
level=info ts=2018-02-01T10:13:41.879244751Z caller=web.go:380 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-02-01T10:13:41.879167177Z caller=main.go:314 msg="Starting TSDB"
level=info ts=2018-02-01T10:13:41.879230809Z caller=targetmanager.go:71 component="target manager" msg="Starting target manager..."
fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x1bc42ef, 0x16)
	/usr/local/go/src/runtime/panic.go:605 +0x95
runtime.sysMap(0xc7a7540000, 0xbb00000, 0x7f3ea71f3000, 0x298cc78)
	/usr/local/go/src/runtime/mem_linux.go:216 +0x1d0
runtime.(*mheap).sysAlloc(0x2973560, 0xbb00000, 0x7f3ea71f3850)
	/usr/local/go/src/runtime/malloc.go:470 +0xd7
runtime.(*mheap).grow(0x2973560, 0x5d80, 0x0)
	/usr/local/go/src/runtime/mheap.go:887 +0x60
runtime.(*mheap).allocSpanLocked(0x2973560, 0x5d80, 0x298cc88, 0x7f3ea72219b8)
	/usr/local/go/src/runtime/mheap.go:800 +0x334
runtime.(*mheap).alloc_m(0x2973560, 0x5d80, 0xc420560100, 0x4151ac)
	/usr/local/go/src/runtime/mheap.go:666 +0x118
runtime.(*mheap).alloc.func1()
	/usr/local/go/src/runtime/mheap.go:733 +0x4d
runtime.systemstack(0xc420565f08)
	/usr/local/go/src/runtime/asm_amd64.s:360 +0xab
runtime.(*mheap).alloc(0x2973560, 0x5d80, 0xc420010100, 0x414814)
	/usr/local/go/src/runtime/mheap.go:732 +0xa1
runtime.largeAlloc(0xbb00000, 0x7f45ef160001, 0xc7a73c9c18)
	/usr/local/go/src/runtime/malloc.go:827 +0x98
runtime.mallocgc.func1()
	/usr/local/go/src/runtime/malloc.go:722 +0x46
runtime.systemstack(0xc4204964b8)
	/usr/local/go/src/runtime/asm_amd64.s:344 +0x79
runtime.mstart()
	/usr/local/go/src/runtime/proc.go:1135

goroutine 168 [running]:
runtime.systemstack_switch()
	/usr/local/go/src/runtime/asm_amd64.s:298 fp=0xc4205615e0 sp=0xc4205615d8 pc=0x459f70
runtime.mallocgc(0xbb00000, 0x1a27b40, 0xc7a74f9b01, 0xc4205616d0)
	/usr/local/go/src/runtime/malloc.go:721 +0x7ae fp=0xc420561688 sp=0xc4205615e0 pc=0x410e8e
runtime.newarray(0x1a27b40, 0x110000, 0x20)
	/usr/local/go/src/runtime/malloc.go:853 +0x60 fp=0xc4205616b8 sp=0xc420561688 pc=0x4111f0
runtime.makeBucketArray(0x191cde0, 0x295a814, 0xc7a020bea0, 0xc79b933ea0)
	/usr/local/go/src/runtime/hashmap.go:927 +0xf5 fp=0xc420561700 sp=0xc4205616b8 pc=0x40a135
runtime.hashGrow(0x191cde0, 0xc4203ba4e0)
	/usr/local/go/src/runtime/hashmap.go:951 +0xa3 fp=0xc420561760 sp=0xc420561700 pc=0x40a2f3
runtime.mapassign_fast32(0x191cde0, 0xc4203ba4e0, 0x565eac9, 0xc7a74fa868)
	/usr/local/go/src/runtime/hashmap_fast.go:485 +0x1b2 fp=0xc4205617c0 sp=0xc420561760 pc=0x40c562
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*indexReader).readSymbols(0xc42019e380, 0x5, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/index.go:662 +0x356 fp=0xc420561980 sp=0xc4205617c0 pc=0x1599b56
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.newIndexReader(0xc420015360, 0x1f, 0x288d880, 0xc4203fa5a0, 0xc42035b900)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/index.go:611 +0x1d3 fp=0xc420561a20 sp=0xc420561980 pc=0x1598ee3
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.OpenBlock(0xc420015360, 0x1f, 0x288d880, 0xc4203fa5a0, 0x0, 0xc4201da200, 0x20)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/block.go:163 +0xea fp=0xc420561aa8 sp=0xc420561a20 pc=0x157a09a
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*DB).reload(0xc42008add0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/db.go:477 +0x381 fp=0xc420561cf8 sp=0xc420561aa8 pc=0x1587ad1
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.Open(0x1bada03, 0x5, 0x2882400, 0xc4203fe0c0, 0x2893700, 0xc420050480, 0xc4203fe300, 0xc4203fe0c0, 0xc4202ba100, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/db.go:215 +0x4d1 fp=0xc420561e68 sp=0xc420561cf8 pc=0x1585641
github.com/prometheus/prometheus/storage/tsdb.Open(0x1bada03, 0x5, 0x2882400, 0xc4203fe0c0, 0x2893700, 0xc420050480, 0xc4205f8168, 0x0, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/storage/tsdb/tsdb.go:143 +0x2a2 fp=0xc420561ef0 sp=0xc420561e68 pc=0x15b7652
main.main.func3(0xc42008e720, 0x2882400, 0xc420396330, 0xc4205f8000, 0xc42042a440)
	/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:316 +0x222 fp=0xc420561fb8 sp=0xc420561ef0 pc=0x16e9162
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2337 +0x1 fp=0xc420561fc0 sp=0xc420561fb8 pc=0x45cb31
created by main.main
	/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:311 +0x3ea8

goroutine 1 [chan receive, 3 minutes]:
main.main()
	/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:353 +0x4154

goroutine 4 [syscall, 3 minutes]:
os/signal.signal_recv(0x0)
	/usr/local/go/src/runtime/sigqueue.go:131 +0xa6
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:22 +0x22
created by os/signal.init.0
	/usr/local/go/src/os/signal/signal_unix.go:28 +0x41

goroutine 19 [chan receive]:
github.com/prometheus/prometheus/vendor/github.com/golang/glog.(*loggingT).flushDaemon(0x2968240)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/golang/glog/glog.go:879 +0x9f
created by github.com/prometheus/prometheus/vendor/github.com/golang/glog.init.0
	/go/src/github.com/prometheus/prometheus/vendor/github.com/golang/glog/glog.go:410 +0x203

goroutine 250 [chan receive, 3 minutes]:
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.muxListener.Accept(...)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:184
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*muxListener).Accept(0xc4203beae0, 0x1c377e8, 0xc4205e59e0, 0x289b500, 0xc4203beae0)
	<autogenerated>:1 +0x65
github.com/prometheus/prometheus/vendor/google.golang.org/grpc.(*Server).Serve(0xc4205e59e0, 0x289b500, 0xc4203beae0, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/server.go:432 +0x189
github.com/prometheus/prometheus/web.(*Handler).Run.func6(0xc4205e59e0, 0x289b500, 0xc4203beae0, 0xc420432800)
	/go/src/github.com/prometheus/prometheus/web/web.go:458 +0x46
created by github.com/prometheus/prometheus/web.(*Handler).Run
	/go/src/github.com/prometheus/prometheus/web/web.go:457 +0xda5

goroutine 91 [select, 3 minutes, locked to thread]:
runtime.gopark(0x1c3ba30, 0x0, 0x1baf3f7, 0x6, 0x18, 0x1)
	/usr/local/go/src/runtime/proc.go:287 +0x12c
runtime.selectgo(0xc42046df50, 0xc42008e360)
	/usr/local/go/src/runtime/select.go:395 +0x1149
runtime.ensureSigM.func1()
	/usr/local/go/src/runtime/signal_unix.go:511 +0x220
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:2337 +0x1

goroutine 167 [chan receive, 3 minutes]:
main.main.func2(0xc42008e240, 0xc42008e180, 0xc4205f8000, 0x2882400, 0xc420396330, 0xc4206a8e60, 0x5, 0x5, 0xc420432800)
	/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:289 +0x4c
created by main.main
	/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:288 +0x3e36

goroutine 171 [select, 3 minutes]:
github.com/prometheus/prometheus/notifier.(*Notifier).Run(0xc4201b62d0)
	/go/src/github.com/prometheus/prometheus/notifier/notifier.go:309 +0x100
created by main.main
	/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:337 +0x3fe3

goroutine 249 [chan receive, 3 minutes]:
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.muxListener.Accept(...)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:184
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*muxListener).Accept(0xc4203bedc0, 0xc42007a010, 0x18f3440, 0x294f710, 0x1b6abc0)
	<autogenerated>:1 +0x65
net/http.(*Server).Serve(0xc42014a000, 0x289b500, 0xc4203bedc0, 0x0, 0x0)
	/usr/local/go/src/net/http/server.go:2695 +0x1b2
github.com/prometheus/prometheus/web.(*Handler).Run.func5(0xc42014a000, 0x289b500, 0xc4203bedc0, 0xc420432800)
	/go/src/github.com/prometheus/prometheus/web/web.go:453 +0x46
created by github.com/prometheus/prometheus/web.(*Handler).Run
	/go/src/github.com/prometheus/prometheus/web/web.go:452 +0xd59

goroutine 174 [IO wait, 3 minutes]:
internal/poll.runtime_pollWait(0x7f45ef10ff70, 0x72, 0xffffffffffffffff)
	/usr/local/go/src/runtime/netpoll.go:173 +0x57
internal/poll.(*pollDesc).wait(0xc4204c0098, 0x72, 0xc420a51a00, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0xae
internal/poll.(*pollDesc).waitRead(0xc4204c0098, 0xffffffffffffff00, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3d
internal/poll.(*FD).Accept(0xc4204c0080, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:335 +0x1e2
net.(*netFD).accept(0xc4204c0080, 0x1, 0x28a2260, 0xc42008e838)
	/usr/local/go/src/net/fd_unix.go:238 +0x42
net.(*TCPListener).accept(0xc42000c590, 0x403a73, 0xc42008e7e0, 0xc420a51cc8)
	/usr/local/go/src/net/tcpsock_posix.go:136 +0x2e
net.(*TCPListener).Accept(0xc42000c590, 0xc420a51cc8, 0xc420374480, 0x1a33fa0, 0x1a33fa0)
	/usr/local/go/src/net/tcpsock.go:247 +0x49
github.com/prometheus/prometheus/vendor/golang.org/x/net/netutil.(*limitListener).Accept(0xc4203bea00, 0xc420a51d20, 0x16a3f1a, 0x4592e0, 0xc420a51d60)
	/go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/netutil/listen.go:30 +0x53
github.com/prometheus/prometheus/vendor/github.com/mwitkow/go-conntrack.(*connTrackListener).Accept(0xc4203beac0, 0x1c34c30, 0xc4206a1340, 0x28a3640, 0xc4203894d0)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/mwitkow/go-conntrack/listener_wrapper.go:86 +0x37
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*cMux).Serve(0xc4206a1340, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:124 +0x95
github.com/prometheus/prometheus/web.(*Handler).Run(0xc420432800, 0x289cb40, 0xc4206a0280, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/web/web.go:463 +0xdbc
main.main.func4(0xc42008e780, 0xc420432800, 0x289cb40, 0xc4206a0280)
	/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:351 +0x3f
created by main.main
	/go/src/github.com/prometheus/prometheus/cmd/prometheus/main.go:351 +0x413a

goroutine 48 [select]:
github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.(*SegmentWAL).run(0xc42041a000, 0x2540be400)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:704 +0x3ee
created by github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb.OpenSegmentWAL
	/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/tsdb/wal.go:244 +0x7c0

goroutine 177 [chan receive (nil chan), 3 minutes]:
github.com/prometheus/prometheus/prompb.RegisterAdminHandlerFromEndpoint.func1.1(0x7f45ef110030, 0xc42007a010, 0xc4200cfa00, 0xc42007a9c0, 0x9)
	/go/src/github.com/prometheus/prometheus/prompb/rpc.pb.gw.go:67 +0x4c
created by github.com/prometheus/prometheus/prompb.RegisterAdminHandlerFromEndpoint.func1
	/go/src/github.com/prometheus/prometheus/prompb/rpc.pb.gw.go:66 +0x1b7

goroutine 176 [select, 3 minutes]:
github.com/prometheus/prometheus/vendor/google.golang.org/grpc.(*addrConn).transportMonitor(0xc4200cfba0)
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/clientconn.go:908 +0x1de
github.com/prometheus/prometheus/vendor/google.golang.org/grpc.(*ClientConn).resetAddrConn.func1(0xc4200cfba0)
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/clientconn.go:637 +0x1c8
created by github.com/prometheus/prometheus/vendor/google.golang.org/grpc.(*ClientConn).resetAddrConn
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/clientconn.go:628 +0x749

goroutine 225 [IO wait, 3 minutes]:
internal/poll.runtime_pollWait(0x7f45ef10feb0, 0x72, 0x0)
	/usr/local/go/src/runtime/netpoll.go:173 +0x57
internal/poll.(*pollDesc).wait(0xc4203ec618, 0x72, 0xffffffffffffff00, 0x288bbc0, 0x287f5b8)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0xae
internal/poll.(*pollDesc).waitRead(0xc4203ec618, 0xc4209c8000, 0x8000, 0x8000)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3d
internal/poll.(*FD).Read(0xc4203ec600, 0xc4209c8000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:126 +0x18a
net.(*netFD).Read(0xc4203ec600, 0xc4209c8000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/fd_unix.go:202 +0x52
net.(*conn).Read(0xc42013e038, 0xc4209c8000, 0x8000, 0x8000, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/net.go:176 +0x6d
bufio.(*Reader).Read(0xc4209a0060, 0xc4209d8038, 0x9, 0x9, 0x0, 0x0, 0x0)
	/usr/local/go/src/bufio/bufio.go:213 +0x30b
io.ReadAtLeast(0x2881080, 0xc4209a0060, 0xc4209d8038, 0x9, 0x9, 0x9, 0x0, 0x4119e6, 0xc4209e2da0)
	/usr/local/go/src/io/io.go:309 +0x86
io.ReadFull(0x2881080, 0xc4209a0060, 0xc4209d8038, 0x9, 0x9, 0xc4209d8028, 0x0, 0x0)
	/usr/local/go/src/io/io.go:327 +0x58
github.com/prometheus/prometheus/vendor/golang.org/x/net/http2.readFrameHeader(0xc4209d8038, 0x9, 0x9, 0x2881080, 0xc4209a0060, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/http2/frame.go:237 +0x7b
github.com/prometheus/prometheus/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc4209d8000, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/http2/frame.go:492 +0xa4
github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.(*framer).readFrame(0xc4209b6060, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http_util.go:608 +0x2f
github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.(*http2Client).reader(0xc4209dc000)
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http2_client.go:1080 +0x47
created by github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.newHTTP2Client
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http2_client.go:267 +0xbe4

goroutine 226 [select, 3 minutes]:
github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.(*http2Client).controller(0xc4209dc000)
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http2_client.go:1168 +0x142
created by github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport.newHTTP2Client
	/go/src/github.com/prometheus/prometheus/vendor/google.golang.org/grpc/transport/http2_client.go:297 +0xd1a

goroutine 251 [IO wait, 3 minutes]:
internal/poll.runtime_pollWait(0x7f45ef10fdf0, 0x72, 0x0)
	/usr/local/go/src/runtime/netpoll.go:173 +0x57
internal/poll.(*pollDesc).wait(0xc420376298, 0x72, 0xffffffffffffff00, 0x288bbc0, 0x287f5b8)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:85 +0xae
internal/poll.(*pollDesc).waitRead(0xc420376298, 0xc4209d8100, 0x9, 0x9)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:90 +0x3d
internal/poll.(*FD).Read(0xc420376280, 0xc4209d81f8, 0x9, 0x9, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:126 +0x18a
net.(*netFD).Read(0xc420376280, 0xc4209d81f8, 0x9, 0x9, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/fd_unix.go:202 +0x52
net.(*conn).Read(0xc42013e230, 0xc4209d81f8, 0x9, 0x9, 0x0, 0x0, 0x0)
	/usr/local/go/src/net/net.go:176 +0x6d
github.com/prometheus/prometheus/vendor/golang.org/x/net/netutil.(*limitListenerConn).Read(0xc4203894a0, 0xc4209d81f8, 0x9, 0x9, 0xc420420720, 0x4, 0x4)
	<autogenerated>:1 +0x5a
github.com/prometheus/prometheus/vendor/github.com/mwitkow/go-conntrack.(*serverConnTracker).Read(0xc4203894d0, 0xc4209d81f8, 0x9, 0x9, 0x400000801, 0x80100000000, 0x4)
	<autogenerated>:1 +0x5a
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*bufferedReader).Read(0xc42026ab18, 0xc4209d81f8, 0x9, 0x9, 0x0, 0x10, 0xc420420750)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/buffer.go:42 +0x123
io.ReadAtLeast(0x2882280, 0xc42026ab18, 0xc4209d81f8, 0x9, 0x9, 0x9, 0x411118, 0x10, 0x1a42760)
	/usr/local/go/src/io/io.go:309 +0x86
io.ReadFull(0x2882280, 0xc42026ab18, 0xc4209d81f8, 0x9, 0x9, 0xb216b66b5e2cd701, 0xefff100000004, 0x4)
	/usr/local/go/src/io/io.go:327 +0x58
github.com/prometheus/prometheus/vendor/golang.org/x/net/http2.readFrameHeader(0xc4209d81f8, 0x9, 0x9, 0x2882280, 0xc42026ab18, 0x0, 0x0, 0xc420420750, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/http2/frame.go:237 +0x7b
github.com/prometheus/prometheus/vendor/golang.org/x/net/http2.(*Framer).ReadFrame(0xc4209d81c0, 0x288dd00, 0xc420420750, 0x0, 0x0)
	/go/src/github.com/prometheus/prometheus/vendor/golang.org/x/net/http2/frame.go:492 +0xa4
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.matchHTTP2Field(0x2882280, 0xc42026ab18, 0x1bb63d3, 0xc, 0x1bbb24b, 0x10, 0x7f45ef1211c0)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/matchers.go:145 +0x14e
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.HTTP2HeaderField.func1(0x2882280, 0xc42026ab18, 0xc4203894d0)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/matchers.go:111 +0x59
github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*cMux).serve(0xc4206a1340, 0x28a3640, 0xc4203894d0, 0xc42008e840, 0xc420420670)
	/go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:143 +0x228
created by github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux.(*cMux).Serve
	/go/src/github.com/prometheus/prometheus/vendor/github.com/cockroachdb/cmux/cmux.go:133 +0x16a
level=info ts=2018-02-01T10:17:10.115235174Z caller=main.go:215 msg="Starting Prometheus" version="(version=2.0.0, branch=HEAD, revision=0a74f98628a0463dddc90528220c94de5032d1a0)"
level=info ts=2018-02-01T10:17:10.115404678Z caller=main.go:216 build_context="(go=go1.9.2, user=root@615b82cb36b6, date=20171108-07:11:59)"
level=info ts=2018-02-01T10:17:10.115559884Z caller=main.go:217 host_details="(Linux 4.15.0-041500rc3-lowlatency #201712110230 SMP PREEMPT Mon Dec 11 02:33:33 UTC 2017 x86_64 d4ee3ece019c (none))"
level=info ts=2018-02-01T10:17:10.11830406Z caller=web.go:380 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=info ts=2018-02-01T10:17:10.118357301Z caller=targetmanager.go:71 component="target manager" msg="Starting target manager..."
level=info ts=2018-02-01T10:17:10.118435332Z caller=main.go:314 msg="Starting TSDB"

The text was updated successfully, but these errors were encountered:

brian-brazil · 2018-02-01T10:25:53Z

Prometheus is not an event logging system, and attempting to use it that way will result in issues as you have discovered. I'd suggest looking at something like the ELK stack.

extraaa2 · 2018-02-01T10:46:12Z

Hello,

Thank you for fast response. I understand that our time series is not continuous but I will like to use the Prometheus capabilities regarding time series rather than ELK stack.
Do you think that the large cardinality of the set of key-value pairs existent for one metric its causing the issue or the speed of the storage hardware?

brian-brazil · 2018-02-01T11:08:13Z

This isn't going to work out, Prometheus simply is not designed for this use case. Initial estimates indicate that you'd need at least 200GB of RAM to support this, and even then performance is likely to be poor.

extraaa2 · 2018-02-01T11:57:22Z

How it is calculated the 200GB of RAM? I am asking, so we can figure out if there is something else we could do to still use Prometheus.
I have only found how RAM is approximated on previous versions of Prometheus, but not 2.0.

brian-brazil · 2018-02-01T12:06:48Z

I just did a quick back of the envelope calculation. Prometheus is not intended for such high cardinalities of short lived data.

brian-brazil · 2018-02-15T11:30:40Z

This is expected behaviour.

lock · 2019-03-22T22:54:51Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

brian-brazil closed this as completed Feb 15, 2018

lock bot locked and limited conversation to collaborators Mar 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus OOM for lots of short time series #3780

Prometheus OOM for lots of short time series #3780

extraaa2 commented Feb 1, 2018

brian-brazil commented Feb 1, 2018

extraaa2 commented Feb 1, 2018

brian-brazil commented Feb 1, 2018

extraaa2 commented Feb 1, 2018

brian-brazil commented Feb 1, 2018

brian-brazil commented Feb 15, 2018

lock bot commented Mar 22, 2019

Prometheus OOM for lots of short time series #3780

Prometheus OOM for lots of short time series #3780

Comments

extraaa2 commented Feb 1, 2018

brian-brazil commented Feb 1, 2018

extraaa2 commented Feb 1, 2018

brian-brazil commented Feb 1, 2018

extraaa2 commented Feb 1, 2018

brian-brazil commented Feb 1, 2018

brian-brazil commented Feb 15, 2018

lock bot commented Mar 22, 2019