Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash recovery uses too much memory compared to target-heap-size #3038

Closed
JorritSalverda opened this issue Aug 8, 2017 · 7 comments
Closed

Comments

@JorritSalverda
Copy link
Contributor

What did you do?

I restarted Prometheus with the default heap size and - the by documentation suggested - 50% memory headroom with this flag:

-storage.local.target-heap-size=2147483648

And the following values for the Kubernetes pod:

resources.requests.memory: 3072Mi
resources.limits.memory: 3072Mi

What did you expect to see?

The memory usage to stay within reasonable limits so the Prometheus pod can recover.

What did you see instead? Under which circumstances?

When starting and running crash recovery I get errors like the following and Kubernetes killing the pod with OOMKilled as reason:

level=error msg="Error updating archived time range for fingerprint 00d5856f9d626ba1: write/prometheus/archived_fingerprint_to_timerange/001392.log: cannot allocate memory" source="storage.go:1791
level=error msg="Error dropping persisted chunks: read /prometheus/00/d5f1e1c2dd8195.db: cannot allocate memory" source="storage.go:1779
level=error msg="Error updating archived time range for fingerprint 00d5f1e1c2dd8195: write /prometheus/archived_fingerprint_to_timerange/001392.log: cannot allocate memory" source="storage.go:1791

This even happens at a memory usage of 4x the heap size (8192Mi in my case).

Environment

The official docker container prom/prometheus:v1.7.1 running on GKE version 1.7.2 with a 200Gi pd-ssd persistent volume.

resources.requests.cpu: 1500m
resources.limits.cpu: 5500m
  • System information:
Linux 4.4.52+ x86_64
  • Prometheus version:
prometheus, version 1.7.1 (branch: master, revision: 3afb3fffa3a29c3de865e1172fb740442e9d0133)
  build user:       root@0aa1b7fc430d
  build date:       20170612-11:44:05
  go version:       go1.8.3
  • Prometheus configuration file:

The default config as shown at https://github.com/prometheus/prometheus/blob/master/documentation/examples/prometheus-kubernetes.yml with only difference of

    global:
      scrape_interval: 10s
      evaluation_interval: 10s
  • Logs:

When the pod finally starts after a number of restarts it logs the following, which might give an indication as too how many time ranges, metrics, etc are processed by recovery.

time="2017-08-08T12:54:31Z" level=info msg="680000 archived time ranges checked." source="crashrecovery.go:480"
time="2017-08-08T12:54:31Z" level=info msg="Clean-up of archive indexes complete." source="crashrecovery.go:504"
time="2017-08-08T12:54:31Z" level=info msg="Rebuilding label indexes." source="crashrecovery.go:512"
time="2017-08-08T12:54:31Z" level=info msg="Indexing metrics in memory." source="crashrecovery.go:513"
time="2017-08-08T12:54:31Z" level=info msg="10000 metrics queued for indexing." source="crashrecovery.go:518"
...
time="2017-08-08T12:54:45Z" level=info msg="1090000 metrics queued for indexing." source="crashrecovery.go:518"
time="2017-08-08T12:54:45Z" level=info msg="Indexing archived metrics." source="crashrecovery.go:521"
time="2017-08-08T12:54:45Z" level=info msg="1100000 metrics queued for indexing." source="crashrecovery.go:534"
...
time="2017-08-08T12:55:02Z" level=info msg="1770000 metrics queued for indexing." source="crashrecovery.go:534"
time="2017-08-08T12:55:02Z" level=info msg="All requests for rebuilding the label indexes queued. (Actual processing may lag behind.)" source="crashrecovery.go:540"
time="2017-08-08T12:55:02Z" level=warning msg="Crash recovery complete." source="crashrecovery.go:153"
time="2017-08-08T12:55:03Z" level=info msg="1094563 series loaded." source="storage.go:439"
time="2017-08-08T12:55:03Z" level=info msg="Starting target manager..." source="targetmanager.go:63"
time="2017-08-08T12:55:03Z" level=info msg="Listening on :9090" source="web.go:259"
time="2017-08-08T12:55:03Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104"
time="2017-08-08T12:55:03Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104"
time="2017-08-08T12:55:03Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104"
time="2017-08-08T12:55:03Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104"
time="2017-08-08T12:55:03Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104"
time="2017-08-08T12:55:03Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104"
time="2017-08-08T12:55:03Z" level=info msg="Using pod service account via in-cluster config" source="kubernetes.go:104"
time="2017-08-08T12:55:04Z" level=warning msg="Storage has entered rushed mode." chunksToPersist=870434 memoryChunks=1648511 source="storage.go:1867" urgencyScore=1
time="2017-08-08T12:55:04Z" level=error msg="Storage needs throttling. Scrapes and rule evaluations will be skipped." chunksToPersist=870468 memoryChunks=1648547 source="storage.go:1007" urgencyScore=1
time="2017-08-08T13:00:03Z" level=info msg="Checkpointing in-memory metrics and chunks..." source="persistence.go:633"
time="2017-08-08T13:00:31Z" level=info msg="Done checkpointing in-memory metrics and chunks in 28.605482826s." source="persistence.go:665"
@beorn7
Copy link
Member

beorn7 commented Aug 8, 2017

Please check out #2139 . Bottom line: The sometimes excessive RAM usage during crash recovery is coming from LevelDB, which blows up during re-indexing everything (as it is required by the crash recovery). LevelDB won't be used anymore in Prometheus 2.x, so there is very little incentive to fix those issues. Sorry for that…

@beorn7 beorn7 closed this as completed Aug 8, 2017
@JorritSalverda
Copy link
Contributor Author

@beorn7 Thanks for answering so quickly. It make sense to make best use of your time and not address this. I'll manage in the meantime. Looking forward to version 2.

FYI I run Prometheus in HA setup on GKE using preemptibles, so the pods get relocated frequently - at least each 24 hours - making this startup process pretty crucial.

@ahume
Copy link

ahume commented Nov 22, 2017

@JorritSalverda I'd be really interested to know more about your configuration for HA Prometheus on GKE. Is there anything you can share? Or any advice?

@brian-brazil
Copy link
Contributor

@ahume Please don't ask unrelated questions on issues.

@JorritSalverda
Copy link
Contributor Author

In version 2 this is no longer an issue, so I think we can close this ticket if everyone's okay with that.

@beorn7
Copy link
Member

beorn7 commented Nov 23, 2017

This issue has been closed long ago.

@lock
Copy link

lock bot commented Mar 23, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Mar 23, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants