Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compaction memory requirements are too high #4110

Closed
SuperQ opened this Issue Apr 24, 2018 · 12 comments

Comments

Projects
None yet
5 participants
@SuperQ
Copy link
Member

SuperQ commented Apr 24, 2018

Proposal

Reduce the default of the --storage.tsdb.max-block-duration from 10% to 2% or 1%.

Bug Report

When using a longer retention time on a high traffic Prometheus server, the 10% compaction level causes a number of users to get into crash loops. This seems to be due to the amount of memory used during the compaction.

For example, a server with a 365d retention time. This server is normally operating at ~20G RSS and needs an additional 6G of memory to perform a 2% compaction of ~60G of TSDB. If this were a full 10% compaction, it could need far more memory for this compaction, causing it to OOM.

There's also no need to generate such large blocks, with a 2% max block duration ratio, we would have under 60 TSDB blocks over a year.

@SuperQ

This comment has been minimized.

Copy link
Member Author

SuperQ commented Apr 24, 2018

Some additional information on an existing TSDB compaction.

This is for a level 5 compaction, 6.75 days. (--storage.tsdb.retention=365d, --storage.tsdb.max-block-duration=7d).

"stats": {
  "numSamples": 57293773416,
  "numSeries": 3522226,
  "numChunks": 485794168
},

The churn on this server is somewhat low, but it has a large ingestion rate and a lot of metrics.

This compaction has a 3.6GB index and 62GB of chunks.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jun 13, 2018

Did you ever get the heap profiles when the compaction was happening?

@SuperQ

This comment has been minimized.

Copy link
Member Author

SuperQ commented Jun 13, 2018

No, I haven't gotten to that yet, thanks for the reminder.

@brian-brazil brian-brazil changed the title tsdb max block duration is too large Compaction memory requirements are too high Jun 13, 2018

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Jun 14, 2018

In addition, could I get all the go memory metrics during the compaction spike?

@tcolgate

This comment has been minimized.

Copy link
Contributor

tcolgate commented Jun 15, 2018

Does an msync actually reclaim space allocated to dirty pages? Could periodic msyncs reduce the usage? (tested it, it doesn't seem to, RSS drops once you unmap)

On a more practical note, can storage.tsdb.max-block-duration be set on a pre-existing prom instance? It seems like reducing the block size could reduce the memory used here?

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Nov 9, 2018

Maybe some streaming during compaction could solve this.

Instead of expanding all blocks at once do it in stages or something.

@brian-brazil

This comment has been minimized.

Copy link
Member

brian-brazil commented Nov 9, 2018

With the PRs I've sent out, it's already as efficient as it's going to get. It seems that @SuperQ's issue was likely cardinality, which my PRs should address.

@krasi-georgiev

This comment has been minimized.

Copy link
Member

krasi-georgiev commented Nov 9, 2018

looking at the bench results for #4816 worse memory jump I found is from 30 to 40gb for a week run with the default settings which doesn't seem so bad.

http://prombench.prometheus.io/grafana/d/7gmLoNDmz/prombench?orgId=1&var-RuleGroup=All&var-pr-number=4816&from=1541709439128&to=1541714031412

@SuperQ do you see any bigger jumps?

@SuperQ

This comment has been minimized.

Copy link
Member Author

SuperQ commented Nov 9, 2018

@krasi-georgiev The issues we saw were with the default 10% max block size when compacting long retention times (365d). We tun our max block size to 7d, which helps. Once we have Brian's memory use optimizations in a release, I can do some re-testing.

@spjspjspj

This comment has been minimized.

Copy link

spjspjspj commented Dec 5, 2018

Why would entire block need to be loaded into memory? We observe high memory use (130GB) too upon (probably unclean) restart.

Edit: this is actually unrelated to this issue. What we experienced I believe is actually #4842

On our dashboard, compaction seemingly happens once every hour and doesn't take much time or memory:

image

retention 180d
min-block 1h
max-block 1d

bandesz pushed a commit to alphagov/paas-cf that referenced this issue Dec 10, 2018

bandesz
Change Prometheus max-block-duration to 7d
Some Prometheus users reported high CPU and memory usage if high retention
periods are used (e.g. >1 year), and the max block duration is not set (which
defaults to 10% of the retention period).

According to [1] we should have a max block duration around 1-2% of the
retention period, therefore we set it to 7 days.

[1] prometheus/prometheus#4110

alext added a commit to alphagov/paas-cf that referenced this issue Dec 11, 2018

Change Prometheus max-block-duration to 7d
Some Prometheus users reported high CPU and memory usage if high retention
periods are used (e.g. >1 year), and the max block duration is not set (which
defaults to 10% of the retention period).

According to [1] we should have a max block duration around 1-2% of the
retention period, therefore we set it to 7 days.

[1] prometheus/prometheus#4110
@SuperQ

This comment has been minimized.

Copy link
Member Author

SuperQ commented Jan 10, 2019

Here's a comparison of 2.5.0 to 2.6.0:

compaction_memory-2 5 0-2
compaction-memory-2 6 0-2

Looking at the data, it seems we still see about a 20-25% spike in RSS size when two compactions are run closely together.

Oddly, the go_memstats_alloc_bytes doesn't seem to notice this use.

Perhaps we need to add some sleep time to the scheduling of compactions to avoid this?

@SuperQ

This comment has been minimized.

Copy link
Member Author

SuperQ commented Jan 10, 2019

So, after some discussion with @brian-brazil, I think this is solved. Figuring out the RSS issue when compactions run is a separate problem.

@SuperQ SuperQ closed this Jan 10, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.