Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible Issues(?) bumping etcd 8 GB upper limit (or not!) #15354

Closed
dims opened this issue Feb 24, 2023 · 7 comments
Closed

possible Issues(?) bumping etcd 8 GB upper limit (or not!) #15354

dims opened this issue Feb 24, 2023 · 7 comments

Comments

@dims
Copy link
Contributor

dims commented Feb 24, 2023

What would you like to be added?

Last night i took a dive in the 8/10 GB limits in etcd to see why we recommend what we recommend

First order of business was current state:

Configurable from command line:

Then saw the following being logged:

Issues:

Also a couple of discussions:

interesting tidbit from google groups discussion

we can recover 2GB of data within 20 seconds on good hardware. We cannot do that for 1TB data due to today's hardware limitation

here's where we added 10% head room:

Interesting tidbit from ^^^ is Mechanism for permitting mmaps > 10GB.

There are a few references to things > 10 GB, example:

Our won @ahrtr says If your VM has big memory (e.g. 64GB), it’s OK to set a value > 8GB. in https://etcd.io/blog/2023/how_to_debug_large_db_size_issue/

the TL;DR is that back in December 2016, we took the MTTR as a guide given the the then status of hardware to come up with a self-imposed limits for the different values we then cargo culted into hard limits. Also the lower limit of 2GB was to support 32bit architectures for windows and arm.

Is this right? Anything else i am missing?

@dims
Copy link
Contributor Author

dims commented Feb 24, 2023

I think i am trying to convince myself that it's ok to bump quota-backend-bytes :)

@ahrtr
Copy link
Member

ahrtr commented Feb 24, 2023

Thanks @dims for raising this topic.

We need performance/benchmark data to support the decision.

I may spend some time to optimize boltDB in the next couple of weeks/months. FYI etcd-io/bbolt#401. I am struggling to downloading the huge boltDB file (about 100GiB) from google drive for now.

@ahrtr
Copy link
Member

ahrtr commented Mar 2, 2023

I did some benchmark test on 16GB boltDB files, and the performance looks pretty good. So the bbolt isn't a blocker for bumping quota-backend-bytes to a value <= 16G.
FYI. etcd-io/bbolt#401 (comment)

The only concern is snapshot transport and restoring from the snapshot at a slow follower. We don't have such data yet.

@ahrtr
Copy link
Member

ahrtr commented Aug 8, 2023

Note the 8GB MaxQuotaBytes isn't a hard limitation. Users can still set a value bigger than 8GB to --quota-backend-bytes; etcd just prints a warning in this case.

@dims
Copy link
Contributor Author

dims commented Sep 7, 2023

/close

@ahrtr ahrtr closed this as completed Sep 7, 2023
@ahrtr
Copy link
Member

ahrtr commented Sep 7, 2023

thx @dims for the confirmation.

@rchincha
Copy link

rchincha commented Jan 25, 2024

Anyone aware of recent benchmarks with spinning disks vs SSDs?

Also, dgraph-io/badger#1668 - an option for write-heavy operations. However if the bottleneck is snapshot/restore, not sure it would help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants