New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fork leveldb to apply lower mmap limit #7514
Comments
Relates to vectordotdev/vector#7514 Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
Hmm, well. I have forked leveldb and can confirm that only the specified number of files are being mapped into the process, which is good. What's bad is vector's resident mem is still roughly equivalent to the disk buffer size. Looking into this further. |
Interesting, so check this out: It looks like, because my system has no memory pressure, linux is being lazy about reclaiming pages but what happens is the unreclaimed space slowly fills up with retries. Allocation point appears to be I'm going to run vector for a long period of time now, see if these start growing as I expect. Afterward I'll run vector in a memory constrained environment. |
Here's a longer run with no memory pressure: I had expected the retry memory to continue to grow, so the linear curve here surprises me. |
Alright, as desired when run like so:
vector now stays within the 300M limit, which previously was not true. Virtual memory use is still quite high but resident size is within boundary, so this should address #7246. I'll have a PR shortly. |
Hmmm, interesting. Do you think there is an issue there? Should we be piling up retries like that? |
It’s a good question. I don’t know. It’s something that users will only notice if their vector doesn’t have memory pressure, which most will have. I’d say it’s more of a curiosity until we learn otherwise, a potential place to optimize away clones if someone has a clever notion. |
Perhaps its worth filing an issue against leveldb? Seems like maintaining a fork can lead to other problems. |
This is a good question. There's been discussion like this in upstream -- see google/leveldb#866 -- and their suggestion is to fork. Realistically leveldb is an awkward fit for what we need in vector. It's a KV store and we just need a disk-backed queue. My expectation is that we'll be reworking the buffer code entirely and removing leveldb within the year. |
Relates to vectordotdev/vector#7514 Signed-off-by: Brian L. Troutwine <brian@troutwine.us>
* Use a forked version of leveldb-sys The forked version of leveldb-sys patched in here limits leveldb to a maximum of 10 mmap'ed LDB files, reducing the total memory burden of vector. Upstream leveldb will map 1000 files, so there's quite a difference in memory usage. Unfortunately this 10 is not configurable. At the moment the fork is under my personal Github account. I think we ought to fork to timberio -- we have an archived version -- and pin the patch to a SHA. I don't have enough juice in the Github org to achieve this. Resolves #7514 Signed-off-by: Brian L. Troutwine <brian@troutwine.us> * switch to timberio/leveldb-sys Signed-off-by: Brian L. Troutwine <brian@troutwine.us> * Revert cargo updates Signed-off-by: Jesse Szwedko <jesse@szwedko.me> * Remove accidental Cargo.lock Signed-off-by: Jesse Szwedko <jesse@szwedko.me> Co-authored-by: Jesse Szwedko <jesse@szwedko.me>
While investigating #7246 and other issues, we've identified that leveldb will mmap up to 1000 files at 4 MB per file. This commonly results in users running into OOM issues using disk buffers.
Unfortunately, this value is not configurable so, to modify it, we have to fork
leveldb-sys
to change it here:https://github.com/skade/leveldb-sys/blob/c5383aaf9e264041c951e9a2aa7381613d0d0dba/deps/leveldb-1.22/util/env_posix.cc#L46
The text was updated successfully, but these errors were encountered: