-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Erlang native query server consumes too much RAM when filters a document with lots of revisions #1209
Comments
Are you setting your And just to clarify, you’re talking about many updates to a doc, but not that many conflicts on the doc? |
Thank you for the response, Janl! But why are you asking this? I've pointed out to the source code :) I'm talking about any revision, regular, conflicted, or deleted. It doesn't matter in this particular context. Of course, with many conflicts the situation will be even worse. If the I'm not claiming that a document with such amount of revisions (especially conflicts) is a good state. No. This happened to us by accident, I guess. I'm not responsible for the application side of things, so I don't know why this happened. I'm just trying to say that loading a document with all of its associated revisions into memory at once and then convert all this data into plain list using If we look at changes enumerator we'll see that situation even worse. Imagine that we need to filter several sequential changes with such a big bad document through native query server. CouchDB will be loading all the revisions and calling |
I see this is on 1.6.0. Have you tried this against 1.7.1? Or 2.1.1? |
Yes, we tried this on 1.7.1 (built for Ubuntu 16.04 by us on our own) and 2.1.1 (from the Apache repository) before actually digging into the source code deeply. Behavior is the same on all versions. Later I checked the |
This should be fixed in 23e5f92 There is a new query_server_config option |
Thanks to @rnewson!👍 |
Hi all! We have online service which uses CouchDB heavily. It has many databases with frequently updated documents. So we faced a problem.
Documents are processed by native query servers with all of its revisions as a one single entity when they are requested with
style=all_docs
parameter. As a result when document with many, many revisions including deleted ones is requested through native Erlang filter CouchDB eats inadequate amount of memory.The reason is as follows (branch
master
is used here just for simplicity, we use CouchDB 1.6 actually):couch_changes:filter()
reads all revisions of a document intoDocs
and callscouch_query_servers:filter_docs()
couch_query_servers:filter_docs()
callscouch_native_process:prompt()
throughddoc_prompt()
/proc_prompt()
couch_native_process:handle_call()
tries to convert the document with all of its revisionsto_binary
at once!Filter processes each revision one by one, right? Why trying to handle all revisions in memory? And if you can handle all revisions in memory then why not to handle in memory entire database? :)
Of course, this problem may be workarounded from the maintenance perspective. Old revisions may be purged from the databases, databases may be compacted and so on. But this is not mean that current behavior is an example of good software design, I guess.
Expected Behavior
Assume that the RAM in our galaxy is not infinite, and the document may have many and many and many revisions. Simply put do not load all revisions at once. After all each revision is a document itself.
Steps to Reproduce
for i in {1..16}; do curl -u admin:admin "http://localhost:5984/db/_changes?feed=normal&style=all_docs&filter=app%2FverySimpleErlangFilter" & done
Environment
The text was updated successfully, but these errors were encountered: