Network performance issue (30 secs for 8mb PUT) #1409
Should complete in a reasonable amount of time, ie < 1 second (its only a 8mb file).
The above command if issued from a remote client (but over a very fast network) can take up to 30 seconds.
The same command issue from a client on the same network takes < 1 second.
Network is very fast between servers, an scp of same file takes < 0.1 seconds.
Tested on dedicated servers and multiple AWS servers. Always same result.
It seems like maybe couchdb network layer maybe sending too small a packet?
I have configured a total of 6 servers mix of AWS and dedicated, all experience the same slowness
Steps to Reproduce (for bugs)
just single nodes and no special configuration.
The text was updated successfully, but these errors were encountered:
And I don't believe this issue is todo with attachments as the same PUT that takes 1 min 29 secs takes less than 1 second if it is done from a client on the same local network as the couchdb server.
Must be some network packet size/timing issue in the network layer of couchdb.
OK good news found solution to performance issue. Turns out the recbuf setting is the cause.
I had to comment out the recbuf param in mochiweb/mochiweb_socket_server.erl and then the time to put a 10 MB text file went from 1 min 29 seconds to < 1 second !!
Could this be the cause of all the complaints about how slow couchdb is for attachments?
I believe not setting recbuf on linux then allows the operating system to handle it more efficiently. (When set it seems to be causing some tcp window size performance issues that lead to the huge delay for larger PUTs.)
I also found this PR mochi/mochiweb#153 to fix mochiweb to allow an "undefined" setting for recbuf, but couchdb 2.1.1 is not using this version, I tried using this version with couchdb but couchdb did not seem to allow undefined as a param to recbuf in the ini file.
So it would seem like we need couchdb to support setting recbuf to undefined, maybe even having that a the default?
I think this one dates back to https://issues.apache.org/jira/browse/COUCHDB-1986. I wasn't too involved with that investigation but when I read through the comments it seems that we didn't have a great reason for customizing the buffer size beyond the fact that it improved things in those specific scenarios where tests were timing out. Note that under the hood mochi was using a fixed
CouchDB master has upgraded the mochiweb dependency to
@kocolosk thank you for the sleuthing and memory here! I'll get a PR up to change the default in master to
@stevedrew for reference, I help people run CouchDB on AWS all the time, and we've not seen this one crop up before - presumably because people are on low-latency links to AWS from their clients/app servers, whereas mochi/mochiweb#153 points the finger at high-latency links. So, thank you for the report!
@stevedrew this isn't the root cause for CouchDB's slow attachment behaviour universally, no. A lot of that slowness comes from the serialisation of attachment data both into the b-tree and over the wire. Internal attachment replication between nodes in a 2.x cluster is also unoptimized, and can block other operations on very large files, leading to database-wide issues.
We still recommend keeping attachments in CouchDB below 10MB per document, which you have already met, hooray! There is an ongoing discussion about how to help guide users towards these defaults, see #1200 and #1253.