Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Streaming replication is pathologically slow #153
The gist has two files, one that creates a small 1e6 record leveldb instance (~24MB on disk) and another which will attempt to use
What we were seeing is that once the initial batch is sent, each buffered batch will typically be only a single record, significantly slowing down the transfer. In our case it was taking ~75 seconds.
I've turned this issue into a pull-request with a new branch that I hacked on in LAX.
I'd like to make it more adaptive, but even now if you try it out you should see huge improvements in speed. There is a test/benchmarks/stream-test.js benchmark that does basically what your gist was doing. There are some console.log writes in write-stream.js there you can uncomment to see some interesting stuff too.
Comments & improvements welcome.
What if it uses a k-armed bandit to decide between large, medium, and small batches?
Basically: normally, you do what worked best, but some of the time, you randomly try a different strategy,
But this is mad science. we should merge this pr and try crazy stuff later (we'd need benchmarks to verify that stuff too)