Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Performance regression querying changes using _doc_ids filter #1737
When upgrading from v1.7.1 to v2.2.0 I noticed our replication was taking longer. I investigated further and found the problem was specifically in relation to the initial request for changes. We use the _doc_ids filter so we can replicate only certain documents and this has been stable and performant on the 1.x versions.
Steps to Reproduce
Use this node script to create a database and fill it with 1 million docs and then query it for specific IDs.
In my testing using the above script I got responses in 1 to 2ms on v1.7.1 and 2500 to 2600ms on 2.2.0.
This has affected real world performance for users trying to replicate their data.
What I've tried
ok, I think I get this finally. Before 2.0, we had an optimization for _doc_ids and _design filter;
The clustered code for _changes does not use it.