Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve channel query queueing #3862

Closed
adamcfraser opened this issue Dec 13, 2018 · 2 comments

Comments

@adamcfraser
Copy link
Contributor

commented Dec 13, 2018

Sync Gateway's queries for a given channel are single-threaded - the general idea being that if multiple requests are querying for the same information concurrently, we want to execute that query once and serve the remaining requests from the cache. This approach can lead to problems when the query result exceed channel_cache_max_length - under high load this can lead to increased latency over time when the request frequency outruns query latency. This can be avoided by increasing channel_cache_max_length, but it's difficult to diagnose when a given channel is hitting this scenario.

The following enhancements will reduce the impact when this scenario is hit, as well as providing additional visibility.

  1. If a client disconnects while it's waiting to run a query, that disconnect isn't identified until the query is run and it attempts to send results. This can be avoided by passing the terminator channel down to the query, and doing a check on that channel when the request gets to the front of the queue, before the query is issued.

  2. There's currently no length on the queue (it's a simple lock). Using a fixed length queue and returning a 503-style error when the queue is at capacity will avoid the cumulative increase in latency over time. This will also provide better visibility up the stack about the state of the cache Moved to #3905

  3. Add a stat/expvar for the total number of requests that are queued waiting to execute a query, as part of the Iridium stats.

@adamcfraser adamcfraser added this to the Iridium milestone Dec 13, 2018

@adamcfraser

This comment has been minimized.

Copy link
Contributor Author

commented Dec 17, 2018

Could add a perf test to replicate/evaluate this:

  • write 1000 docs to channel
  • set channel cache size to 500
  • clients requesting changes since 0

@adamcfraser adamcfraser self-assigned this Jan 7, 2019

@adamcfraser adamcfraser added ready and removed backlog labels Jan 7, 2019

@adamcfraser adamcfraser added in progress review and removed ready labels Jan 11, 2019

@adamcfraser

This comment has been minimized.

Copy link
Contributor Author

commented Jan 11, 2019

Addressing items 1 and 3 above in this ticket - moved 2 to #3905 as it may have CBL dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.