Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent mongo connection drops #17

Closed
GUI opened this issue Dec 5, 2013 · 0 comments
Closed

Intermittent mongo connection drops #17

GUI opened this issue Dec 5, 2013 · 0 comments

Comments

@GUI
Copy link
Member

GUI commented Dec 5, 2013

The gatekeeper queries Mongo to verify an API key is valid. This query is sometimes failing to occur. This was leading to a user being denied even if they supplied a valid key. The problem only cropped up rarely, so it wasn't widely seen, which explains why it's only been discovered now. I have an ugly workaround currently implemented that seems to address the problem, but this deserves more investigation, since the workaround is not ideal and not performant.

What's happening: each gatekeeper proxy process holds open a persistent connection to Mongo that gets re-used across all the requests served by that process. The issue crops up when that persistent connection is randomly terminated. There's then a brief period of time when the mongo client isn't aware that it has a terminated connection, so it's subsequent queries fail, until the connection reconnects.

This may be related to the hosting environment and network or firewall settings that lead to the disconnects: https://support.mongolab.com/entries/23009358-handling-dropped-connections-on-windows-azure However, my attempts at fixing it with keepalive settings have been unsuccessful. The network nature of the problem also probably explains why this has never cropped up in unit tests or other local environments where mongo is on the same machine as the gatekeeper.

The workaround I have in place (see NREL/api-umbrella-gatekeeper@ff9da2a and the couple subsequent commits) basically just keeps retrying the mongo query every 50 ms for up to 100 times. Some retry mechanism may be needed, but the amount of these retries that are currently necessary make no sense to me. In one environment where this is a problem, I can see it make up to 60 or 70 retries before finally succeeding. With the wait time in between each retry, this adds a somewhat significant amount of time to the request if a user happens to be the super-unlucky one to hit it when this connection drops.

GUI added a commit that referenced this issue Sep 27, 2015
Fixes and changes to how roles get applied
GUI added a commit that referenced this issue Sep 27, 2015
Add option to override roles in sub-settings (and some refactoring)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant