-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
local proxying: set a default resolver ttl of 10s #54
Conversation
this is needed because internal ("private") PaaS routes are load-balanced simply on a DNS trick and our blue-green deployment procedure may make an instance disappear at any time with little notice
To elaborate on the "DNS trick" used by cf, it should in most cases "just work" because their DNS server responds with a 0s ttl, which compliant clients should take to mean "re-lookup every time", but nginx notoriously takes to mean "whatever". So that's why we're only seeing it as a problem in the router. |
@@ -3,6 +3,8 @@ | |||
server { | |||
listen 80; | |||
server_name api.*; | |||
# valid= value must be significantly less than the amount of guard time we have in our blue-green deployment process | |||
resolver {{ resolver_ip }} valid=10s; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we already have a line like this in the top level nginx.conf.j2
? Would it be better to replace the 300s value there than do each app separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would also have an effect on e.g. requests forwarded to S3 - are we sure we want to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha. OK!
Can you deploy to test without merging then restore before merging to test the deployment process? Just to check that the config isn't going to break routing? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's see what happens (have you got a revert PR at the ready?)
@lfdebrux I think that would be tricky to say the least. |
https://trello.com/c/H53K45It
This is needed because internal ("private") PaaS routes are load-balanced simply on a DNS trick and our blue-green deployment procedure may make an instance disappear at any time with little notice.
While I'd love to be able to test this without merging, that slightly defeats the point because it's a problem with deployment & releasing that this is trying to fix, and we need to go through a proper release cycle to see if it works.
Note I've added this rule for connections to our frontend apps too even though we don't yet use internal routes for them - this is because I anticipate someone trying to switch them too sooner or later and I'd rather they don't have to rediscover this problem from scratch then.