Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dns_route53: add change-max-poll config argument. #6071

Closed
wants to merge 1 commit into from

Conversation

ezekiel
Copy link
Contributor

@ezekiel ezekiel commented Jun 5, 2018

This allows adjustment of the formerly hardcoded 120-rounds value.

To timeout more quickly, adjust down:
--dns-route53-change-max-poll 100

To timeout more slowly, adjust up:
--dns-route53-change-max-poll 150

@ezekiel ezekiel force-pushed the dns_route53_tuneable branch 4 times, most recently from 18e6e1d to 02807c4 Compare June 8, 2018 18:59
@jsha
Copy link
Contributor

jsha commented Jun 12, 2018

Before we add a new command line flag, I'd like to try and find the root cause of why INSYNC status sometimes takes a long time. If slow syncing is caused by throttling at Route 53, instead of allowing higher poll times we should present the user with a message informing them that they may be encountering throttling and should adjust their issuance intervals.

@ezekiel
Copy link
Contributor Author

ezekiel commented Jun 12, 2018

applicable limits documented for route53:

All requests

Five requests per second per AWS account. If you submit more than five requests per second, Amazon Route 53 returns an HTTP 400 error (Bad request). The response header also includes a Code element with a value of Throttling and a Message element with a value of Rate exceeded.

ChangeResourceRecordSets requests

If Route 53 can't process a request before the next request arrives, it will reject subsequent requests for the same hosted zone and return an HTTP 400 error (Bad request). The response header also includes a Code element with a value of PriorRequestNotComplete and a Message element with a value of The request was rejected because Route 53 was still processing a prior request.

In both cases, amazon says it will return HTTP 400. I'll double-check if any of my logs show such an error.

@ezekiel
Copy link
Contributor Author

ezekiel commented Jun 12, 2018

Reviewing the response headers logged by certbot and neither Throttling nor PriorRequestNotComplete were ever the cause of timeouts for my systems. The change requests remained in PENDING status for as long as 25 minutes.

I think there are just some semi-rare occasions where synchronization is taking much longer than expected within Amazon infrastructure.

This allows adjustment of the formerly hardcoded 120-rounds value.

To timeout more quickly, adjust down:
  --dns-route53-change-max-poll 100

To timeout more slowly, adjust up:
  --dns-route53-change-max-poll 150
@arcivanov
Copy link

related to #6125

@ohemorange
Copy link
Contributor

Hey @jsha, could you take a look at this?

@bmw bmw assigned jsha Sep 12, 2018
@jsha
Copy link
Contributor

jsha commented Sep 12, 2018

We wound up working around this by reducing the number of times we update DNS. Closing since fewer options > more options. We might want to consider adding a message when polling for a change times out saying something like "If you've updated a lot of times recently, you might be getting rate limited by Amazon." But I'd want to see some more reports from other users first.

@jsha jsha closed this Sep 12, 2018
@jsha
Copy link
Contributor

jsha commented Sep 12, 2018

A little more detail here: @ezekiel reminds me that we never actually found evidence that we were being rate limited, so that might not be the root cause.

According to Amazon docs,

Amazon Route 53 is designed to propagate updates you make to your DNS records to its world-wide network of authoritative DNS servers within 60 seconds under normal conditions.

It's not clear how long they are outside "normal conditions," but the current setting will wait up to 10 minutes, which seems sufficiently generous. If anyone else has this issue, however, please let us know and provide details of your setup. We'd love to know why some updates take a long time to sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants