ec2_elb: Failing with boto exception ( 400 - throttling ) #30229

ansibot · 2017-09-12T19:16:29Z

ISSUE TYPE

bug report

COMPONENT NAME

ec2_elb module

ANSIBLE VERSION

Ansible 1.7.2
Boto Version 2.32.1

OS / ENVIRONMENT

Redhat Enterprise Linux 6.4 ( Ansible Tower v2.0.0)

SUMMARY

:**
I have a process "lights on" which turns on ec2 instances, then subsequently adds said instances to their respective load balancers ( defined by elb_shortname ). I get very intermittent results, sometimes the playbook will complete successfully, even consecutively at times. No matter what I do, ( change logic, implement 'pauses' etc, setting and retrieving facts ) I can not get around this AWS throttling message.

**

STEPS TO REPRODUCE

:**

Invoke playbook that turns on ec2_instances, waits and then places machines with defined elb_shortname variable into its respective group load balancer.

- name: starting instance(s)
  when: instance_id is defined and lights_on|default ("false") == "true"
  local_action: ec2
  args:
    region: 'us-west-2'
    instance_ids: "{{ instance_id }}"
    state: 'running'
    wait: 'yes'
    wait_timeout: '300'
  register: ec2

- name: Pausing, trying to avoid AWS throttling
  pause: minutes=10

- name: registering instance to its respective groups ELB
  # instances that do not require ELBs do not need to run this part of the playbook
  when: elb_shortname is defined and lights_on|default ("false") == "true"
  local_action: ec2_elb
  args:
    region: 'us-west-2'
    state: 'present'
    wait: 'yes'
    wait_timeout: '300'

**

EXPECTED RESULTS

:**

TASK: [roles/lights_on | registering instance to its respective groups ELB] *** 
skipping: [tstmaoradbc01 -> 127.0.0.1] 
skipping: [tstmaoradbt01 -> 127.0.0.1] 
skipping: [tstmaoradbt02 -> 127.0.0.1] 
skipping: [tstmaoramqa01 -> 127.0.0.1] 
skipping: [tstmaoramqe01 -> 127.0.0.1] 
skipping: [tstmaoramem01 -> 127.0.0.1] 
skipping: [tstmaoradbu01 -> 127.0.0.1] 
skipping: [tstmaoramem02 -> 127.0.0.1] 
skipping: [tstmaorabix01 -> 127.0.0.1] 
skipping: [tstmaorarel01 -> 127.0.0.1] 
skipping: [tstmaorarex01 -> 127.0.0.1] 
skipping: [tstmaorarex02 -> 127.0.0.1] 
skipping: [tstmaorarex03 -> 127.0.0.1] 
skipping: [tstmaorarex04 -> 127.0.0.1] 
skipping: [tstmaoraetl01 -> 127.0.0.1] 
changed: [tstmaorawsa01 -> 127.0.0.1] 
changed: [tstmaoramax01 -> 127.0.0.1] 
changed: [tstmaorauia02 -> 127.0.0.1] 
changed: [tstmaorauiw01 -> 127.0.0.1] 
changed: [tstmaoramwx01 -> 127.0.0.1] 
changed: [tstmaorawss01 -> 127.0.0.1] 
changed: [tstmaorawsl01 -> 127.0.0.1] 
changed: [tstmaorawss02 -> 127.0.0.1] 
changed: [tstmaorawsa02 -> 127.0.0.1] 
changed: [tstmaorauia01 -> 127.0.0.1] 
changed: [tstmaoramwx02 -> 127.0.0.1]

**

ACTUAL RESULTS

:**

TASK: [roles/lights_on | registering instance to its respective groups ELB] ***
failed: [tstmaorawss01 -> 127.0.0.1] => {"failed": true, "parsed": false}
invalid output was: Traceback (most recent call last):
  File "/Users/ndobbs/.ansible/tmp/ansible-tmp-1412091621.06-77576803493759/ec2_elb", line 1874, in <module>
    main()
  File "/Users/ndobbs/.ansible/tmp/ansible-tmp-1412091621.06-77576803493759/ec2_elb", line 326, in main
    elb_man.register(wait, enable_availability_zone, timeout)
  File "/Users/ndobbs/.ansible/tmp/ansible-tmp-1412091621.06-77576803493759/ec2_elb", line 159, in register
    self._await_elb_instance_state(lb, 'InService', initial_state, timeout)
  File "/Users/ndobbs/.ansible/tmp/ansible-tmp-1412091621.06-77576803493759/ec2_elb", line 196, in _await_elb_instance_state
    instance_state = self._get_instance_health(lb)
  File "/Users/ndobbs/.ansible/tmp/ansible-tmp-1412091621.06-77576803493759/ec2_elb", line 244, in _get_instance_health
    status = lb.get_instance_health([self.instance_id])[0]
  File "/Library/Python/2.7/site-packages/boto/ec2/elb/loadbalancer.py", line 324, in get_instance_health
    return self.connection.describe_instance_health(self.name, instances)
  File "/Library/Python/2.7/site-packages/boto/ec2/elb/__init__.py", line 547, in describe_instance_health
    [('member', InstanceState)])
  File "/Library/Python/2.7/site-packages/boto/connection.py", line 1166, in get_list
    raise self.ResponseError(response.status, response.reason, body)
boto.exception.BotoServerError: BotoServerError: 400 Bad Request
<ErrorResponse xmlns="http://elasticloadbalancing.amazonaws.com/doc/2012-06-01/">
  <Error>
    <Type>Sender</Type>
    <Code>Throttling</Code>
    <Message>Rate exceeded</Message>
  </Error>
  <RequestId>1a08966d-48b8-11e4-8ddc-e3515a48666b</RequestId>
</ErrorResponse>

Copied from original issue: ansible/ansible-modules-core#143

The text was updated successfully, but these errors were encountered:

ansibot · 2017-09-12T19:16:29Z

From @ansibot on 2014-10-06T14:46:34Z

Can You Help Us Out?

Thanks for filing a ticket! I am the friendly GitHub Ansibot.

It looks like you might not have filled out the issue description based on our standard issue template. You might not have known about that, and that's ok too, we'll tell you how to do it.

We have a standard template because Ansible is a really busy project and it helps to have some standard information in each ticket, and GitHub doesn't yet provide a standard facility to do this like some other bug trackers. We hope you understand as this is really valuable to us!.

Solving this is simple: please copy the contents of this template and paste it into the description of your ticket. That's it!

If You Had A Question To Ask Instead

If you happened to have a "how do I do this in Ansible" type of question, that's probably more of a user-list question than a bug report, and you should probably ask this question on the project mailing list instead.

However, if you think you have a bug, the report is the way to go! We definitely want all the bugs filed :) Just trying to help!

About Priority Tags

Since you're here, we'll also share some useful information at this time.

In general tickets will be assigned a priority between P1 (highest) and P5, and then worked in priority order. We may also have some follow up questions along the way, so keeping up with follow up comments via GitHub notifications is a good idea.

Due to large interest in Ansible, humans may not comment on your ticket immediately.

Mailing Lists

If you have concerns or questions, you're welcome to stop by the ansible-project or ansible-development mailing lists, as appropriate. Here are the links:

https://groups.google.com/forum/#!forum/ansible-project - for discussion of bugs and how-to type questions
https://groups.google.com/forum/#!forum/ansible-devel - for discussion on how to implement a code change, or feature brainstorming among developers

Thanks again for the interest in Ansible!

ansibot · 2017-09-12T19:16:30Z

From @smiller171 on 2014-10-06T14:46:34Z

This problem is caused by AWS itself, because each account can only make API calls so fast before being throttled. Amazon suggests using an exponential cooldown timer. It would make sense to build such a cooldown timer into all modules that make AWS API calls, so that we don't have to build that logic into our plays.

ansibot · 2017-09-12T19:16:30Z

From @ndobbs on 2014-10-06T14:46:34Z

smiller171, I completely agree. However from viewing boto source of ec2_elb_lb module, it seems as if this backoff is already implemented.

I do agree with you, this logic should be implemented inside of ansible ec2* modules, however that's just my personal opinion.

ansibot · 2017-09-12T19:16:31Z

From @smiller171 on 2014-10-06T14:46:34Z

It's worth noting that I had the same problem with the ec2_metric_alarm module. I have avoided the issue so far by deploying in batches with serial: 10

ansibot · 2017-09-12T19:16:31Z

From @ndobbs on 2014-10-06T14:46:34Z

smiller171, thank you for the suggestion I hadn't even considered batching the machines with serial. I have implemented the batching in my plays - hopefully we'll see at least a higher success rate with our 'light_switch' process.

Thanks again for your help.

ansibot · 2017-09-12T19:16:31Z

From @acaire on 2014-10-06T14:46:34Z

I ran into this today and resolved it by using the retrying library: acaire/ansible-modules-core@421f7efcc56fde85d0f54743b7ad2436735dab9e

I'm assuming it'd be a stretch to add the required pip package though, or is it worth the PR?

ansibot · 2017-09-12T19:16:32Z

From @ndobbs on 2014-10-06T14:46:34Z

I was able to solve this issue by implementing 'until' logic - thanks to a recommendation by @tgerla in a Tower Support ticket I created that allowed me to get the right syntax down.

- name: starting instance(s)
  when: instance_id is defined and lights_on|default ("false") == "true"
  local_action:
    module: 'ec2'
    region: 'us-west-2'
    instance_ids: "{{ instance_id }}"
    state: 'running'
    wait: 'yes'
    wait_timeout: '120'
  register: ec2_result
  # If instance does not start, try to start it again
  until: ec2_result|success
  retries: 10
  delay: 30

- name: registering instance to its respective groups ELB
  # instances that do not require ELBs do not need to run this part of the playbook
  when: elb_shortname is defined and lights_on|default ("false") == "true"
  local_action: ec2_elb
  args:
    region: 'us-west-2'
    enable_availability_zone: 'no'
    instance_id: "{{ instance_id }}"
    ec2_elbs: "{{ env }}-{{ elb_shortname }}"
    state: 'present'
    wait: 'yes'
    wait_timeout: '120'
  register: ec2_elb_result
  until: ec2_elb_result|success
  retries: 10
  delay: 30

ansibot · 2017-09-12T19:16:32Z

From @ndobbs on 2014-10-06T14:46:34Z

This issue was fixed by implementing controls such as until and serializing machine's in 'batches' in order to avoid the AWS throttling limit.

ansibot · 2017-09-12T19:16:33Z

From @smiller171 on 2014-10-06T14:46:34Z

@ndobbs I would call that a workaround, not a solution. That said, many changes have been made and this was likely solved by now.

ansibot · 2017-09-12T19:16:33Z

From @cooniur on 2014-10-06T14:46:34Z

I agree with @smiller171, the "until" solution is indeed a workaround. Ansible should provide a way of setting polling rate on cloud services (not only AWS, but also others).

I got the same throttling error while using rds module to restore databases.

Please consider re-open this ticket

ansibot · 2017-09-12T19:16:33Z

From @ndobbs on 2014-10-06T14:46:34Z

I reopened this issue due to the fact that its not resolved and we still have people reporting in the thread.

ansibot · 2017-09-12T19:16:34Z

From @cooniur on 2014-10-06T14:46:34Z

Update: this issue happens in Ansible 2.x too.

ansibot · 2017-09-12T19:24:39Z

@ansibot Greetings! Thanks for taking the time to open this issue. In order for the community to handle your issue effectively, we need a bit more information.

Here are the items we could not find in your description:

component name

Please set the description of this issue with this template:
https://raw.githubusercontent.com/ansible/ansible/devel/.github/ISSUE_TEMPLATE.md

click here for bot help

smiller171 · 2017-09-12T20:10:19Z

Is this still a problem?

sbussetti · 2017-09-13T21:15:44Z

@smiller171 yes:

An exception occurred during task execution. To see the full traceback, use -vvv. The error was: </ErrorResponse>
fatal: [localhost]: FAILED! => {"changed": false, "failed": true, "module_stderr": "Traceback (most recent call last):
  File \"/var/folders/v7/mmsm_9x941j0xjnm33hq5_th0000gp/T/ansible_cxkDNn/ansible_module_ec2_elb_lb.py\", line 1359, in <module>
    main()
  File \"/var/folders/v7/mmsm_9x941j0xjnm33hq5_th0000gp/T/ansible_cxkDNn/ansible_module_ec2_elb_lb.py\", line 1352, in main
    elb=elb_man.get_info(),
  File \"/var/folders/v7/mmsm_9x941j0xjnm33hq5_th0000gp/T/ansible_cxkDNn/ansible_module_ec2_elb_lb.py\", line 614, in get_info
    info['connection_draining_timeout'] = int(self.elb_conn.get_lb_attribute(self.name, 'ConnectionDraining').timeout)
  File \"/Users/sbussetti/.virtualenvs/devops/lib/python2.7/site-packages/boto/ec2/elb/__init__.py\", line 481, in get_lb_attribute
    attributes = self.get_all_lb_attributes(load_balancer_name)
  File \"/Users/sbussetti/.virtualenvs/devops/lib/python2.7/site-packages/boto/ec2/elb/__init__.py\", line 459, in get_all_lb_attributes
    params, LbAttributes)
  File \"/Users/sbussetti/.virtualenvs/devops/lib/python2.7/site-packages/boto/connection.py\", line 1208, in get_object
    raise self.ResponseError(response.status, response.reason, body)
boto.exception.BotoServerError: BotoServerError: 400 Bad Request
<ErrorResponse xmlns=\"http://elasticloadbalancing.amazonaws.com/doc/2012-06-01/\">
  <Error>
    <Type>Sender</Type>
    <Code>Throttling</Code>
    <Message>Rate exceeded</Message>
  </Error>
  <RequestId>052ac731-98c8-11e7-9ec3-7d07fe631ddd</RequestId>
</ErrorResponse>

", "module_stdout": "", "msg": "MODULE FAILURE", "rc": 0}

smiller171 · 2017-09-13T21:34:46Z

@bcoca This has been an issue since 2014 and is still present. Are you able to comment on this bug? Has anyone on the team taken a look at this? I don't think it would be terribly difficult to implement retries in the case of throttling since Boto is explicit that it's a throttling error.

ansibot · 2017-11-19T13:53:18Z

cc @willthames
click here for bot help

jamiecwilliams · 2018-10-10T16:41:33Z

@s-hertel, PR #31892 that you referenced was closed. We are still experiencing this throttling issue in Ansible 2.6.4. Is there a plan to address this?

biohazd · 2018-11-08T15:05:07Z

+1 this is a very important issue to fix.

jaksah · 2019-01-29T10:10:09Z

According to the boto documentation for get_all_load_balancer there is an optional parameter to specify load balancer names. Even if the ec2_elbs argument is defined, this property is not used, resulting in multiple calls to AWS for traversing the pagination. I've created a PR (#51424) for sending the ec2_elbs to get_all_load_balancers. Hopefully this can reduce the request rate, especially for accounts with large number of load balancers.

gibsonje · 2019-12-06T19:25:00Z

December 2019 I'm repeatedly seeing this issue. All other AWS modules are not giving me rate limit errors. I upgraded to ansible 2.9.0 and using the throttle keyword no other parts of my playbook are giving rate limit errors. However, this ec2_elb module very frequently does. I've had to try a lot of adjustments and retries to get this to reliably work.

I can only assume my other modules are masking the problem by doing exponential backoff retries behind the scenes and this module is the victim: rate limited by previous module executions and having no recourse but to fail hard.

It would be great if a boto3 module existed for this with the retry logic. Other modules are working great, especially in combination with throttle: 1 to avoid rate limiting.

ansibot · 2020-01-31T21:53:41Z

cc @jillr @tremble
click here for bot help

ansibot · 2020-08-16T23:07:42Z

Thank you very much for your interest in Ansible. Ansible has migrated much of the content into separate repositories to allow for more rapid, independent development. We are closing this issue/PR because this content has been moved to one or more collection repositories.

lib/ansible/modules/cloud/amazon/ec2_elb.py -> https://galaxy.ansible.com/community/aws

For further information, please see:
https://github.com/ansible/ansibullbot/blob/master/docs/collection_migration.md

ansibot added the affects_1.7 This issue/PR affects Ansible v1.7 label Sep 12, 2017

ansibot mentioned this issue Sep 12, 2017

ec2_elb: Failing with boto exception ( 400 - throttling ) ansible/ansible-modules-core#143

Closed

s-hertel mentioned this issue Oct 18, 2017

Port ec2_elb to boto3 - fixes #30229 #31892

Closed

ansibot added bug This issue/PR relates to a bug. and removed bug_report labels Mar 1, 2018

ansibot added the traceback This issue/PR includes a traceback. label May 28, 2018

ansibot added support:core This issue/PR relates to code supported by the Ansible Engineering Team. and removed support:certified This issue/PR relates to certified code. labels Sep 17, 2018

ansibot added support:community This issue/PR relates to code supported by the Ansible community. and removed support:certified This issue/PR relates to certified code. labels Oct 11, 2018

ansibot removed the needs_maintainer Ansibot is unable to identify maintainers for this PR. (Check `author` in docs or BOTMETA.yml) label Nov 16, 2018

jaksah mentioned this issue Jan 29, 2019

Pass optional elb names to boto call for getting elbs #51424

Closed

ansibot removed the deprecated This issue/PR relates to a deprecated module. label Feb 6, 2019

ansibot added the has_pr This issue has an associated PR. label Jul 24, 2019

ansibot added collection Related to Ansible Collections work collection:community.aws needs_collection_redirect https://github.com/ansible/ansibullbot/blob/master/docs/collection_migration.md labels Apr 29, 2020

ansibot added the needs_triage Needs a first human triage before being processed. label May 16, 2020

ansibot added the bot_closed label Aug 16, 2020

ansibot closed this as completed Aug 16, 2020

sivel removed the needs_triage Needs a first human triage before being processed. label Aug 17, 2020

ansible locked and limited conversation to collaborators Sep 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ec2_elb: Failing with boto exception ( 400 - throttling ) #30229

ec2_elb: Failing with boto exception ( 400 - throttling ) #30229

ansibot commented Sep 12, 2017 •

edited by ansibotdev

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

smiller171 commented Sep 12, 2017

sbussetti commented Sep 13, 2017 •

edited

smiller171 commented Sep 13, 2017

ansibot commented Nov 19, 2017

jamiecwilliams commented Oct 10, 2018

biohazd commented Nov 8, 2018

jaksah commented Jan 29, 2019

gibsonje commented Dec 6, 2019

ansibot commented Jan 31, 2020

ansibot commented Aug 16, 2020

ec2_elb: Failing with boto exception ( 400 - throttling ) #30229

ec2_elb: Failing with boto exception ( 400 - throttling ) #30229

Comments

ansibot commented Sep 12, 2017 • edited by ansibotdev

ISSUE TYPE

COMPONENT NAME

ANSIBLE VERSION

OS / ENVIRONMENT

SUMMARY

STEPS TO REPRODUCE

EXPECTED RESULTS

ACTUAL RESULTS

ansibot commented Sep 12, 2017

Can You Help Us Out?

If You Had A Question To Ask Instead

About Priority Tags

Mailing Lists

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

ansibot commented Sep 12, 2017

smiller171 commented Sep 12, 2017

sbussetti commented Sep 13, 2017 • edited

smiller171 commented Sep 13, 2017

ansibot commented Nov 19, 2017

jamiecwilliams commented Oct 10, 2018

biohazd commented Nov 8, 2018

jaksah commented Jan 29, 2019

gibsonje commented Dec 6, 2019

ansibot commented Jan 31, 2020

ansibot commented Aug 16, 2020

ansibot commented Sep 12, 2017 •

edited by ansibotdev

sbussetti commented Sep 13, 2017 •

edited