Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`exact_count` instance parallel provisioning sometimes hangs #16

Closed
sivakumart opened this issue Dec 16, 2018 · 2 comments

Comments

@sivakumart
Copy link
Member

commented Dec 16, 2018

Issue Report

When an error occurs during instance provisioning, when enable_parallel_requests is set to True and an exact_count is specified in oci_instance, the play execution hangs.

Expected behavior

A clear and concise description of what you expected to happen.

Environment

  • OS version:

Linux

  • Ansible version:
ansible 2.7.0.dev0
  config file = /home/siva/.ansible.cfg
  configured module search path = ['/home/siva/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /space/projects/ansible/lib/ansible
  executable location = /space/projects/ansible/bin/ansible
  python version = 3.5.5 |Anaconda, Inc.| (default, May 13 2018, 21:12:35) [GCC 7.2.0]
  • OCI Python SDK version:

2.1.2

  • OCI Ansible Modules version:

1.3.0

Ansible playbook to reproduce the issue

	- name: Attempt to create 3 webserver instances using exact-count and count-tag, with an error
	
	  oci_instance:
	
	    name: "{{exact_count_instance_name}}"
	
	    shape: "{{test_instance_shape}}"
	
	    compartment_id: "{{test_compartment_ocid}}"
	
	    # Use the Second AD so that this test doesn't interfere with other compute instance test runs that use the first AD
	
	    # and cause a quota-issue
	
	    availability_domain: "{{test_availability_domain_3}}"
	
	    source_details:
	
	        source_type: "image"
	
	        image_id: "{{image_ocid}}"
	
	    vnic:
	
	        subnet_id: "{{test_subnet_ocid_ad3}}"
	
	    # use an invalid tag namespace to simulate an error
	
	    defined_tags: "{ '{{ test_tag_namespace_name }}invalid':{'{{ test_tag_name }}':'{{test_instance_name}}-tag-value'} }"
	
	    exact_count: 3
	
	    count_tag: "{ '{{ test_tag_namespace_name }}':{'{{ test_tag_name }}':'{{test_instance_name}}-tag-value'} }"
	
	  register: result
@sivakumart

This comment has been minimized.

Copy link
Member Author

commented Dec 16, 2018

This issue occurs because the exception handling process uses module.fail_json (that in turns call sys.exit) and this results in the parallel provisioning logic to wait for the thread to complete for ever.

A fix would require changing the exception handling strategy used in all our modules and the oci_utils classes. Currently a lot of methods in oci_utils (and their corresponding callers) arbitrarily catch exceptions and invoke module.fail_json to fail the play. This is work-in-progress.

Until this is fixed, the work around is either of the following:

- name: Launch compute instances
  oci_instance:
    availability_domain: "{{ instance_ad }}"
    compartment_id: "{{ instance_compartment }}"
    name: "instance{{ item }}"
    image_id: "{{ instance_image }}"
    shape: "{{ instance_shape }}"
    vnic:
        assign_public_ip: True
        hostname_label: "{{ instance_hostname }}{{ item }}"
        subnet_id: "{{ instance_subnet_id }}"
    metadata:
        ssh_authorized_keys: "{{ lookup('file',  my_test_public_key ) }}"
    defined_tags:
      TagNamespace1: "{{ tag_namespace }}"
  with_sequence: start=1 end={{ exact_count }}
  register: oci_instance
  async: 600  # Maximum runtime in seconds. Adjust as needed.
  poll: 0  # Fire and continue (never poll)
- name: Wait for creation to finish
  async_status:
    jid: "{{ item.ansible_job_id }}"
  register: oci_jobs
  until: oci_jobs.finished
  delay: 5  # Check every 10 seconds. Adjust as you like.
  retries: 50  # Retry up to 10 times. Adjust as needed.
  with_items:
    - "{{ oci_instance.results }}"
@nalsaber

This comment has been minimized.

Copy link
Member

commented May 1, 2019

The instance parallel provisioning feature using exact_count is deprecated and replaced with OCI instance pools feature
This feature was deprecated in release v1.8.0.

@nalsaber nalsaber closed this May 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.