Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

salt-cloud - Only 2 out of many ec2 providers are queried with -Q #55311

Open
nnsense opened this issue Nov 14, 2019 · 13 comments
Open

salt-cloud - Only 2 out of many ec2 providers are queried with -Q #55311

nnsense opened this issue Nov 14, 2019 · 13 comments
Labels
Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged Salt-Cloud
Milestone

Comments

@nnsense
Copy link

nnsense commented Nov 14, 2019

Description of Issue

I've just started with salt-cloud. I've created a single file with a number of providers (we use subaccounts, every provider is an account, all the account have a role allowing admin privileges to the instance). salt-cloud --list-providers is correctly reporting all the providers. When I try salt-cloud -Q only instances from 2 providers, apparently picked randomly among providers, are shown. Basically, if each of the 5 providers has 5 instances, query would show 10 instances instead of 25.
I've tried -l debug and only 2 (sometimes even 1) endpoints are shown, as if the query isn't really trying to list the others. This is the end of the debug:

[...]
[DEBUG   ] LazyLoaded proxmox.avail_sizes
[DEBUG   ] Using AWS endpoint: ec2.eu-west-1.amazonaws.com
[INFO    ] Assuming the role: arn:aws:iam::XXXXXXXXXXX:role/OrganizationAccountAccessRole
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: saltsrv.wallawalla.net
[INFO    ] Assuming the role: arn:aws:iam::YYYYYYYYYYYY:role/OrganizationAccountAccessRole
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: saltsrv.wallawalla.net
[DEBUG   ] AWS Request: https://ec2.eu-west-1.amazonaws.com/?Action=DescribeInstances&Version=2014-10-01
[DEBUG   ] AWS Request: https://ec2.us-east-1.amazonaws.com/?Action=DescribeInstances&Version=2014-10-01
[DEBUG   ] AWS Response Status Code: 200
[DEBUG   ] LazyLoaded cloud.cache_node_list
[DEBUG   ] AWS Response Status Code: 200
[DEBUG   ] LazyLoaded cloud.cache_node_list
[DEBUG   ] LazyLoaded nested.output
[...]

Interestingly, if I remove one of the provider successfully queried from my list in /etc/salt/cloud.providers.d, another provider takes its place and still just 2 (or 1) providers are queried.

I don't know if it's related, but when I try terminating one of the instances from a working provider Salt successfully find the host, but then it's like losing it again:

[DEBUG   ] Using AWS endpoint: ec2.eu-west-1.amazonaws.com                                                                                                                           [1/1892]
[INFO    ] Assuming the role: arn:aws:iam::XXXXXXXXXXX:role/OrganizationAccountAccessRole
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: saltsrv.wallawalla.net
[DEBUG   ] AWS Request: https://ec2.eu-west-1.amazonaws.com/?Action=DescribeInstances&Version=2014-10-01
[DEBUG   ] AWS Response Status Code: 200
The following virtual machines are set to be destroyed:
  ds:
    ec2:
      DS-BOX12

Proceed? [N/y] y
... proceeding
[INFO    ] Destroying in non-parallel mode.
[DEBUG   ] Using AWS endpoint: ec2.eu-west-1.amazonaws.com
[INFO    ] Assuming the role: arn:aws:iam::XXXXXXXXXXX:role/OrganizationAccountAccessRole
[DEBUG   ] Using cached minion ID from /etc/salt/minion_id: saltsrv.wallawalla.net
[DEBUG   ] AWS Request: https://ec2.eu-west-1.amazonaws.com/?Action=DescribeInstances&Filter.1.Name=tag%3AName&Filter.1.Value.1=DS-BOX12&Version=2014-10-01
[DEBUG   ] AWS Response Status Code: 200
[DEBUG   ] Using AWS endpoint: ec2.eu-west-1.amazonaws.com
[DEBUG   ] AWS Request: https://ec2.eu-west-1.amazonaws.com/?Action=DescribeInstanceAttribute&Attribute=disableApiTermination&InstanceId=i-0e1aa5c99658ae027&Version=2014-10-01
[DEBUG   ] AWS Response Status Code: 400
[ERROR   ] AWS Response Status Code and Error: [400 400 Client Error: Bad Request] {'Errors': {'Error': {'Message': "The instance ID 'i-0e1aa5c99658ae027' does not exist", 'Code': 'InvalidI
nstanceID.NotFound'}}, 'RequestID': 'd73588d9-d598-4184-ad97-48b3dd013b9a'}
[DEBUG   ] Termination Protection is disabled for DS-BOX12
[DEBUG   ] LazyLoaded cloud.fire_event
[DEBUG   ] MasterEvent PUB socket URI: /var/run/salt/master/master_event_pub.ipc
[DEBUG   ] MasterEvent PULL socket URI: /var/run/salt/master/master_event_pull.ipc
[DEBUG   ] Sending event: tag = salt/cloud/DS-BOX12/destroying; data = {u'instance_id': 'i-0e1aa5c99658ae027', u'_stamp': '2019-11-14T22:32:12.766919', u'name': 'DS-BOX12', u'event': u'dest
roying instance'}
[DEBUG   ] Closing IPCMessageClient instance
[INFO    ] Renaming DS-BOX12 to DS-BOX12-DELd34461b06183420b91a24744125840fe
[DEBUG   ] Using AWS endpoint: ec2.eu-west-1.amazonaws.com
[DEBUG   ] AWS Request: https://ec2.eu-west-1.amazonaws.com/?Action=DescribeInstances&Filter.1.Name=tag%3AName&Filter.1.Value.1=DS-BOX12&Version=2014-10-01
[DEBUG   ] AWS Response Status Code: 200
[ERROR   ] There was an error destroying machines:
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/salt/cloud/cli.py", line 210, in run
    ret = mapper.destroy(names, cached=True)
  File "/usr/lib/python2.7/site-packages/salt/cloud/__init__.py", line 1015, in destroy
    ret = self.clouds[fun](name)
  File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 3368, in destroy
    rename(name, kwargs={'newname': newname}, call='action')
  File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 3307, in rename
    set_tags(name, {'Name': kwargs['newname']}, call='action')
  File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 3124, in set_tags
    instance_id = _get_node(name=name, instance_id=None, location=location)['instanceId']
  File "/usr/lib/python2.7/site-packages/salt/cloud/clouds/ec2.py", line 3532, in _get_node
    return next(iter(instance_info))

I've tried:

  • Splitting the single provider's file into many single files,
  • Removing one provider at a time to check if one was causing the issue
  • Removing all the profiles (2 very basic profiles, no map files yet)
  • Using proper credentials instead of 'use-instance-role-credentials'

I've found this out because I was trying to delete an instance created with salt-cloud but salt couldn't find it, and it did when I've left only that provider in /etc/salt/cloud.providers.d/aws.conf .

Setup

OS: CentOS Linux release 7.7.1908 (Core)
salt-cloud version: salt-cloud 2019.2.2 (Fluorine)

Providers in /etc/salt/cloud.providers.d/aws.conf :

pr1:
  driver: ec2
  id: 'use-instance-role-credentials'
  key: 'use-instance-role-credentials'
  role_arn: arn:aws:iam::3262362363262:role/OrganizationAccountAccessRole
  private_key: /root/.ssh/id_rsa
  keyname: administrator
  ssh_username: root
  location: eu-west-1
  minion:
    master: saltsrv.wallawalla.net

pr2:
  driver: ec2
  id: 'use-instance-role-credentials'
  key: 'use-instance-role-credentials'
  role_arn: arn:aws:iam::552352352353:role/OrganizationAccountAccessRole
  private_key: /root/.ssh/id_rsa
  keyname: administrator
  location: us-east-1
  minion:
    master: saltsrv.wallawalla.net

pr3:
  driver: ec2
  id: 'use-instance-role-credentials'
  key: 'use-instance-role-credentials'
  role_arn: arn:aws:iam::124241241242:role/OrganizationAccountAccessRole
  private_key: /root/.ssh/id_rsa
  keyname: administrator
  ssh_username: root
  location: eu-west-1
  minion:
    master: saltsrv.wallawalla.net

pr4:
  driver: ec2
  id: 'use-instance-role-credentials'
  key: 'use-instance-role-credentials'
  role_arn: arn:aws:iam::1241244124212:role/OrganizationAccountAccessRole
  private_key: /root/.ssh/id_rsa
  keyname: administrator
  ssh_username: root
  location: eu-west-1
  minion:
    master: saltsrv.wallawalla.net

pr5:
  driver: ec2
  id: 'use-instance-role-credentials'
  key: 'use-instance-role-credentials'
  role_arn: arn:aws:iam::31233123442:role/OrganizationAccountAccessRole
  private_key: /root/.ssh/id_rsa
  keyname: administrator
  ssh_username: root
  location: eu-west-1
  minion:
    master: saltsrv.wallawalla.net

Steps to Reproduce Issue

salt-cloud --list-providers to check the providers and
salt-cloud -Q to query all the instances. Expected behaviour is to see all instances from all providers listed. Instead, I get only instances from 2 providers.

Versions Report

Salt Version:
           Salt: 2019.2.2
 
Dependency Versions:
           cffi: 1.6.0
       cherrypy: unknown
       dateutil: 2.8.0
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.6
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Aug  7 2019, 00:51:29)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4
 
System Versions:
           dist: centos 7.7.1908 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-862.3.2.el7.x86_64
         system: Linux
        version: CentOS Linux 7.7.1908 Core
@Ch3LL
Copy link
Contributor

Ch3LL commented Dec 20, 2019

ping @saltstack/team-cloud any ideas here?

@Ch3LL Ch3LL added Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged and removed needs-triage labels Dec 20, 2019
@Ch3LL Ch3LL added this to the Blocked milestone Dec 20, 2019
@stale
Copy link

stale bot commented Jan 19, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale
Copy link

stale bot commented Jan 22, 2020

Thank you for updating this issue. It is no longer marked as stale.

@stale stale bot removed the stale label Jan 22, 2020
@Akm0d
Copy link
Contributor

Akm0d commented Jan 22, 2020

Does salt-cloud --list-providers list all the configured providers properly?

@nnsense
Copy link
Author

nnsense commented Jan 22, 2020

salt-cloud --list-providers is correctly reporting all the providers. When I try salt-cloud -Q only instances from 2 providers, apparently picked randomly among providers, are shown.

I've just tested it, it's still the case after the latest update:

salt-master-2019.2.3-1.el7.noarch
salt-cloud-2019.2.3-1.el7.noarch
salt-2019.2.3-1.el7.noarch
salt-api-2019.2.3-1.el7.noarch
salt-syndic-2019.2.3-1.el7.noarch
salt-minion-2019.2.3-1.el7.noarch
salt-repo-latest-2.el7.noarch
salt-ssh-2019.2.3-1.el7.noarch

Hey, thanks for looking into this ;)

@stale
Copy link

stale bot commented Feb 21, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Feb 21, 2020
@sagetherage
Copy link
Contributor

@Akm0d can you follow up here?

@stale
Copy link

stale bot commented Feb 21, 2020

Thank you for updating this issue. It is no longer marked as stale.

@stale stale bot removed the stale label Feb 21, 2020
@stale
Copy link

stale bot commented Mar 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.

@stale stale bot added the stale label Mar 22, 2020
@nnsense
Copy link
Author

nnsense commented Mar 29, 2020

Up

@stale
Copy link

stale bot commented Mar 29, 2020

Thank you for updating this issue. It is no longer marked as stale.

@dhruv-malik-ptc
Copy link

dhruv-malik-ptc commented May 27, 2021

Same issue here while deploying cross account EC2. It successfully assumes cross account role, is able to describe the Subnet, but on the second call when creating NetworkInterfaces, it fails to find the Subnet in that account, even though the subnet exists.

[DEBUG   ] https://sts.amazonaws.com:443 "GET /?Action=AssumeRole&DurationSeconds=3600&Policy=%7B%22Version%22%3A%222012-10-17%22%2C%22Statement%22%3A%5B%7B%22Sid%22%3A%22Stmt1%22%2C%20%22Effect%22%3A%22Allow%22%2C%22Action%22%3A%22%2A%22%2C%22Resource%22%3A%22%2A%22%7D%5D%7D&RoleArn=arn%3Aaws%3Aiam%3A%3A047306228716%3Arole%2Fcross-account-role&RoleSessionName=salt-master-dev&Version=2011-06-15 HTTP/1.1" 200 901
[DEBUG   ] AWS Request: https://ec2.ap-southeast-2.amazonaws.com/?Action=DescribeSubnets&Version=2014-10-01
[DEBUG   ] Starting new HTTPS connection (1): ec2.ap-southeast-2.amazonaws.com:443
[DEBUG   ] https://ec2.ap-southeast-2.amazonaws.com:443 "GET /?Action=DescribeSubnets&Version=2014-10-01 HTTP/1.1" 200 None
[DEBUG   ] AWS Response Status Code: 200
[DEBUG   ] Using AWS endpoint: ec2.ap-southeast-2.amazonaws.com
[DEBUG   ] AWS Request: https://ec2.ap-southeast-2.amazonaws.com/?Action=CreateNetworkInterface&PrivateIpAddresses.0.Primary=true&SecurityGroupId.0=sg-80947382094chaskasj&SubnetId=subnet-asdsadasdasdasd231231232131&Version=2014-10-01
[DEBUG   ] Starting new HTTPS connection (1): ec2.ap-southeast-2.amazonaws.com:443
[DEBUG   ] https://ec2.ap-southeast-2.amazonaws.com:443 "GET /?Action=CreateNetworkInterface&PrivateIpAddresses.0.Primary=true&SecurityGroupId.0=sg-80947382094chaskasj&SubnetId=subnet-asdsadasdasdasd231231232131&Version=2014-10-01 HTTP/1.1" 400 None
[DEBUG   ] AWS Response Status Code: 400
[ERROR   ] AWS Response Status Code and Error: [400 400 Client Error: Bad Request for url: https://ec2.ap-southeast-2.amazonaws.com/?Action=CreateNetworkInterface&PrivateIpAddresses.0.Primary=true&SecurityGroupId.0=sg-80947382094chaskasj&SubnetId=subnet-asdsadasdasdasd231231232131&Version=2014-10-01] {'Errors': {'Error': {'Code': 'InvalidSubnetID.NotFound', 'Message': "The subnet ID 'subnet-asdsadasdasdasd231231232131' does not exist"}}, 'RequestID': 'e98ec8c0-e2ed-4544-9f16-20c3a1aa505c'}
[ERROR   ] Failed to create VM salt-deploy-test. Configuration value 1 needs to be set
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/salt/cloud/__init__.py", line 1228, in create
    output = self.clouds[func](vm_)
  File "/usr/lib/python3.6/site-packages/salt/loader.py", line 1235, in __call__
    return self.loader.run(run_func, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/salt/loader.py", line 2268, in run
    return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/salt/loader.py", line 2283, in _run_as
    return _func_or_method(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/salt/cloud/clouds/ec2.py", line 2658, in create
    data, vm_ = request_instance(vm_, location)
  File "/usr/lib/python3.6/site-packages/salt/cloud/clouds/ec2.py", line 1927, in request_instance
    _new_eni = _create_eni_if_necessary(interface, vm_)
  File "/usr/lib/python3.6/site-packages/salt/cloud/clouds/ec2.py", line 1473, in _create_eni_if_necessary
    eni_desc = result[1]
KeyError: 1
Error: There was a profile error: Failed to deploy VM

Versions report

Salt Version:
            Salt: 3003

Dependency Versions:
 Apache Libcloud: 2.2.1
            cffi: 1.14.5
        cherrypy: 5.6.0
        dateutil: Not Installed
       docker-py: Not Installed
           gitdb: Not Installed
       gitpython: Not Installed
          Jinja2: 2.10.3
         libgit2: 1.1.0
        M2Crypto: 0.35.2
            Mako: Not Installed
         msgpack: 0.6.2
    msgpack-pure: Not Installed
    mysql-python: Not Installed
       pycparser: 2.20
        pycrypto: Not Installed
    pycryptodome: 3.10.1
          pygit2: 1.5.0
          Python: 3.6.8 (default, Nov 16 2020, 16:55:22)
    python-gnupg: Not Installed
          PyYAML: 3.13
           PyZMQ: 17.0.0
           smmap: Not Installed
         timelib: Not Installed
         Tornado: 4.5.3
             ZMQ: 4.1.4

System Versions:
            dist: centos 7 Core
          locale: UTF-8
         machine: x86_64
         release: 4.14.231-173.361.amzn2.x86_64
          system: Linux
         version: CentOS Linux 7 Core

@lallish
Copy link

lallish commented Jan 31, 2024

Did you find the solution?
Same issue here, all is fine with retrieving information from an assumed role's account but once it attempts to create the network interface it fails, on salt 3006.1:

return:
                 Exception occurred in runner cloud.create: Traceback (most recent call last):
                   File "/usr/local/lib/python3.6/site-packages/salt/client/mixins.py", line 388, in low
                     data["return"] = func(*args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
                     return self.loader.run(run_func, *args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/salt/loader/lazy.py", line 1232, in run
                     return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
                     return callable(*args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/salt/loader/lazy.py", line 1247, in _run_as
                     return _func_or_method(*args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/salt/runners/cloud.py", line 187, in create
                     info = client.create(provider, instances, **salt.utils.args.clean_kwargs(**kwargs))
                   File "/usr/local/lib/python3.6/site-packages/salt/cloud/__init__.py", line 414, in create
                     ret[name] = salt.utils.data.simple_types_filter(mapper.create(vm_))
                   File "/usr/local/lib/python3.6/site-packages/salt/cloud/__init__.py", line 1226, in create
                     output = self.clouds[func](vm_)
                   File "/usr/local/lib/python3.6/site-packages/salt/loader/lazy.py", line 149, in __call__
                     return self.loader.run(run_func, *args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/salt/loader/lazy.py", line 1232, in run
                     return self._last_context.run(self._run_as, _func_or_method, *args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/contextvars/__init__.py", line 38, in run
                     return callable(*args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/salt/loader/lazy.py", line 1247, in _run_as
                     return _func_or_method(*args, **kwargs)
                   File "/usr/local/lib/python3.6/site-packages/salt/cloud/clouds/ec2.py", line 2669, in create
                     data, vm_ = request_instance(vm_, location)
                   File "/usr/local/lib/python3.6/site-packages/salt/cloud/clouds/ec2.py", line 1935, in request_instance
                     _new_eni = _create_eni_if_necessary(interface, vm_)
                   File "/usr/local/lib/python3.6/site-packages/salt/cloud/clouds/ec2.py", line 1446, in _create_eni_if_necessary
                     "No such subnet <{}>".format(interface.get("SubnetId"))
                 salt.exceptions.SaltCloudConfigError: No such subnet <subnet-0e9ffa3f1235123baa0>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged Salt-Cloud
Projects
No open projects
[Test] Triage
  
Needs triage
Development

No branches or pull requests

6 participants