Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using cloud_provider 'OpenStack' fails to deploy cluster - fails at populate inventory into hosts file task #923

Closed
sc68cal opened this issue Jan 19, 2017 · 10 comments

Comments

@sc68cal
Copy link
Contributor

sc68cal commented Jan 19, 2017

Environment:

  • Cloud provider or hardware configuration: OpenStack

  • OS: OpenStack instances are Ubuntu 16.04, my laptop is OS X 10.10

  • Version of Ansible (ansible --version): ansible 2.1.0.0

Kargo version (commit) (git rev-parse --short HEAD): 5420fa9

Network plugin used: flannel

Copy of your inventory file:

# ## Configure 'ip' variable to bind kubernetes services on a
# ## different ip than the default iface
 kargo-node-1 ansible_ssh_host=172.18.237.212 ansible_user=ubuntu
 kargo-node-2 ansible_ssh_host=172.18.237.220 ansible_user=ubuntu
 kargo-node-3 ansible_ssh_host=172.18.237.214 ansible_user=ubuntu
 kargo-node-4 ansible_ssh_host=172.18.237.22 ansible_user=ubuntu
 kargo-node-5 ansible_ssh_host=172.18.237.211 ansible_user=ubuntu

# ## configure a bastion host if your nodes are not directly reachable
# bastion ansible_ssh_host=x.x.x.x

[kube-master]
 kargo-node-1
 kargo-node-2

[etcd]
 kargo-node-1
 kargo-node-2
 kargo-node-3

[kube-node]
 kargo-node-2
 kargo-node-3
 kargo-node-4
 kargo-node-5

[k8s-cluster:children]
 kube-node
 kube-master

Command used to invoke ansible:

ansible-playbook -b -v -K -i inventory/inventory.yml cluster.yml

Output of ansible run:

https://gist.github.com/sc68cal/610576854cddf8cf135cc813e7784138

tl;dr is

TASK [kubernetes/preinstall : Hosts | populate inventory into hosts file] ******
fatal: [kargo-node-2]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'ansible_default_ipv4'"}
fatal: [kargo-node-3]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'ansible_default_ipv4'"}
fatal: [kargo-node-5]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'ansible_default_ipv4'"}
fatal: [kargo-node-1]: FAILED! => {"failed": true, "msg": "'dict object' has no attribute 'ansible_default_ipv4'"}
to retry, use: --limit @cluster.retry
@sc68cal
Copy link
Contributor Author

sc68cal commented Jan 19, 2017

This is while following the openstack guide

@bogdando
Copy link
Contributor

bogdando commented Jan 20, 2017

The w/a is to define ip= for nodes, so services could bound on instead of the ansible_default_ipv4
Try this as well #769 (comment) and check this #769 (comment)

@bogdando
Copy link
Contributor

May be a dup of #212

@sc68cal
Copy link
Contributor Author

sc68cal commented Jan 20, 2017

I did change my inventory file to add ip and it did not change the outcome. I will review the other suggestions you have and report back. Thanks for the reply

# ## Configure 'ip' variable to bind kubernetes services on a
# ## different ip than the default iface
 kargo-node-1 ansible_ssh_host=172.18.237.212 ansible_user=ubuntu ip=10.0.0.5
 kargo-node-2 ansible_ssh_host=172.18.237.220 ansible_user=ubuntu ip=10.0.0.7
 kargo-node-3 ansible_ssh_host=172.18.237.214 ansible_user=ubuntu ip=10.0.0.6
 kargo-node-4 ansible_ssh_host=172.18.237.22 ansible_user=ubuntu ip=10.0.0.9
 kargo-node-5 ansible_ssh_host=172.18.237.211 ansible_user=ubuntu ip=10.0.0.8

# ## configure a bastion host if your nodes are not directly reachable
# bastion ansible_ssh_host=x.x.x.x

[kube-master]
 kargo-node-1
 kargo-node-2

[etcd]
 kargo-node-1
 kargo-node-2
 kargo-node-3

[kube-node]
 kargo-node-2
 kargo-node-3
 kargo-node-4
 kargo-node-5

[k8s-cluster:children]
 kube-node
 kube-master

@sc68cal
Copy link
Contributor Author

sc68cal commented Jan 20, 2017

kargo-node-4 has been unreachable for some reason, let me also try removing that node from the inventory and retry as well.

@sc68cal
Copy link
Contributor Author

sc68cal commented Jan 20, 2017

OK, so commenting out kargo-node-4 let me continue onward, but then the following task timed out:


TASK [network_plugin/flannel : Flannel | Create flannel pod manifest] **********
changed: [kargo-node-5] => {"changed": true, "checksum": "c09162109e175060ed6ba48154955f9fdc41c757", "dest": "/etc/kubernetes/manifests/flannel-pod.manifest", "gid": 0, "group": "root", "md5sum": "96a7eec2682eba943686e71bd41de044", "mode": "0644", "owner": "root", "size": 1314, "src": "/home/ubuntu/.ansible/tmp/ansible-tmp-1484931505.62-125587897227718/source", "state": "file", "uid": 0}
changed: [kargo-node-1] => {"changed": true, "checksum": "47a9f63e600194803b1b835f437da29acb2d650f", "dest": "/etc/kubernetes/manifests/flannel-pod.manifest", "gid": 0, "group": "root", "md5sum": "622d02ed7b2e93c2dded61eb283dfe5c", "mode": "0644", "owner": "root", "size": 1314, "src": "/home/ubuntu/.ansible/tmp/ansible-tmp-1484931505.62-241361757177572/source", "state": "file", "uid": 0}
changed: [kargo-node-2] => {"changed": true, "checksum": "3ab1f23fb3ef9c867dffc74fbaa39f87bf8b9581", "dest": "/etc/kubernetes/manifests/flannel-pod.manifest", "gid": 0, "group": "root", "md5sum": "875f1093562db65aff347969685489e0", "mode": "0644", "owner": "root", "size": 1314, "src": "/home/ubuntu/.ansible/tmp/ansible-tmp-1484931505.6-125678261606678/source", "state": "file", "uid": 0}
changed: [kargo-node-3] => {"changed": true, "checksum": "bb43d07d1b1fa932a40a1d2535f7b5d961b95a23", "dest": "/etc/kubernetes/manifests/flannel-pod.manifest", "gid": 0, "group": "root", "md5sum": "d1f02c8db88d322313f4d5ec9b33b9f3", "mode": "0644", "owner": "root", "size": 1314, "src": "/home/ubuntu/.ansible/tmp/ansible-tmp-1484931505.61-268195622979648/source", "state": "file", "uid": 0}

TASK [network_plugin/flannel : Flannel | Wait for flannel subnet.env file presence] ***
fatal: [kargo-node-2]: FAILED! => {"changed": false, "elapsed": 600, "failed": true, "msg": "Timeout when waiting for file /run/flannel/subnet.env"}
fatal: [kargo-node-3]: FAILED! => {"changed": false, "elapsed": 600, "failed": true, "msg": "Timeout when waiting for file /run/flannel/subnet.env"}
fatal: [kargo-node-5]: FAILED! => {"changed": false, "elapsed": 600, "failed": true, "msg": "Timeout when waiting for file /run/flannel/subnet.env"}
fatal: [kargo-node-1]: FAILED! => {"changed": false, "elapsed": 600, "failed": true, "msg": "Timeout when waiting for file /run/flannel/subnet.env"}

@sc68cal
Copy link
Contributor Author

sc68cal commented Jan 20, 2017

Making security groups with very permissive rules to allow node to node communication, and specifically following the recommendation to allow the etcd port, did not change the outcome

@bogdando
Copy link
Contributor

note: the given error message only points to the etcd cluster issues

@nniehoff
Copy link
Contributor

nniehoff commented Mar 2, 2017

I have the same issue deploying to aws (GovCloud which has it's own issues anyway). I am trying to deploy 3 etcd nodes, all 3 fail with this same error.

@ant31
Copy link
Contributor

ant31 commented Aug 15, 2018

stale. Join #kubespray chan if you require additional help.

@ant31 ant31 closed this as completed Aug 15, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants