New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployments now fails due to failure in DNS resolution #3

Closed
mglantz opened this Issue Jan 4, 2017 · 5 comments

Comments

Projects
None yet
2 participants
@mglantz

mglantz commented Jan 4, 2017

I'm not sure why, but now, for me, deployments of non HA clusters (only tried this, guess this issues is for all sorts of deployments though) fails due to DNS resolution issue. When adding the infra and node server names to /etc/hosts, deployOpenShift triggers an successful Ansible run and a cluster is installed properly. Is this something you see as well?

@mglantz

This comment has been minimized.

Show comment
Hide comment
@mglantz

mglantz Jan 4, 2017

When looking at the masters DNS config, it seems bound to fail.. but this did not happen earlier when I tried, so.. not sure what has changed. Perhaps that it's now running on Red Hat Enterprise Linux 7.3.


[root@ocpnnmaster ~]# grep ocpnnnode -r /etc
/etc/ansible/hosts:ocpnnnode-0 openshift_node_labels="{'region': 'nodes', 'zone': 'default'}"

[root@ocpnnmaster ~]# cat /etc/resolv.conf 
# Generated by NetworkManager
search mkszgepxlf5ehigwksgcr2qcha.ax.internal.cloudapp.net
nameserver 168.63.129.16

[root@ocpnnmaster ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

[root@ocpnnmaster ~]# ping ocpnnnode-0.mkszgepxlf5ehigwksgcr2qcha.ax.internal.cloudapp.net
ping: ocpnnnode-0.mkszgepxlf5ehigwksgcr2qcha.ax.internal.cloudapp.net: Name or service not known

mglantz commented Jan 4, 2017

When looking at the masters DNS config, it seems bound to fail.. but this did not happen earlier when I tried, so.. not sure what has changed. Perhaps that it's now running on Red Hat Enterprise Linux 7.3.


[root@ocpnnmaster ~]# grep ocpnnnode -r /etc
/etc/ansible/hosts:ocpnnnode-0 openshift_node_labels="{'region': 'nodes', 'zone': 'default'}"

[root@ocpnnmaster ~]# cat /etc/resolv.conf 
# Generated by NetworkManager
search mkszgepxlf5ehigwksgcr2qcha.ax.internal.cloudapp.net
nameserver 168.63.129.16

[root@ocpnnmaster ~]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

[root@ocpnnmaster ~]# ping ocpnnnode-0.mkszgepxlf5ehigwksgcr2qcha.ax.internal.cloudapp.net
ping: ocpnnnode-0.mkszgepxlf5ehigwksgcr2qcha.ax.internal.cloudapp.net: Name or service not known
@mglantz

This comment has been minimized.

Show comment
Hide comment
@mglantz

mglantz Jan 4, 2017

Seems like the IP can be predicted, 192.168.2.4+ will be nodes, and then at the end we'll have the infra node.. but this must have worked before, without hostfile hacks.

mglantz commented Jan 4, 2017

Seems like the IP can be predicted, 192.168.2.4+ will be nodes, and then at the end we'll have the infra node.. but this must have worked before, without hostfile hacks.

@haroldwongms

This comment has been minimized.

Show comment
Hide comment
@haroldwongms

haroldwongms Jan 4, 2017

Owner

I have now received numerous emails that the template is failing. I will look into this and get back to you as this was working fine last month.

Owner

haroldwongms commented Jan 4, 2017

I have now received numerous emails that the template is failing. I will look into this and get back to you as this was working fine last month.

@haroldwongms

This comment has been minimized.

Show comment
Hide comment
@haroldwongms

haroldwongms Jan 5, 2017

Owner

It seems there is a known issue with DNS resolution on newly deployed RHEL 7.2 / 7.3 images. The Azure engineering team is working on a resolution for this.

Until that is corrected, I will not be testing the templates any further at this time.

I am closing this issue for now.

Owner

haroldwongms commented Jan 5, 2017

It seems there is a known issue with DNS resolution on newly deployed RHEL 7.2 / 7.3 images. The Azure engineering team is working on a resolution for this.

Until that is corrected, I will not be testing the templates any further at this time.

I am closing this issue for now.

@mglantz

This comment has been minimized.

Show comment
Hide comment
@mglantz

mglantz Jan 5, 2017

Thanks for the feedback, sounds serious. Hoping for a quick resolution.

mglantz commented Jan 5, 2017

Thanks for the feedback, sounds serious. Hoping for a quick resolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment