-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Builder fails to resolve host #63
Comments
Hmmm, this is obviously a networking or DNS problem. Does it happen intermittently or every time? |
It is happening every time. I deleted builder pod and the same issue is happening in the new pod as well.
Is Kubernetes responsible for DNS resolution or the builder itself? |
Look like there is outside internet connectivity from inside the builder pod, but you may be dealing with issues related to no connectivity from the dockebuilder pod. Can you add a curl command in the Dockefile to curl to some website. Try:
Something may be wrong with the way kudedns or cni is configured. |
This also seems to be failing. Any thoughts on how to troubleshoot it? |
This seems to happen only with docker builds. I am able to run this app on the cluster https://github.com/teamhephy/example-python-django. This is using only Procfile without any dockerfile. However this app https://github.com/teamhephy/helloworld with Dockerfile is failing to build.
|
This is likely an issue with the CNI and docker interface is not able to access outside interface for whatever reason. Could be iptables or firewall... How are you running kubernetes? What is the infrastructure underneath kubernetes, what cloud provider and what networking CNI are you using? |
AWS EKS - 1 master + 1 worker node(m5.large) - Managed using eksctl.
It is using default amazon-k8s-cni.
|
So this looks all good. I think your problem is security group or iptable rules on the nodes not allowing you to send request out to 0.0.0.0 . Can you verify that the security group of the EKS nodes has a rule to ALLOW ALL 0.0.0.0 for Outbound. |
Thanks, @Cryptophobia There is some issue with quay.io images and hephy installation is failing. Once it is resolved, will check this. |
Outbound rules on all nodes seem to be set correctly. Also, if the outbound rule was the problem, shouldn't pip install fail when setting up https://github.com/teamhephy/example-python-django? Is there any debug flag that can be set to see more verbose output? |
Yes @ChillarAnand , this is correct. If The problem here is that the builder when doing heroku buildpacks runs networking inside it's own container, while when building Dockerfiles with
Can you enable logging on builder by setting the DEBUG env variable on the builder? |
After setting It printed out pod spec and failed at |
@ChillarAnand just out of curiosity, if you're using CNI did you also enable this value at Workflow install time:
I don't really understand how CNI affects the topology of the cluster but this has resolved networking issues on some cluster providers for me before. It's one of the highlights on https://web.teamhephy.com/ (see instructions for DigitalOcean at the bottom) |
Totally forgot about this. Yes, thank you @kingdonb! Might want to try that global flag when installing/upgrading hephy workflow as well. Now that I think about it, the More info about this flag: https://docs.teamhephy.com/managing-workflow/production-deployments/#using-on-cluster-registry-with-cni |
Thanks, @kingdonb In a new cluster, i have installed hephy with the following command and still the build is failing with the same error.
Not sure what is causing the issue. |
Hi @ChillarAnand , I believe you figured this out in our Slack community so I am going to close it for now. |
After git push, during docker build, build fails to resolve deb.debian.org
There are no errors in builder pod logs
If I ssh into pod and try to resolve it, it is working.
The text was updated successfully, but these errors were encountered: