Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Builder fails to resolve host #63

Closed
ChillarAnand opened this issue Apr 17, 2020 · 16 comments
Closed

Builder fails to resolve host #63

ChillarAnand opened this issue Apr 17, 2020 · 16 comments
Assignees

Comments

@ChillarAnand
Copy link

ChillarAnand commented Apr 17, 2020

After git push, during docker build, build fails to resolve deb.debian.org

 ---> Using cache
 ---> 5e8494e15701
Step 4/19 : RUN apt-get update &&     apt-get install -y apt-transport-https ca-certificates vim &&     curl -sS https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - &&     echo "deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list &&     rm -rf /var/lib/apt/lists/*
 ---> Running in cad63ad42fd8
Err:1 http://deb.debian.org/debian buster InRelease
  Temporary failure resolving 'deb.debian.org'
Err:2 http://security.debian.org/debian-security buster/updates InRelease
  Temporary failure resolving 'security.debian.org'
Err:3 http://deb.debian.org/debian buster-updates InRelease
  Temporary failure resolving 'deb.debian.org'
Reading package lists...
W: Failed to fetch http://deb.debian.org/debian/dists/buster/InRelease  Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://security.debian.org/debian-security/dists/buster/updates/InRelease  Temporary failure resolving 'security.debian.org'
W: Failed to fetch http://deb.debian.org/debian/dists/buster-updates/InRelease  Temporary failure resolving 'deb.debian.org'
W: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package vim
The command '/bin/sh -c apt-get update &&     apt-get install -y apt-transport-https ca-certificates vim &&     curl -sS https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - &&     echo "deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list &&     rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100
remote: 2020/04/17 13:42:22 Error running git receive hook [Build pod exited with code 1, stopping build.]
To ssh://deis-builder.x.x.x.x.nip.io:2222/demo-server.git
 ! [remote rejected]   ENGG-3881 -> ENGG-3881 (pre-receive hook declined)
error: failed to push some refs to 'ssh://git@deis-builder.x.x.x.x.nip.io:2222/demo-server.git'

There are no errors in builder pod logs


deis deis-builder-57cf7db484-64x99 deis-builder Accepted connection.
deis deis-builder-57cf7db484-64x99 deis-builder Starting ssh authentication
deis deis-builder-57cf7db484-64x99 deis-builder Channel type: session
deis deis-builder-57cf7db484-64x99 deis-builder
deis deis-builder-57cf7db484-64x99 deis-builder Key='LANG', Value='C.UTF-8'
deis deis-builder-57cf7db484-64x99 deis-builder
deis deis-builder-57cf7db484-64x99 deis-builder Key='LC_ALL', Value='en_US.UTF-8'
deis deis-builder-57cf7db484-64x99 deis-builder
deis deis-builder-57cf7db484-64x99 deis-builder Key='LC_CTYPE', Value='UTF-8'
deis deis-builder-57cf7db484-64x99 deis-builder
deis deis-builder-57cf7db484-64x99 deis-builder receiving git repo name: demo-server.git, operation: git-receive-pack, fingerprint: ee:02:70:18:75:c4:23:6c:38:d6:11:13:81:4e:6a:c8, user: test
deis deis-builder-57cf7db484-64x99 deis-builder creating repo directory /home/git/demo-server.git
deis deis-builder-57cf7db484-64x99 deis-builder writing pre-receive hook under /home/git/demo-server.git
deis deis-builder-57cf7db484-64x99 deis-builder git-shell -c git-receive-pack 'demo-server.git'
deis deis-builder-57cf7db484-64x99 deis-builder Waiting for git-receive to run.
deis deis-builder-57cf7db484-64x99 deis-builder Waiting for deploy.
deis deis-builder-57cf7db484-64x99 deis-builder Deploy complete.

If I ssh into pod and try to resolve it, it is working.

root@deis-builder-57cf7db484-64x99:/# host deb.debian.org
deb.debian.org is an alias for debian.map.fastly.net.
debian.map.fastly.net has address 151.101.158.133
debian.map.fastly.net has IPv6 address 2a04:4e42:24::645
@Cryptophobia
Copy link
Member

Hmmm, this is obviously a networking or DNS problem. Does it happen intermittently or every time?

@ChillarAnand
Copy link
Author

ChillarAnand commented Apr 18, 2020

It is happening every time. I deleted builder pod and the same issue is happening in the new pod as well.

➜  git push deis ENGG-3881
deis deis-builder-57cf7db484-64x99 deis-builder
deis deis-builder-57cf7db484-64x99 deis-builder Key='LC_ALL', Value='en_US.UTF-8'
deis deis-builder-57cf7db484-64x99 deis-builder
deis deis-builder-57cf7db484-64x99 deis-builder Key='LC_CTYPE', Value='UTF-8'
deis deis-builder-57cf7db484-64x99 deis-builder
deis deis-builder-57cf7db484-64x99 deis-builder receiving git repo name: demo-server.git, operation: git-receive-pack, fingerprint: ee:02:70:18:75:c4:23:6c:38:d6:11:13:81:4e:6a:c8, user: test
deis deis-builder-57cf7db484-64x99 deis-builder creating repo directory /home/git/demo-server.git
deis deis-builder-57cf7db484-64x99 deis-builder writing pre-receive hook under /home/git/demo-server.git
deis deis-builder-57cf7db484-64x99 deis-builder git-shell -c git-receive-pack 'demo-server.git'
deis deis-builder-57cf7db484-64x99 deis-builder Waiting for git-receive to run.
deis deis-builder-57cf7db484-64x99 deis-builder Waiting for deploy.
deis deis-builder-57cf7db484-64x99 deis-builder Deploy complete.
- deis deis-builder-57cf7db484-64x99
+ deis deis-builder-57cf7db484-hfs4v › deis-builder
deis deis-builder-57cf7db484-hfs4v deis-builder 2020/04/18 03:25:02 Starting health check server on port 8092
deis deis-builder-57cf7db484-hfs4v deis-builder 2020/04/18 03:25:02 Starting deleted app cleaner
deis deis-builder-57cf7db484-hfs4v deis-builder 2020/04/18 03:25:02 Starting SSH server on 0.0.0.0:2223
deis deis-builder-57cf7db484-hfs4v deis-builder Listening on 0.0.0.0:2223
deis deis-builder-57cf7db484-hfs4v deis-builder Accepting new connections.

deis deis-builder-57cf7db484-hfs4v deis-builder Accepted connection.
deis deis-builder-57cf7db484-hfs4v deis-builder Starting ssh authentication
deis deis-builder-57cf7db484-hfs4v deis-builder Channel type: session
deis deis-builder-57cf7db484-hfs4v deis-builder
deis deis-builder-57cf7db484-hfs4v deis-builder Key='LANG', Value='C.UTF-8'
deis deis-builder-57cf7db484-hfs4v deis-builder
deis deis-builder-57cf7db484-hfs4v deis-builder Key='LC_ALL', Value='en_US.UTF-8'
deis deis-builder-57cf7db484-hfs4v deis-builder
deis deis-builder-57cf7db484-hfs4v deis-builder Key='LC_CTYPE', Value='UTF-8'
deis deis-builder-57cf7db484-hfs4v deis-builder
deis deis-builder-57cf7db484-hfs4v deis-builder receiving git repo name: demo-server.git, operation: git-receive-pack, fingerprint: ee:02:70:18:75:c4:23:6c:38:d6:11:13:81:4e:6a:c8, user: test
deis deis-builder-57cf7db484-hfs4v deis-builder creating repo directory /home/git/demo-server.git
deis deis-builder-57cf7db484-hfs4v deis-builder writing pre-receive hook under /home/git/demo-server.git
deis deis-builder-57cf7db484-hfs4v deis-builder git-shell -c git-receive-pack 'demo-server.git'
deis deis-builder-57cf7db484-hfs4v deis-builder Waiting for git-receive to run.
deis deis-builder-57cf7db484-hfs4v deis-builder Waiting for deploy.
deis deis-builder-57cf7db484-hfs4v deis-builder Deploy complete.
---> Using cache
 ---> 5e8494e15701
Step 4/19 : RUN apt-get update &&     apt-get install -y apt-transport-https ca-certificates vim &&     curl -sS https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - &&     echo "deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list &&     rm -rf /var/lib/apt/lists/*
 ---> Running in a37b160bf44f
Err:1 http://deb.debian.org/debian buster InRelease
  Temporary failure resolving 'deb.debian.org'
Err:2 http://security.debian.org/debian-security buster/updates InRelease
  Temporary failure resolving 'security.debian.org'
Err:3 http://deb.debian.org/debian buster-updates InRelease
  Temporary failure resolving 'deb.debian.org'
Reading package lists...
W: Failed to fetch http://deb.debian.org/debian/dists/buster/InRelease  Temporary failure resolving 'deb.debian.org'
W: Failed to fetch http://security.debian.org/debian-security/dists/buster/updates/InRelease  Temporary failure resolving 'security.debian.org'
W: Failed to fetch http://deb.debian.org/debian/dists/buster-updates/InRelease  Temporary failure resolving 'deb.debian.org'
W: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package vim
The command '/bin/sh -c apt-get update &&     apt-get install -y apt-transport-https ca-certificates vim &&     curl -sS https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add - &&     echo "deb http://apt.postgresql.org/pub/repos/apt/ stretch-pgdg main" | tee /etc/apt/sources.list.d/pgdg.list &&     rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100

Is Kubernetes responsible for DNS resolution or the builder itself?

@Cryptophobia
Copy link
Member

Look like there is outside internet connectivity from inside the builder pod, but you may be dealing with issues related to no connectivity from the dockebuilder pod. Can you add a curl command in the Dockefile to curl to some website.

Try:

RUN curl -4 icanhazip.com

Something may be wrong with the way kudedns or cni is configured.

@ChillarAnand
Copy link
Author

Starting build... but first, coffee!
Step 1/11 : FROM python:2.7
 ---> 68e7be49c28c
Step 2/11 : RUN curl -4 icanhazip.com
 ---> Running in 7fdacdf60383
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
remote:   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0curl: (6) Could not resolve host: icanhazip.com
The command '/bin/sh -c curl -4 icanhazip.com' returned a non-zero code: 6
remote: 2020/04/21 16:34:30 Error running git receive hook [Build pod exited with code 1, stopping build.]

This also seems to be failing. Any thoughts on how to troubleshoot it?

@ChillarAnand
Copy link
Author

ChillarAnand commented Apr 21, 2020

This seems to happen only with docker builds.

I am able to run this app on the cluster https://github.com/teamhephy/example-python-django. This is using only Procfile without any dockerfile.

However this app https://github.com/teamhephy/helloworld with Dockerfile is failing to build.

Starting build... but first, coffee!
Step 1/10 : FROM debian:jessie
 ---> 7144b35bf6b5
Step 2/10 : RUN apt-get update && apt-get install -qy curl
 ---> Running in 34477c1baa1d
Err http://deb.debian.org jessie InRelease
  
Err http://security.debian.org jessie/updates InRelease
  
Err http://deb.debian.org jessie-updates InRelease
  mote: 
Err http://security.debian.org jessie/updates Release.gpg
  Could not resolve 'security.debian.org'
Err http://deb.debian.org jessie Release.gpg
  Could not resolve 'deb.debian.org'
Err http://deb.debian.org jessie-updates Release.gpg
  Could not resolve 'deb.debian.org'
Reading package lists...
W: Failed to fetch http://deb.debian.org/debian/dists/jessie/InRelease  

W: Failed to fetch http://security.debian.org/debian-security/dists/jessie/updates/InRelease  

W: Failed to fetch http://deb.debian.org/debian/dists/jessie-updates/InRelease  

W: Failed to fetch http://deb.debian.org/debian/dists/jessie/Release.gpg  Could not resolve 'deb.debian.org'

W: Failed to fetch http://security.debian.org/debian-security/dists/jessie/updates/Release.gpg  Could not resolve 'security.debian.org'

W: Failed to fetch http://deb.debian.org/debian/dists/jessie-updates/Release.gpg  Could not resolve 'deb.debian.org'

W: Some index files failed to download. They have been ignored, or old ones used instead.
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package curl
The command '/bin/sh -c apt-get update && apt-get install -qy curl' returned a non-zero code: 100
remote: 2020/04/21 16:41:30 Error running git receive hook [Build pod exited with code 1, stopping build.]

@Cryptophobia
Copy link
Member

This is likely an issue with the CNI and docker interface is not able to access outside interface for whatever reason. Could be iptables or firewall...

How are you running kubernetes? What is the infrastructure underneath kubernetes, what cloud provider and what networking CNI are you using?

@ChillarAnand
Copy link
Author

AWS EKS - 1 master + 1 worker node(m5.large) - Managed using eksctl.

eksctl create cluster -n demo --version 1.15 --nodes 1 --node-type m5.large 

It is using default amazon-k8s-cni.

➜  ~ kubectl describe daemonset aws-node --namespace kube-system | grep Image | cut -d "/" -f 2

amazon-k8s-cni:v1.5.5

@Cryptophobia
Copy link
Member

So this looks all good. I think your problem is security group or iptable rules on the nodes not allowing you to send request out to 0.0.0.0 . Can you verify that the security group of the EKS nodes has a rule to ALLOW ALL 0.0.0.0 for Outbound.

@Cryptophobia Cryptophobia self-assigned this Apr 23, 2020
@ChillarAnand
Copy link
Author

Thanks, @Cryptophobia

There is some issue with quay.io images and hephy installation is failing. Once it is resolved, will check this.

@ChillarAnand
Copy link
Author

Screenshot 2020-04-24 at 9 24 53 PM

Outbound rules on all nodes seem to be set correctly.

Also, if the outbound rule was the problem, shouldn't pip install fail when setting up https://github.com/teamhephy/example-python-django?

Is there any debug flag that can be set to see more verbose output?

@Cryptophobia
Copy link
Member

Cryptophobia commented Apr 24, 2020

Yes @ChillarAnand , this is correct. If pip install does not fail inside the builder that means that this is a particular problem with dockerbuilder pod.

The problem here is that the builder when doing heroku buildpacks runs networking inside it's own container, while when building Dockerfiles with deis push command, the builder first spawns a separate pod dockerbuilder to build the docker image. There must be something related to networking that is broken on this dockerbuilder pod when spawned by builder...

Is there any debug flag that can be set to see more verbose output?

Can you enable logging on builder by setting the DEBUG env variable on the builder?
https://docs.teamhephy.com/managing-workflow/tuning-component-settings/#customizing-the-builder

@ChillarAnand
Copy link
Author

$ kubectl --namespace deis edit deployment deis-builder 

After setting DEIS_DEBUG flag to true, re-deployed helloworld.

It printed out pod spec and failed at apt update as mentioned earlier. Couldn't find anything useful.

@kingdonb
Copy link
Member

@ChillarAnand just out of curiosity, if you're using CNI did you also enable this value at Workflow install time:

--set global.use_cni=true

I don't really understand how CNI affects the topology of the cluster but this has resolved networking issues on some cluster providers for me before. It's one of the highlights on https://web.teamhephy.com/ (see instructions for DigitalOcean at the bottom)

@Cryptophobia
Copy link
Member

Cryptophobia commented Apr 27, 2020

Totally forgot about this. Yes, thank you @kingdonb! Might want to try that global flag when installing/upgrading hephy workflow as well. Now that I think about it, the global.use_cni=true flag may solve this issue.

More info about this flag: https://docs.teamhephy.com/managing-workflow/production-deployments/#using-on-cluster-registry-with-cni

@ChillarAnand
Copy link
Author

Thanks, @kingdonb

In a new cluster, i have installed hephy with the following command and still the build is failing with the same error.

helm install hephy/workflow --namespace deis --generate-name \
     --set router.host_port.enabled=true --set global.use_rbac=true --set global.use_cni=true

Not sure what is causing the issue.

@Cryptophobia
Copy link
Member

Hi @ChillarAnand , I believe you figured this out in our Slack community so I am going to close it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants