Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add troubleshooting section to kube-dns readme #31817

Closed
wants to merge 1 commit into from

Conversation

MrHohn
Copy link
Member

@MrHohn MrHohn commented Aug 31, 2016

Working on #23981.

This PR added some generic troubleshooting information for kube-dns addon. The addition is pasted below:

@girishkalele @bprashanth


4 Troubleshooting tips.

If above nslookup command does not work, here are some hints:

Check the local DNS configuration first

Take a look inside the resolv.conf file. (See "Inheriting DNS from the node" and "Known issues" sections for more information)

cat /etc/resolv.conf

You should see the search path and nameserver are set up like below: (search path may vary for different cloud providers)

search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.gce_project_id.internal
nameserver 10.0.0.10
options ndots:5

Quick diagnosis

If below appears, usually means something is wrong with kube-dns addon or Services

/ # nslookup kubernetes
Server:    10.0.0.10
Address 1: 10.0.0.10

nslookup: can't resolve 'kubernetes'

or

/ # nslookup kubernetes
Server:    10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'kubernetes'

Let's continue to debug it.

Is dns pod running?

kubectl get pods command could tell. (namespace may be "default" or others if you manually deploy it)

kubectl get pods --namespace=kube-system -a

You should see something like:

NAME                                                       READY     STATUS    RESTARTS   AGE
...
kube-dns-v19-ezo1y                                         3/3       Running   0           1h
...

If no pod is running or failed/completed, then maybe because this dns addon would not be deployed by default in your current environment and you have not deployed it manually.

Otherwise, please try next hint.

Is dns pod working properly?

Use kubectl logs command for detective work.

kubectl logs --namespace=kube-system kube-dns-v19-ezo1y -c kubedns
kubectl logs --namespace=kube-system kube-dns-v19-ezo1y -c dnsmasq
kubectl logs --namespace=kube-system kube-dns-v19-ezo1y -c healthz

See if you could find any suspicious log. If not, please continue.

Is dns service up?

Varify through kubectl get service command.

kubectl get svc --namespace=kube-system

You should see something like:

NAME                    CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
...
kube-dns                10.0.0.10      <none>        53/UDP,53/TCP        1h
...

If you have created the service or in the case it should be created by default but it does not appear, goto this debugging services page for more information.

Otherwise, let's go ahead.

Are dns endpoints exposed?

Varify through kubectl get endpoints command.

kubectl get ep kube-dns --namespace=kube-system

You should see something like:

NAME       ENDPOINTS                       AGE
kube-dns   10.180.3.17:53,10.180.3.17:53    1h

If not, again, goto this debugging services page and look for the endpoints section.

At last, if you reach here, the final suggestion is still goto this debugging services page and look for the "Is the kube-proxy working?" section.

Hopefully you would have got some meaningful clues now. For more Kubernetes DNS example go here.


This change is Reviewable

@k8s-bot
Copy link

k8s-bot commented Aug 31, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

1 similar comment
@k8s-bot
Copy link

k8s-bot commented Aug 31, 2016

Can one of the admins verify that this patch is reasonable to test? If so, please reply "ok to test".
(Note: "add to whitelist" is no longer supported. Please update configurations in kubernetes/test-infra/jenkins/job-configs/kubernetes-jenkins-pull instead.)

This message will repeat several times in short succession due to jenkinsci/ghprb-plugin#292. Sorry.

@girishkalele
Copy link

cc @timbunce for comments/feedback

I will add a label to turn off the e2e tests since these are doc-only changes

@girishkalele girishkalele added this to the next-candidate milestone Aug 31, 2016
@girishkalele girishkalele added sig/network Categorizes an issue or PR as relevant to SIG Network. kind/documentation Categorizes issue or PR as related to documentation. documentation/confusing labels Aug 31, 2016
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed labels Aug 31, 2016
@k8s-bot
Copy link

k8s-bot commented Aug 31, 2016

GCE e2e build/test passed for commit 8ad179d.

@@ -215,6 +215,107 @@ Address 1: 10.0.0.1

If you see that, DNS is working correctly.

### 4 Troubleshooting tips.
If above ```nslookup``` command does not work, here are some hints:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single-ticks for inline code, triple-ticks for multi-line blocks.

#### Is dns pod running?
```kubectl get pods``` command could tell. (namespace may be "default" or others if you manually deploy it)
```
kubectl get pods --namespace=kube-system -a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about kubectl get pods --namespace=kube-system -l k8s-app=kube-dns

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is better, I applied.

Otherwise, please try next hint.

#### Is dns pod working properly?
Use ```kubectl logs``` command for detective work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

single ticks

#### Is dns pod working properly?
Use ```kubectl logs``` command for detective work.
```
kubectl logs --namespace=kube-system kube-dns-v19-ezo1y -c kubedns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kubedns

or

POD=$(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name)
kubectl logs --namespace=kube-system $POD -c kubedns

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the first one.

@MrHohn
Copy link
Member Author

MrHohn commented Sep 26, 2016

The new commit makes a bit enhancement according to the comments. Also a DNS known issues sub-section is added for the Alpine issue: #30215.

But isn't this README being moved out to http://kubernetes.io/docs/admin/dns? (#32931)

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI Kubemark GCE e2e failed for commit cf8899b. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark gci e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@MrHohn
Copy link
Member Author

MrHohn commented Oct 12, 2016

Sent a new PR on kubernetes/website#1449. Closing this.

@MrHohn MrHohn closed this Oct 12, 2016
@MrHohn MrHohn deleted the dns-addon-readme branch May 16, 2017 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to documentation. sig/network Categorizes an issue or PR as relevant to SIG Network. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants