Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add troubleshooting section to kube-dns readme #1449

Merged
merged 2 commits into from
Nov 21, 2016

Conversation

MrHohn
Copy link
Member

@MrHohn MrHohn commented Oct 12, 2016

Fix kubernetes/kubernetes#23981.

This PR added some generic troubleshooting information for kube-dns addon. The addition is pasted below.

@thockin @bowei @philips


Troubleshooting tips.

If above nslookup command does not work, here are some hints:

Check the local DNS configuration first

Take a look inside the resolv.conf file. (See "Inheriting DNS from the node" and "Known issues" sections for more information)

cat /etc/resolv.conf

You should see the search path and nameserver are set up like below: (search path may vary for different cloud providers)

search default.svc.cluster.local svc.cluster.local cluster.local google.internal c.gce_project_id.internal
nameserver 10.0.0.10
options ndots:5

DNS known issues

If you are using Alpine(lower than 3.2) as base image, dns may not work properly because Alpine broke dns search paths but fixed it in some later release. Please take a look at here for more detail.

Quick diagnosis

If below appears, usually means something is wrong with kube-dns addon or Services

/ # nslookup kubernetes
Server:    10.0.0.10
Address 1: 10.0.0.10

nslookup: can't resolve 'kubernetes'

or

/ # nslookup kubernetes
Server:    10.0.0.10
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local

nslookup: can't resolve 'kubernetes'

Let's continue to debug it.

Is dns pod running?

kubectl get pods command could tell. (namespace may be "default" or others if you manually deploy it)

kubectl get pods --namespace=kube-system -l k8s-app=kube-dns

You should see something like:

NAME                                                       READY     STATUS    RESTARTS   AGE
...
kube-dns-v19-ezo1y                                         3/3       Running   0           1h
...

If no pod is running or failed/completed, then maybe because this dns addon would not be deployed by default in your current environment and you have not deployed it manually.

Otherwise, please try next hint.

Is dns pod working properly?

Use kubectl logs command for detective work.

kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c kubedns
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c dnsmasq
kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c healthz

See if you could find any suspicious log. If not, please continue.

Is dns service up?

Varify through kubectl get service command.

kubectl get svc --namespace=kube-system

You should see something like:

NAME                    CLUSTER-IP     EXTERNAL-IP   PORT(S)             AGE
...
kube-dns                10.0.0.10      <none>        53/UDP,53/TCP        1h
...

If you have created the service or in the case it should be created by default but it does not appear, goto this debugging services page for more information.

Otherwise, let's go ahead.

Are dns endpoints exposed?

Varify through kubectl get endpoints command.

kubectl get ep kube-dns --namespace=kube-system

You should see something like:

NAME       ENDPOINTS                       AGE
kube-dns   10.180.3.17:53,10.180.3.17:53    1h

If not, again, goto this debugging services page and look for the endpoints section.

At last, if you reach here, the final suggestion is still goto this debugging services page and look for the "Is the kube-proxy working?" section.

Hopefully you would have got some meaningful clues now. For more Kubernetes DNS example go here.


This change is Reviewable

```

#### DNS known issues
If you are using `Alpine`(lower than 3.2) as base image, dns may not work properly because `Alpine` broke dns search paths but fixed it in some later release. Please take a look at [here](https://github.com/kubernetes/kubernetes/issues/30215) for more detail.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alpine 3.4 was when it got fixed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3.14.1 base image still not working for non-fqdn resolution

@thockin thockin assigned bowei and unassigned lavalamp Oct 13, 2016
@thockin
Copy link
Member

thockin commented Oct 13, 2016

Assigning to Bowei for review

@@ -182,6 +182,111 @@ Address 1: 10.0.0.1

If you see that, DNS is working correctly.

### Troubleshooting tips.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No periods in headers. Please change to "Troubleshooting Tips"

@@ -165,7 +165,7 @@ NAME READY STATUS RESTARTS AGE
busybox 1/1 Running 0 <some-time>
```

### Validate DNS works
### Validate DNS works.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove period from header. Change to "Validate that DNS is working"

@@ -182,6 +182,111 @@ Address 1: 10.0.0.1

If you see that, DNS is working correctly.

### Troubleshooting tips.
If above `nslookup` command does not work, here are some hints:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If the nslookup command fails, check the following:"

If above `nslookup` command does not work, here are some hints:

#### Check the local DNS configuration first
Take a look inside the resolv.conf file. (See "Inheriting DNS from the node" and "Known issues" sections for more information)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please provide actual links to these sections.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two sections are currently in kubernetes/build/kube-dns/README.md, but are being removed in kubernetes/kubernetes#32931.

Don't know why they are not included in previous PR. So I moved them here as well.

Otherwise, let's go ahead.

#### Are dns endpoints exposed?
Varify through `kubectl get endpoints` command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Varify -> "Verify"

Change to "You can verify that dns endpoints are exposed by using the kubectl get endpoints command."

See if you could find any suspicious log. If not, please continue.

#### Is dns service up?
Varify through `kubectl get service` command.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Varify -> Verify

"Verify that the DNS service is up by using the kubectl get service command.

...
```

If you have created the service or in the case it should be created by default but it does not appear, goto this [debugging services page](http://kubernetes.io/docs/user-guide/debugging-services/) for more information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

goto -> see


If you have created the service or in the case it should be created by default but it does not appear, goto this [debugging services page](http://kubernetes.io/docs/user-guide/debugging-services/) for more information.

Otherwise, let's go ahead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove.

kube-dns 10.180.3.17:53,10.180.3.17:53 1h
```

If not, again, goto this [debugging services page](http://kubernetes.io/docs/user-guide/debugging-services/) and look for the endpoints section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"If you do not see the endpoints, see endpoints section in the debugging services documentation."


At last, if you reach here, the final suggestion is still goto this [debugging services page](http://kubernetes.io/docs/user-guide/debugging-services/) and look for the "Is the kube-proxy working?" section.

Hopefully you would have got some meaningful clues now. For more Kubernetes DNS example go [here](https://github.com/kubernetes/kubernetes/tree/master/examples/cluster-dns).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove first sentence.

Change second sentence to "For additional Kubernetes DNS examples, see the cluster-dns examples in the Kubernetes GitHub repository."

@MrHohn
Copy link
Member Author

MrHohn commented Oct 18, 2016

Many thanks for the detailed comments! I changed and pushed the README according to the instructions.

PS:
Both "Inheriting DNS from the node" and "Known issues" sections are currently in kubernetes/build/kube-dns/README.md, but are being removed in kubernetes/kubernetes#32931.

Don't know why they are not included in previous PR. So I moved them here as well.

@devin-donnelly
Copy link
Contributor

@bowei, can you do a tech review on this, please?

@philips
Copy link
Contributor

philips commented Oct 28, 2016

LGTM

Copy link
Member

@bowei bowei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comment, otherwise looks good

Errors such as the following indicate a problem with the kube-dns add-on or associated Services:

```
/ # nslookup kubernetes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put kubectl exec in front of the command? may not be obvious for novices where this is run from

or

```
/ # nslookup kubernetes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kubectl exec

...
```

If you see that no pod is running or that the pod has failed/completed, the dns add-on may not be deployed by default in your current environment and you have not deployed it manually.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...current environment and you will have to deploy it manually.


#### Check for Errors in the DNS pod

Use `kubectl logs` command for detective work.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for detective work -> to see logs for the DNS daemons

kubectl logs --namespace=kube-system $(kubectl get pods --namespace=kube-system -l k8s-app=kube-dns -o name) -c healthz
```

See if you could find any suspicious log like below:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably not give the example below, it is confusing -- instead you should probably say something about the log format (W, E, F letter at the beginning represent Warning, Error and Failure) and searching for entries that have these as the logging level.

F1017 17:39:44.063280 90225 server.go:56] Failed to create a kubernetes client: invalid configuration: no configuration has been provided
```

Please use [kubernetes issues](https://github.com/kubernetes/kubernetes/issues) to trace/report unintended errors.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trace/report...errors. -> report unexpected errors.

kubectl get svc --namespace=kube-system
```

You should see something like:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove "something like"

@MrHohn
Copy link
Member Author

MrHohn commented Oct 28, 2016

Thanks for the comments! Adjusted and pushed a new commit.

@bowei
Copy link
Member

bowei commented Oct 28, 2016

LGTM

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 21, 2016
@jaredbhatti jaredbhatti merged commit fde38be into kubernetes:master Nov 21, 2016
@MrHohn MrHohn deleted the dns-troubleshooting branch October 5, 2017 00:58
mikutas pushed a commit to mikutas/k8s-website that referenced this pull request Sep 22, 2022
…bernetes#1449)

* update 2.12 docs to use `linkerd install --crds` before `install`

The Linkerd CRDs must be installed before installing the control plane.
Most of the "Tasks" docs that require Linkerd to be installed just
reference the "Install" task, which demonstrates the use of
`install --crds`. However, some of the docs demonstrate alternative
configurations or installation methods, and actually show running the
`install` command. Many of those docs don't indicate that
`linkerd install --crds` must be run first.

This branch updates the 2.12 "Tasks" docs to show the use of
`linkerd install --crds` prior to running a `linkerd install` command. I
think I got all of them?

I didn't add `install --crds` in a couple of places where the install
command is not piped directly into `kubectl apply`. For example, in the
private docker registry documentation, we show piping `linkerd install`
into `grep` to get the docker image versions, and in that case, we are
not actually installing Linkerd into a cluster in that command example,
so the installation won't actually fail.

* update helm install instructions too
Okabe-Junya pushed a commit to Okabe-Junya/website that referenced this pull request Dec 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add some troubleshooting docs to the DNS addon README.md
10 participants