-
Notifications
You must be signed in to change notification settings - Fork 4
Troubleshooting Kubernetes
Some tips for general troubleshooting within a kubernetes cluster.
You should never have the requirement to log on to any of the worker nodes. Just login to your master by
$ ssh <k8smaster ip>
while being logged in with user ansible
on your Ansible server.
Then switch to user ubuntu
with
$ sudo su - ubuntu
Within the MOADSD-NG environment you can use Stern when you want to get logs from multiple Kubernetes objects like Service, Deployment or Job/CronJob. Stern lets you get color-coded logs from multiple containers inside the pods from all related Kubernetes objects of your application/microservice. With a simple command like below, you can tail logs from more relevant containers:
$ stern -n <namespace> <app-name> -t --since 10m
To easily follow realtime logs from a whole namespace, e.g. smartcheck, use the following command while logged in to your master:
$ stern -n smartcheck . -t --since 10m
See: https://github.com/wercker/stern
$ kubectl --namespace flaskapp describe pod flaskapp-b47654947-9jrkr
Name: flaskapp-b47654947-9jrkr
Namespace: flaskapp
Priority: 0
PriorityClassName: <none>
Node: k8sworker1/192.168.1.151
Start Time: Tue, 13 Nov 2018 07:33:50 +0000
Labels: app.kubernetes.io/instance=flaskapp
app.kubernetes.io/name=flaskapp
pod-template-hash=603210503
Annotations: <none>
Status: Pending
IP: 10.244.1.78
Controlled By: ReplicaSet/flaskapp-b47654947
Containers:
flaskapp:
Container ID:
Image: markus/flaskapp:latest
Image ID:
Port: 5001/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Liveness: http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
Readiness: http-get http://:http/ delay=0s timeout=1s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qqjzc (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-qqjzc:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qqjzc
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m default-scheduler Successfully assigned flaskapp/flaskapp-b47654947-9jrkr to k8sworker1
Normal Pulling 2m (x4 over 3m) kubelet, k8sworker1 pulling image "markus/flaskapp:latest"
Warning Failed 2m (x4 over 3m) kubelet, k8sworker1 Failed to pull image "markus/flaskapp:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for markus/flaskapp, repository does not exist or may require 'docker login'
Warning Failed 2m (x4 over 3m) kubelet, k8sworker1 Error: ErrImagePull
Normal BackOff 1m (x6 over 3m) kubelet, k8sworker1 Back-off pulling image "markus/flaskapp:latest"
Warning Failed 1m (x7 over 3m) kubelet, k8sworker1 Error: ImagePullBackOff
$ kubectl --namespace flaskapp edit pods flaskapp-7c45b98764-rf8wb
That opens a vi with the pod configuration which you can directly modify.
To query events from a namespace you can do the following:
$ kubectl get events --namespace=flaskapp
LAST SEEN FIRST SEEN COUNT NAME KIND SUBOBJECT TYPE REASON SOURCE MESSAGE
42m 18h 213 flaskapp-7555c4974f-lvjps.1566629fd924b547 Pod spec.containers{flaskapp} Normal Pulling kubelet, k8sworker2 pulling image "markus/flaskapp:latest"
18m 18h 4673 flaskapp-7555c4974f-lvjps.156662a05ffeac90 Pod spec.containers{flaskapp} Normal BackOff kubelet, k8sworker2 Back-off pulling image "markus/flaskapp:latest"
11m 11m 1 flaskapp-b47654947-9jrkr.15669eb6e934d05d Pod Normal Scheduled default-scheduler Successfully assigned flaskapp/flaskapp-b47654947-9jrkr to k8sworker1
11m 11m 1 flaskapp.15669eb6d86fd585 Deployment Normal ScalingReplicaSet deployment-controller Scaled up replica set flaskapp-b47654947 to 1
11m 11m 1 flaskapp-b47654947.15669eb6e1ed6b7b ReplicaSet Normal SuccessfulCreate replicaset-controller Created pod: flaskapp-b47654947-9jrkr
10m 11m 4 flaskapp-b47654947-9jrkr.15669eb76ebd723b Pod spec.containers{flaskapp} Normal Pulling kubelet, k8sworker1 pulling image "markus/flaskapp:latest"
10m 11m 4 flaskapp-b47654947-9jrkr.15669eb7c83cb3a6 Pod spec.containers{flaskapp} Warning Failed kubelet, k8sworker1 Failed to pull image "markus/flaskapp:latest": rpc error: code = Unknown desc = Error response from daemon: pull access denied for markus/flaskapp, repository does not exist or may require 'docker login'
10m 11m 4 flaskapp-b47654947-9jrkr.15669eb7c83d275d Pod spec.containers{flaskapp} Warning Failed kubelet, k8sworker1 Error: ErrImagePull
10m 11m 6 flaskapp-b47654947-9jrkr.15669eb7ea66d8f6 Pod spec.containers{flaskapp} Normal BackOff kubelet, k8sworker1 Back-off pulling image "markus/flaskapp:latest"
3m 18h 4737 flaskapp-7555c4974f-lvjps.156662a05ffede13 Pod spec.containers{flaskapp} Warning Failed kubelet, k8sworker2 Error: ImagePullBackOff
1m 11m 41 flaskapp-b47654947-9jrkr.15669eb7ea6708e1 Pod spec.containers{flaskapp} Warning Failed kubelet, k8sworker1 Error: ImagePullBackOff
That also helps to identify the problem. In that case, pulling the image from docker hub did not work out.
Might be useful while exploring Kubernetes :-)
The following command will output false if RBAC is disabled and true otherwise:
$ kubectl get clusterroles > /dev/null 2>&1 && echo true || echo false
BASH setup autocomplete in bash into the current shell, bash-completion package should be installed first.
$ source <(kubectl completion bash)
add autocomplete permanently to your bash shell.
$ echo "source <(kubectl completion bash)" >> ~/.bashrc
ZSH setup autocomplete in zsh into the current shell
$ source <(kubectl completion zsh)
add autocomplete permanently to your zsh shell
$ echo "if [ $commands[kubectl] ]; then source <(kubectl completion zsh); fi" >> ~/.zshrc
Set which Kubernetes cluster kubectl communicates with and modifies configuration information. See Authenticating Across Clusters with kubeconfig documentation for detailed config file information.
Select a specific namespace to work with:
$ kubectl config set-context $(kubectl config current-context) --namespace=monitoring
Verify:
$ kubectl config view | grep namespace:
namespace: monitoring
Reset context to default:
$ kubectl config set-context kubernetes-admin@kubernetes --namespace=default
$ kubectl config view # Show Merged kubeconfig settings.
$
$ # use multiple kubeconfig files at the same time and view merged config
$ KUBECONFIG=~/.kube/config:~/.kube/kubconfig2 kubectl config view
$
$ # Get the password for the e2e user
$ kubectl config view -o jsonpath='{.users[?(@.name == "e2e")].user.password}'
$
$ kubectl config current-context # Display the current-context
$ kubectl config use-context my-cluster-name # set the default context to my-cluster-name
$
$ # add a new cluster to your kubeconf that supports basic auth
$ kubectl config set-credentials kubeuser/foo.kubernetes.com --username=kubeuser --password=kubepassword
$
$ # set a context utilizing a specific username and namespace.
$ kubectl config set-context gce --user=cluster-admin --namespace=foo && kubectl config use-context gce
Set default namespace as default in current context
$ kubectl config set-context $(kubectl config current-context) --namespace=default
Kubernetes manifests can be defined in json or yaml. The file extension .yaml, .yml, and .json can be used.
$ kubectl create -f ./my-manifest.yaml # create resource(s)
$ kubectl create -f ./my1.yaml -f ./my2.yaml # create from multiple files
$ kubectl create -f ./dir # create resource(s) in all manifest files in dir
$ kubectl create -f https://git.io/vPieo # create resource(s) from url
$ kubectl create deployment nginx --image=nginx # start a single instance of nginx
$ kubectl explain pods,svc # get the documentation for pod and svc manifests
$
$ # Create multiple YAML objects from stdin
$ cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
name: busybox-sleep
spec:
containers:
- name: busybox
image: busybox
args:
- sleep
- "1000000"
---
apiVersion: v1
kind: Pod
metadata:
name: busybox-sleep-less
spec:
containers:
- name: busybox
image: busybox
args:
- sleep
- "1000"
EOF
$
$ # Create a secret with several keys
$ cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Secret
metadata:
name: mysecret
type: Opaque
data:
password: $(echo -n "s33msi4" | base64 -w0)
username: $(echo -n "jane" | base64 -w0)
EOF
$ # Get commands with basic output
$ kubectl get services # List all services in the namespace
$ kubectl get pods --all-namespaces # List all pods in all namespaces
$ kubectl get pods -o wide # List all pods in the namespace, with more details
$ kubectl get deployment my-dep # List a particular deployment
$ kubectl get pods --include-uninitialized # List all pods in the namespace, including uninitialized ones
$
$ # Describe commands with verbose output
$ kubectl describe nodes my-node
$ kubectl describe pods my-pod
$
$ kubectl get services --sort-by=.metadata.name # List Services Sorted by Name
$
$ # List pods Sorted by Restart Count
$ kubectl get pods --sort-by='.status.containerStatuses[0].restartCount'
$
$ # Get the version label of all pods with label app=cassandra
$ kubectl get pods --selector=app=cassandra rc -o \
jsonpath='{.items[*].metadata.labels.version}'
$
$ # Get all running pods in the namespace
$ kubectl get pods --field-selector=status.phase=Running
$
$ # Get ExternalIPs of all nodes
$ kubectl get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="ExternalIP")].address}'
$
$ # List Names of Pods that belong to Particular RC
$ # "jq" command useful for transformations that are too complex for jsonpath, it can be found at https://stedolan.github.io/jq/
$ sel=${$(kubectl get rc my-rc --output=json | jq -j '.spec.selector | to_entries | .[] | "\(.key)=\(.value),"')%?}
$ echo $(kubectl get pods --selector=$sel --output=jsonpath={.items..metadata.name})
$
$ # Check which nodes are ready
$ JSONPATH='{range .items[*]}{@.metadata.name}:{range @.status.conditions[*]}{@.type}={@.status};{end}{end}' \
&& kubectl get nodes -o jsonpath="$JSONPATH" | grep "Ready=True"
$
$ # List all Secrets currently in use by a pod
$ kubectl get pods -o json | jq '.items[].spec.containers[].env[]?.valueFrom.secretKeyRef.name' | grep -v null | sort | uniq
$
$ # List Events sorted by timestamp
$ kubectl get events --sort-by=.metadata.creationTimestamp
As of version 1.11 rolling-update have been deprecated (see CHANGELOG-1.11.md), use rollout instead.
$ kubectl set image deployment/frontend www=image:v2 # Rolling update "www" containers of "frontend" deployment, updating the image
$ kubectl rollout undo deployment/frontend # Rollback to the previous deployment
$ kubectl rollout status -w deployment/frontend # Watch rolling update status of "frontend" deployment until completion
$
$ # deprecated starting version 1.11
$ kubectl rolling-update frontend-v1 -f frontend-v2.json # (deprecated) Rolling update pods of frontend-v1
$ kubectl rolling-update frontend-v1 frontend-v2 --image=image:v2 # (deprecated) Change the name of the resource and update the image
$ kubectl rolling-update frontend --image=image:v2 # (deprecated) Update the pods image of frontend
$ kubectl rolling-update frontend-v1 frontend-v2 --rollback # (deprecated) Abort existing rollout in progress
$
$ cat pod.json | kubectl replace -f - # Replace a pod based on the JSON passed into std
$
$ # Force replace, delete and then re-create the resource. Will cause a service outage.
$ kubectl replace --force -f ./pod.json
$
$ # Create a service for a replicated nginx, which serves on port 80 and connects to the containers on port 8000
$ kubectl expose rc nginx --port=80 --target-port=8000
$
$ # Update a single-container pod's image version (tag) to v4
$ kubectl get pod mypod -o yaml | sed 's/\(image: myimage\):.*$/\1:v4/' | kubectl replace -f -
$
$ kubectl label pods my-pod new-label=awesome # Add a Label
$ kubectl annotate pods my-pod icon-url=http://goo.gl/XXBTWq # Add an annotation
$ kubectl autoscale deployment foo --min=2 --max=10 # Auto scale a deployment "foo"
$ kubectl patch node k8s-node-1 -p '{"spec":{"unschedulable":true}}' # Partially update a node
$
$ # Update a container's image; spec.containers[*].name is required because it's a merge key
$ kubectl patch pod valid-pod -p '{"spec":{"containers":[{"name":"kubernetes-serve-hostname","image":"new image"}]}}'
$
$ # Update a container's image using a json patch with positional arrays
$ kubectl patch pod valid-pod --type='json' -p='[{"op": "replace", "path": "/spec/containers/0/image", "value":"new image"}]'
$
$ # Disable a deployment livenessProbe using a json patch with positional arrays
$ kubectl patch deployment valid-deployment --type json -p='[{"op": "remove", "path": "/spec/template/spec/containers/0/livenessProbe"}]'
$
$ # Add a new element to a positional array
$ kubectl patch sa default --type='json' -p='[{"op": "add", "path": "/secrets/1", "value": {"name": "whatever" } }]'
The edit any API resource in an editor.
$ kubectl edit svc/docker-registry # Edit the service named docker-registry
$ KUBE_EDITOR="nano" kubectl edit svc/docker-registry # Use an alternative editor
$ kubectl scale --replicas=3 rs/foo # Scale a replicaset named 'foo' to 3
$ kubectl scale --replicas=3 -f foo.yaml # Scale a resource specified in "foo.yaml" to 3
$ kubectl scale --current-replicas=2 --replicas=3 deployment/mysql # If the deployment named mysql's current size is 2, scale mysql to 3
$ kubectl scale --replicas=5 rc/foo rc/bar rc/baz # Scale multiple replication controllers
$ kubectl delete -f ./pod.json # Delete a pod using the type and name specified in pod.json
$ kubectl delete pod,service baz foo # Delete pods and services with same names "baz" and "foo"
$ kubectl delete pods,services -l name=myLabel # Delete pods and services with label name=myLabel
$ kubectl delete pods,services -l name=myLabel --include-uninitialized # Delete pods and services, including uninitialized ones, with label name=myLabel
$ kubectl -n my-ns delete po,svc --all # Delete all pods and services, including uninitialized ones, in namespace my-ns,
$ kubectl logs my-pod # dump pod logs (stdout)
$ kubectl logs my-pod --previous # dump pod logs (stdout) for a previous instantiation of a container
$ kubectl logs my-pod -c my-container # dump pod container logs (stdout, multi-container case)
$ kubectl logs my-pod -c my-container --previous # dump pod container logs (stdout, multi-container case) for a previous instantiation of a container
$ kubectl logs -f my-pod # stream pod logs (stdout)
$ kubectl logs -f my-pod -c my-container # stream pod container logs (stdout, multi-container case)
$ kubectl run -i --tty busybox --image=busybox -- sh # Run pod as interactive shell
$ kubectl attach my-pod -i # Attach to Running Container
$ kubectl port-forward my-pod 5000:6000 # Listen on port 5000 on the local machine and forward to port 6000 on my-pod
$ kubectl exec my-pod -- ls / # Run command in existing pod (1 container case)
$ kubectl exec my-pod -c my-container -- ls / # Run command in existing pod (multi-container case)
$ kubectl top pod POD_NAME --containers # Show metrics for a given pod and its containers
$ kubectl cordon my-node # Mark my-node as unschedulable
$ kubectl drain my-node # Drain my-node in preparation for maintenance
$ kubectl uncordon my-node # Mark my-node as schedulable
$ kubectl top node my-node # Show metrics for a given node
$ kubectl cluster-info # Display addresses of the master and services
$ kubectl cluster-info dump # Dump current cluster state to stdout
$ kubectl cluster-info dump --output-directory=/path/to/cluster-state # Dump current cluster state to /path/to/cluster-state
$
$ # If a taint with that key and effect already exists, its value is replaced as specified.
$ kubectl taint nodes foo dedicated=special-user:NoSchedule
List all supported resource types along with their shortnames, API group, whether they are namespaced, and Kind:
$ kubectl api-resources
Other operations for exploring API resources:
$ kubectl api-resources --namespaced=true # All namespaced resources
$ kubectl api-resources --namespaced=false # All non-namespaced resources
$ kubectl api-resources -o name # All resources with simple output (just the resource name)
$ kubectl api-resources -o wide # All resources with expanded (aka "wide") output
$ kubectl api-resources --verbs=list,get # All resources that support the "list" and "get" request verbs
$ kubectl api-resources --api-group=extensions # All resources in the "extensions" API group
Wiki
About MOADSD-NG
Getting Started
MOADSD-NG Containerized
MOADSD-NG Manual Setup
- Configure your Server
- Preparing to work with Google GCP
- Preparing to work with Amazon AWS
- Preparing to work with VMware ESXi
Adapt MOADSD-NG to Your Needs
The MOADSD-NG Life-Cycle
- Setup the Environment
- Deploy the Software Stack
- Deploy the Endpoints
- Pause the Environment
- Resume the Environment
- Terminate the Environment
- Update the Environment
Software Components
- [Deep Security]
- [Deep Security Smart Check]
- Container Orchestration
- Container Registry
- Cluster Storage
- Jenkins
- GitLab
- Linkerd
- [Prometheus]
- [Grafana]
Tipps
Final Words
Deprecated Chapters