# Pods: running containers in Kubernetes

https://kubernetes.io/docs/concepts/workloads/pods/

A Pod is the basic execution unit of a Kubernetes application–the smallest and simplest unit in the Kubernetes object model that you create or deploy. A Pod represents processes running on your Cluster.

A Pod encapsulates an application’s container (or, in some cases, multiple containers), storage resources, a unique network IP, and options that govern how the container(s) should run. A Pod represents a unit of deployment: a single instance of an application in Kubernetes, which might consist of either a single container or a small number of containers that are tightly coupled and that share resources.

Docker is the most common container runtime used in a Kubernetes Pod, but Pods support other container runtimes as well.

Pods in a Kubernetes cluster can be used in two main ways:

<ul><li><strong>Pods that run a single container</strong>. The “one-container-per-Pod” model is the most common Kubernetes use case; in this case, you can think of a Pod as a wrapper around a single container, and Kubernetes manages the Pods rather than the containers directly.</li><li><p><strong>Pods that run multiple containers that need to work together</strong>. A Pod might encapsulate an application composed of multiple co-located containers that are tightly coupled and need to share resources. These co-located containers might form a single cohesive unit of service–one container serving files from a shared volume to the public, while a separate “sidecar” container refreshes or updates those files. The Pod wraps these containers and storage resources together as a single manageable entity.


Now, we’ll
start **reviewing all types of Kubernetes objects (or resources)** in greater detail, so
you’ll understand when, how, and why to use each of them. We’ll start with pods,
because they’re the central, most important, concept in Kubernetes. Everything
else either manages, exposes, or is used by pods.

> Centralni, najpomembnejši koncept v Kubernetesu
- Vse ostalo upravlja, izpostavlja oziroma uporablja stroke


## Introducing pods

You’ve already learned that a pod is a **co-located group of containers** and represents
the basic building block in Kubernetes. **Instead of deploying containers individually,
you always deploy and operate on a pod of containers**. We’re not implying that a pod
always includes more than one container—it’s common for pods to contain only a single
container. The key thing about pods is that when a pod does contain multiple containers,
**all of them are always run on a single worker node** — it never spans multiple
worker nodes. 

> **UNDERSTANDING WHY MULTIPLE CONTAINERS ARE BETTER THAN ONE CONTAINER RUNNING MULTIPLE PROCESSES**  Imagine an app consisting of multiple processes that either communicate through
IPC (Inter-Process Communication) or through locally stored files, which requires
them to run on the same machine. Because in Kubernetes you always run processes in
containers and each container is much like an isolated machine, you may think it
makes sense to run multiple processes in a single container, but you shouldn’t do that.
**Containers are designed to run only a single process per container** (unless the
process itself spawns child processes). If you run multiple unrelated processes in a
single container, **it is your responsibility to keep all those processes running, manage
their logs, and so on**. For example, you’d have to include a mechanism for automatically
restarting individual processes if they crash. Also, all those processes would
log to the same standard output, so you’d have a hard time figuring out what process
logged what. Therefore, you need to run each process in its own container. That’s how Docker
and Kubernetes are meant to be used.


- Another indication that containers should only run a single process is the fact that the container runtime only restarts the container when the container’s root process dies. It doesn’t care about any child processes created by this root process. If it spawns child processes, it alone is responsible for keeping all these processes running.

### Understanding Pods

Because you’re not supposed to group multiple processes into a single container, it’s
obvious you need another higher-level construct that will allow you to bind containers
together and manage them as a single unit. This is the reasoning behind pods.

A pod of containers allows you to run closely related processes together and provide
them with (almost) the same environment as if they were all running in a single
container, while keeping them somewhat isolated. This way, you get the best of both
worlds. You can take advantage of all the features containers provide, while at the
same time giving the processes the illusion of running together.

Because all containers of a pod run under the same Network and UTS namespaces
(we’re talking about Linux namespaces here), they all share the same hostname and
network interfaces.

Similarly, all containers of a pod run under the same IPC namespace
and can communicate through IPC. In the latest Kubernetes and Docker versions, they
can also share the same PID namespace, but that feature isn’t enabled by default.

But when it comes to the filesystem, things are a little different. Because most of the
**container’s filesystem comes from the container image, by default, the filesystem of
each container is fully isolated from other containers**. However, it’s possible to have
them share file directories using a Kubernetes concept called a Volume.

- One thing to stress here is that because containers in a pod run in the same Network
namespace, they share the same IP address and port space. This means processes running
in containers of the same pod need to take care not to bind to the same port
numbers or they’ll run into port conflicts. But this only concerns containers in the
same pod. Containers of different pods can never run into port conflicts, because
each pod has a separate port space. 

- All the containers in a pod also have the same
loopback network interface, so a container can communicate with other containers in
the same pod through localhost.

**INTRODUCING THE FLAT INTER-POD NETWORK**
 
All pods in a Kubernetes cluster reside in a single flat, shared, network-address space
(shown in figure 3.2), which means every pod can access every other pod at the other
pod’s IP address. No NAT (Network Address Translation) gateways exist between them.
When two pods send network packets between each other, they’ll each see the actual
IP address of the other as the source IP in the packet.

Consequently, communication between pods is always simple. It doesn’t matter if two
pods are scheduled onto a single or onto different worker nodes; in both cases the
containers inside those pods can communicate with each other across the flat NATless
network, much like computers on a local area network (LAN), regardless of the
actual inter-node network topology. Like a computer on a LAN, each pod gets its own
IP address and is accessible from all other pods through this network established specifically
for pods. This is usually achieved through an additional software-defined network
layered on top of the actual network.

To sum up what’s been covered in this section: pods are logical hosts and behave
much like physical hosts or VMs in the non-container world. **Processes running in the
same pod are like processes running on the same physical or virtual machine, except
that each process is encapsulated in a container.**

### Organizing containers across pods properly

You should think of pods as separate machines, but where each one hosts only a certain
app. Unlike the old days, when we used to cram all sorts of apps onto the same
host, we don’t do that with pods. Because pods are relatively lightweight, you can have
as many as you need without incurring almost any overhead. **Instead of stuffing everything
into a single pod, you should organize apps into multiple pods, where each one
contains only tightly related components or processes.**

> Although nothing is stopping you from running both the frontend server and the
database in a single pod with two containers, it isn’t the most appropriate way.

**SPLITTING MULTI-TIER APPS INTO MULTIPLE PODS**

- If both the frontend and backend are in the same pod, then both will always be
run on the same machine.
    - If you have a two-node Kubernetes cluster and only this single
pod, you’ll only be using a single worker node and not taking advantage of the
computational resources (CPU and memory) you have at your disposal on the second
node.
    - Splitting the pod into two would allow Kubernetes to schedule the frontend to
one node and the backend to the other node, thereby improving the utilization of
your infrastructure.

**SPLITTING INTO MULTIPLE PODS TO ENABLE INDIVIDUAL SCALING**:
Another reason why you shouldn’t put them both into a single pod is scaling. A pod is
also the basic unit of scaling. Kubernetes can’t horizontally scale individual containers;
instead, it scales whole pods. If your pod consists of a frontend and a backend container,
when you scale up the number of instances of the pod to, let’s say, two, you end
up with two frontend containers and two backend containers.
Usually, frontend components have completely different scaling requirements
than the backends, so we tend to scale them individually. Not to mention the fact that
backends such as databases are usually much harder to scale compared to (stateless)
frontend web servers. If you need to scale a container individually, this is a clear indication
that it needs to be deployed in a separate pod.

**UNDERSTANDING WHEN TO USE MULTIPLE CONTAINERS IN A POD**: The main reason to put multiple containers into a single pod is when the application
consists of one main process and one or more complementary processes, as shown in
figure 3.3.  For example, the main container in a pod could be a web server that serves files from
a certain file directory, while an additional container (a sidecar container) periodically
downloads content from an external source and stores it in the web server’s
directory. In chapter 6 you’ll see that you need to use a Kubernetes Volume that you
mount into both containers.

Other examples of sidecar containers include log rotators and collectors, data processors,
communication adapters, and others.

**DECIDING WHEN TO USE MULTIPLE CONTAINERS IN A POD**

To recap how containers should be grouped into pods—when deciding whether to
put two containers into a single pod or into two separate pods, you always need to ask
yourself the following questions:
- Do they need to be run together or can they run on different hosts?
- Do they represent a single whole or are they independent components?
- Must they be scaled together or individually?

Basically, you should always gravitate toward running containers in separate pods,
unless a specific reason requires them to be part of the same pod. Figure 3.4 will help
you memorize this.

## Creating pods from YAML or JSON descriptors

### Creating a simple YAML descriptor for a pod

You’re going to create a file called `ex01-kubia-manual.yaml`.



```yml
apiVersion: v1
kind: Pod
metadata:
  name: kubia-manual
spec:
  containers:
  - image: luksa/kubia
    name: kubia
    ports:
    - containerPort: 8080
      protocol: TCP
```

 It conforms to the v1 version of the Kubernetes API. The
type of resource you’re describing is a pod, with the name kubia-manual. The pod
consists of a single container based on the luksa/kubia image. You’ve also given a
name to the container and indicated that it’s listening on port 8080.

> **SPECIFYING CONTAINER PORTS**: Specifying ports in the pod definition is purely informational. Omitting them has no
effect on whether clients can connect to the pod through the port or not. If the container is accepting connections 
through a port bound to the 0.0.0.0 address, other
pods can always connect to it, even if the port isn’t listed in the pod spec explicitly. But
it makes sense to define the ports explicitly so that everyone using your cluster can
quickly see what ports each pod exposes. Explicitly defining ports also allows you to
assign a name to each port, which can come in handy, as you’ll see later in the book.

### Using kubectl create to create the pod

To create the pod from your YAML file, use the kubectl create command:
    
    $ kubectl create -f kubia-manual.yaml
    
The kubectl create -f command is used for creating any resource (not only pods)
from a YAML or JSON file.

**RETRIEVING THE WHOLE DEFINITION OF A RUNNING POD**

After creating the pod, you can ask Kubernetes for the full YAML of the pod. You’ll
see it’s similar to the YAML you saw earlier. You’ll learn about the additional fields
appearing in the returned definition in the next sections. Go ahead and use the following
command to see the full descriptor of the pod:

    kubectl get pod kubia-manual -o yaml

If you’re more into JSON, you can also tell kubectl to return JSON instead of YAML
like this (this works even if you used YAML to create the pod):

    kubectl get po kubia-manual -o json

**SEEING YOUR NEWLY CREATED POD IN THE LIST OF PODS**

Your pod has been created, but how do you know if it’s running? Let’s list pods to see
their statuses:

    kubectl get pods

There’s your kubia-manual pod. Its status shows that it’s running. If you’re like me,
you’ll probably want to confirm that’s true by talking to the pod. You’ll do that in a
minute. First, you’ll look at the app’s log to check for any errors.

### Viewing application logs

Your little Node.js application logs to the process’s standard output. Containerized
applications usually log to the standard output and standard error stream instead of
writing their logs to files. This is to allow users to view logs of different applications in
a simple, standard way.

The container runtime (Docker in your case) redirects those streams to files and
allows you to get the container’s log by running

    docker logs <container id>

You could use ssh to log into the node where your pod is running and retrieve its logs
with docker logs, but Kubernetes provides an easier way.

**RETRIEVING A POD’S LOG WITH KUBECTL LOGS**

To see your pod’s log (more precisely, the container’s log) you run the following command
on your local machine (no need to ssh anywhere):
    
    kubectl logs kubia-manual

You haven’t sent any web requests to your Node.js app, so the log only shows a single
log statement about the server starting up. As you can see, retrieving logs of an application
running in Kubernetes is incredibly simple if the pod only contains a single
container.

> NOTE: Container logs are automatically rotated daily and every time the log file
reaches 10MB in size. The kubectl logs command only shows the log entries
from the last rotation.

**SPECIFYING THE CONTAINER NAME WHEN GETTING LOGS OF A MULTI-CONTAINER POD**

If your pod includes multiple containers, you have to explicitly specify the container
name by including the -c <container name> option when running kubectl logs. In
your kubia-manual pod, you set the container’s name to kubia, so if additional containers
exist in the pod, you’d have to get its logs like this:

    kubectl logs kubia-manual -c kubia

Note that you can only retrieve container logs of pods that are still in existence. When
a pod is deleted, its logs are also deleted. To make a pod’s logs available even after the
pod is deleted, you need to set up centralized, cluster-wide logging, which stores all
the logs into a central store.

### Sending requests to the pod

The pod is now running—at least that’s what kubectl get and your app’s log say. But
how do you see it in action? In the previous chapter, you used the kubectl expose
command to create a service to gain access to the pod externally. You’re not going to
do that now, because a whole chapter is dedicated to services, and you have other ways
of connecting to a pod for testing and debugging purposes. One of them is through
port forwarding.

**FORWARDING A LOCAL NETWORK PORT TO A PORT IN THE POD**

When you want to talk to a specific pod without going through a service (for debugging
or other reasons), Kubernetes allows you to configure port forwarding to the
pod. This is done through the kubectl port-forward command. The following
command will forward your machine’s local port 7000 to port 8080 of your kubiamanual
pod:

    kubectl port-forward kubia-manual 7000:8080

The port forwarder is running and you can now connect to your pod through the
local port.

**CONNECTING TO THE POD THROUGH THE PORT FORWARDER**

In a different terminal, you can now use curl to send an HTTP request to your pod
through the kubectl port-forward proxy running on localhost:8888:

    curl localhost:7000

Figure 3.5 shows an overly simplified view of what happens when you send the request.
In reality, a couple of additional components sit between the kubectl process and the
pod, but they aren’t relevant right now.

Using port forwarding like this is an effective way to test an individual pod.

## Organizing pods with labels

At this point, you have two pods running in your cluster. When deploying actual
applications, most users will end up running many more pods. As the number of
pods increases, the need for categorizing them into subsets becomes more and
more evident.


For example, with microservices architectures, the number of deployed microservices
can easily exceed 20 or more. Those components will probably be replicated
(multiple copies of the same component will be deployed) and multiple versions or
releases (stable, beta, canary, and so on) will run concurrently. This can lead to hundreds
of pods in the system. Without a mechanism for organizing them, you end up
with a big, incomprehensible mess, such as the one shown in figure 3.6. The figure
shows pods of multiple microservices, with several running multiple replicas, and others
running different releases of the same microservice.


It’s evident you need a way of organizing them into smaller groups based on arbitrary
criteria, so every developer and system administrator dealing with your system can easily
see which pod is which. And you’ll want to operate on every pod belonging to a certain
group with a single action instead of having to perform the action for each pod
individually.

**Organizing pods and all other Kubernetes objects is done through labels.**

### Introducing labels

Labels are a simple, yet incredibly powerful, Kubernetes feature for organizing not
only pods, but all other Kubernetes resources. A label is an **arbitrary key-value pair** you
attach to a resource, which is then utilized when selecting resources using label selectors
(resources are filtered based on whether they include the label specified in the selector).
A resource can have more than one label, as long as the keys of those labels are
unique within that resource. You usually attach labels to resources when you create
them, but you can also add additional labels or even modify the values of existing
labels later without having to recreate the resource.

Let’s turn back to the microservices example from figure 3.6. By adding labels to
those pods, you get a much-better-organized system that everyone can easily make
sense of. Each pod is labeled with two labels:
- app, which specifies which app, component, or microservice the pod belongs to.
- rel, which shows whether the application running in the pod is a stable, beta,
or a canary release.

> DEFINITION A canary release is when you deploy a new version of an application
next to the stable version, and only let a small fraction of users hit the
new version to see how it behaves before rolling it out to all users. This prevents
bad releases from being exposed to too many users.

By adding these two labels, you’ve essentially organized your pods into two dimensions.

Every developer or ops person with access to your cluster can now easily see the system’s
structure and where each pod fits in by looking at the pod’s labels.

### Specifying labels when creating a pod

Now, you’ll see labels in action by creating a new pod with two labels. Create a new file
called kubia-manual-with-labels.yaml with the contents of the following listing.

```yml
apiVersion: v1
kind: Pod
metadata:
  name: kubia-manual-v2
  labels:
    creation_method: manual
    env: prod
spec:
  containers:
   - image: luksa/kubia
     name: kubia
     ports:
     - containerPort: 8080
       protocol: TCP
```

You’ve included the labels creation_method=manual and env=data.labels section.
You’ll create this pod now:

    kubectl create -f kubia-manual-with-labels.yaml

The kubectl get pods command doesn’t list any labels by default, but you can see
them by using the --show-labels switch:

    kubectl get pod --show-labels

Instead of listing all labels, if you’re only interested in certain labels, you can specify
them with the -L switch and have each displayed in its own column. List pods again
and show the columns for the two labels you’ve attached to your kubia-manual-v2 pod:

    kubectl get pod -L creation_method,env

### Modifying labels of existing pods

Labels can also be added to and modified on existing pods. Because the kubia-manual
pod was also created manually, let’s add the creation_method=manual label to it:

    kubectl label pod kubia-manual creation_method=manual

Now, let’s also change the env=prod label to env=debug on the kubia-manual-v2 pod,
to see how existing labels can be changed.

> NOTE You need to use the --overwrite option when changing existing labels.

    kubectl label po kubia-manual-v2 env=debug --overwrite

List the pods again to see the updated labels:
    
    kubectl get po -L creation_method,env

As you can see, attaching labels to resources is trivial, and so is changing them on
existing resources. It may not be evident right now, but this is an incredibly powerful
feature, as you’ll see in the next chapter. But first, let’s see what you can do with these
labels, in addition to displaying them when listing pods.

## Listing subsets of pods through label selectors

Attaching labels to resources so you can see the labels next to each resource when listing
them isn’t that interesting. But labels go hand in hand with label selectors. Label
selectors allow you to select a subset of pods tagged with certain labels and perform an
operation on those pods. A label selector is a criterion, which filters resources based
on whether they include a certain label with a certain value.
A label selector can select resources based on whether the resource
- Contains (or doesn’t contain) a label with a certain key
- Contains a label with a certain key and value
- Contains a label with a certain key, but with a value not equal to the one you
specify

### Listing pods using a label selector

Let’s use label selectors on the pods you’ve created so far. To see all pods you created
manually (you labeled them with creation_method=manual), do the following:

    kubectl get po -l creation_method=manual
    
To list all pods that include the env label, whatever its value is:

    kubectl get po -l env

And those that don’t have the env label:

    kubectl get po -l '!env'

> NOTE Make sure to use single quotes around !env, so the bash shell doesn’t
evaluate the exclamation mark.

## Using labels and selectors to constrain pod scheduling

All the pods you’ve created so far have been scheduled pretty much randomly across
your worker nodes. As I’ve mentioned in the previous chapter, this is the proper way
of working in a Kubernetes cluster. Because Kubernetes exposes all the nodes in the
cluster as a single, large deployment platform, it shouldn’t matter to you what node a
pod is scheduled to. Because each pod gets the exact amount of computational
resources it requests (CPU, memory, and so on) and its accessibility from other pods
isn’t at all affected by the node the pod is scheduled to, usually there shouldn’t be any
need for you to tell Kubernetes exactly where to schedule your pods.

Certain cases exist, however, where you’ll want to have at least a little say in where
a pod should be scheduled. A good example is when your hardware infrastructure
isn’t homogenous. If part of your worker nodes have spinning hard drives, whereas
others have SSDs, you may want to schedule certain pods to one group of nodes and
the rest to the other. Another example is when you need to schedule pods performing
intensive GPU-based computation only to nodes that provide the required GPU
acceleration.

You never want to say specifically what node a pod should be scheduled to, because
that would couple the application to the infrastructure, whereas the whole idea of
Kubernetes is hiding the actual infrastructure from the apps that run on it. But if you
want to have a say in where a pod should be scheduled, instead of specifying an exact
node, you should describe the node requirements and then let Kubernetes select a
node that matches those requirements. This can be done through node labels and
node label selectors.

### Scheduling to one specific node

Similarly, you could also schedule a pod to an exact node, because each node also has
a unique label with the key kubernetes.io/hostname and value set to the actual hostname
of the node. But setting the nodeSelector to a specific node by the hostname
label may lead to the pod being unschedulable if the node is offline. You shouldn’t
think in terms of individual nodes. Always think about logical groups of nodes that satisfy
certain criteria specified through label selectors.



## Stopping and removing pods

### Deleting a pod by name

First, delete the kubia-gpu pod by name:
    
    kubectl delete po kubia-gpu


By deleting a pod, you’re instructing Kubernetes to terminate all the containers that are
part of that pod. Kubernetes sends a SIGTERM signal to the process and waits a certain
number of seconds (30 by default) for it to shut down gracefully. If it doesn’t shut down
in time, the process is then killed through SIGKILL. To make sure your processes are
always shut down gracefully, they need to handle the SIGTERM signal properly.

> TIP You can also delete more than one pod by specifying multiple, space-separated
names (for example, kubectl delete po pod1 pod2).

### Deleting pods using label selectors

Instead of specifying each pod to delete by name, you’ll now use what you’ve learned
about label selectors to stop both the kubia-manual and the kubia-manual-v2 pod.
Both pods include the creation_method=manual label, so you can delete them by
using a label selector:

    kubectl delete po -l creation_method=manual

### Deleting (almost) all resources in a namespace

You can delete the ReplicationController and the pods, as well as all the Services
you’ve created, by deleting all resources in the current namespace with a single
command:
    
    kubectl delete all --all

The first all in the command specifies that you’re deleting resources of all types, and
the --all option specifies that you’re deleting all resource instances instead of specifying
them by name (you already used this option when you ran the previous delete
command).

> NOTE Deleting everything with the all keyword doesn’t delete absolutely
everything. Certain resources (like Secrets, which we’ll introduce in chapter 7)
are preserved and need to be deleted explicitly.

As it deletes resources, kubectl will print the name of every resource it deletes. In the
list, you should see the kubia ReplicationController and the kubia-http Service you
created in chapter 2.

> NOTE The kubectl delete all --all command also deletes the kubernetes
Service, but it should be recreated automatically in a few moments.