# Kubernetes
- Doc: https://kubernetes.io/docs/
  - [Kubernetes API](https://kubernetes.io/docs/reference/kubernetes-api/)
- Code: https://github.com/kubernetes/kubernetes

Books:
* Lukša, Marko. **Kubernetes in Action**. 2017. Manning. [Kubernetes in Action.ipynb](./Kubernetes/Kubernetes%20in%20Action.ipynb)
* 张磊. **深入剖析Kubernetes**. 2021. 人民邮电出版社.


Topics:
- 容器技术基础
  - 隔离, 限制
  - 容器运行时: CRI
- 体系结构 
- 集群部署
  - kubeadm
- 容器编排原理
  - Pod
  - Deployment
  - StatefulSet
  - DaemonSet
  - Job, CronJob
  - 声明式API
  - RBAC
  - Operator
- 存储原理
  - PV, PVC
  - CSI插件
- 网络原理
  - 网络模型
  - 三层网络方案
  - CNI插件
  - 网络隔离: NetworkPolicy
  - Service
  - Ingress
- 调度与资源管理
  - 资源模型
  - 调度器: 调度策略, 优先级, 抢占
  - Device Plugin
- 监控与日志
  - Metrics Server

# Kubernetes objects

- Kubernetes objects

Kubernetes objects are persistent entities in the Kubernetes system.
A Kubernetes object is a "record of intent"--once you create the object, the Kubernetes system will constantly work to ensure that object exists.

- Labels and Selectors

Labels are key/value pairs that are attached to objects, such as pods. Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system.

Via a label selector, the client/user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.

- Namespaces

In Kubernetes, namespaces provides a mechanism for isolating groups of resources within a single cluster.

- Annotations

You can use Kubernetes annotations to attach arbitrary non-identifying metadata to objects. 
Clients such as tools and libraries can retrieve this metadata.

- Field Selectors

Field selectors let you select Kubernetes resources based on the value of one or more resource fields.

- Finalizers

Finalizers are namespaced keys that tell Kubernetes to wait until specific conditions are met before it fully deletes resources marked for deletion.

- Owners and Dependents

In Kubernetes, some objects are owners of other objects. For example, a ReplicaSet is the owner of a set of Pods. 
These owned objects are dependents of their owner.

# Components

![](https://d33wubrfki0l68.cloudfront.net/2475489eaf20163ec0f54ddc1d92aa8d4c87c96b/e7c81/images/docs/components-of-kubernetes.svg)

Control Plane Components:
- kube-apiserver: The core component server that exposes the Kubernetes HTTP API
- etcd: Consistent and highly-available key value store for all API server data
- kube-scheduler: Looks for Pods not yet bound to a node, and assigns each Pod to a suitable node.
- kube-controller-manager: Runs controllers to implement Kubernetes API behavior.
- cloud-controller-manager: Integrates with underlying cloud provider(s).

Node Components:
- kubelet: Ensures that Pods are running, including their containers.
- kube-proxy: Maintains network rules on nodes to implement Services.
- Container runtime: Software responsible for running containers.

Addons:
- DNS: For cluster-wide DNS resolution
- Web UI (Dashboard): [General-purpose web UI for Kubernetes clusters](https://github.com/kubernetes/dashboard), For cluster management via a web interface
- Container Resource Monitoring: For collecting and storing container metrics
- Cluster-level Logging: For saving container logs to a central log store

## kubelet
- https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/

## kube-proxy
- https://kubernetes.io/docs/reference/command-line-tools-reference/kube-proxy/

## kube-apiserver
- https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/

## kube-controller-manager
- https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/

## kube-scheduler
- https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/

## Kubernetes Metrics Server

https://github.com/kubernetes-sigs/metrics-server

> Metrics Server is a scalable, efficient source of container resource metrics for Kubernetes built-in autoscaling pipelines.
>
> Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through [Metrics API](https://github.com/kubernetes/metrics) for use by [Horizontal Pod Autoscaler](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) and [Vertical Pod Autoscaler](https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler/). Metrics API can also be accessed by `kubectl top`, making it easier to debug autoscaling pipelines.

# 容器编排
- containers

autoscaling workloads:
- HorizontalPodAutoscaler(HPA)
- VerticalPodAutoscaler(VPA)
- Autoscaling based on cluster size
- Event driven Autoscaling
- Autoscaling based on schedules

## Workloads

- Pods

- Deployment

A Deployment provides *declarative updates for Pods and ReplicaSets*. You describe a desired state in a Deployment, and the Deployment Controller changes the actual state to the desired state at a controlled rate. You can define Deployments to create new ReplicaSets, or to remove existing Deployments and adopt all their resources with new Deployments.

- ReplicaSet(替代ReplicationController)

A ReplicaSet's purpose is to *maintain a stable set of replica Pods running at any given time*. As such, it is often used to guarantee the availability of a specified number of identical Pods.

- StatefulSets

StatefulSet is the workload API object used to *manage stateful applications*. Manages the deployment and scaling of a set of Pods, and provides *guarantees about the ordering and uniqueness of these Pods*. Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky **identity** for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling. If you want to use storage volumes to provide persistence for your workload, you can use a StatefulSet as part of the solution. Although individual Pods in a StatefulSet are susceptible to failure, the persistent Pod identifiers make it easier to match existing volumes to the new Pods that replace any that have failed.


- DaemonSet

A DaemonSet ensures that *all (or some) Nodes run a copy of a Pod*. As nodes are added to the cluster, Pods are added to them. As nodes are removed from the cluster, those Pods are garbage collected. Deleting a DaemonSet will clean up the Pods it created.

Some typical uses of a DaemonSet are:

running a cluster storage daemon on every node
running a logs collection daemon on every node
running a node monitoring daemon on every node
In a simple case, one DaemonSet, covering all nodes, would be used for each type of daemon. A more complex setup might use multiple DaemonSets for a single type of daemon, but with different flags and/or different memory and cpu requests for different hardware types.

- Jobs

A Job creates one or more Pods and will *continue to retry execution of the Pods until a specified number of them successfully terminate*. As pods successfully complete, the Job tracks the successful completions. When a specified number of successful completions is reached, the task (ie, Job) is complete. Deleting a Job will clean up the Pods it created. Suspending a Job will delete its active Pods until the Job is resumed again.

A simple case is to create one Job object in order to reliably run one Pod to completion. The Job object will start a new Pod if the first Pod fails or is deleted (for example due to a node hardware failure or a node reboot).

You can also use a Job to run multiple Pods in parallel.

If you want to run a Job (either a single task, or several in parallel) on a schedule, see CronJob.

- Automatic Cleanup for Finished Jobs

> FEATURE STATE: Kubernetes v1.23 [stable]

When your Job has finished, it's useful to keep that Job in the API (and not immediately delete the Job) so that you can tell whether the Job succeeded or failed.

Kubernetes' TTL-after-finished controller provides a TTL (time to live) mechanism to limit the lifetime of Job objects that have finished execution.

- CronJob

> FEATURE STATE: Kubernetes v1.21 [stable]

A CronJob creates Jobs on a *repeating schedule*.

CronJob is meant for performing regular scheduled actions such as backups, report generation, and so on. One CronJob object is like one line of a crontab (cron table) file on a Unix system. It runs a job periodically on a given schedule, written in Cron format.

- ReplicationController

> Note: A Deployment that configures a ReplicaSet is now the recommended way to set up replication.

A ReplicationController ensures that a specified number of pod replicas are running at any one time. In other words, a ReplicationController makes sure that a pod or a homogeneous set of pods is always up and available.

## CRI

The CRI is a plugin interface which enables the kubelet to use a wide variety of container runtimes, without having a need to recompile the cluster components.
    
The Kubernetes Container Runtime Interface (CRI) defines the main gRPC protocol for the communication between the cluster components kubelet and container runtime.

# 网络
- service, load balancing, networking
- Network plugins: CNI
- [Ports and Protocols](https://kubernetes.io/docs/reference/networking/ports-and-protocols/)

- Service

An abstract way to expose an application running on a set of `Pod`s as a network service.

With Kubernetes you don't need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.

- Ingress

FEATURE STATE: Kubernetes v1.19 [stable]

An API object that manages external access to the services in a cluster, typically HTTP.

Ingress may provide load balancing, SSL termination and name-based virtual hosting.

![Ingress](https://d33wubrfki0l68.cloudfront.net/91ace4ec5dd0260386e71960638243cf902f8206/c3c52/docs/images/ingress.svg)

- Ingress Controllers

In order for the Ingress resource to work, the cluster must have an ingress controller running.

Unlike other types of controllers which run as part of the `kube-controller-manager` binary, Ingress controllers are not started automatically with a cluster. Use this page to choose the ingress controller implementation that best fits your cluster.

Kubernetes as a project supports and maintains AWS, GCE, and nginx ingress controllers.

Additional third-party controllers.


# 存储
- CSI

# 资源管理, 调度
- scheduling, preemption, eviction

# 集群管理
- node关闭和自动扩展
- 证书
- 集群网络
- 日志
- 组件metrics
- ...

## 日志
- [System Logs](https://kubernetes.io/docs/concepts/cluster-administration/system-logs/)
- [kube-log-runner](https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/component-base/logs/kube-log-runner/README.md)

# 配置
- ConfigMap
- Secrets
- liveness, readiness, startup probe

The kubelet uses **liveness probes** to know when to restart a container. 
For example, liveness probes could catch a deadlock, where an application is running, but unable to make progress. Restarting a container in such a state can help to make the application more available despite bugs.

The kubelet uses **readiness probes** to know when a container is ready to start accepting traffic. 
A Pod is considered ready when all of its containers are ready. One use of this signal is to control which Pods are used as backends for Services. When a Pod is not ready, it is removed from Service load balancers.

The kubelet uses **startup probes** to know when a container application has started. 
If such a probe is configured, it disables liveness and readiness checks until it succeeds, making sure those probes don't interfere with the application startup. This can be used to adopt liveness checks on slow starting containers, avoiding them getting killed by the kubelet before they are up and running.

# Security, Policy

Security:
- Pod security: standards, admission, policy
- Service account
- API access
- RBAC

Policy:
- resource quota

# 扩展Kubernetes

# Tools

- [Docker Desktop Kubernetes](https://docs.docker.com/desktop/features/kubernetes/): ex v1.27.2
- GoogleContainerTools
	- [distroless](https://github.com/GoogleContainerTools/distroless): Language focused docker images, minus the operating system.

## kubectl
- [The Kubectl book](https://kubectl.docs.kubernetes.io/)

In [9]:
!kubectl version

Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:20:07Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"windows/amd64"}




Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:13:28Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"}


In [None]:
!kubectl get nodes

NAME             STATUS   ROLES           AGE   VERSION
docker-desktop   Ready    control-plane   33d   v1.27.2


In [8]:
!kubectl get ns

NAME              STATUS   AGE
default           Active   33d
kube-node-lease   Active   33d
kube-public       Active   33d
kube-system       Active   33d


## Python Client
- [client libraries](https://kubernetes.io/docs/reference/using-api/client-libraries/)

In [3]:
!pip install kubernetes



In [6]:
# Access Clusters Using the Kubernetes API
# https://kubernetes.io/docs/tasks/administer-cluster/access-cluster-api/

# kubernetes.client
# https://github.com/kubernetes-client/python/blob/master/kubernetes/README.md
from kubernetes import client, config
import pprint

config.load_kube_config()

v1 = client.CoreV1Api()
print("Listing pods with their IPs:")
ret = v1.list_pod_for_all_namespaces(watch=False)
for i in ret.items:
  print("%s\t%s\t%s" % (i.status.pod_ip, i.metadata.namespace, i.metadata.name))

# nodes
nodes = v1.list_node()
for item in nodes.items:
  print(item.metadata.name)

# namespaces
namespaces = v1.list_namespace()
for item in namespaces.items:
  print(item.metadata.name)

# pods
ns_name = 'kube-system'
pods = v1.list_namespaced_pod(ns_name)
for item in pods.items:
  pod_name = item.metadata.name
  print("pod:", pod_name)
  if 'etcd-docker-desktop' == pod_name:
    print(v1.read_namespaced_pod_status(pod_name, ns_name).status.phase)
    print(v1.read_namespaced_pod_log(pod_name, ns_name)[-200:])
  print()

# services
ns_name = 'kube-system'
services = v1.list_namespaced_service(ns_name)
for item in services.items:
  srv_name = item.metadata.name
  print("service:", srv_name)

Listing pods with their IPs:
10.1.0.46	kube-system	coredns-5d78c9869d-rlgnm
10.1.0.49	kube-system	coredns-5d78c9869d-zkbjr
192.168.65.4	kube-system	etcd-docker-desktop
192.168.65.4	kube-system	kube-apiserver-docker-desktop
192.168.65.4	kube-system	kube-controller-manager-docker-desktop
192.168.65.4	kube-system	kube-proxy-plbl4
192.168.65.4	kube-system	kube-scheduler-docker-desktop
10.1.0.47	kube-system	storage-provisioner
10.1.0.48	kube-system	vpnkit-controller
docker-desktop
default
kube-node-lease
kube-public
kube-system
pod: coredns-5d78c9869d-rlgnm

pod: coredns-5d78c9869d-zkbjr

pod: etcd-docker-desktop
Running
ook":"433.362µs","hash":4221830778}
{"level":"info","ts":"2025-02-24T03:04:09.288Z","caller":"mvcc/hash.go:137","msg":"storing new hash","hash":4221830778,"revision":267628,"compact-revision":267230}


pod: kube-apiserver-docker-desktop

pod: kube-controller-manager-docker-desktop

pod: kube-proxy-plbl4

pod: kube-scheduler-docker-desktop

pod: storage-provisioner

pod: vpn

## minikube
- https://github.com/kubernetes/minikube

> minikube implements a local Kubernetes cluster on macOS, Linux, and Windows. minikube's [primary goals](https://minikube.sigs.k8s.io/docs/concepts/principles/) are to be the best tool for local Kubernetes application development and to support all Kubernetes features that fit.

In [None]:
# !docker pull gcr.io/k8s-minikube/storage-provisioner:v5
# !docker pull registry.k8s.io/kube-controller-manager:v1.32.0
# !docker pull registry.k8s.io/coredns/coredns:v1.11.3
# !docker pull registry.k8s.io/kube-apiserver:v1.32.0
# !docker pull registry.k8s.io/kube-proxy:v1.32.0
# !docker pull registry.k8s.io/pause:3.10
# !docker pull registry.k8s.io/etcd:3.5.16-0
# !docker pull registry.k8s.io/kube-scheduler:v1.32.0

!minikube delete
!minikube config set driver docker
# https://github.com/kubernetes/minikube/issues/8997
# https://storage.googleapis.com/minikube-preloaded-volume-tarballs/v18/v1.32.0/preloaded-images-k8s-v18-v1.32.0-docker-overlay2-amd64.tar.lz4?checksum=md5:4da2ed9bc13e09e8e9b7cf53d01335db

!minikube start --alsologtostderr

'minikube.exe' is not recognized as an internal or external command,
operable program or batch file.


In [3]:
!minikube status

minikube
type: Control Plane
host: Running
kubelet: Running
apiserver: Running
kubeconfig: Configured



In [4]:
# !minikube dashboard

^C


# Application
- quarkus-kubernets in 'Reactive Systems in Java'

## Jib
- [Code](https://github.com/GoogleContainerTools/jib)

> Jib: Build container images for your Java applications.
> 
> Jib builds optimized Docker and [OCI](https://github.com/opencontainers/image-spec) images for your Java applications without a Docker daemon - and without deep mastery of Docker best-practices. It is available as plugins for [Maven](https://github.com/GoogleContainerTools/jib/blob/master/jib-maven-plugin) and [Gradle](https://github.com/GoogleContainerTools/jib/blob/master/jib-gradle-plugin) and as a Java library.
>
>- [Maven](https://maven.apache.org/): See documentation for [jib-maven-plugin](https://github.com/GoogleContainerTools/jib/blob/master/jib-maven-plugin).
>- [Gradle](https://gradle.org/): See documentation for [jib-gradle-plugin](https://github.com/GoogleContainerTools/jib/blob/master/jib-gradle-plugin).
>- [Jib Core](https://github.com/GoogleContainerTools/jib/blob/master/jib-core): A general-purpose container-building library for Java.
>- [Jib CLI](https://github.com/GoogleContainerTools/jib/blob/master/jib-cli): A command-line interface for building images that uses Jib Core.

Maven Plugin:

```xml
<properties>
    <app.main.class>xxx</app.main.class>
    <jib.image.from>
    docker://eclipse-temurin:17-jdk-alpine@sha256:ddd7a05cf8263989c29f2a9476dcfa25d0eaf8310d400f998ebd03c0d32feb72
    </jib.image.from>
    <jib.image.to>projectxxx/${project.artifactId}:${project.version}</jib.image.to>
    <harbor.username>xxx</harbor.username>
    <harbor.password>xxx</harbor.password>
</properties>

<plugin>
    <groupId>com.google.cloud.tools</groupId>
    <artifactId>jib-maven-plugin</artifactId>
    <version>3.3.2</version>
    <configuration>
        <containerizingMode>packaged</containerizingMode>
        <from>
            <image>${jib.image.from}</image>
        </from>
        <to>
            <image>${jib.image.to}</image>
            <auth>
                <username>${harbor.username}</username>
                <password>${harbor.password}</password>
            </auth>
        </to>
        <container>
            <jvmFlags>
                <jvmFlag>-Xms512m</jvmFlag>
            </jvmFlags>
            <environment>
                <TZ>Asia/Shanghai</TZ>
                <!-- profile -->
                <!-- <spring.profiles.active>prod</spring.profiles.active>-->
            </environment>
            <volumes>
                <volume>/tmp</volume>
            </volumes>
            <ports>
                <port>80</port>
            </ports>
            <!-- <entrypoint>java -cp /app/libs/* -jar /app/${project.artifactId}-${project.version}.jar-->
            <!-- </entrypoint>-->
            <mainClass>${app.main.class}</mainClass>
            <format>OCI</format>
        </container>
        <allowInsecureRegistries>true</allowInsecureRegistries>
    </configuration>
</plugin>
```

# Specification

## OCI: Open Container Initiative


- [About the Open Container Initiative](https://opencontainers.org/about/overview/)

> Open Container Initiative (OCI) 
> The Open Container Initiative (OCI) is a lightweight, open governance structure (project), formed under the auspices of the Linux Foundation, for the express purpose of creating open industry standards around container formats and runtimes. The OCI was launched on June 22nd 2015 by Docker, CoreOS and other leaders in the container industry.
> 
> The OCI currently contains three specifications: **the Runtime Specification (runtime-spec)**, **the Image Specification (image-spec)** and **the Distribution Specification (distribution-spec)**. The Runtime Specification outlines how to run a “filesystem bundle” that is unpacked on disk. At a high-level an OCI implementation would download an OCI Image then unpack that image into an OCI Runtime filesystem bundle. At this point the OCI Runtime Bundle would be run by an OCI Runtime.
> 
> This entire workflow should support the UX that users have come to expect from container engines like **Docker** and **rkt**: primarily, the ability to run an image with no additional arguments:
>
>- docker run example.com/org/app:v1.0.0
>- rkt run example.com/org/app,version=v1.0.0
>
> To support this UX the OCI Image Format contains sufficient information to launch the application on the target platform (e.g. command, arguments, environment variables, etc). This specification defines how to create an OCI Image, which will generally be done by a build system, and output an **image manifest**, a **filesystem (layer) serialization**, and an **image configuration**.
> 
> Docker is donating its container format and runtime, **runC**, to the OCI to serve as the cornerstone of this new effort. It is available now at https://github.com/opencontainers/runc.
> 
> The distribution specification reached v1.0 in May 2020 and was introduced to OCI as an effort to standardize the API to distribute container images. However, the specification is designed generically enough to be leveraged as a distribution mechanism for any type of content.


Specifications:
- [OCI Runtime Specification](https://github.com/opencontainers/runtime-spec)
- [OCI Image Format](https://github.com/opencontainers/image-spec)
- [OCI Distribution Specification](https://github.com/opencontainers/distribution-spec)


# References

Kubernetes API Reference: https://kubernetes.io/docs/reference/kubernetes-api
 - API Conventions: https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md
 - One-page API Reference for Kubernetes v1.28: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.28/

Configuration APIs: https://kubernetes.io/docs/reference/config-api/
 - kubeconfig
 - kube-apiserver admission
 - kube-apiserver configuration
 - kube-apiserver encryption
 - kube-apiserver event rate limit
 - kubelet configuration
 - kubelet credential providers, kube-scheduler configuration
 - kube-controller-manager configuration
 - kube-proxy configuration
 - audit.k8s.io API
 - Client authentication API
 - WebhookAdmission configuration
 - ImagePolicy API

Design Docs
 - Kubernetes Architecture: https://git.k8s.io/design-proposals-archive/architecture/architecture.md
 - Kubernetes Design Overview: https://git.k8s.io/design-proposals-archive