# Minikube

This notebook contains the basic installation, setup and management of a Kubernetes application, using Minikube. It draws heavily from the following sources:
- [AWS ec2 documentation](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/ec2/index.html)
- [Learn Kubernetes basics](https://kubernetes.io/docs/tutorials/kubernetes-basics/)
- [Minikube documentation](minikube.sigs.k8s.io/docs)
- [Kubectl documentation](https://kubernetes.io/docs/reference/kubectl/)

Import libraries

In [3]:
import pandas as pd
pd.set_option('display.max_rows', None)
pd.set_option('display.max_colwidth', None)
import json

## Install and setup (Ubuntu in AWS)

### Setup AWS virtual machine

List AMIs (can also be found in [Amazon EC2 AMI Locator](https://cloud-images.ubuntu.com/locator/ec2/))

In [82]:
%%bash
aws ec2 describe-images \
    --filters "Name=name,Values=ubuntu/images/hvm-ssd/*20.04-amd64*" "Name=architecture,Values=x86_64" \
    --region=us-east-1 \
    --query "Images[*].[Name, ImageId, Architecture, CreationDate]" \
    --output json > AMIs.json

In [83]:
with open('AMIs.json') as f:
    amis_json = json.load(f)

amis_df = pd.DataFrame.from_records(amis_json, columns=['Name', 'ImageID', 'Architecture', 'CreatedDate'])
amis_df.sort_values('CreatedDate', ascending=False).head()

Unnamed: 0,Name,ImageID,Architecture,CreatedDate
68,ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20220627.1-aced0818-eef1-427a-9e04-8ba38bada306,ami-0a20b2fbe7bee1376,x86_64,2022-06-28T01:46:23.000Z
29,ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20220627.1,ami-0439517b5e436bdab,x86_64,2022-06-28T00:51:25.000Z
77,ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20220615-aced0818-eef1-427a-9e04-8ba38bada306,ami-0c24d345ea91339ee,x86_64,2022-06-16T02:01:00.000Z
18,ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20220615,ami-031cf125b681ca3e0,x86_64,2022-06-16T00:57:43.000Z
26,ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20220610-aced0818-eef1-427a-9e04-8ba38bada306,ami-040b8002416c0ad23,x86_64,2022-06-10T12:36:00.000Z


Describe instance types

In [18]:
%%bash
aws ec2 describe-instance-types \
    --filters Name=vcpu-info.default-cores,Values=2 Name=processor-info.supported-architecture,Values=x86_64 Name=hibernation-supported,Values=true \
    --query "InstanceTypes[*].[InstanceType, CurrentGeneration]" \
    > instance_types.json

In [19]:
with open('instance_types.json', 'r') as f:
    instances_json = json.load(f)

instances_df = pd.DataFrame.from_records(instances_json, columns=['InstanceType', 'CurrentGeneration'])
instances_df

Unnamed: 0,InstanceType,CurrentGeneration
0,r5a.xlarge,True
1,m5d.xlarge,True
2,t3.xlarge,True
3,m5.xlarge,True
4,c5.xlarge,True
5,t2.large,True
6,m4.xlarge,True
7,c3.xlarge,False
8,r5d.xlarge,True
9,r5ad.xlarge,True


Launch instance

In [20]:
%%bash
aws ec2 run-instances --image-id ami-052efd3df9dad4825 \
    --count 1 \
    --instance-type t2.medium \
    --key-name ec2KeyPair \
    --security-group-ids sg-07ee2eedb8836a8a1

{
    "Groups": [],
    "Instances": [
        {
            "AmiLaunchIndex": 0,
            "ImageId": "ami-052efd3df9dad4825",
            "InstanceId": "i-02ae84460e0657bd2",
            "InstanceType": "t2.medium",
            "KeyName": "ec2KeyPair",
            "LaunchTime": "2022-07-01T13:36:02+00:00",
            "Monitoring": {
                "State": "disabled"
            },
            "Placement": {
                "AvailabilityZone": "us-east-1a",
                "GroupName": "",
                "Tenancy": "default"
            },
            "PrivateDnsName": "ip-172-31-31-82.ec2.internal",
            "PrivateIpAddress": "172.31.31.82",
            "ProductCodes": [],
            "PublicDnsName": "",
            "State": {
                "Code": 0,
                "Name": "pending"
            },
            "StateTransitionReason": "",
            "SubnetId": "subnet-84fdd0ce",
            "VpcId": "vpc-9f18b6e5",
            "Architecture": "x86_64",
            "B

Check instance state and id

In [90]:
%%bash
aws ec2 describe-instances --instance-id i-02ae84460e0657bd2 \
    --query Reservations[0].Instances[0].[InstanceId,State,PublicDnsName] \

[
    "i-02ae84460e0657bd2",
    {
        "Code": 16,
        "Name": "running"
    },
    "ec2-54-157-20-97.compute-1.amazonaws.com"
]


### Install Docker and Minikube

SSH into the AWS instance and:
- [Install Docker engine on Ubuntu](https://docs.docker.com/engine/install/ubuntu/)
- [Install minikube on Ubuntu](https://minikube.sigs.k8s.io/docs/start/)
- If `minikube start` returns a "Docker not healthy" error, follow [these instructions](https://github.com/kubernetes/minikube/issues/7903#issuecomment-624074810).
- [Install Kubectl on Ubuntu](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/)

### Install and setup (WSL)

Taken from [magda.io](https://magda.io/docs/installing-minikube.html).

1. Install Docker-Desktop for Windows
2. Install Minikube for Windows
3. Install Kubectl in WSL

    >```bash
    >curl -LO https://storage.googleapis.com/kubernetes-release/release/`curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt`/bin/linux/amd64/kubectl
    >chmod +x ./kubectl
    >sudo mv ./kubectl /usr/local/bin/kubectl
    >kubectl version --client
    >```

4. Create `minikube` file with the following script and save it in path.

    >```bash
    >#!/bin/sh
    >/mnt/c/Program\ Files/Kubernetes/Minikube/minikube.exe $@
    >```

5. Start minikube

    >```bash
    >chmod +x minikube
    >bash minikube start`
    >```

6. Copy minikube context, cluster and user configuration from Windows to WSL.

    >```bash
    >mkdir ~/.kube
    >cp /mnt/c/Users/ramir/.kube/config ~/.kube/config 
    >```

7. In `config` replace every path that has Windows format to Linux format (e.g. `C:\\` to `/mnt/c/`)

7. Create `minikube-go` file with the following script and save it in path.

    >```bash
    >#!/bin/sh
    >eval $(minikube docker-env --shell=bash)
    >export DOCKER_CERT_PATH=$(wslpath -u "${DOCKER_CERT_PATH}")
    >```

8. Run `minikube-go` to configure WSL Docker to talk to minikube.

    >```bash
    >sudo chmod +x minikube-go
    >minikube-go
    >```

9. Check that `minikube` is correctly installed.

    >```bash
    >minikube version
    >```


In [4]:
%%bash
minikube version

minikube version: v1.26.0
commit: f4b412861bb746be73053c9f6d2895f12cf78565


Check if `kubectl` is correctly installed (if this returns an error, `minikube start` first)

In [6]:
%%bash
kubectl version --short

Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.


Client Version: v1.24.2
Kustomize Version: v4.5.4
Server Version: v1.24.1


## Create a cluster

Start cluster

In [7]:
%%bash
minikube start

* minikube v1.26.0 on Ubuntu 22.04 (xen/amd64)
* Automatically selected the docker driver
* Using Docker driver with root privileges
* Starting control plane node minikube in cluster minikube
* Pulling base image ...
* Creating docker container (CPUs=2, Memory=2200MB) ...
* Preparing Kubernetes v1.24.1 on Docker 20.10.17 ...
  - Generating certificates and keys ...
  - Booting up control plane ...
  - Configuring RBAC rules ...
* Verifying Kubernetes components...
  - Using image gcr.io/k8s-minikube/storage-provisioner:v5
* Enabled addons: storage-provisioner, default-storageclass
* Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default


View cluster details

In [8]:
%%bash
kubectl cluster-info

[0;32mKubernetes control plane[0m is running at [0;33mhttps://192.168.49.2:8443[0m
[0;32mCoreDNS[0m is running at [0;33mhttps://192.168.49.2:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy[0m

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.


View cluster nodes

In [9]:
%%bash
kubectl get nodes

NAME       STATUS   ROLES           AGE   VERSION
minikube   Ready    control-plane   13s   v1.24.1


## Deploy an application

Deploy an application in a Docker container that uses NGINX to echo back all the requests received.

In [10]:
%%bash
kubectl create deployment requester --image=gcr.io/google-samples/kubernetes-bootcamp:v1

deployment.apps/requester created


## Explore application

From [Kubernetes tutorial](https://kubernetes.io/docs/tutorials/kubernetes-basics/explore/explore-intro/):

### Pods 

A Pod is a Kubernetes abstraction that represents a group of one or more application containers (such as Docker), and some shared resources for those containers. Those resources include:

- Shared storage, as Volumes
- Networking, as a unique cluster IP address
- Information about how to run each container, such as the container image version or specific ports to use

A Pod models an application-specific "logical host" and can contain different application containers which are relatively tightly coupled. For example, a Pod might include both the container with your Node.js app as well as a different container that feeds the data to be published by the Node.js webserver. The containers in a Pod share an IP Address and port space, are always co-located and co-scheduled, and run in a shared context on the same Node.

Pods are the atomic unit on the Kubernetes platform. When we create a Deployment on Kubernetes, that Deployment creates Pods with containers inside them (as opposed to creating containers directly). Each Pod is tied to the Node where it is scheduled, and remains there until termination (according to restart policy) or deletion. In case of a Node failure, identical Pods are scheduled on other available Nodes in the cluster.

![module_03_pods.svg](https://d33wubrfki0l68.cloudfront.net/fe03f68d8ede9815184852ca2a4fd30325e5d15a/98064/docs/tutorials/kubernetes-basics/public/images/module_03_pods.svg)

### Nodes

A Pod always runs on a Node. A Node is a worker machine in Kubernetes and may be either a virtual or a physical machine, depending on the cluster. Each Node is managed by the control plane. A Node can have multiple pods, and the Kubernetes control plane automatically handles scheduling the pods across the Nodes in the cluster. The control plane's automatic scheduling takes into account the available resources on each Node.

Every Kubernetes Node runs at least:

- Kubelet, a process responsible for communication between the Kubernetes control plane and the Node; it manages the Pods and the containers running on a machine.
- A container runtime (like Docker) responsible for pulling the container image from a registry, unpacking the container, and running the application.

![node overview](https://d33wubrfki0l68.cloudfront.net/5cb72d407cbe2755e581b6de757e0d81760d5b86/a9df9/docs/tutorials/kubernetes-basics/public/images/module_03_nodes.svg)

List deployments

In [11]:
%%bash
kubectl get deployments

NAME        READY   UP-TO-DATE   AVAILABLE   AGE
requester   1/1     1            1           107s


##### Interact with application through proxy

In a second terminal enable a proxy that will forward communications into the cluster private network. 
- The proxy uses a Kubernetes API endpoint to communicate with the application.
- The connection will not output anything while it is active.
- The API server will automatically create an endpoint for each pod, based on the pod name, that is also accessible through the proxy.

In [7]:
%%bash
# Do this in a proper terminal; in Jupyter, each block is run in a new subprocess, so the variables don't persist accross blocks.
# kubectl proxy

See app's output (notice the `$POD_NAME:8080` bit, which is different than the `minikube` tutorial, where the port is not specified, throwing an error).

In [8]:
%%bash
export POD_NAME=$(kubectl get pods -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}') \
&& echo PodName: $POD_NAME \
&& curl http://localhost:8001/api/v1/namespaces/default/pods/$POD_NAME:8080/proxy/

PodName: requester-5978658454-hdpjb


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    74    0    74    0     0   3797      0 --:--:-- --:--:-- --:--:--  3894


Hello Kubernetes bootcamp! | Running on: requester-5978658454-hdpjb | v=1


Using the connection through the proxy running in the second terminal, query the app version.

In [9]:
%%bash
curl http://127.0.0.1:8001/version

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   263  100   263    0     0  57124      0 --:--:-- --:--:-- --:--:-- 65750


{
  "major": "1",
  "minor": "24",
  "gitVersion": "v1.24.1",
  "gitCommit": "3ddd0f45aa91e2f30c70734b175631bec5b5825a",
  "gitTreeState": "clean",
  "buildDate": "2022-05-24T12:18:48Z",
  "goVersion": "go1.18.2",
  "compiler": "gc",
  "platform": "linux/amd64"
}

Get pod name and access it through the API (i.e. the proxy connection)

In [10]:
%%bash
export POD_NAME=$(kubectl get pods -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}') \
&& echo PodName=$POD_NAME \
&& curl http://127.0.0.1:8001/api/v1/namespaces/default/pods/$POD_NAME/

PodName=requester-5978658454-hdpjb


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed


{
  "kind": "Pod",
  "apiVersion": "v1",
  "metadata": {
    "name": "requester-5978658454-hdpjb",
    "generateName": "requester-5978658454-",
    "namespace": "default",
    "uid": "f129ee14-79d7-4844-b971-5336c9310994",
    "resourceVersion": "400",
    "creationTimestamp": "2022-07-04T16:06:42Z",
    "labels": {
      "app": "requester",
      "pod-template-hash": "5978658454"
    },
    "ownerReferences": [
      {
        "apiVersion": "apps/v1",
        "kind": "ReplicaSet",
        "name": "requester-5978658454",
        "uid": "54f5b93c-baea-435d-a238-b964bdc7727b",
        "controller": true,
        "blockOwnerDeletion": true
      }
    ],
    "managedFields": [
      {
        "manager": "kube-controller-manager",
        "operation": "Update",
        "apiVersion": "v1",
        "time": "2022-07-04T16:06:42Z",
        "fieldsType": "FieldsV1",
        "fieldsV1": {
          "f:metadata": {
            "f:generateName": {},
            "f:labels": {
              ".": {},

100  7001    0  7001    0     0  2353k      0 --:--:-- --:--:-- --:--:-- 3418k


}

#### Interact with application via `kubectl`

Get pods

In [12]:
%%bash
kubectl get pods

NAME                         READY   STATUS    RESTARTS   AGE
requester-5978658454-hdpjb   1/1     Running   0          57m


Describe pods

In [13]:
%%bash
kubectl describe pods

Name:         requester-5978658454-hdpjb
Namespace:    default
Priority:     0
Node:         minikube/192.168.49.2
Start Time:   Mon, 04 Jul 2022 16:06:42 +0000
Labels:       app=requester
              pod-template-hash=5978658454
Annotations:  <none>
Status:       Running
IP:           172.17.0.3
IPs:
  IP:           172.17.0.3
Controlled By:  ReplicaSet/requester-5978658454
Containers:
  kubernetes-bootcamp:
    Container ID:   docker://984df3174a844d626c88e1752f1c5fd2f555bd6f0635317c5b3a07dd10455e19
    Image:          gcr.io/google-samples/kubernetes-bootcamp:v1
    Image ID:       docker-pullable://gcr.io/google-samples/kubernetes-bootcamp@sha256:0d6b8ee63bb57c5f5b6156f446b3bc3b3c143d233037f3a2f00e279c8fcc64af
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Mon, 04 Jul 2022 16:06:51 +0000
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount fr

View container logs

In [14]:
%%bash
export POD_NAME=$(kubectl get pods -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}') \
&& kubectl logs $POD_NAME

Kubernetes Bootcamp App Started At: 2022-07-04T16:06:51.306Z | Running On:  requester-5978658454-hdpjb 

Running On: requester-5978658454-hdpjb | Total Requests: 1 | App Uptime: 3334.356 seconds | Log Time: 2022-07-04T17:02:25.662Z


List environment variables

In [15]:
%%bash
export POD_NAME=$(kubectl get pods -o go-template --template '{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}') \
&& kubectl exec $POD_NAME -- env

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=requester-5978658454-hdpjb
KUBERNETES_PORT_443_TCP_PROTO=tcp
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_ADDR=10.96.0.1
KUBERNETES_SERVICE_HOST=10.96.0.1
KUBERNETES_SERVICE_PORT=443
KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT=tcp://10.96.0.1:443
KUBERNETES_PORT_443_TCP=tcp://10.96.0.1:443
NPM_CONFIG_LOGLEVEL=info
NODE_VERSION=6.3.1
HOME=/root


Start a Bash session in the pod's container (run in a proper terminal)

```bash
$ kubectl exec -ti $POD_NAME -- bash
root@kubernetes-bootcamp-fb5c67579-pbnlv:/# cat server.js
  var http = require('http');
  var requests=0;
  var podname= process.env.HOSTNAME;
  var startTime;
  var host;
  var handleRequest = function(request, response) {
    response.setHeader('Content-Type', 'text/plain');
    response.writeHead(200);
    response.write("Hello Kubernetes bootcamp! | Running on: ");
    response.write(host);
    response.end(" | v=1\n");
    console.log("Running On:" ,host, "| Total Requests:", ++requests,"| App Uptime:", (new Date() - startTime)/1000 , "seconds", "| Log Time:",new Date());
  }
  var www = http.createServer(handleRequest);
  www.listen(8080,function () {
      startTime = new Date();;
      host = process.env.HOSTNAME;
      console.log ("Kubernetes Bootcamp App Started At:",startTime, "| Running On: " ,host, "\n" );
  });
```

### Expose app publicly

#### Overview of Kubernetes Services

Kubernetes Pods are mortal. Pods in fact have a lifecycle. When a worker node dies, the Pods running on the Node are also lost. A ReplicaSet might then dynamically drive the cluster back to desired state via creation of new Pods to keep your application running. As another example, consider an image-processing backend with 3 replicas. Those replicas are exchangeable; the front-end system should not care about backend replicas or even if a Pod is lost and recreated. That said, each Pod in a Kubernetes cluster has a unique IP address, even Pods on the same Node, so there needs to be a way of automatically reconciling changes among Pods so that your applications continue to function.

A Service in Kubernetes is an abstraction which defines a logical set of Pods and a policy by which to access them. Services enable a loose coupling between dependent Pods. A Service is defined using YAML (preferred) or JSON, like all Kubernetes objects. The set of Pods targeted by a Service is usually determined by a LabelSelector (see below for why you might want a Service without including selector in the spec).

Although each Pod has a unique IP address, those IPs are not exposed outside the cluster without a Service. Services allow your applications to receive traffic. Services can be exposed in different ways by specifying a type in the ServiceSpec:

- ClusterIP (default) - Exposes the Service on an internal IP in the cluster. This type makes the Service only reachable from within the cluster.
- NodePort - Exposes the Service on the same port of each selected Node in the cluster using NAT. Makes a Service accessible from outside the cluster using <NodeIP>:<NodePort>. Superset of ClusterIP.
- LoadBalancer - Creates an external load balancer in the current cloud (if supported) and assigns a fixed, external IP to the Service. Superset of NodePort.
- ExternalName - Maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value. No proxying of any kind is set up. This type requires v1.7 or higher of kube-dns, or CoreDNS version 0.0.8 or higher.

More information about the different types of Services can be found in the Using Source IP tutorial. Also see Connecting Applications with Services.

Additionally, note that there are some use cases with Services that involve not defining selector in the spec. A Service created without selector will also not create the corresponding Endpoints object. This allows users to manually map a Service to specific endpoints. Another possibility why there may be no selector is you are strictly using type: ExternalName.
A Kubernetes Service is an abstraction layer which defines a logical set of Pods and enables external traffic exposure, load balancing and service discovery for those Pods.

#### Services and Labels

A Service routes traffic across a set of Pods. Services are the abstraction that allows pods to die and replicate in Kubernetes without impacting your application. Discovery and routing among dependent Pods (such as the frontend and backend components in an application) are handled by Kubernetes Services.

Services match a set of Pods using labels and selectors, a grouping primitive that allows logical operation on objects in Kubernetes. Labels are key/value pairs attached to objects and can be used in any number of ways:

- Designate objects for development, test, and production
- Embed version tags
- Classify an object using tags

![services and labels](https://d33wubrfki0l68.cloudfront.net/7a13fe12acc9ea0728460c482c67e0eb31ff5303/2c8a7/docs/tutorials/kubernetes-basics/public/images/module_04_labels.svg)

#### Services practice

Get current services. When the deployment is run, a default `ClusterIP` service is created by default.

In [16]:
%%bash
kubectl get services

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   72m


Create new `NodePort` Service

In [17]:
%%bash
kubectl expose deployment/requester --type="NodePort" --port 8080

service/requester exposed


Check that Service was correctly created

In [18]:
%%bash
kubectl get services

NAME         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
kubernetes   ClusterIP   10.96.0.1        <none>        443/TCP          73m
requester    NodePort    10.103.182.142   <none>        8080:30563/TCP   4s


In [19]:
%%bash
kubectl describe services/requester

Name:                     requester
Namespace:                default
Labels:                   app=requester
Annotations:              <none>
Selector:                 app=requester
Type:                     NodePort
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.103.182.142
IPs:                      10.103.182.142
Port:                     <unset>  8080/TCP
TargetPort:               8080/TCP
NodePort:                 <unset>  30563/TCP
Endpoints:                172.17.0.3:8080
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>


Find which port is opened in the node.

In [20]:
%%bash
kubectl describe services/requester | grep "^NodePort:.*"

NodePort:                 <unset>  30563/TCP


Alterntatively,

In [21]:
%%bash 
export NODE_PORT=$(kubectl get services/requester -o go-template='{{(index .spec.ports 0).nodePort}}')
echo $NODE_PORT

30563


Test that the app is exposed outside the cluster.

In [22]:
%%bash
minikube ip

192.168.49.2


In [23]:
%%bash 
export NODE_PORT=$(kubectl get services/requester -o go-template='{{(index .spec.ports 0).nodePort}}')
curl $(minikube ip):$NODE_PORT

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    74    0    74    0     0  26685      0 --:--:-- --:--:-- --:--:-- 37000


Hello Kubernetes bootcamp! | Running on: requester-5978658454-hdpjb | v=1
