Both Pod and Deployment are full-fledged objects in the Kubernetes API. Deployment manages creating Pods by means of ReplicaSets. What it boils down to is that Deployment will create Pods with spec taken from the template. It is rather unlikely that you will ever need to create Pods directly for a production use-case.

Deployment is a higher-level API object that updates its underlying Replica Sets and their Pods in a similar fashion as kubectl rolling-update. Deployments are recommended if you want this rolling update functionality, because unlike kubectl rolling-update, they are declarative, server-side, and have additional features.

A Deployment configures a ReplicaSet controller to create and maintain a specific version of the Pods that the Deployment specifies.

```
kubectl apply -f [DEPLOYMENT_FILE]
```

```
kubectl run [DEPLOYMENT_NAME] \
--image [IMAGE]:[TAG] \
--replica 3 \
--labels [KET]=[VALUE] \
--port 8080 \
--generator deployment/apps.v1 \
--save-config
```

```
kubectl get deployment [DEPLOYMENT_NAME]

kubectl get deployment [DEPLOYMENT_NAME] -o yaml > this.yaml
```

```
kubectl get deployment [DEPLOYMENT_NAME]
```

```
kubectl scale deployment [DEPLOYMENT_NAME] --replicas =5

kubectl autoscale deployment [DEPLOYMENT_NAME] --min=5 --max=15 --cpu-percent=75
```

updating a deployment

```
kubectl apply -f [DEPLOYMENT_NAME]

kubectl set image deployment [DEPLOYMENT_NAME] [IMAGE] [IMAGE]:[TAG]
```

Rolling update

```
maxSurge: 5
maxUnavailable: 25%

```
Specifying max unavailable at 25% means you want to have at least 75% of the desired pods running at the same time. The default max unavailable is 25%.  

Max surge allows you to specify the maximum number of pods that can be created concurrently in a new replica set.  

You can also set max surge as a percentage. The deployment controller looks at the total number of running pods in both ReplicaSets old and new. In this example, a deployment with the desired number of Pods as four and a max surge of 25% will allow a maximum of 5 Pods running at any given time. In other words, it'll allow 125% of the desired number of pods, which is five. Again, the default maxSurge is 25%. Let's look at a deployment with the desired number of pods set to ten, maxUnavailable set to 30% and maxSurge set to 5. The old replica set has 10 pods

Deployment  
Desired pods = 10 pods  
Max unavailable = 30% of disired pods  
max surge = 5 pods  

total pods = 15 (max)

Old ReplicaSet

number of pods = 10 - 8 = 2

New ReplicaSet

number of pods = 5

Rolling back a deployment
--

```
kubectl rollout undo deployment [DEPLOYMENT_NAME]

kubectl rollout undo deployment [DEPLOYMENT_NAME] --to-revision=2

kubectl rollout history deployment [DEPLOYMENT_NAME] --revision=2
```

Pausing/Resuming a deployment
--

```
kubectl rollout pause deployment [DEPLOYMENT_NAME]

kubectl rollout resume deployment [DEPLOYMENT_NAME]

kubectl rollout status deployment [DEPLOYMENT_NAME]

kubectl rollout delete deployment [DEPLOYMENT_NAME]
```

Note: Session affinity

The Service configuration used in the lab does not ensure that all requests from a single client will always connect to the same Pod. Each request is treated separately and can connect to either the normal nginx deployment or to the nginx-canary deployment. This potential to switch between different versions may cause problems if there are significant changes in functionality in the canary release. To prevent this you can set the sessionAffinity field to ClientIP in the specification of the service if you need a client's first request to determine which Pod will be used for all subsequent connections.

```
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: web
spec:
  replicas: 1
  selector:
    matchLabels:
      run: web
  template:
    metadata:
      labels:
        run: web
    spec:
      containers:
      - image: gcr.io/google-samples/hello-app:1.0
        name: web
        ports:
        - containerPort: 8080
          protocol: TCP
```

Create a service resource of type NodePort on port 8080 for the web deployment.
```
kubectl expose deployment web --target-port=8080 --type=NodePort

kubectl get service web
```

When you use `kubectl autoscale`, you specify a maximum and minimum number of replicas for your application, as well as a CPU utilization target.
```
kubectl autoscale deployment web --max 4 --min 1 --cpu-percent 1
```

The `kubectl autoscale` command you used in the previous task creates a `HorizontalPodAutoscaler` object that targets a specified resource, called the scale target, and scales it as needed.

The autoscaler periodically adjusts the number of replicas of the scale target to match the average CPU utilization that you specify when creating the autoscaler.

```
kubectl get hpa

kubectl describe horizontalpodautoscaler web
```

```
apiVersion: apps/v1
kind: Deployment
metadata:
  name: loadgen
spec:
  replicas: 4
  selector:
    matchLabels:
      app: loadgen
  template:
    metadata:
      labels:
        app: loadgen
    spec:
      containers:
      - name: loadgen
        image: k8s.gcr.io/busybox
        args:
        - /bin/sh
        - -c
        - while true; do wget -q -O- http://web:8080; done
```

To stop the load on the web application, scale the loadgen deployment to zero replicas.

```
kubectl scale deployment loadgen --replicas 0
```