# MongoDB Enterprise Kubernetes Operator

## Agenda 

- High level overview of Kubernetes
- Kubernetes Cluster vs MongoDB Cluster
- Statefull vs Stateless Replication 
- Kubernetes Operators
- Ops Manager Kubernetes Operator
- Build a local cluster along the way


## `> whoami`

![norberto leite](assets/DT5hO0_u_400x400.jpg) 

```json
{
    "name": "Norberto Leite", 
    "position": "Lead Engineer", 
    "team": "Curriculum, Engineering"
}
```

### [@nleite](https://twitter.com/nleite)

### Disclaimer

> This is a buzzword intensive presentation but by no means intended to tricky you into thinking I'm a very smart person! Buzzwords just sound nice when put together...


### But before we get started ....

![mflix front page ](assets/mflix_frontpage.png)

replace this image

### MongoDB Developer Courses 

![m220p](assets/M220JS_hero.jpg)


https://university.mongodb.com/

## Kubernetes

![Kubernetes Logo](assets/kubernetes_logo.png)

### Kubernetes vendor ecosystem

![kubernetes vendors](assets/k8t_new.jpg)

> https://blog.spotinst.com/2018/05/20/kubernetes-ecosystem/

### Definition

Kubernetes is an open-source container-orchestration system for automating deployment, scaling and management of containerized applications. It was originally designed by Google and is now maintained by the Cloud Native Computing Foundation

![kubernetes_definition](assets/kubernetes_definition.png)

#### Kubernetes Objects

- pods
- replicasets
- persistentvolumeclaims
- persistentvolumes
- nodes
- storageclasses
- clusters
- ...

https://kubernetes.io/docs/concepts/overview/working-with-objects/kubernetes-objects/

### Kubernetes is *for* Containers => *Virginia is for Lovers*

Kubernetes is an open-source **`container`**-orchestration system for automating deployment, scaling and management of **`containerized applications`**. It was originally designed by Google and is now maintained by the Cloud Native Computing Foundation

![kubernetes_definition](assets/kubernetes_definition.png) | ![](assets/Welcome_to_Virginia_Sign.jpg)
----- | -----


Kubernetes uses containers. Well, we can say that kubernetes loves containers. Deploys and manages containers and containerized applications 

Kubernetes has standardized the container definition using docker format. 

### Container definition
`cat mflix/Dockerfile`

```yaml
# base image of mflix container
FROM java:8
# port number the container exposes
EXPOSE 90000
# make the jar file available in the container image
COPY mflix-1.0-SNAPSHOT.jar ./mflix-1.0-SNAPSHOT.jar
# application run command
CMD ["java", "-jar", "./mflix-1.0-SNAPSHOT.jar"]
```

In this file we can see an example of a Docker image file. 
Sets the instructions to load, expose and execute containarized applications or instances. 

The Docker images are hiearchical, this means that we can compose images uppon each other, inheriting the configuration and image setup

In this example we are creating a container image using as baseline a Java image.

### Image vs Container

An `image` determines what and how to run, using/inherinting which requierements and the default configuration of a **containerized application**


A `container` is the the **runtime** execution of a built Docker image.

### Image vs Container diagram
![image vs container](assets/vGuay.png)

> https://stackoverflow.com/questions/23735149/what-is-the-difference-between-a-docker-image-and-a-container

### Kubernetes manages Containers

Kubernetes is an open-source container-**`orchestration`** system for **`automating deployment, scaling and management`** of containerized applications. It was originally designed by Google and is now maintained by the Cloud Native Computing Foundation

![kubernetes_definition](assets/kubernetes_definition.png)

Aside from running containers, kubernetes is also capable of defining the rules of when to start/stop containers, how containers communicate with one another, how we scale deployments, how to upgrade versions of containers, how to provide HA and fault-tolerance, where to place different containers into different node / machines.

## Kubernetes Architecture



![general k8s architecture](assets/general_k8s_archictecture19_10.png)

On a high level, kubernetes can be represented by something similar to this diagram. 

For each Kubernetes cluster, we will have master node, which holds a set of important components of the architecture: 

- kube-scheduler 
- kube-control-manager 
- kube-apiserver 
- etcd 
- kubelet
- kube-proxy

Each of these I'll provide the relevant links for the exact function within a k8s cluster, however the names of these components are pretty self explanatory. 
The unusual one, that might be a bit more criptic in terms of meaning, given that the name might mean very different things, is etcd, which is an HA key value store, that Kubernetes uses for all cluster data. You can think of etcd as the config server in a MongoDB sharded cluster, which may or may not be set to run within the master node at all. It can run on it's own separate node.  
You will find all the relevant links at the end of this presentation.

But in essence, the master node runs a fair amount of different things. 

https://github.com/kubernetes/community/blob/master/contributors/design-proposals/architecture/architecture.md#the-kubernetes-node

### Multi-master Kubernetes with ``kubeadm``
![general k8s architecture](assets/general_k8s_archictecture_multimaster.png)

Given the previous diagram, you might been thinking

> this Kubernetes cluster thing does not seem to be too scalable, how in this day an age does a cluster have only one master. 

Well, fear not, kubernetes does have a way to avoid single points of failure using ``kubeadm``. 
This is out-of-scope for this talk, but keep in mind that this alone can be setup in several different architectures. 

Bottom line is that kubernetes can be set to run in an HA mode.

### Kubernetes Node 
![formely know as minion](assets/k8s_node_diagram.png)

Kubernetes is a cluster, therefore 
> there will be dragons! 

Not really, but there will be nodes. 
Aside from the previously aluded Master node, or several of these, we will also have worker nodes, previously known as ``minions``

These worker nodes can have serveral different specs. We can compose a k8s cluster with physical, virtual, cloud server nodes. Although, like in any systems archicture, consistency tends to be benefitial on the long term, a k8s cluster can be composed by a very diverse set of server instance specs. 

Each node is composed with the necessary services to run ``pods``. Has a container runtime, with that same purpose in mind, this is generally docker.


### Kubernetes Pod

![pod diagram](assets/k8s_pod_diagram.png)

https://kubernetes.io/docs/concepts/workloads/pods/pod/

A POD is the smallest deployable unit of computing in Kubernetes. 
It can be composed of one or several different containers, a group of containers, and allows the definition of shared network and storage and how to run the set of containers that compose the POD. 

### Kubernetes ReplicaSet - Across Nodes

![replica set across different nodes](assets/k8s_replicaset_multiplenodes_diagram.png)
https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/

Kubernetes allows for pods to be fault tolerant and highly available. This managed via ReplicaSes (familiar name!) 

We can define PODs replica sets across nodes

### Kubernetes ReplicaSet - Single Node

![replica set single node](assets/k8s_replicaset_singlenode_diagram.png)

Or within a single node. 
This is model that we are going to setup today.

### Kubernetes Service 

![Kubernetes Service](assets/k8s_service_diagram.png)
https://kubernetes.io/docs/concepts/services-networking/service/

Services are a speciall type of POD that that other PODs relly on to operate. 
Now, by default PODs are mortal and get resurected dynamically, and they subject to constant change in terms of their deployment composition, number of replica nodes etc. 
This can cause issues to other PODs if those rely in some guarantees and pre-defined configuration. 

A Kubernetes Service is an abstraction which defines a logical set of PODs and a policy by which to access them. You find Services as relliable and consistent PODs to support other PODs.



## Ops Manager / Cloud Manager

![ops manager diagram](assets/ops_manager_diagram.png)

MongoDB Ops Manager is a MongoDB on-prem solution for managing MongoDB Cluster deployments. 
Allows for an holistic management of all things related with MongoDB

- updates
- scaling up and down
- user management and integration 
- node deployment 
- role management 

Across you datacenter.

And there are several particular aspects of a MongoDB Cluster that need care and attention, something that ops manager takes care of in a very efficient way. 

### Cloud / Ops Manager - Monitoring

![opsmanager monitoring](assets/opsmanager_monitoring.png)

### Cloud / Ops Manager - Automation

![opsmanager automation](assets/ops_manager_automation.png)

### Cloud / Ops Manager - Backup 

![opsmanager backup](assets/opsmanager_backup.png)

### Cloud / Ops Manager Agents

![agents](assets/opscloud_manager_agents.png)

## Kubernetes Cluster vs MongoDB Cluster

![kubernetes vs MongoDB Cluster](assets/k8s_mongodb_cluster.png)

There are several similar notions and definitions between a Kubernetes cluster and a MongoDB cluster. 

But the devil is in the details and in the functionality of each of these clusters. 


### Cluster Concepts 

- MongoDB Replica Set 

- Kubernetes Replica Set 

- MongoDB Node 

- Kubernetes Node

### Kubernetes Nodes vs MongoDB Nodes 

![Kubernetes nodes vs MongoDB nodes](assets/k8s_nodes_mongodb_nodes.png)


### MongoDB Nodes in a Kubernetes Node

![kubernetes MongoDB POD](assets/k8s_nodes_mongodb_pod.png)


### Kubernetes ReplicaSet vs MongoDB ReplicaSet

![replica sets](assets/k8s_replicaset_mdb_replicaset.png)

While there purpose for each of the replica set notions is to provide fault tollerance, these are pretty distinct. 

In a POD replication, the definition of the containers is replicate has defined, either to a different pod running in the same node or accross different nodes. 

In a MongoDB Replica Set, the fault tollerance and HA is also associated with a dynamic intra replica set rules and options. All nodes of a MongoDB Replica set share the exact same data, they follow a replication protocol and respond to workloads as a single shared state. 
This is generally not the case in a Kubernetes Replica Set. 

A nice way to distinguish these two different replica sets is to think in terms of Kubernetes replica sets as redundancy of application instances/containers, while a MongoDB replica set assures redundancy and HA of data, regardless of the specification of the instance that supports that service, although all nodes only run a mongodb binary.

## Stateless vs Statefull

![stateless vs statefull](assets/stateless_vs_statefull.png)


One important aspect to keep in mind around cluster management, in particular scalability of clusters, concernes state and state management. 

In generall, container technology is extremely efficient scalling out stateless applications and systems. 
This as to do with the fact that state, data, adds density to the scalability. It tends to be more complicated to manage data then intances. 

And this where Kubernetes, via persistent volumes, allows containers scallability to be better aligned, not perfect with the notion of scaling systems that rely and manage state. 

Getting a system that excels at data management, like mongodb , combined with the scalabilty offered by kubernetes is a very appealing solution for ops professionals. 

## Kubernetes Operator

> An Operator is a method of packaging, deploying and managing a Kubernetes application. A Kubernetes application is an application that is both deployed on Kubernetes and managed using the Kubernetes APIs and kubectl tooling.

_https://coreos.com/operators/_

## MongoDB Enterprise Kubernetes Operator (beta)

> The Operator enables easy deploys of MongoDB into Kubernetes clusters, using our management, monitoring and backup platforms, Ops Manager and Cloud Manager. By installing this integration, you will be able to deploy MongoDB instances with a single simple command.

_https://github.com/mongodb/mongodb-enterprise-kubernetes_

### MongoDB Enterprise Kubernetes - Main Benefits

* Quick, declarative definition of what MongoDB services you want
* Auto-healing, using Kubernetes reliability features
* Easy to scale up / scale down 


## All Together Now!

![All Together Now](https://upload.wikimedia.org/wikipedia/en/c/cd/All_Together_Now_cover.jpg)


## Kubernetes + Cloud/Ops Manager 

![kubernetes ops manager deployment](assets/k8s_opsmanager_diagram.png)


#### Step 1 - Create Kubernetes Cluster and Cloud/Ops Manager Instance
![step 1](assets/k8s_opsmanager_step1.png)

#### Step 2 - Install Enterprise Kubernetes Operator
![step 2](assets/k8s_opsmanager_step2.png)

#### Step 3  - Apply deployment
![step 3](assets/k8s_opsmanager_step3.png)

#### Step 4 - Setup Deployment PODs and Agents
![step 4](assets/k8s_opsmanager_step4.png)

#### Step 5 - Cluster Up and Running Managed by Cloud/Ops Manager
![step 5](assets/k8s_opsmanager_diagram.png)

In [None]:
from IPython.display import YouTubeVideo 
video_id = "e9ENv0l6_bc"
YouTubeVideo(video_id, 400, 300, start=30)


## Let's do it!

### Methods and Materials

- ``minikube``
- ``kubectl``
- ``docker``
- ``mongodb-enterprise-kubernetes``
- ``ops manager | cloud manager``


An to keep my promisse, these are some of the tools that I'll be exploring in this session. 

In this presentation I'll be assembling a kubernetes cluster and configuring the necessary operator to interact and deploy a MongoDB cluster, in the k8s using opsmanager. 
For that I'll be making use of a few pre-installed tools. 

* ``minikue`` - localhost kubernetes test cluster
* ``kubectl`` - commandline tool for managing and interacting with the k8s cluster 
* ``ops manager`` - vagrant box that deploys ops manager

For the presentation purposes, and given that you might not have a local installation of ops manager, I'm going to use cloud manager instead. 
You will need to adjust the configuration options for your own cloud manager instance instead. 

* ``mongodb-enterprise-kubernetes`` - k8s mongodb official operator for ops manager

The jupyter notebook version of this presentation contains all the setup and installation instructions so that you do not have to manually install every single one of these tools, if you whish to reproduce the same steps later on. 

#### Presentation Source Notebook

https://github.com/nleite/opsmgrk8s

which you can download / clone / reproduce by looking into this link. 


### ``minikube``

```sh
minikube [command]
```

Minikube allows us to create a local cluster using a virtual machine to support it.

Since k8s manages clusters, we might need one of those.

``minikube`` takes a command argument for execution. 

In [None]:
%%bash

brew cask install minikube

Then we have to install ``minikube``

* https://github.com/kubernetes/minikube/releases

In [None]:
%%bash

minikube --help

There are several different commands for minikube, I invite you all to explore the full set of functionallity that minikube supports, if that's your cup of tea, but for this talk purposes, we are just going to go ahead and start a minikube cluster.

#### Start minikube

```bash
minikube start 
```

In [None]:
%%bash

minikube status

Once we have our local k8s cluster up and running we can move along to the next step. 
Locally installing the mongodb enterprise kubernetes operator. 

### ``kubectl``

```sh
kubectl [command] [TYPE] [NAME] [flags]
```

``kubectl`` takes a command, a type, a name and a set of optional flags

K8s allows for several commands of different types. 
You can also think of commands as verbs or actions.
Things like

- create
- expose
- get
- describe
- ...

are all basic commands.

Each verb / command can be applied to a resource type, and there are several different available resource types:
- https://kubernetes.io/docs/reference/kubectl/overview/#resource-types

In [None]:
%%bash
brew install kubectl

With this instruction we can install ``kubectl`` on macos. 

For other systems follow the installation guide: 

- https://kubernetes.io/docs/tasks/tools/install-kubectl/ 

In [None]:
%%bash

kubectl get nodes

In [None]:
%%bash

kubectl get pods

Using kubectl I can easily get information about the nodes that are currently running in my local cluster. 

    while delivering this talk you can show the dynamic nature of this presentation format by changing the command from this current node to the following instruction 
    
    kubectl get nodes -o json 
    
    This will generate a quite large output that you can skip iterating on
    
In my case, I have only one member in the cluster, minikube, which is readily available and has the master role. 

### `docker`

```bash
docker COMMAND
```

https://docs.docker.com/docker-for-mac/

#### Use `minikube` container runtime (docker daemon)

```bash
minikube docker-env 
```

But before building the container image of our application, we are going to redirect your docker client to connect to minikube's container runtime, minikube has its own docker daemon.

#### Build an Image

```bash
# builds the mflix image
docker build -t mflix mflix

# lists the list of existing images
docker images
```

Builds the `mflix` image from defined in the `mflix/Dockerfile` 

In [None]:
%%bash
minikube docker-env 

# eval $(minikube docker-env )

In [None]:
%%bash

docker build -t mflix mflix

docker images

#### Run a Container

```bash
# run a container
docker run -d -p 8500:5000 mflix

# list running containers
docker ps
```

This command initializes a container, using the mflix container image in detached mode, mapping the container exposed port 5000 to host port 8500 

In [None]:
%%bash

docker run -d -p 8500:5000 mflix
  
docker ps --format "table {{.ID}}\t{{.Image}}"

In [None]:
%%bash
# execute this line to stop the previously launched container
docker ps --quiet | xargs docker container stop 

In [None]:
%%bash

cat mflix-deployment.yaml

### Install MongoDB Enterprise Operator 
```bash
git clone https://github.com/mongodb/mongodb-enterprise-kubernetes.git
```

https://docs.opsmanager.mongodb.com/current/tutorial/install-k8s-operator/

To install the ``mongodb-enterprise-kubernetes`` we recommend you clone the repository locally. 

We are going to look into the what an kubernetes operator is and how the ``mongodb-enterprise-kubernetes`` operator works.
For now we can simply clone the the repository and follow the installation instructions: 

* https://docs.opsmanager.mongodb.com/current/tutorial/install-k8s-operator/


#### Cloud Manager Organization / Project
https://www.mongodb.com/cloud/cloud-manager
![Cloud Manager Project Dashboard](assets/cloud_manager_project.png)

You can think of Cloud Manager as the hosted version of Ops Manager, which is very handy for the purposes of this presentation, given that it removes the need for a local deployment of Ops Manager. 

Not expecting you to do it right now, but feel free to try it out, you can start a 30-day free trial of Cloud Manager, if you want to reproduce all of the configuration setup that I'm about to show you. 

## Recap

- Basic overview of Kubernetes components and architecture 
- How to locally install a Kubernetes cluster 
- How to deploy containarized applications in Kubernetes 
- How to deploy and manage a MongoDB Cluster in Kubernetes 
- How to integrate Ops Manager | Cloud Manager with Kubernetes

### References and Glossory

* [kubectl documentation](https://kubernetes.io/docs/reference/kubectl/kubectl/)
* [kubernetes node]()
* [kubeadm documentation](https://kubernetes.io/docs/setup/independent/high-availability/#external-etcd)
* MongoDB Enterprise Kubernetes Operator 
    * [Installation tutorial](https://docs.opsmanager.mongodb.com/current/tutorial/install-k8s-operator/)
    * [Source code repository](https://github.com/mongodb/mongodb-enterprise-kubernetes)  

## QA ? / Notes / Thank You!