# An architecture for achieving greatness
1. [Containers](#containers)
    - [VM vs. Container](#VM-vs-Container)
    - [Software packaging](#Software-packaging)
    - [Building and deploying](#Building-and-deploying)
    - [Sharing is caring - or not](#Sharing-is-caring---or-not)
    - [How does it scale](#How-does-it-scale)
2. [DC/OS](#dc/os)
    - [One OS to rule them all](#One-OS-to-rule-them-all)
        - [Mesos](#Mesos)
            - [The minimal cluster resource manager](#The-minimal-cluster-resource-manager)
            - [User space and system space - just like linux](#User-space-and-system-space---just-like-linux)
        - [Marathon - the worlds oldest scheduler](#Marathon---the-worlds-oldest-scheduler)
        - [And a bunch of other things](#And-a-bunch-of-other-things)
    - [DC/OS vs. Kubernetes vs. Swarm](#DC/OS-vs.-Kubernetes-vs.-Swarm)
    - [Creating apps](#creating-apps)
        - [An example app - Memcached](#An-example-app---Memcached)
3. [Azure](#Azure)
    - [Deploying DC/OS](#deploying-dc/os)
    - [Deployment and monitoring](#deployment-and-monitoring)

---

## Containers

### VM vs. Container
* What is aufs
* What happens in the kernel
* OS virtualization vs Hardware 

<tr>
    <td> <img src="https://www.docker.com/sites/default/files/Container%402x.png" alt="Drawing" style="width: 400px;"/> </td>
    <td> <img src="https://www.docker.com/sites/default/files/VM%402x.png" alt="Drawing" style="width: 400px;"/> </td>
    </tr>

<tr>
    <td> <img src="http://images.nvidia.com/content/products/hpc/gpu-enabled-docker-containers.png" alt="Drawing" style="width: 500px;"/> </td>
</tr>



---

### Software packaging

##### A minimal Julia script

In [None]:
# example.jl
while true
    println("I'm Alive")
    sleep(20)
end

---

##### A basic Dockerfile for the script

```bash
FROM centos:7

# INSTALL JULIA  
ENV JULIA_TAG="0.6"
ENV JULIA_BIN="https://julialang-s3.julialang.org/bin/linux/x64/${JULIA_TAG}/julia-${JULIA_TAG}-latest-linux-x86_64.tar.gz"
RUN yum --enablerepo=extras install -y epel-release && \
    yum update -y && \
    curl ${JULIA_BIN} | tar zxf - -C /tmp && \
    cp -r /tmp/julia*/* /usr/local && \
    rm -rf /tmp/*

# Update packages and install Requests
RUN julia -e 'Pkg.update()' && \
    yum install -y http-parser unzip && \
    julia -e 'Pkg.add("Requests")' && \
    julia -e 'using Requests'

# Add example julia script to be executed on start
ADD example.jl /
ENTRYPOINT ["julia", "/example.jl"]
```

---

#### Creating a GPU enabled container


```bash
FROM nvidia/cuda:8.0-cudnn6-devel-centos7
LABEL maintainer "LakeTide <info@laketide.com>"
WORKDIR /tmp
ENV JULIA_TAG="v0.6.0"
ENV JULIA_GIT="https://github.com/JuliaLang/julia.git"
ENV MXNET_HOME="/usr/local"

RUN yum --enablerepo=extras install -y epel-release && \
    yum update -y && \
    yum install -y git m4 gcc-gfortran make which bzip2 patch openssl && \
    git clone ${JULIA_GIT} && \
    cd julia && \
    git checkout ${JULIA_TAG} && \
    sh contrib/download_cmake.sh && \
    make all -j$(nproc) MARCH=x86-64 JULIA_CPU_TARGET=x86-64 BUILD_LLVM_CLANG=1 && \
    cp -r usr/* /usr/local/ && \
    rm -f /usr/local/lib/libssh* /usr/local/lib/libcurl* && \
    ln -s /usr/local/tools/* /usr/local/bin/ && \
    rm -f /usr/local/bin/curl* && \
    yum autoremove -y && rm -rf /var/cache/yum* && rm -rf /tmp && mkdir /tmp
    
 RUN yum install -y opencv lapack blas opencv-devel wget openblas-devel lapack-devel && \
    cd /tmp && \
    git clone --depth 1 --recursive --single-branch https://github.com/dmlc/mxnet && cd mxnet && \
    make -j$(nproc) ADD_CFLAGS=-I/usr/include/openblas USE_DIST_KVSTORE=1 USE_BLAS=openblas \
    USE_LAPACK=1 USE_OPENCV=1 USE_CUDA=1 USE_CUDNN=1 USE_CUDA_PATH=/usr/local/cuda && \
    cp lib/* /usr/local/lib && \
    echo "/usr/local/lib" | tee -a /etc/ld.so.conf && \
    echo "/usr/local/cuda/lib64/stubs" | tee -a /etc/ld.so.conf && ldconfig && \
    julia -e 'Pkg.update()' && \
    julia -e 'Pkg.add("MXNet"); using MXNet' && \
    yum autoremove -y && rm -rf /var/cache/yum/* && rm -rf /tmp && mkdir /tmp

RUN julia -e 'Pkg.add("DataFrames"); using DataFrames' && \
    julia -e 'Pkg.add("OhMyREPL"); using OhMyREPL' && \
    julia -e 'Pkg.add("CUDAnative")' && \
    julia -e 'Pkg.add("CuArrays")' && \
    yum autoremove -y && rm -rf /var/cache/yum/*

RUN yum update -y && \
    yum install -y python-devel python-pip czmq && \
    pip install --upgrade pip && \
    pip install jupyter && \
    pip install jupyterthemes && \
    jt -f ubuntu -t grade3 && \
    julia -e 'Pkg.update()' && julia -e 'Pkg.add("IJulia"); using IJulia' && \
    yum install -y fftw-libs && \
    julia -e 'Pkg.add("Gadfly")' && julia -e 'using Gadfly' && \
    julia -e 'Pkg.add("Plots")' && julia -e 'using Plots' && \
    yum autoremove -y && \
    rm -rf /var/cache/yum/*

COPY extras/juliarc.jl /root/.juliarc.jl
COPY extras/bashrc /root/.bashrc
COPY extras/bash_profile /root/.bash_profile

EXPOSE 8888

ENTRYPOINT julia -e 'Pkg.build("CUDAdrv"); Pkg.build("CuArrays")' && export JULIA_NUM_THREADS=$((`nproc`/2)) && /usr/bin/jupyter notebook --allow-root --ip='*'
```


---

### Building and deploying

```bash
# We can build a Dockerfile and push it to our repository
docker build -t laketide/nordicdata .
docker login
docker push laketide/nordicdata
```

```bash
# We can run a docker image locally as a single instance
docker run -d -p 1337:8888 -v ~/containervolume:/tmp laketide/nordicdata

```

```bash
# We can map nvidia drivers to our container to utilize gpu's on the host
nvidia-docker run -d --restart always -p 6666:22 -p 1337:8888 -v ~/containervolume:/tmp laketide/julia-src
NV_GPU='0,1' nvidia-docker run -d -p 1337:8888 -v ~/containervolume:/tmp laketide/nordicdata
NV_GPU='GPU-836c0c09,GPU-b78a60a' nvidia-docker run ...

```

---

### Sharing is caring - but who cares

### How does it scale? - it doesn't

* Docker Swarm
* Kubernetes
* DC/OS
* RancherOS

---


<tr>
    <td> <img src="https://mesosphere.com/wp-content/uploads/2016/04/logo-horizontal-styled.png" alt="Drawing" style="width: 400px;"/> </td>
</tr>


### One OS to rule them all
* User space and system space - just like linux
* Mesos + systemd components


<tr>
    <td> <img src="https://dcos.io/docs/1.10/img/dcos-components-1.9.png" alt="Drawing" style="width: 500px;"/> </td>
</tr>

---

* Master/slave architecture

<tr>
    <td> <img src="https://dcos.io/docs/1.9/img/dcos-node-types.png" alt="Drawing" style="width: 500px;"/> </td>
</tr>

---

* DNS, virtual networks, layer 4 load balancer, VIPs for service discovery

<tr>
    <td> <img src="https://mesosphere.com/wp-content/uploads/2015/12/lb1.jpg" alt="Drawing" style="width: 400px;"/> </td>
</tr>

---

* DMZ for security

<tr>
    <td> <img src="https://dcos.io/docs/1.9/img/dcos-node-types.png" alt="Drawing" style="width: 500px;"/> </td>
</tr>

* Built in container scheduler for jobs and stateful services

---

### Mesos

* Minimal cluster resource manager
* Fault tolerance via Zookeeper
* Provides custom isolation for cpu, mem, disks, ports, gpu
* Native support for Docker and the Nvidia plugin

### Marathon
* Like Internet Explorer - but nice.
* Built in container scheduler (equivalent to Kubernetes, Swarm etc)
* Blue/Green deployment
* Persistent volumes
* Custom health check definitions

## ----> Let's deploy DC/OS on Azure!

---

### Creating apps

##### Deploying the example Docker image on DC/OS with Marathon

```json
{
  "id": "nordicdatasummit",
  "cpus": 0.5,
  "mem": 1000,
  "container": {
    "type": "DOCKER",
    "docker": {
      "image": "laketide/nordicdata",
      "network": "BRIDGE",
      "forcePullImage": true
    }
  }
}
```

#### DC/OS supports GPU computing natively

```json
{
    "id": "docker-gpu-test",
    "acceptedResourceRoles":["slave_public", "*"],
    "cmd": "while [ true ] ; do nvidia-smi; sleep 5; done",
    "cpus": 1,
    "mem": 128,
    "disk": 0,
    "gpus": 1,
    "instances": 1,
    "container": {
      "type": "MESOS",
      "docker": {
        "image": "nvidia/cuda"
      }
    }
}
```

#### Deploying a Universe app to Marathon

##### A Memcached json definition example

```json
{
  "id": "/memcached",
  "cmd": "memcached -m 1024 -vv",
  "instances": 1,
  "cpus": 1,
  "mem": 2000,
  "disk": 0,
  "gpus": 0,
  "constraints": [],
  "maxLaunchDelaySeconds": 3600,
  "container": {
    "type": "DOCKER",
    "volumes": [],
    "docker": {
      "image": "memcached",
      "network": "BRIDGE",
      "portMappings": [
        {
          "containerPort": 11211,
          "hostPort": 11211,
          "servicePort": 0,
          "protocol": "tcp",
          "name": "connect",
          "labels": {
            "VIP_0": "/memcached:11211"
          }
        }
      ],
      "privileged": true,
      "forcePullImage": false
    }
  },
  "healthChecks": [
    {
      "gracePeriodSeconds": 600,
      "intervalSeconds": 10,
      "timeoutSeconds": 10,
      "maxConsecutiveFailures": 3,
      "portIndex": 0,
      "protocol": "TCP"
    }
  ],
  "upgradeStrategy": {
    "minimumHealthCapacity": 0,
    "maximumOverCapacity": 0
  },
  "unreachableStrategy": {
    "inactiveAfterSeconds": 300,
    "expungeAfterSeconds": 600
  },
  "killSelection": "YOUNGEST_FIRST",
  "requirePorts": true
}
```

## Azure

### Deploying DC/OS
* With Container Service
* With ARM and acs-engine

### GUI
* Deploy apps
* Monitor cluster health
* Check Mesos, Exhibitor, Marathon

### CLI
* Deploy/remove custom apps, jobs etc.
* Micro-manage services, deployments

## Building your own GPU workstation

<tr>
    <td> <img src="https://i2.wp.com/www.laketide.com/wp-content/uploads/2016/07/DIGITSv2.jpg?w=1080" alt="Drawing" style="width: 700px;"/> </td>
</tr>

Hardware limitations:
* Max power available
* Cooling
* Sound?
* Money

Performance:
* GFLOPS/watt
* GeForce SLI?
* PCIe vs CPU
* RAM vs CPU
* SSD vs HD vs cost
* Power supply = sum of all components + safety
* Cool properly is key (spacing, positive airflow, 

<tr>
    <td> <img src="http://cdn.wccftech.com/wp-content/uploads/2014/08/ASIS-X99-E-WS-Motherboard.jpg" alt="Drawing" style="width: 700px;"/> </td>
</tr>
