# Part I - docker compose

Part I is about using docker compose to start multiple docker containers. Using docker compose you can also put all commands you normally need to run a container in a file. It can be submitted to version control system.

In [7]:
import os
print(os.listdir())
print(os.listdir('docker'))

['.ipynb_checkpoints', 'deploy-dask-cluster-docker-compose.ipynb', 'docker', 'docker-compose.yml']
['Dockerfile', 'requirements.txt']


In [4]:
!tree

Folder PATH listing for volume Acer
Volume serial number is A2AC-A7B2
C:.
+---.ipynb_checkpoints
+---docker


#### Already done

In notebook xyz a Dockerfile and requirements were created. These are repeated such that this directory is self contained. 

In [9]:
%%writefile ./docker/Dockerfile
FROM python:3.10.12-slim

WORKDIR app
COPY ./requirements.txt .

RUN pip3 install --upgrade pip

RUN pip3 install -r requirements.txt

EXPOSE 8786 8787 30000-65535

CMD ["dask", "scheduler", "--host", "0.0.0.0"]
#CMD ["dask", "scheduler"]


Overwriting ./docker/Dockerfile


In [15]:
%%writefile ./docker/requirements.txt
pandas==2.2.3
numpy==2.1.0
dask[complete]==2024.9.1
dask-cloudprovider==2024.9.0
s3fs==2024.9.0
dask-expr==1.1.15
awscli
jupyter


Overwriting ./docker/requirements.txt


In [16]:
os.listdir('docker')

['Dockerfile', 'requirements.txt']

In [54]:
%%writefile docker-compose.yml
services:
  dask-scheduler:
    restart: always
    #context: ./
    build: ./docker/
    container_name: dask-scheduler-container
    ports:
      - 8786-8787:8786-8787
  dask-worker-1:
    restart: always
    #context: ./
    build: ./docker
    container_name: dask-worker-1
    command: "dask worker tcp://dask-scheduler:8786"


Overwriting docker-compose.yml


In [55]:
!docker compose build

#0 building with "default" instance using docker driver

2024/10/24 19:11:11 http2: server: error reading preface from client //./pipe/docker_engine: file has already been closed




#1 [dask-worker-1 internal] load build definition from Dockerfile
#1 transferring dockerfile: 291B done
#1 DONE 0.0s

#2 [dask-scheduler internal] load build definition from Dockerfile
#2 transferring dockerfile: 291B done
#2 DONE 0.0s

#3 [dask-scheduler internal] load metadata for docker.io/library/python:3.10.12-slim
#3 DONE 0.7s

#4 [dask-scheduler internal] load .dockerignore
#4 transferring context: 2B done
#4 DONE 0.0s

#5 [dask-worker-1 internal] load .dockerignore
#5 transferring context: 2B done
#5 DONE 0.0s

#6 [dask-scheduler internal] load build context
#6 transferring context: 38B done
#6 DONE 0.0s

#7 [dask-scheduler 1/5] FROM docker.io/library/python:3.10.12-slim@sha256:4d440b214e447deddc0a94de23a3d97d28dfafdf125a8b4bb8073381510c9ee2
#7 DONE 0.0s

#8 [dask-worker-1 internal] load build context
#8 transferring context: 38B done
#8 DONE 0.0s

#9 [dask-worker-1 2/5] WORKDIR app
#9 CACHED

#10 [dask-worker-1 3/5] COPY ./requirements.txt .
#10 CACHED

#11 [dask-worker-1 4/

In [60]:
!docker-compose up -d

 Network deploy-dask-cluster-docker-compose_default  Creating
 Network deploy-dask-cluster-docker-compose_default  Created
 Container dask-scheduler-container  Creating
 Container dask-worker-1  Creating
 Container dask-worker-1  Created
 Container dask-scheduler-container  Created
 Container dask-scheduler-container  Starting
 Container dask-worker-1  Starting
 Container dask-worker-1  Started
 Container dask-scheduler-container  Started


In [61]:
!docker ps

CONTAINER ID   IMAGE                                               COMMAND                  CREATED         STATUS         PORTS                                               NAMES
24fafab0f794   deploy-dask-cluster-docker-compose-dask-scheduler   "dask scheduler --hoâ€¦"   7 seconds ago   Up 3 seconds   0.0.0.0:8786-8787->8786-8787/tcp, 30000-65535/tcp   dask-scheduler-container
1f4f2554de66   deploy-dask-cluster-docker-compose-dask-worker-1    "dask worker tcp://dâ€¦"   7 seconds ago   Up 4 seconds   8786-8787/tcp, 30000-65535/tcp                      dask-worker-1


In [63]:
!docker logs dask-scheduler-container

2024-10-24 17:12:27,457 - distributed.scheduler - INFO - -----------------------------------------------
2024-10-24 17:12:28,120 - distributed.http.proxy - INFO - To route to workers diagnostics web server please install jupyter-server-proxy: python -m pip install jupyter-server-proxy
2024-10-24 17:12:28,192 - distributed.scheduler - INFO - State start
2024-10-24 17:12:28,198 - distributed.scheduler - INFO - -----------------------------------------------
2024-10-24 17:12:28,200 - distributed.scheduler - INFO -   Scheduler at:     tcp://172.21.0.3:8786
2024-10-24 17:12:28,201 - distributed.scheduler - INFO -   dashboard at:  http://172.21.0.3:8787/status
2024-10-24 17:12:28,201 - distributed.scheduler - INFO - Registering Worker plugin shuffle
2024-10-24 17:12:31,767 - distributed.scheduler - INFO - Register worker <WorkerState 'tcp://172.21.0.2:35795', status: init, memory: 0, processing: 0>
2024-10-24 17:12:32,130 - distributed.scheduler - INFO - Starting worker compute stream, tcp:/

In [62]:
!docker logs dask-worker-1

2024-10-24 17:12:30,011 - distributed.nanny - INFO -         Start Nanny at: 'tcp://172.21.0.2:34765'
2024-10-24 17:12:31,275 - distributed.worker - INFO -       Start worker at:     tcp://172.21.0.2:35795
2024-10-24 17:12:31,275 - distributed.worker - INFO -          Listening to:     tcp://172.21.0.2:35795
2024-10-24 17:12:31,275 - distributed.worker - INFO -          dashboard at:           172.21.0.2:44081
2024-10-24 17:12:31,275 - distributed.worker - INFO - Waiting to connect to:  tcp://dask-scheduler:8786
2024-10-24 17:12:31,276 - distributed.worker - INFO - -------------------------------------------------
2024-10-24 17:12:31,276 - distributed.worker - INFO -               Threads:                         12
2024-10-24 17:12:31,276 - distributed.worker - INFO -                Memory:                  15.54 GiB
2024-10-24 17:12:31,277 - distributed.worker - INFO -       Local Directory: /tmp/dask-scratch-space/worker-c9m9p3pw
2024-10-24 17:12:31,277 - distributed.worker - INFO -

In [64]:
from dask.distributed import Client

In [65]:
client = Client(address="tcp://localhost:8786")

In [66]:
client

0,1
Connection method: Direct,
Dashboard: http://localhost:8787/status,

0,1
Comm: tcp://172.21.0.3:8786,Workers: 1
Dashboard: http://172.21.0.3:8787/status,Total threads: 12
Started: Just now,Total memory: 15.54 GiB

0,1
Comm: tcp://172.21.0.2:35795,Total threads: 12
Dashboard: http://172.21.0.2:44081/status,Memory: 15.54 GiB
Nanny: tcp://172.21.0.2:34765,
Local directory: /tmp/dask-scratch-space/worker-c9m9p3pw,Local directory: /tmp/dask-scratch-space/worker-c9m9p3pw
Tasks executing:,Tasks in memory:
Tasks ready:,Tasks in flight:
CPU usage: 6.0%,Last seen: Just now
Memory usage: 154.03 MiB,Spilled bytes: 0 B
Read bytes: 573.0414202634641 B,Write bytes: 1.57 kiB


docker compose created an internal bridge network. The internal IP from the scheduler above can be found in the docker network that was created.

In [67]:
!docker network ls

NETWORK ID     NAME                                         DRIVER    SCOPE
a99534b2cf90   bridge                                       bridge    local
5c44ba80445b   deploy-dask-cluster-docker-compose_default   bridge    local
4efe82ca718c   host                                         host      local
bd68aa16dd4b   none                                         null      local
e80b6b301979   setuppostgresdatabase_default                bridge    local


inspect the network to see the internal IP adresses and compare to above

In [69]:
#!docker network inspect deploy-dask-cluster-docker-compose_default

In [72]:
client.close()

In [73]:
!docker compose down

 Container dask-scheduler-container  Stopping
 Container dask-worker-1  Stopping
 Container dask-scheduler-container  Stopped
 Container dask-scheduler-container  Removing
 Container dask-scheduler-container  Removed
 Container dask-worker-1  Stopped
 Container dask-worker-1  Removing
 Container dask-worker-1  Removed
 Network deploy-dask-cluster-docker-compose_default  Removing
 Network deploy-dask-cluster-docker-compose_default  Removed


# Kubernetes

https://www.youtube.com/watch?v=p6xDCz00TxU

https://www.dabbleofdevops.com/blog/deploy-and-scale-your-dask-cluster-with-kubernetes