Skip to content

Bioconductor/k8sredis

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

18 May, 2019

Work in progress: current state

Recent updates

  • Use dockerhub images; separate 'manager' and 'worker'.
  • Document use on gcloud
  • Re-organized files for easier build & deploy

Start minikube or gcloud

Use minikube for (local) development, or gcloud for scalable deployment.

Start the minikube VM with

minikube start

For gcloud, see below.

Create application in kubernetes

In kubernetes, create a redis service and running redis application, an RStudio service, an RStudio 'manager', and five R worker 'jobs'.

kubectl apply -f k8s/

The two services, redis and manager pods, and worker pods should all be visible and healthy with

kubectl get all

Log in to R

Via your browser on the port 300001 at the ip address returned by minikube or gcloud

## For minikube, use...
minikube ip

## For gcloud, use any 'EXTERNAL-IP' from
kubectl get nodes --output wide

e.g.,

http://192.168.99.101:30001

this will provide access to RStudio, with user rstudio and password bioc. Alternatively, connect to R at the command line with

kubectl exec -it manager -- /bin/bash

Use

Define a simple function

fun = function(i) {
    Sys.sleep(1)
    Sys.info()[["nodename"]]
}

Create a RedisParam to connect to the job queue and communicate with the workers, and use BiocParallel::register() to make this the default back-end

library(RedisParam)

p <- RedisParam(workers = 5, jobname = "demo", is.worker = FALSE)
register(bpstart(p))

Use bplapply() for parallel evaluation

system.time(res <- bplapply(1:13, fun))
table(unlist(res))

Clean up

Quit and exit the R manager (or simply leave your RStudio session in the browser)

> q()     # R
# exit    # manager

Clean up kubernetes

$ kubectl delete -f k8s/

Stop minikube or gcloud

## minikube...
minikube stop

## ..or gcloud
gcloud container clusters delete [CLUSTER_NAME]

Google cloud [WORK IN PROGRESS]

One uses Google kubernetes service rather than minikube. Make sure that minikube is not running

minikube stop

Enable kubernetes service

Make sure the Kubernetes Engine API is enables by visiting https://console.cloud.google.com.

Make sure the appropriate project is selected (dropdown in the blue menu bar).

Choose APIs & Services the hamburger (top left) dropdown, and + ENABLE APIS & SERVICES (center top).

Configure gcloud

At the command line, make sure the correct account is activated and the correct project associated with the account

gcloud auth list
gclod config list

Use gcloud config help / gcloud config set help and eventually gcloud config set core/project VALUE to udpate the project and perhaps other information, e.g., compute/zone and compute/region.

Start and authenticate the gcloud kubernetes engine

A guide to exposing applications guide is available; we'll most closely follow the section Creating a Service of type NodePort.

Create a cluster (replace [CLUSTER_NAME] with an appropriate identifier)

gcloud container clusters create [CLUSTER_NAME]

Authenticate with the cluster

gcloud container clusters get-credentials [CLUSTER_NAME]

Create a whole in the firewall that surrounds our cloud (30001 is from k8s/rstudio-service.yaml)

gcloud compute firewall-rules create test-node-port --allow tcp:30001

At this stage, we can use kubectl apply ... etc., as above.

Docker images

Docker images for the manager and worker are available at dockerhub as mtmorgan/bioc-redis-manager and mtmorgan/bioc-redis-worker. They were built as

docker build -t bioc-redis-worker -f docker/Dockerfile.worker docker
docker build -t bioc-redis-manager -f docker/Dockerfile.manager docker

The R manager docker file -- is from rocker/rstudio:3.6.0 providing R RStudio server, and additional infrastructure to support RedisParam. The R worker docker file -- is from rocker/r-base:latest providing R, and additional infrastructure to support RedisParam.

If one were implementing a particularly workflow, likely the worker (and perhaps manager) images would be built from a more complete image like Bioconductor/AnVIL_Docker customized with required packages.

For use of local images, one needs to build these in the minikube environment

eval $(minikube docker-env)
docker build ...

TODO

A little further work will remove the need to create the RedisParam() in the R session.

The create / delete steps can be coordinated by a helm chart, so that a one-liner will give a URL to a running RStudio backed by arbitary number of workers.

About

Use a kubernetes cluster of R workers using redis & RedisParam

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%