Skip to content

EmilHvitfeldt/working-with-docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

working-with-docker

Table of Content

Motiation

reproducibility etc etc

Several publications now comes with dockerfiles to replicate their research. The SWAG data set is one example.

Before We Start

We will be learning how to use Docker with R to promote reproducibility. First you would need to install docker and install R.

In the next sections we will look a some of the use cases

Examples

Creating a Docker Image

Start by creating folder for analysis.

library(fs)
getwd()
dir_create("01-example")
file_create("01-example/Dockerfile")

Inside this folder we will put the script we will run. Here it will be called script.R and include the following code

# Transform data to right shape/format
iris_metrix <- as.matrix(iris[-5])

# Run model
kmeans_iris <- kmeans(iris_metrix, centers = 3)

# Extract information
cluster_centers <- as.data.frame(kmeans_iris$centers)
cluster_class <- kmeans_iris$cluster

# Same results
library(readr)
write_csv(cluster_centers, "iris-centers.csv")

It takes some data, works on it, and output to a file.

Check current version of R we are on.

R.Version()$version.string

From this we will get a docker image from rocker with the same version of R. Populate your Dockerfile with

FROM rocker/r-ver:3.5.1

which will download an image with version 3.5.1 R already installed. We also need to install the various R packages used in our script. to find the versions installed for you run the following line of code in R.

sessioninfo::package_info(pkgs = c("readr", "tibble"), dependencies = FALSE)

Notice how we can pass multiple packages names in the pkgs argument. We will load readr in as version 1.3.0 and tibble as version 1.4.2 (as 2.0.0 is quite yet available). In your docker file add the follow code

 RUN R -e "install.packages('remotes'); \
   remotes::install_version('readr', '1.3.0'); \
   remotes::install_version('tibble', '2.0.0');"

Next we will create a folder inside the docker container where the analysis will take place, this is done by including

RUN mkdir /home/analysis

To move the script into the docker image we include

COPY script.R /home/analysis/script.R

Lastly we will ask docker to run the script, and move the result to a results folder.

CMD cd /home/analysis \
  && R -e "source('script.R')" \
  && mv /home/analysis/iris-centers.csv /home/results/iris-centers.csv

Now the dockerfile is complete and we can build the docker image, this is done by typing in the following command into the terminal. We have named the image analysis.

# setting the right working directory
cd ~/Github/working-with-docker/01-example
# Build docker image
docker build -t analysis .

Access information from image

# Access results form image
docker run -v ~/Github/working-with-docker/01-example:/home/results  analysis

Docker Commands

docker pull [user/repo] # Get image from a repository
docker build [directory] -t [tag] # Make an image from a Docker file and give it a tag/name
docker run [image] -p [ports] -v [volumes] [command] -d # Start a container from an image opening a set of ports linking a set volumes run a non-default command, detach
docker stop [container] # Stop a running container

Resources

An Introduction to Docker for R Users https://colinfay.me/docker-r-reproducibility/

A collection of Docker images for R https://hub.docker.com/u/rocker

Rocker project https://www.rocker-project.org/

Introduction to Rocker https://journal.r-project.org/archive/2017/RJ-2017-065/index.html

ROpenSci Docker Tutorial https://ropenscilabs.github.io/r-docker-tutorial/

How To Remove Docker Containers, Images, Volumes, and Networks https://linuxize.com/post/how-to-remove-docker-images-containers-volumes-and-networks/

Docker for the UseR https://github.com/noamross/nyhackr-docker-talk

Reproducibility of computational workflows is automated using continuous analysis https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6103790/

About

🐳 Worked minimal examples of workflows with Docker and R

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published