Author Carpentry : Docker for reproducible research
Reproducibility of computational results is crucial in modern algorithm-based research. In this lesson, we introduce Docker as a useful tool to (a) document your computational environment, and (b) make a computational environment transferable across machines and thus archivable. The intention of this course is to showcase Docker as a useful tool for scientists, even if they are not regular users of the command line, which this course is completely based on.
Content Contributors: Daniel Nüst
Lesson Maintainers: Daniel Nüst
Lesson status: In Development
- Docker basics (images, Dockerfiles, containers)
- RStudio in a container
- Jupyter Notebook in a container
- Archival of software and runtime environments with Docker
- Creating reproducible runtime environments from manifests
- Getting started with Docker
- RStudio in a Docker container
- Jupyter in a Docker container
- Transfer and archive of containers
- Create an image from a Dockerfile
Scope of this lesson
This lesson provides a rather "raw and manual" approach to creating reproducible packages of data, code, and the required runtime environment. Making this potentially tedious process more comfortable and ideally automatic for users is an active field of research, see for example the Executable Research Compendium by the o2r project, and tools such as ReproZip. Naturally understanding in depth how reproducibility can be achieved provides a clear advantage over simply using a white box (the existing tools are all open, so there is no black box). Therefore this lesson's contents on concepts or containerization/virtualization and the leading open source tool are surely worth knowing, even when using supporting tools and services. This also makes this topic suitable for a generic audience interested in Author Carpentry.
Author Carpentry's teaching is hands-on, so participants are encouraged to use their own computers to insure the proper setup of tools for an efficient workflow. These lessons assume no prior knowledge of the skills or tools, but working through this lesson requires working installations of the software described below. To most effectively use these materials, please make sure to install everything before working through this lesson.
- editor with highlighting for
Dockerfiles, e.g. vscode (use this if you have no preference), Atom or Sublime, or even Vim
In addition to the software, please bring a piece of your own research in one of the following formats (preferred formats first) if you have one at hand. This could be your digital notebook, or a section of any analysis script you've been using for a published paper, for example. Make sure you can share these files with other course participants, i.e. also bring required data, remove information with privacy issues, and potentially make an excerpt of a longer script and make sure it still runs.
- RMarkdown (
- R script (
- Python script (