Reproducible Research through Containerisation: Docker and Singularity
A workshop for the Eurotech Summer School: Open Science In Practice
11:00-13:00, 5 September 2019 | EPFL, Lausanne, Switzerland
Table of Contents
A container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings. Containers can be used to package entire scientific workflows, software and libraries, and even data. This means that you don’t have to ask your cluster admin to install anything for you - you can put it in a container and run or easily share your analysis with collaborators. Containers are particularly useful way for reproducing research which relies on software to be configured in a certain way, and/or which makes use of libraries that vary between (or don’t exist on) different systems. There are a number of different tools available for creating and working with containers, and this workshop will focus on Docker and Singularity.
After the workshop, participants should know:
- What containerisation is and why they might use it in their research.
- Some common containerisation tools and when to use them.
- How to find and run containers built by other people.
- How to build their own container.
- How to distribute their container online.
Prior to this workshop, participants will need:
- Their own laptop computers
- Some familiarity with the command line
- To install Docker: https://docs.docker.com/install/
- To install Singularity [optional]: https://sylabs.io/guides/3.0/user-guide/installation.html
- Learning objectives
- Feedback practice: Software Carpentry style traffic lights
- Introduction to Reproducibility through Containerisation
- Comfort break
- Build your own Docker container
- Summary and closing
You can find the slides to this workshop here.
- Part 1: Introduction to Reproducibility through Containerisation
- Part 2: Docker
- Part 3: Making your own Docker Image
- Part 4: Singularity
- Part 5: Summary and Closing
Attribution: The tutorial and slides have been adopted from the Melbourne Bioinformatics' Containerized Bioinformatics tutorial.
Dr. Rachael Ainsworth is the Research Software Community Manager at the Software Sustainability Institute based at the University of Manchester. She is passionate about openness, transparency, reproducibility and inclusion in research. She is also a Software Sustainability Institute Fellow, FOSTER certified Open Science Trainer, Mozilla Open Leader, member of The Turing Way community, organiser for the women in data meetup group HER+Data MCR and member of the first cohort of Tech Future Female Leaders.