Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Docker is a Linux container virtualisation platform that is popular for distributing and running server and command line applications for cloud instances in a reproducible manner and to form a distributed microservice architecture.
A Linux container is a special kernel feature, which similar to chroot jails behave as a separate machine, but unlike Virtual Machines do not have the overhead of virtualization of hardware.
Docker is popular in the devops movement as it provides an easy way to install dependencies for software development and deployment, e.g. to run servers for mySQL, Apache Solr or node.js.
In brief a Docker Image contains a virtual Linux file system (e.g. a miniature Debian installation). A Docker Container is a particular execution of a Docker Image, which typically runs a single process as installed within the container, and may have network ports exposed to the world, or have parts of the host computer's file system mounted within the inner container.
One great advantage of Docker is that it simplifies tool installation, as each Docker image is a self-contained Linux distribution which don't have to be compatible with the host computer (beyond the kernel), and it's easy to try out a different tool or tool version without causing irreversible changes.
Docker runs on Linux natively; for Windows and OS X users Docker automatically manage a virtual machine running the Linux containers. Docker containers can also be deployed on the cloud or a local cluster, e,g. using Docker Machine.
Docker images can be created from a
Dockerfile, which basically lists
the commands to run to prepare the image. Docker images can be chained together
using base images - for instance to build on an image with mySQL, the
FROM mysql followed by additional commands like
(to include new files) or
RUN (to run a command within the container).
Thus Docker is also an important tool for reproducibility, as these images can be automatically kept up to date and are distributed through the Docker hub. In bioinformatics, this has led to Bioboxes, a standard for creating interchangable bioinformatics software containers.
Docker is not compatible with all Grid/HPC architectures - as it requires certain Linux kernel features and the nodes often run older distribution. Another potential blocker for HPC users is that central Docker base images assume an amd64 processor architecture - using Docker on other CPUs would require compiling all Docker base images yourself - which would negate some of the advantages.