dockergate2

Docker has become a popular technology used by cloud services because of the ease of deployment and process isolation provided to the applications. By default,Docker has access to a majority of system calls that are made to the host kernel. This unrestricted access of system calls results in exposing a greater attack surface for sharing same kernel. Docker version 17.03 onwards includes support for a Seccomp profile that can allow or deny system calls that the applications inside a Docker container can make. However, it is impractical to manually define a Seccomp profile for a particular Docker image without innate knowledge of all executable code inside. In this work, we propose DockerGate, a platform that can statically analyze a Docker image to identify the system calls in the executable binaries. Our key insight here is that by analyzing the reachable, executable code, tighter Seccomp profiles can be generated which would reduce the attack surface of the Docker Container.The static analysis framework was developed as a pipeline that first generates a graph of the filesystem and then traverses the graph, analyzing each node for system calls. We test DockerGate by generating Seccomp policies for 40 Docker Images and running the Images within a container with the Seccomp policy applied. We achieve an average size of 230 system calls for a Seccomp policy as opposed to the Default policy of 300 system calls. We also manage to achieve basic functionality for 39 out of 42 containers

Over the course of the past semester, we have developed a platform that can statically analyze a Docker Image and can generate a custom Seccomp policy that is tighter than the default Seccomp policy. In this document, we describe the architecture of this platform and how it can be extended to perform other experiments with different types of files.

Overview

We developed DockerGate as a static analysis platform that could go through all the executable code in a Docker Image and can extract what system calls each executable requires. The process involves mounting the Docker Image onto a container, traversing through the filesystem and generating a “call/linked” graph and then individually analyzing each executable and their associated libraries. So the Analysis process can be divided into three phases :

Initial Pass - Call graph generation for Docker Image

Second Pass - Traversing the Call graph and analyzing each file

Policy Generation - Seccomp policy is written

The following document describes how each phase is executed and what tools are used in each phase and how the entire solution is implemented. Below is the code structure.

Initial Pass over Container Filesystem

The initial pass is done by mounting the Docker Image and doing a depth first type of file traversal. Wherever an executable file is encountered, we add it as a “blue” node to the traversal graph. Then, we check whether this executable file is dynamically linked and if yes, which libraries it is linked to. These libraries are also added to the graph as “white” nodes. These libraries are also recursively checked for other linked libraries. So we get a multiple component graph as can be seen below. The above is done by sharing a folder containing the file traversal code and graph generation code between the host and container. The folder contains a statically-linked version of Python with the Graphviz module installed. By including all of these required tools, there is no dependency for the code that is required from the container. This makes DockerGate platform-independent. The final output graph is stored as a DOT file and is copied back to the host. Upon visualization of the graph, it can be seen that there are no directed paths from one ELF binary to another and the graphs are rather disconnected. We believe that once we include analysis of bash script, python files, the connectivity will improve and we will be able to produce tighter Seccomp policies. As of now, we are analyzing all executables as we are assuming all of the code is reachable. However, for Docker Images that are Web Servers and are based on some base OS like Ubuntu or CentOS, many of the executables are present just as dead code and are almost never executed. So, we believe that if we follow the execution path or call-graph from the entrypoint of the Docker Image, we can produce Seccomp policies that only allow the system calls that are present in that path. So, unusable code like bash (in Tomcat) won’t even be analyzed. The code is such that once we are able to find a suitable method that can analyze bash scripts and Python files, it can be added as a separate module to DockerGate.

Second Pass over Container Filesystem

Once the DOT graph of the file traversal is generated, the rest of the analysis happens in the host itself. We use Radare2 as our primary analysis tool. In the initial version of DockerGate, we had used based text analysis of the object files of the executable files. During the evaluation, we found that it had been missing several system calls and wasn’t as sophisticated as one would like. So, in this iteration, we decided to switch to Radare2. Radare2 provides Python APIs that can dissect an executable file or linked library and provide the assembly code for every function. Using a combination of these APIs with text analysis, we search for all the system calls being called in that binary or library function.

To avoid repeating analysis of the same libraries in different Images, we also maintain an SQLlite3 database that saves what system calls each library function makes and what system calls every binary makes. We use a SHA256 hash of the library or binary file as a unique identifier. So, for example, if a Docker Image uses libc-2.23 and this particular library has already been seen before, the hash of this file and the saved hash would be compared. If there is a match, the analysis will be skipped as the system calls each function makes is already known. There is a similar case for binaries. If bash-2.23 is being used and has already been analyzed, there is no need to analyze it again. This considerably speeds up the process of analysis for a Docker Image especially after several images have been analyzed.

Please check out dockergate-automated-seccomp.pdf for the complete design

How to run dockergate

Execute all the commands in install.txt to setup the environment.
Run dockergate_start.sh image

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
docker_shared_folder		docker_shared_folder
graphs		graphs
scripts		scripts
snapshot		snapshot
src		src
temp_policy		temp_policy
.gitignore		.gitignore
README.md		README.md
community-docker-2000.txt		community-docker-2000.txt
dockergate-automated-seccomp.pdf		dockergate-automated-seccomp.pdf
dockergate_start.sh		dockergate_start.sh
image_index		image_index
index.txt		index.txt
install.txt		install.txt

mohitgoyal2011/dockergate2.1

Folders and files

Latest commit

History

Repository files navigation

dockergate2

Overview

Initial Pass - Call graph generation for Docker Image

Second Pass - Traversing the Call graph and analyzing each file

Policy Generation - Seccomp policy is written

Initial Pass over Container Filesystem

Second Pass over Container Filesystem

How to run dockergate

About

Topics

Resources

Stars

Watchers

Forks

Languages