-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
I ran into an issue in a project where my builds - run through docker-compose
- seemed to be taking an awfully long time (around ~60 seconds) during the context build/upload stage. strace
showed a ton of time was being spent stat()
ing files that were included in my .dockerignore
rules, which I found curious.
Oddly, when I simply used docker build
to build the container, I didn't have this issue, and context build/upload took about ~3-5 seconds. I couldn't figure out what was going wrong, so I investigated docker-py
, and found that almost all of my execution time was spent in this get_paths
call.
It appears that the difference in execution time is because docker-py's implementation of dockerignore/tar exclusion is far slower than Docker's:
Docker's implementation of the dockerignore exclusion algorithm, (seen here), walks through each folder, but does not descend into a directory if it matched an exclusion pattern. Meanwhile, docker-py first gets an array of every single file in the context folder, and then applies a filter to the array. This seems to be what is causing the massive difference in execution time when I build my project - docker-py is iterating over thousands of files that Docker correctly ignores.
I started on a fix, using what I believe are the same rules as Docker's algorithm: thomasboyt@9f302f6
This runs just as fast as Docker's implementation, but doesn't fully implement exception rules (e.g. !foo
), leading it to fail a few tests. Before I go through and add this feature, I wanted to confirm that I'm on the right path (and that no one else has a better solution/algorithm to apply).