Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use shallow git clone in Dockerfiles #4117

Closed
cancan101 opened this issue Aug 31, 2016 · 10 comments
Closed

Use shallow git clone in Dockerfiles #4117

cancan101 opened this issue Aug 31, 2016 · 10 comments
Assignees

Comments

@cancan101
Copy link
Contributor

cancan101 commented Aug 31, 2016

git clone --depth 1 --shallow-submodules ...

EDIT: Looks like have to omit --shallow-submodules for the version of git in the image. This does not seem to affect the size or the speed.

@vrv
Copy link

vrv commented Aug 31, 2016

Will that work if we're operating at a specific commit that's not HEAD for our git repos?

@cancan101
Copy link
Contributor Author

The Dockerfile clones specifies a specific branch at clone time.

@vrv
Copy link

vrv commented Aug 31, 2016

Okay, can you give it a try with and without the shallow-submodules option and show us the time difference?

@vrv vrv added the stat:awaiting response Status - Awaiting response from author label Aug 31, 2016
@cancan101
Copy link
Contributor Author

Sorry, I should add this has two benefits:

  1. faster build
  2. more importantly: smaller Docker image. Right now it's > 3 GB.

@vrv
Copy link

vrv commented Aug 31, 2016

  1. How much faster?
  2. How is this related to Cleanup Bazel Cache in Dockerfile #4116 ?

@cancan101
Copy link
Contributor Author

cancan101 commented Aug 31, 2016

  1. On my local machine 3.8s vs 12.1s
  2. Related; it is one more step to cutting bloat in the image. That one I hope cuts 2+ GB. This one is a more modest 50MB.

@aselle aselle removed the stat:awaiting response Status - Awaiting response from author label Sep 9, 2016
@aselle
Copy link
Contributor

aselle commented Sep 9, 2016

@caisq, could you take a look at integrating this. the devel docker image is pretty insanely huge right now. Somebody nearby filled their disk and wedged their machine installing that docker image :).

@cancan101
Copy link
Contributor Author

Any thoughts on this?

@aselle aselle added the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Sep 22, 2016
@caisq
Copy link
Contributor

caisq commented Sep 26, 2016

On my todo list. Thanks.

@aselle aselle removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Sep 26, 2016
drpngx pushed a commit to drpngx/tensorflow that referenced this issue Oct 17, 2016
1) Clean up large Bazel build cache. Total filesystem size reduction as seen by du -sh /:
  devel image: 1.5 GB (Before: 2.9 GB; After: 1.4 GB)
  devel-gpu image: 2.3 GB (Before: 4.7 GB; After: 2.4 GB)
2) Using nvidia-docker for GPU docker build.
3) Upgrade Bazel version from 0.3.1 to 0.3.2.
4) Add missing libcurl3-dev build dependency to devel images.
5) Add scipy and sklearn to Dockerfile.devel-gpu to enhance consistency with other image types (e.g., Dockerfile.devel).
6) Remove the obsolete and unnecessary --recurse-submodules flag for git clone.

Related to GH issues: tensorflow#4116 and tensorflow#4117

However, not using the "git clone --depth 1" suggested by issue tensorflow#4117, because the size of the git repo is only reduced by about 50 MB by the "--depth 1" flag. This space saving is small compared to the space saving due to bazel cache removal. The complete history of the git repo can be useful for certain development purposes.
Change: 136302103
@caisq
Copy link
Contributor

caisq commented Oct 19, 2016

We decided not to do shallow git clone, because

  1. the space saving is not significant, compared to the space saving brought by removing bazel build cache as implemented in 8a72421 for issue Cleanup Bazel Cache in Dockerfile #4116
  2. the git history may be useful for certain development purposes.

Closing this issue.

@caisq caisq closed this as completed Oct 19, 2016
caisq added a commit to caisq/tensorflow that referenced this issue Oct 21, 2016
1) Clean up large Bazel build cache. Total filesystem size reduction as seen by du -sh /:
  devel image: 1.5 GB (Before: 2.9 GB; After: 1.4 GB)
  devel-gpu image: 2.3 GB (Before: 4.7 GB; After: 2.4 GB)
2) Using nvidia-docker for GPU docker build.
3) Upgrade Bazel version from 0.3.1 to 0.3.2.
4) Add missing libcurl3-dev build dependency to devel images.
5) Add scipy and sklearn to Dockerfile.devel-gpu to enhance consistency with other image types (e.g., Dockerfile.devel).
6) Remove the obsolete and unnecessary --recurse-submodules flag for git clone.

Related to GH issues: tensorflow#4116 and tensorflow#4117

However, not using the "git clone --depth 1" suggested by issue tensorflow#4117, because the size of the git repo is only reduced by about 50 MB by the "--depth 1" flag. This space saving is small compared to the space saving due to bazel cache removal. The complete history of the git repo can be useful for certain development purposes.
Change: 136302103
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants