Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile: don't cache s4tf toolchain tar file #49

Merged
merged 1 commit into from Mar 25, 2019
Merged

Dockerfile: don't cache s4tf toolchain tar file #49

merged 1 commit into from Mar 25, 2019

Conversation

vvmnnnkv
Copy link
Contributor

Current Dockerfile steps:

  1. Install / update system packages & pip
  2. Download pre-compiled s4tf toolchain using curl & untar it
  3. Copy swift-jupyter source code
  4. Install dependencies from requirements*.txt
  5. Register kernel

This approach has downside in case someone wants to rebuild the container often in order to get the latest version of s4tf toolchain. As the curl command in step 2 is cached by docker build system, it's only possible to re-download s4tf by disabling build cache (docker build --no-cache), which requires to re-do all the steps in Dockerfile; or by supplying different swift_tf_url argument value, e.g. by adding a random query param to the url, which is not very convenient and will also force docker build re-do all subsequent steps after step 2.

Suggested change is to re-order steps and to use Dockerfile ADD command so that docker build always checks for the latest s4tf toolchain but still caches other steps as much as possible.
New order:

  1. Install / update system packages & pip
  2. Copy requirements*.txt and install dependencies from them (doing that before copying all sources allows to cache this step separately, so it's not executed even if code was updated but requirements*.txt were not updated)
  3. Copy swift-jupyter source code
  4. Download s4tf toolchain using ADD command (that checks if the file has changed)
  5. Untar s4tf
  6. Register kernel

Suggested change has own downsides that might make it not acceptable depending on the docker image use-cases:

  1. ADD command fully downloads the file to check if it has changed, unfortunately it's not smart enough yet to consider E-Tag header: Dockerfile ADD remote url does not use any HTTP header so always re-downloads moby/moby#15717. Since this download occurs every time you build the image, it might be too annoying in case one needs to build the image much more often than s4tf is updated.
  2. Docker image size is increased by the size of s4tf toolchain .tar file; despite file is removed after ADD command, it is still stored in layer created by ADD command (can be avoided by squashing the image, but --squash option is still experimental).

@marcrasi marcrasi self-requested a review March 24, 2019 17:23
Copy link
Contributor

@marcrasi marcrasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks for doing this! I'm going to run the tests (which use the Dockerfile) and then merge this.

@marcrasi
Copy link
Contributor

  • ADD command fully downloads the file to check if it has changed, unfortunately it's not smart enough yet to consider E-Tag header: moby/moby#15717. Since this download occurs every time you build the image, it might be too annoying in case one needs to build the image much more often than s4tf is updated.

Something I could do is upload another file next to the build that has a hash of the build. Then we can change the ADD command to look at the small hash file instead of the big toolchain file, so that it can quickly tell if anything has changed. I think I'll do this some time in the next few days.

@marcrasi marcrasi merged commit ea1ba94 into google:master Mar 25, 2019
realdoug pushed a commit to realdoug/swift-jupyter that referenced this pull request Mar 28, 2019
Added a proposal for a high-level design for graph program extraction.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants