Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid code duplication in Dockerfiles for Jupyter notebook images #321

Closed
jlewi opened this issue Mar 1, 2018 · 1 comment · Fixed by #366
Closed

Avoid code duplication in Dockerfiles for Jupyter notebook images #321

jlewi opened this issue Mar 1, 2018 · 1 comment · Fixed by #366

Comments

@jlewi
Copy link
Contributor

jlewi commented Mar 1, 2018

We build two versions of the Jupyter notebook images; one for CPU and one for GPU.

There's a lot of code duplication (e.g. list of python packages to install).

To avoid this duplication we should create a common shell script that can be invoked in the Dockerfile to perform the steps that should be the same.

This will help ensure there's less divergence over time.

@jlewi
Copy link
Contributor Author

jlewi commented Mar 3, 2018

/cc @ojarjur

ojarjur added a commit to ojarjur/kubeflow that referenced this issue Mar 5, 2018
This change moves almost all of the commands run during the
building of our CPU and GPU Docker images into a single, shared
script called `install.sh`.

There are currently only two differences between the two image
types (those being the base image and the version of the
tensorflow build to install). The rest of the contents of the
Dockerfiles is meant to be identical.

However, maintaining two nearly-identical sets of instructions
is not practical. It requires that each change be duplicated
in both files, and we've seen that, in practice, this doesn't
happen reliably and the files drift over time.

By placing the bulk of the common code into a shared file, we
can reduce the risk of that happening in the future, while also
reducing the amount of work developers need to do when making
a change.

However, this change does have one downside; since all of the
install steps happen in a single Docker `RUN` command, modifying
any of them will result in the entire image having to be rebuilt.

This fixes kubeflow#321
ojarjur added a commit to ojarjur/kubeflow that referenced this issue Mar 6, 2018
This change eliminates the duplication between the two Dockerfiles
(one for CPU-only instances and one for GPU-enabled instances) by
eliminating the second file.

The two files were meant to differ in two ways:

1. The base image used (ubuntu vs. nvidia/cuda)
2. The tensorflow package installed (tf-nightly vs. tf-nightly-gpu)

It turns out that we can just make both of those arguments (defined
by an `ARG` statement with a default value), and then the two
files can be merged into one.

The default values chosen for the args are the ones for the CPU
image, so running `docker build .` without any `--build-arg` flags
will produce the CPU image.

To build a GPU image, pass in the corresponding values for the
`BASE_IMAGE` and `TF_PACKAGE` arguments using `--build-arg` flags.

This fixes kubeflow#321
k8s-ci-robot pushed a commit that referenced this issue Mar 7, 2018
* Consolidate down to a single Dockerfile

This change eliminates the duplication between the two Dockerfiles
(one for CPU-only instances and one for GPU-enabled instances) by
eliminating the second file.

The two files were meant to differ in two ways:

1. The base image used (ubuntu vs. nvidia/cuda)
2. The tensorflow package installed (tf-nightly vs. tf-nightly-gpu)

It turns out that we can just make both of those arguments (defined
by an `ARG` statement with a default value), and then the two
files can be merged into one.

The default values chosen for the args are the ones for the CPU
image, so running `docker build .` without any `--build-arg` flags
will produce the CPU image.

To build a GPU image, pass in the corresponding values for the
`BASE_IMAGE` and `TF_PACKAGE` arguments using `--build-arg` flags.

This fixes #321

* Move the TF_PACKAGE arg down to right before it is used

* Update the release steps to use the consolidated Dockerfile
yanniszark pushed a commit to arrikto/kubeflow that referenced this issue Feb 15, 2021
…low#321)

* Add doc for tfjob and pytorch examples in Katib

* Add contents

* Fix README

* Fix link to examples in README

* Fix README

* Add information about Katib UI and status of StudyJob

* Add Ambassador information
elenzio9 pushed a commit to arrikto/kubeflow that referenced this issue Oct 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant