Upgrade Dockerfile assembler system #24051

angerson · 2018-11-29T21:51:55Z

This is a big upgrade to the Dockerfile assembler I wrote a couple of
months ago. The spec has changed, the script has been rewritten, and
there are new features throughout:

The assembler can build and upload images to Docker Hub.
The assembler can also run tests (!), although the testing system is
extremely rudimentary. It could be expanded with parallelism later, if
execution time becomes a problem.
spec.yml is totally different, and now defines both dockerfiles and
images. It handles the combinatorial explosion of multiple optional features
without excessive duplication, unlike the previous spec format.
Partials are the same, but I dumped the extensive dockerfile
documentation support because I don't think anyone would have used it.
Dockerfiles are handled under the same kind of system as images, which
is neat. The new Dockerfiles aren't so duplicated.
I've upgraded the images with new tensorflow tutorial files (jupyter
only) and fixed some others that didn't actually work.
I've improved the development documentation by suggesting aliases.
Added "static-dockerfiles" directory to track independent Dockerfiles.

These changes should better support changes like #23194.

angerson · 2018-11-29T22:22:38Z

The full suite of versioned images are available at https://hub.docker.com/r/angersson/tensorflow/tags/, e.g.:

docker run -it --rm --runtime=nvidia -u $(id -u):$(id -g) angersson/tensorflow:zeus-gpu bash

This is a big upgrade to the Dockerfile assembler I wrote a couple of months ago. The spec has changed, the script has been rewritten, and there are new features throughout: - The assembler can build and upload images to Docker Hub. - The assembler can also run tests (!), although the testing system is extremely rudimentary. It could be expanded with parallelism later, if execution time becomes a problem. - spec.yml is totally different, and now defines both dockerfiles and images. It handles the combinatorial explosion of multiple optional features without excessive duplication, unlike the previous spec format. - Partials are the same, but I dumped the extensive dockerfile documentation support because I don't think anyone would have used it. - Dockerfiles are handled under the same kind of system as images, which is neat. The new Dockerfiles aren't so duplicated. - I've upgraded the images with new tensorflow tutorial files (jupyter only) and fixed some others that didn't actually work. - I've improved the development documentation by suggesting aliases. - Added "static-dockerfiles" directory to track independent Dockerfiles. These changes should better support changes like tensorflow#23194.

tensorflow/tools/dockerfiles/assembler.py

gunan

Mostly looks good. Some terminology and function names can be better, but apart from one no big objections.

angerson · 2018-11-30T21:16:35Z

Thanks for the review. I fixed the problems you noted, plus a few more small issues.

PiperOrigin-RevId: 223601056

maystroh · 2018-12-04T11:05:32Z

I created a singularity image using this command:

singularity build angersson_gpu.simg docker://angersson/tensorflow:zeus-gpu-py3

But when I launch this script:

import tensorflow as tf
with tf.device('/gpu:0'):
    a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
    b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
    c = tf.matmul(a, b)
with tf.Session() as sess:
    print (sess.run(c))

it gives me this error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/volume/test_gpu.py", line 1, in <module>
    import tensorflow as tf
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/usr/lib/python3.5/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/usr/lib/python3.5/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcuda.so.1: cannot open shared object file: No such file or directory

angerson · 2018-12-04T19:06:07Z

@maystroh Are you using the nvidia-docker2 runtime to launch the GPU images? I don't know how singularity works, but that looks like the same kind of failure that occurs when you don't use --runtime=nvidia.

maystroh · 2018-12-04T21:27:06Z

It works when I add --nv to my singularity command. Thanks @angersson

googlebot added the cla: yes label Nov 29, 2018

angerson requested a review from gunan November 29, 2018 21:52

angerson force-pushed the angerson-tagger branch from 4d77702 to 675f415 Compare November 29, 2018 22:27

tensorflowbutler assigned qlzh727 Nov 30, 2018

gunan reviewed Nov 30, 2018

View reviewed changes

tensorflow/tools/dockerfiles/assembler.py Show resolved Hide resolved

tensorflow/tools/dockerfiles/assembler.py Outdated Show resolved Hide resolved

gunan previously approved these changes Nov 30, 2018

View reviewed changes

tjakob mentioned this pull request Nov 30, 2018

Multi-arch docker images #14934

Closed

Apply review fixes

e53c859

angerson dismissed gunan’s stale review via e53c859 November 30, 2018 21:49

angerson added the ready to pull PR ready for merge process label Nov 30, 2018

angerson added 2 commits November 30, 2018 14:34

Add Jupyter tutorial readme

2ed052b

Add Apache license notification

9152bfc

tensorflow-copybara merged commit 9152bfc into tensorflow:master Dec 1, 2018

tensorflow-copybara pushed a commit that referenced this pull request Dec 1, 2018

Merge pull request #24051 from angersson:angerson-tagger

d42a0fa

PiperOrigin-RevId: 223601056

angerson mentioned this pull request Dec 3, 2018

Docker with python 3.6 #22292

Closed

angerson deleted the angerson-tagger branch December 3, 2018 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade Dockerfile assembler system #24051

Upgrade Dockerfile assembler system #24051

angerson commented Nov 29, 2018

angerson commented Nov 29, 2018

gunan left a comment

angerson commented Nov 30, 2018

maystroh commented Dec 4, 2018

angerson commented Dec 4, 2018

maystroh commented Dec 4, 2018

Upgrade Dockerfile assembler system #24051

Upgrade Dockerfile assembler system #24051

Conversation

angerson commented Nov 29, 2018

angerson commented Nov 29, 2018

gunan left a comment

Choose a reason for hiding this comment

angerson commented Nov 30, 2018

maystroh commented Dec 4, 2018

angerson commented Dec 4, 2018

maystroh commented Dec 4, 2018