Basic Tensorflow Support

Added some initial tf tools Implemented UI Fixes for tensorflow 0.10 Removed tf-slim as its not part of the 0.10 master Added the lmdb reader with a tf.cond that needs replacement Implemented train and val seperation with a templating Fixed issue with dequeueing both runners by pulling both graphs Implemented training and validation rythm Added support for both png and jpg and added 16 bit support Implemented mean subtraction - but needs rework to load as constant Added an optimized implementation of mean subtraction Further optimized the mean loading by using a shared constant Wrapped the data loader in a factory to easily support more data types Implemented cropping Implemented floating point support. Implemented seperate LMDB database. Implemented regression support. Added some brief nosetests. Need to invoke accuracy only on classification though. Implemented variable restoration. Needs thorough testing Implemented inferencing, not entirely polished Moved some code into functions, started on modularization a bit Implemented digits custom helper functions Implemented custom printing ops Implemented autoencoder total rewrite of summaries Implemented output to console from scalar summaries Fixes for summary outputs: only simple scalar values are parsed to console Implemented binary segmentation and necessary fixes Some updates on binary seg Implemented all possible optimizers and started work on learning rate shaper Started work on the lr policies Fixes for learning_rates, implemented optimizers, tested variable summary output to UI Implemented and tested all learning rates and optimizers Introduces new model definition and improvements in loss handling and graph layout Major refactoring of main code. Implemented new model description. Implemented and tested inferencing. Implemented and tested weight/snapshot loading. All-round minor updates and fixes Fixes in summary cumulator and implemented an RNN model Fixes for mean subtraction in tf and tf-ui, implemented data order selection in image-view extension Implemented support for mean file of format: png, jpg, binaryproto - the latter being the fault that DIGITS will provide. Added support for runtime statistics and some allround fixes Added static tensorboard style network visualization for tensorflow. Added output of traces (no vis yet). Added a loader while waiting for network vis. Minor syntax cleanups. Implemented alexnet standard network Pulled in updates for travis build and added tensorflow install Added two more files for Mr Travis Implemented tensorflow configuration Added tf config to doc Fixes for ubuntu deployment of tf. Moved tf tools Fixes for tf ubuntu Fixes for tf ubuntu Some fixes and updates for TF in Travis Fix in network viz test Implemented default sinlge-gpu support and some nosetests Fixes for inference Added siamese network, bugfixes, minor features, some utility tf functions Added siamese network and example png Better error-ui format for network viz Added an alternative simpler siamese network that doesnt need a seperate db, minor error update Preliminary version of hdf5 implemented Implemented fine-tuning by renaming variables Implemented visualisation of variables and the activations of the Ops they belong to. Fix in inf vis naming Fixes in visualualisation shapes and naming Implemented softmax upon classification Implemented all nosetests for tf classification, and many allround bugfixes Implemented generic nosetests - some need work Fix for travis to find python exe Implemented a better file format deducer, and implemented a bare minimal TFRecord-reader Added top_n accuracy shortcut Implemented on-line data augmentation for TF, 5 types. Some minor bugfixes. Need to do something with image whitening though during validation and inf.. Added tensorflow data augmentation test Minor fixes and improvements from linter Implemented minimal and bare multigpu and fixes to get it running for greg Preliminary version of tfrecord writer for classification Some changes to optimize dataloading for tfr More fixes for tfrecrods Fix generic data loading Minor breaking changes but updates in namescoping Implemented new model structure. Improvements to multi-gpu handling. Updates to namespaces. Implemented accounting for regularization. Many allround updates Implemented proper visualisation for gpu devices Minor updates and converted alexnet and vgg16 to new format Fix in tfrecord shape WIP on timeline traces Finalized support for tensorflow timeline traces Fixed alexnet for tf Fix merge errors Minify tf-graph-basic.build.js
ethantang95 · Dec 1, 2016 · ea25e1a · ea25e1a
1 parent 1bce875
commit ea25e1a
Show file tree

Hide file tree

Showing 75 changed files with 16,080 additions and 180 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,6 +1,7 @@
 # Temporary files
 *.swp
 *~
+.DS_Store
 TAGS
 
 # Compiled / optimized files

diff --git a/.gjslintrc b/.gjslintrc
@@ -1,3 +1,3 @@
 --max_line_length=120
---exclude_directories=3rdparty
+--exclude_directories=3rdparty,tb
 --disable=0121,0220
diff --git a/.travis.yml b/.travis.yml
@@ -18,6 +18,7 @@ env:
         - DIGITS_TEST_FRAMEWORK=caffe CAFFE_FORK=NVIDIA
         - DIGITS_TEST_FRAMEWORK=caffe CAFFE_FORK=BVLC
         - DIGITS_TEST_FRAMEWORK=torch
+        - DIGITS_TEST_FRAMEWORK=tensorflow
         - DIGITS_TEST_FRAMEWORK=none
 
 matrix:
@@ -82,6 +83,7 @@ addons:
             - cmake
             - cython
             - git
+            - gfortran
             - graphviz
             - libboost-filesystem-dev
             - libboost-python-dev
@@ -128,6 +130,7 @@ install:
     - echo "backend:agg" > ~/.config/matplotlib/matplotlibrc
     - ./scripts/travis/install-caffe.sh $CAFFE_ROOT
     - if [ "$DIGITS_TEST_FRAMEWORK" == "torch" ]; then travis_wait ./scripts/travis/install-torch.sh $TORCH_ROOT; else unset TORCH_ROOT; fi
+    - if [ "$DIGITS_TEST_FRAMEWORK" == "tensorflow" ]; then travis_wait ./scripts/travis/install-tensorflow.sh; fi
     - pip install -r ./requirements.txt
     - pip install -r ./requirements_test.txt
     - pip install -e .

diff --git a/digits/config/__init__.py b/digits/config/__init__.py
@@ -12,6 +12,7 @@
     torch,
     server_name,
     store_option,
+    tensorflow,
 )
 
 
@@ -20,3 +21,4 @@ def config_value(option):
     Return the current configuration value for the given option
     """
     return option_list[option]
+
diff --git a/digits/config/tensorflow.py b/digits/config/tensorflow.py
@@ -0,0 +1,46 @@
+# Copyright (c) 2016, NVIDIA CORPORATION.  All rights reserved.
+from __future__ import absolute_import
+
+import os
+import platform
+from subprocess import Popen,PIPE
+
+from . import option_list
+
+VARNAME_ENV_TFPY = 'TENSORFLOW_PYTHON'
+DEFAULT_PYTHON_EXE = 'python2' # @TODO(tzaman) - use the python executable that was used to launch digits?
+
+if platform.system() == 'Darwin':
+    # DYLD_LIBRARY_PATH and LD_LIBRARY_PATH is sometimes stripped, and the cuda libraries might need it
+    if not "DYLD_LIBRARY_PATH" in os.environ:
+        if "CUDA_HOME" in os.environ:
+            os.environ["DYLD_LIBRARY_PATH"] = str(os.environ["CUDA_HOME"] + '/lib')
+
+def test_tf_import(python_exe):
+    """
+    Tests if tensorflow can be imported, returns if it went okay and optional error.
+    """
+    p = Popen([python_exe, "-c", "import tensorflow"], stdout=PIPE, stderr=PIPE)
+    (out, err) = p.communicate()
+    return p.returncode==0, str(err)
+
+if VARNAME_ENV_TFPY in os.environ:
+    tf_python_exe = os.environ[VARNAME_ENV_TFPY]
+else:
+    tf_python_exe = DEFAULT_PYTHON_EXE
+
+tf_enabled, err = test_tf_import(tf_python_exe)
+
+if not tf_enabled:
+    print('Tensorflow support disabled.')
+#    print('Failed importing Tensorflow with python executable "%s"\n%s' % (tf_python_exe, err))
+
+if tf_enabled:
+    option_list['tensorflow'] = {
+        'enabled': True,
+        'executable': tf_python_exe,
+    }
+else:
+    option_list['tensorflow'] = {
+        'enabled': False,
+    }
diff --git a/digits/dataset/images/classification/forms.py b/digits/dataset/images/classification/forms.py
@@ -21,13 +21,16 @@ class ImageClassificationDatasetForm(ImageDatasetForm):
                                   choices=[
                                       ('lmdb', 'LMDB'),
                                       ('hdf5', 'HDF5'),
+                                      ('tfrecords', 'TFRecords'),
                                   ],
                                   default='lmdb',
                                   )
 
     def validate_backend(form, field):
         if field.data == 'lmdb':
             form.compression.data = 'none'
+        elif field.data == 'tfrecords':
+            form.compression.data = 'none'
         elif field.data == 'hdf5':
             form.encoding.data = 'none'
 

diff --git a/digits/dataset/tasks/create_db.py b/digits/dataset/tasks/create_db.py
@@ -51,8 +51,9 @@ def __init__(self, input_file, db_name, backend, image_dims, **kwargs):
         self.input_file = input_file
         self.db_name = db_name
         self.backend = backend
-        if backend == 'hdf5':
+        if backend == 'hdf5' or backend == 'tfrecords':
             # the list of hdf5 files is stored in a textfile
+            # tfrecords can be sharded as well
             self.textfile = os.path.join(self.db_name, 'list.txt')
         self.image_dims = image_dims
         if image_dims[2] == 3:

diff --git a/digits/extensions/view/imageOutput/config_template.html b/digits/extensions/view/imageOutput/config_template.html
@@ -12,6 +12,12 @@
     {{ form.channel_order(class='form-control') }}
 </div>
 
+<div class="form-group{{mark_errors([form.data_order])}}">
+    {{ form.data_order.label }}
+    {{ form.data_order.tooltip }}
+    {{ form.data_order(class='form-control') }}
+</div>
+
 <div class="form-group{{mark_errors([form.pixel_conversion])}}">
     {{ form.pixel_conversion.label }}
     {{ form.pixel_conversion.tooltip }}

diff --git a/digits/extensions/view/imageOutput/forms.py b/digits/extensions/view/imageOutput/forms.py
@@ -23,6 +23,18 @@ class ConfigForm(Form):
                 'is ignored in the case of a grayscale image)'
     )
 
+    data_order = utils.forms.SelectField(
+        'Data order',
+        choices=[
+            ('chw', 'CHW'),
+            ('hwc', 'HWC'),
+            ],
+        default='chw',
+        tooltip="Set the order of the data. For Caffe and Torch models this "
+                "is often CHW, for Tensorflow it's HWC."
+                "W=Width, H=Height, C=Channels"
+        )
+
     pixel_conversion = utils.forms.SelectField(
         'Pixel conversion',
         choices=[

diff --git a/digits/extensions/view/imageOutput/view.py b/digits/extensions/view/imageOutput/view.py
@@ -29,6 +29,7 @@ def __init__(self, dataset, **kwargs):
 
         # view options
         self.channel_order = kwargs['channel_order'].upper()
+        self.data_order = kwargs['data_order'].upper()
         self.normalize = (kwargs['pixel_conversion'] == 'normalize')
 
     @staticmethod
@@ -76,8 +77,13 @@ def process_data(self, input_id, input_data, output_data):
         """
         Process one inference and return data to visualize
         """
-        # assume the only output is a CHW image
+
         data = output_data[output_data.keys()[0]].astype('float32')
+
+        if self.data_order == 'HWC':
+            data = (data.transpose((2, 0, 1)))
+
+        # assume CHW at this point
         channels = data.shape[0]
         if channels == 3 and self.channel_order == 'BGR':
             data = data[[2, 1, 0], ...]  # BGR to RGB

diff --git a/digits/frameworks/__init__.py b/digits/frameworks/__init__.py
@@ -3,6 +3,7 @@
 
 from .caffe_framework import CaffeFramework
 from .framework import Framework
+from .tensorflow_framework import TensorflowFramework
 from .torch_framework import TorchFramework
 from digits.config import config_value
 
@@ -19,6 +20,9 @@
 # torch is optional
 torch = TorchFramework() if config_value('torch')['enabled'] else None
 
+# tensorflow is optional
+tensorflow = TensorflowFramework() if config_value('tensorflow')['enabled'] else None
+
 # caffe is mandatory
 caffe = CaffeFramework()
 
@@ -35,6 +39,8 @@ def get_frameworks():
     frameworks = [caffe]
     if torch:
         frameworks.append(torch)
+    if tensorflow:
+        frameworks.append(tensorflow)
     return frameworks
 
 

diff --git a/digits/frameworks/caffe_framework.py b/digits/frameworks/caffe_framework.py
@@ -34,6 +34,8 @@ class CaffeFramework(Framework):
 
     # whether this framework can shuffle data during training
     CAN_SHUFFLE_DATA = False
+    SUPPORTS_PYTHON_LAYERS_FILE = True
+    SUPPORTS_TIMELINE_TRACING = False
 
     if config_value('caffe')['flavor'] == 'NVIDIA':
         if parse_version(config_value('caffe')['version']) > parse_version('0.14.0-alpha'):
@@ -132,10 +134,11 @@ def get_network_from_path(self, path):
         return network
 
     @override
-    def get_network_visualization(self, desc):
+    def get_network_visualization(self, **kwargs):
         """
         return visualization of network
         """
+        desc = kwargs['desc']
         net = caffe_pb2.NetParameter()
         text_format.Merge(desc, net)
         # Throws an error if name is None

diff --git a/digits/frameworks/framework.py b/digits/frameworks/framework.py
@@ -26,6 +26,18 @@ def can_shuffle_data(self):
         """
         return self.CAN_SHUFFLE_DATA
 
+    def supports_python_layers_file(self):
+        """
+        return whether framework can shuffle input data during training
+        """
+        return self.SUPPORTS_PYTHON_LAYERS_FILE
+
+    def supports_timeline_traces(self):
+        """
+        return whether framework supports creating timeline traces
+        """
+        return self.SUPPORTS_TIMELINE_TRACING
+
     def supports_solver_type(self, solver_type):
         """
         return whether framework supports this solver_type
@@ -77,7 +89,7 @@ def get_network_from_path(self, path):
         """
         raise NotImplementedError('Please implement me')
 
-    def get_network_visualization(self, desc):
+    def get_network_visualization(self, **kwargs):
         """
         return visualization of network
         """