Support and build against Keras 2.2.2 and TF 1.10.0 #151

lu-wang-dl · 2018-08-16T21:47:42Z

bump spark version to 2.3.1
bump tensorframes version to 0.4.0
bump keras==2.2.2 and tensorflow==1.10.0 to fix travis issues
TF_C_API_GRAPH_CONSTRUCTION added as a temp fix
Drop support for Spark <2.3 and hence Scala 2.10
add python3 friendly print
add pooling='avg' in resnet50 testing model beccause keras api changed
test arrays almost equal with whatever precision 5 in NamedImageTransformerBaseTestCase, test_bare_keras_module, keras_load_and_preproc
make keras model smaller in test_simple_keras_udf

This is a continued work from #149.

according to a commit in tensorflow, > One known difference is improved static shape inference, meaning some shape errors will be surfaced during graph construction instead of at runtime. link: tensorflow/tensorflow@0f60a7d

codecov-io · 2018-08-16T23:42:24Z

Codecov Report

Merging #151 into master will not change coverage.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master     #151   +/-   ##
=======================================
  Coverage   85.37%   85.37%           
=======================================
  Files          34       34           
  Lines        1922     1922           
  Branches       41       41           
=======================================
  Hits         1641     1641           
  Misses        281      281

Impacted Files	Coverage Δ
python/sparkdl/transformers/keras_applications.py	`85.71% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6be7772...ef39c1e. Read the comment docs.

lu-wang-dl · 2018-08-20T19:37:57Z

retest

yogeshg

Thanks! I think the description can use a little more context on what changes were introduced here and why. left a few comments on the code part too, lemme know what ya think! thanks for tackling this!

yogeshg · 2018-08-20T22:05:22Z

python/tests/graph/test_pieces.py

@@ -169,6 +169,6 @@ def test_pipeline(self):
                # tfx.write_visualization_html(issn.graph,
                # NamedTemporaryFile(prefix="gdef", suffix=".html").name)

-            self.assertTrue(np.all(preds_tgt == preds_ref))
+            np.testing.assert_array_almost_equal(preds_tgt, preds_ref, decimal=5)


can we use featurizerCompareDigitsExact setting centrally, if we want to control this?

yogeshg · 2018-08-20T22:05:52Z

python/tests/transformers/named_image_ResNet50_test.py

@@ -20,3 +20,4 @@ class NamedImageTransformerResNet50Test(NamedImageTransformerBaseTestCase):

    __test__ = True
    name = "ResNet50"
+    poolingMethod = 'avg'


can we comment why this is necessary in the description of this PR?

yogeshg · 2018-08-20T22:07:26Z

python/tests/udf/keras_sql_udf_test.py

-            model.add(Dense(units=10))
-            model.add(Activation('softmax'))
+            # model.add(Dense(64, activation='relu'))
+            model.add(Dense(16, activation='softmax'))


can we leave a comment here suggesting why we removed these lines?

Made one comments to explain it.

yogeshg · 2018-08-20T22:12:00Z

.travis.yml

@@ -21,6 +21,9 @@ env:
    - SPARK_BUILD_URL="https://dist.apache.org/repos/dist/release/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz"
    - SPARK_HOME=$HOME/.cache/spark-versions/$SPARK_BUILD
    - RUN_ONLY_LIGHT_TESTS=True
+    # TODO: This is a temp fix in order to pass tests.
+    # We should update implementation to allow graph construction via C API.
+    - TF_C_API_GRAPH_CONSTRUCTION=0


For more context, IIRC, i added this as a debug mechanism, i do not think this should make a condition for our testing, according to the commit, tensorflow/tensorflow@0f60a7d

One known difference is improved static shape inference, meaning some shape errors will be surfaced during graph construction instead of at runtime.

Which means we might need to update our assumption that the shape will be checked only during run time. As noted in the comments above.

Also, i remember a vague suspicion that we may have ben using different versions of tensorflow in tensorframes and this library. can we confirm if both tensorframes and sparkdl use tf 1.10 ?

We didn't release tensorframes with TF 1.10 yet. After the releasing of the new tensorframes we can retest this.

yogeshg · 2018-08-20T22:17:59Z

python/tests/transformers/named_image_test.py

@@ -69,8 +69,9 @@ class NamedImageTransformerBaseTestCase(SparkDLTestCase):
    name = None
    # Allow subclasses to force number of partitions - a hack to avoid OOM issues
    numPartitionsOverride = None
-    featurizerCompareDigitsExact = 6
+    featurizerCompareDigitsExact = 5


one easy way to make this global is by adding an assertArrayAlmostEqual method to SparkDLTestCase that calls np.testing.test_arrays_almost_equal and takes a default comparing precision, and using that instead of assertEquals. How do you feel about this?

Good idea. It may needs to change all of the usage of np.testing.test_arrays_almost_equal. It might be better to change it on a different PR.

yogeshg · 2018-08-20T22:19:34Z

python/tests/transformers/named_image_test.py

    featurizerCompareDigitsCosine = 1
+    poolingMethod = None


Is it true that pooling is None in testing and avg in prod? if so, can we clarify why in the description?

For other models, we can use the default poolingMethod which is None. Only issue is ResNet50. The new Keras version changed the model. In order to make it match as what we did before, we need to add the avg pooling in ResNet50 model.

I was looking at the test cases, can you point me to which one is it that fails?
If I understand correctly, we check the output of this model, as stored in kerasPredict against tensorflow, resized images and deepimagepredictor model. Shouldn't all of them be still the same? if not, which one and why?

also, should we simply add pooling="avg" in

spark-deep-learning/python/sparkdl/transformers/keras_applications.py

Lines 234 to 235 in 4a2ae91

def _testKerasModel(self, include_top, pooling=None):

return resnet50.ResNet50(weights="imagenet", include_top=include_top, pooling=pooling)

? to achieve the same effect as you are achieving without adding the testing harness to test against all poolings?

The test test_featurization_no_reshape fails because kerasReshaped and dfFeatures has different shapes.

https://github.com/databricks/spark-deep-learning/blob/master/python/tests/transformers/named_image_test.py#L186

About the second. My question is do we want to support other pooling method for Kreas model in the future?

jkbradley

I just left a minor comment but can finish my review after @yogeshg checks the updates. Also, per his review, can you please update the PR description? Thanks!

jkbradley · 2018-08-21T01:36:11Z

python/requirements.txt

 nose>=1.3.7  # for testing
 parameterized>=0.6.1 # for testing
 pillow>=4.1.1,<4.2
 pygments>=2.2.0
-tensorflow==1.6.0
+tensorflow==1.10.0 # NOTE: this package has only been tested with tensorflow 0.10.0


In note: 0.10.0 -> 1.10.0

yogeshg · 2018-08-21T01:48:44Z

python/tests/graph/test_pieces.py

@@ -49,6 +49,8 @@


 class GraphPiecesTest(SparkDLTestCase):
+
+    featurizerCompareDigitsExact = 5


I was thinking we could put this all the way in SparkDLTestCase and use it everywhere we do the array almost comparison. But if this is blocking and we want to put that off for later, fine by me.

It might not be a good idea to put the featurizerCompareDigitsExact in SparkDLTestCase. Because of the different condition numbers, we might need different number for different model. Like for InceptionV3 and Xception, we make it to 4. Since np.testing.test_arrays_almost_equal already have a default value 6. We may not want to define a different one, unless we define the new assertArrayAlmostEqual as you suggested.

Aha! did not know that. that is definitely more complicated.

yogeshg · 2018-08-21T02:01:06Z

Thanks for the update, Lu! As a summary of my comments:

can we put the testingPrecision in SparkDLTestCase ?
which test case fails with the standard setting of resnet50? and follow up questions inline.
do you think we need to make matching changes in model and _testKerasModel methods of resnet50?
assertArrayAlmostEqual to be introduced in a follow up pr
tensorframes release to follow or precede this release

yogeshg · 2018-08-21T23:21:32Z

python/sparkdl/transformers/keras_applications.py

@@ -188,7 +188,8 @@ def inputShape(self):
        return (299, 299)

    def _testKerasModel(self, include_top):
-        return xception.Xception(weights="imagenet", include_top=include_top)
+        return xception.Xception(weights="imagenet",


nit: we do not need the new line here, but that cool.

yogeshg · 2018-08-21T23:31:09Z

LGTM pending tests. @jkbradley , LGTY?

jkbradley · 2018-08-21T23:47:57Z

LGTM too, thanks!

jkbradley · 2018-08-22T16:39:27Z

I'll merge this now. Thanks @ludatabricks and @yogeshg !

mengxr and others added 7 commits August 16, 2018 14:43

drop Spark <2.3 support and build against Keras 2.2.2 and TF 1.10.0

00ea155

print(..) for py3

d4b036e

set env var TF_C_API_GRAPH_CONSTRUCTION=0

4848796

according to a commit in tensorflow, > One known difference is improved static shape inference, meaning some shape errors will be surfaced during graph construction instead of at runtime. link: tensorflow/tensorflow@0f60a7d

TODO for the temp fix from Yogesh

cb55854

should tolerate small numerical difference

aca5c5a

fix the test

7ed9429

fix python style error

8d28e85

lu-wang-dl added 3 commits August 20, 2018 10:22

dummy test for python2 fail

a651ab1

add two more laysers in the sequential model to test

7f9fef5

change .travis.yml back to do the test again

f2916eb

lu-wang-dl added 2 commits August 20, 2018 12:41

simplify the model

acc1f82

remove one layer

5c570f2

yogeshg suggested changes Aug 20, 2018

View reviewed changes

address the comments

4a2ae91

jkbradley mentioned this pull request Aug 21, 2018

[WIP] drop Spark <2.3 support and build against Keras 2.2.2 and TF 1.10.0 #149

Closed

jkbradley reviewed Aug 21, 2018

View reviewed changes

yogeshg reviewed Aug 21, 2018

View reviewed changes

jkbradley mentioned this pull request Aug 21, 2018

Move Scalactic dependency to test only #152

Merged

lu-wang-dl added 2 commits August 21, 2018 14:23

fix the typo

98edaf2

remove useless params

ef39c1e

yogeshg approved these changes Aug 21, 2018

View reviewed changes

jkbradley merged commit bd8ff0d into databricks:master Aug 22, 2018

lu-wang-dl deleted the tf-1.10 branch August 22, 2018 16:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support and build against Keras 2.2.2 and TF 1.10.0 #151

Support and build against Keras 2.2.2 and TF 1.10.0 #151

lu-wang-dl commented Aug 16, 2018 •

edited by yogeshg

Loading

codecov-io commented Aug 16, 2018 •

edited

Loading

lu-wang-dl commented Aug 20, 2018

yogeshg left a comment

yogeshg Aug 20, 2018

yogeshg Aug 20, 2018

yogeshg Aug 20, 2018

lu-wang-dl Aug 21, 2018

yogeshg Aug 20, 2018

lu-wang-dl Aug 20, 2018

yogeshg Aug 20, 2018

lu-wang-dl Aug 20, 2018

yogeshg Aug 20, 2018

lu-wang-dl Aug 20, 2018

yogeshg Aug 21, 2018 •

edited

Loading

lu-wang-dl Aug 21, 2018

lu-wang-dl Aug 21, 2018

jkbradley left a comment

jkbradley Aug 21, 2018

yogeshg Aug 21, 2018

lu-wang-dl Aug 21, 2018

yogeshg Aug 21, 2018

yogeshg commented Aug 21, 2018 •

edited

Loading

yogeshg Aug 21, 2018

yogeshg commented Aug 21, 2018

jkbradley commented Aug 21, 2018

jkbradley commented Aug 22, 2018

	def _testKerasModel(self, include_top, pooling=None):
	return resnet50.ResNet50(weights="imagenet", include_top=include_top, pooling=pooling)

		@@ -49,6 +49,8 @@


		class GraphPiecesTest(SparkDLTestCase):

		featurizerCompareDigitsExact = 5

Support and build against Keras 2.2.2 and TF 1.10.0 #151

Support and build against Keras 2.2.2 and TF 1.10.0 #151

Conversation

lu-wang-dl commented Aug 16, 2018 • edited by yogeshg Loading

codecov-io commented Aug 16, 2018 • edited Loading

Codecov Report

lu-wang-dl commented Aug 20, 2018

yogeshg left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yogeshg Aug 21, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkbradley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yogeshg commented Aug 21, 2018 • edited Loading

Choose a reason for hiding this comment

yogeshg commented Aug 21, 2018

jkbradley commented Aug 21, 2018

jkbradley commented Aug 22, 2018

lu-wang-dl commented Aug 16, 2018 •

edited by yogeshg

Loading

codecov-io commented Aug 16, 2018 •

edited

Loading

yogeshg Aug 21, 2018 •

edited

Loading

yogeshg commented Aug 21, 2018 •

edited

Loading