-
Notifications
You must be signed in to change notification settings - Fork 494
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support and build against Keras 2.2.2 and TF 1.10.0 #151
Conversation
according to a commit in tensorflow, > One known difference is improved static shape inference, meaning some shape errors will be surfaced during graph construction instead of at runtime. link: tensorflow/tensorflow@0f60a7d
Codecov Report
@@ Coverage Diff @@
## master #151 +/- ##
=======================================
Coverage 85.37% 85.37%
=======================================
Files 34 34
Lines 1922 1922
Branches 41 41
=======================================
Hits 1641 1641
Misses 281 281
Continue to review full report at Codecov.
|
retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think the description can use a little more context on what changes were introduced here and why. left a few comments on the code part too, lemme know what ya think! thanks for tackling this!
python/tests/graph/test_pieces.py
Outdated
@@ -169,6 +169,6 @@ def test_pipeline(self): | |||
# tfx.write_visualization_html(issn.graph, | |||
# NamedTemporaryFile(prefix="gdef", suffix=".html").name) | |||
|
|||
self.assertTrue(np.all(preds_tgt == preds_ref)) | |||
np.testing.assert_array_almost_equal(preds_tgt, preds_ref, decimal=5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use featurizerCompareDigitsExact
setting centrally, if we want to control this?
@@ -20,3 +20,4 @@ class NamedImageTransformerResNet50Test(NamedImageTransformerBaseTestCase): | |||
|
|||
__test__ = True | |||
name = "ResNet50" | |||
poolingMethod = 'avg' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we comment why this is necessary in the description of this PR?
model.add(Dense(units=10)) | ||
model.add(Activation('softmax')) | ||
# model.add(Dense(64, activation='relu')) | ||
model.add(Dense(16, activation='softmax')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we leave a comment here suggesting why we removed these lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made one comments to explain it.
@@ -21,6 +21,9 @@ env: | |||
- SPARK_BUILD_URL="https://dist.apache.org/repos/dist/release/spark/spark-${SPARK_VERSION}/spark-${SPARK_VERSION}-bin-hadoop2.7.tgz" | |||
- SPARK_HOME=$HOME/.cache/spark-versions/$SPARK_BUILD | |||
- RUN_ONLY_LIGHT_TESTS=True | |||
# TODO: This is a temp fix in order to pass tests. | |||
# We should update implementation to allow graph construction via C API. | |||
- TF_C_API_GRAPH_CONSTRUCTION=0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For more context, IIRC, i added this as a debug mechanism, i do not think this should make a condition for our testing, according to the commit, tensorflow/tensorflow@0f60a7d
One known difference is improved static shape inference, meaning some shape errors will be surfaced during graph construction instead of at runtime.
Which means we might need to update our assumption that the shape will be checked only during run time. As noted in the comments above.
Also, i remember a vague suspicion that we may have ben using different versions of tensorflow in tensorframes and this library. can we confirm if both tensorframes and sparkdl use tf 1.10 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't release tensorframes with TF 1.10 yet. After the releasing of the new tensorframes we can retest this.
@@ -69,8 +69,9 @@ class NamedImageTransformerBaseTestCase(SparkDLTestCase): | |||
name = None | |||
# Allow subclasses to force number of partitions - a hack to avoid OOM issues | |||
numPartitionsOverride = None | |||
featurizerCompareDigitsExact = 6 | |||
featurizerCompareDigitsExact = 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one easy way to make this global is by adding an assertArrayAlmostEqual
method to SparkDLTestCase
that calls np.testing.test_arrays_almost_equal
and takes a default comparing precision, and using that instead of assertEquals. How do you feel about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. It may needs to change all of the usage of np.testing.test_arrays_almost_equal
. It might be better to change it on a different PR.
featurizerCompareDigitsCosine = 1 | ||
poolingMethod = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it true that pooling is None in testing and avg
in prod? if so, can we clarify why in the description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For other models, we can use the default poolingMethod which is None. Only issue is ResNet50. The new Keras version changed the model. In order to make it match as what we did before, we need to add the avg pooling in ResNet50 model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I was looking at the test cases, can you point me to which one is it that fails?
If I understand correctly, we check the output of this model, as stored inkerasPredict
against tensorflow, resized images and deepimagepredictor model. Shouldn't all of them be still the same? if not, which one and why? - also, should we simply add
pooling="avg"
inspark-deep-learning/python/sparkdl/transformers/keras_applications.py
Lines 234 to 235 in 4a2ae91
def _testKerasModel(self, include_top, pooling=None): return resnet50.ResNet50(weights="imagenet", include_top=include_top, pooling=pooling)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test test_featurization_no_reshape
fails because kerasReshaped and dfFeatures has different shapes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
About the second. My question is do we want to support other pooling method for Kreas model in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just left a minor comment but can finish my review after @yogeshg checks the updates. Also, per his review, can you please update the PR description? Thanks!
python/requirements.txt
Outdated
nose>=1.3.7 # for testing | ||
parameterized>=0.6.1 # for testing | ||
pillow>=4.1.1,<4.2 | ||
pygments>=2.2.0 | ||
tensorflow==1.6.0 | ||
tensorflow==1.10.0 # NOTE: this package has only been tested with tensorflow 0.10.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In note: 0.10.0 -> 1.10.0
@@ -49,6 +49,8 @@ | |||
|
|||
|
|||
class GraphPiecesTest(SparkDLTestCase): | |||
|
|||
featurizerCompareDigitsExact = 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we could put this all the way in SparkDLTestCase
and use it everywhere we do the array almost comparison. But if this is blocking and we want to put that off for later, fine by me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might not be a good idea to put the featurizerCompareDigitsExact in SparkDLTestCase. Because of the different condition numbers, we might need different number for different model. Like for InceptionV3 and Xception, we make it to 4. Since np.testing.test_arrays_almost_equal
already have a default value 6. We may not want to define a different one, unless we define the new assertArrayAlmostEqual
as you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha! did not know that. that is definitely more complicated.
Thanks for the update, Lu! As a summary of my comments:
|
@@ -188,7 +188,8 @@ def inputShape(self): | |||
return (299, 299) | |||
|
|||
def _testKerasModel(self, include_top): | |||
return xception.Xception(weights="imagenet", include_top=include_top) | |||
return xception.Xception(weights="imagenet", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we do not need the new line here, but that cool.
LGTM pending tests. @jkbradley , LGTY? |
LGTM too, thanks! |
I'll merge this now. Thanks @ludatabricks and @yogeshg ! |
2.3.1
0.4.0
keras==2.2.2
andtensorflow==1.10.0
to fix travis issues2.3
and hence Scala2.10
pooling='avg'
in resnet50 testing model beccause keras api changedNamedImageTransformerBaseTestCase
,test_bare_keras_module
,keras_load_and_preproc
test_simple_keras_udf
This is a continued work from #149.