Fix Keras->ONNX shape and BatchNorm conversion #213

dhung-msft · 2019-08-28T23:10:31Z

Problem:

Shapes previously used None to denote free dimensions, which were translated to ONNX as 0
BatchNorm parameters were incorrectly translated

Solution:

Changed shape handling to use 'None' (correctly translated to 'None' in ONNX)
Fixed BatchNorm parameter conversion to correctly carry over beta, gamma, mean, and variance

Validation:
Converted YOLOv3 Keras models to ONNX, verified much better output tensor parity between Keras and ONNX (WinML) evaluation

Change size handling to use 'None' instead of None Fix BatchNorm parameter conversion to correctly carry over mean and variance

CLAassistant · 2019-08-28T23:10:35Z

All committers have signed the CLA.

jiafatom · 2019-08-28T23:18:52Z

keras2onnx/common/utils.py

@@ -51,7 +51,7 @@ def set_logger_level(lvl):

 @with_variable('batch_size')
 def get_default_batch_size():
-    return 'N'
+    return 'None'


This 'N' is used to specify the batch size as the symbol 'N'. Then in output shape, it also contains this 'N', so we know it is a match. If we use 'None', we cannot match

I'll revert this change, thanks

jiafatom · 2019-08-28T23:20:21Z

keras2onnx/ke2onnx/batch_norm.py

    scale_tensor_name = scope.get_unique_variable_name('scale')
-    container.add_initializer(scale_tensor_name, onnx_proto.TensorProto.FLOAT, params[0].shape, gamma)
+    container.add_initializer(scale_tensor_name, onnx_proto.TensorProto.FLOAT, params[0].shape, params[0])


Can we have a unit test to cover this? Please check test_layers.py for examples.

Looks like it's already covered in test_batch_normalization_2() (in test_layers.py). Interestingly, both before and after pass the test, as the unit tests only run once (i.e. against a single input), the moving mean and variance will in fact be 0 and 1. In this case, the resulting values are actually the same:

# Substituting 0, 1 for mean, variance gamma := params[0] / sqrt(1 + epsilon) == params[0] beta := params[1] - params[0] * 0 / sqrt(1 + epsilon) == params[1]

Added test_batch_normalization_3

wenbingl · 2019-08-29T17:35:05Z

keras2onnx/ke2onnx/batch_norm.py

    scale_tensor_name = scope.get_unique_variable_name('scale')
-    container.add_initializer(scale_tensor_name, onnx_proto.TensorProto.FLOAT, params[0].shape, gamma)
+    container.add_initializer(scale_tensor_name, onnx_proto.TensorProto.FLOAT, params[0].shape, params[0])


The gama and beta actually was introduced by this PR:
onnx/onnxmltools#36, I don't have too much detail why this change was made.
Can you add some test cases to prove it?

I added test_batch_normalization_3 which covers non-default (0/1) mean variance values and confirmed it fails with previous code and passes with these changes

I just test this test_batch_normalization_3 on original code, and it actually passes, see here. Seems that the original code is also correct (an equivalent way to do this)

Hm interesting. Perhaps there's something in my local environment that is making it fail instead, but I'm surprised as I'm able to follow the numerical differences. I'll dig further when I have a chance

Yep I reinstalled keras2onnx on my machine (and also tried a virtualenv) and both are showing the original code passing. No idea how I was seeing failures before, very possible the two are equivalent. I'll keep looking

jiafatom · 2019-08-29T17:57:17Z

Please also fix the ci build failure...thanks.

Corrected BatchNorm epsilon conversion Added BatchNorm test cases for non-default mean/variance values

wenbingl · 2019-08-29T18:55:07Z

@wschin, Can you still remember why gama/beta was calculated in this way?

jiafatom · 2019-08-29T21:01:07Z

Please also fix the ci build failure...thanks.

ci build failure is because of something else, I just fixed it.

wenbingl · 2019-08-29T21:21:38Z

keras2onnx/parser.py

@@ -20,7 +20,7 @@
 def _infer_variable_type(tensor):
    tensor_shape = []
    if tensor.shape not in (tf.TensorShape(None), tf.TensorShape([])):
-        tensor_shape = [d.value for d in tensor.shape]
+        tensor_shape = [d.value if d.value is not None else 'None' for d in tensor.shape]


This will break the right behavior was designed in ONNX. If the right way cannot work for WINML, we need add a program option to be backward compatible with WINML.

wenbingl · 2019-08-29T22:08:29Z

keras2onnx/ke2onnx/batch_norm.py

@@ -47,26 +47,23 @@ def convert_keras_batch_normalization(scope, operator, container):
    if not op.center:
        params.insert(1, np.zeros(params[1].shape, dtype=float))

-    gamma = params[0] / np.sqrt(params[3] + op.epsilon)


According to the paper, https://arxiv.org/pdf/1502.03167v3.pdf, the formula in p3. the original calculation is correct in the inference mode. Do you have any example to show the new change would be more accurate?

My understanding is as follows. The Keras op has the following parameters:

param[0] := gamma param[1] := beta param[2] := moving_mean param[3] := moving_variance

These are then applied as described in the paper (under Algorithm 1):

x_norm = (x - moving_mean) / sqrt(moving_variance + epsilon)
y = gamma * x_norm + beta

The ONNX operator has the same parameters, but with slightly different naming

scale := gamma B := beta mean := moving_mean var := moving_variance

However, the previous change appears to have interpreted these values instead as:

scale = gamma / sqrt(moving_variance + epsilon) # Most similar to x_norm's definition? B = beta - gamma * moving_mean / sqrt(moving_variance + epsilon) # Unsure how to interpret this # Perhaps trying to treat the scale and B(ias) tensors as the x_norm and y terms # as opposed to as gamma and beta?

My understanding is it should just be a direct translation of gamma -> scale etc, but it's possible I missed the secret behind these formulas

jiafatom · 2019-08-31T14:20:55Z

Close it since the original code can pass the test.

dhung-msft · 2019-09-03T19:05:33Z

Yep I see where I was mistaken, sorry for the noise!

#213) * fix bilstm/bigru bug when bw lst/gru's output is not followed by reverse nodes * add check before doing bigru rewrite * fix missed comma

Change default batch size to 'None' from 'N'

75823f5

Change size handling to use 'None' instead of None Fix BatchNorm parameter conversion to correctly carry over mean and variance

jiafatom reviewed Aug 28, 2019

View reviewed changes

wenbingl approved these changes Aug 29, 2019

View reviewed changes

Reverted batch size dimension from 'None' to 'N'

925b96a

Corrected BatchNorm epsilon conversion Added BatchNorm test cases for non-default mean/variance values

wenbingl requested a review from wschin August 29, 2019 18:52

Merge branch 'master' into master

d39a51c

wenbingl suggested changes Aug 29, 2019

View reviewed changes

wenbingl reviewed Aug 29, 2019

View reviewed changes

jiafatom closed this Aug 31, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Keras->ONNX shape and BatchNorm conversion #213

Fix Keras->ONNX shape and BatchNorm conversion #213

dhung-msft commented Aug 28, 2019

CLAassistant commented Aug 28, 2019 •

edited

Loading

jiafatom Aug 28, 2019

dhung-msft Aug 29, 2019

jiafatom Aug 28, 2019 •

edited

Loading

dhung-msft Aug 29, 2019

dhung-msft Aug 29, 2019

wenbingl Aug 29, 2019

dhung-msft Aug 29, 2019

jiafatom Aug 29, 2019

dhung-msft Aug 29, 2019

dhung-msft Aug 29, 2019

jiafatom commented Aug 29, 2019

wenbingl commented Aug 29, 2019

jiafatom commented Aug 29, 2019

wenbingl Aug 29, 2019

wenbingl Aug 29, 2019

dhung-msft Aug 29, 2019

jiafatom commented Aug 31, 2019

dhung-msft commented Sep 3, 2019

Fix Keras->ONNX shape and BatchNorm conversion #213

Fix Keras->ONNX shape and BatchNorm conversion #213

Conversation

dhung-msft commented Aug 28, 2019

CLAassistant commented Aug 28, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiafatom Aug 28, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiafatom commented Aug 29, 2019

wenbingl commented Aug 29, 2019

jiafatom commented Aug 29, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jiafatom commented Aug 31, 2019

dhung-msft commented Sep 3, 2019

CLAassistant commented Aug 28, 2019 •

edited

Loading

jiafatom Aug 28, 2019 •

edited

Loading