Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF Nightly breaks unit tests test_horovod_syncbn_gpu() and test_horovod_syncbn_cpu() #3422

Closed
chongxiaoc opened this issue Feb 25, 2022 · 0 comments · Fixed by #3431
Closed
Assignees
Labels

Comments

@chongxiaoc
Copy link
Collaborator

chongxiaoc commented Feb 25, 2022

Bug report:

Nightly config CI:
https://github.com/horovod/horovod/runs/5327552707?check_suite_focus=true

Failure stack:

[0]<stdout>:=================================== FAILURES ===================================
[0]<stdout>:___________________ TensorFlowTests.test_horovod_syncbn_cpu ____________________
[0]<stdout>:
[0]<stdout>:self = <test_tensorflow.TensorFlowTests testMethod=test_horovod_syncbn_cpu>
[0]<stdout>:
[0]<stdout>:    def test_horovod_syncbn_cpu(self):
[0]<stdout>:        """Test that the SyncBatchNormalization implementation is correct on CPU."""
[0]<stdout>:    
[0]<stdout>:        hvd.init()
[0]<stdout>:        with tf.device("/cpu:0"):
[0]<stdout>:            x_list = [
[0]<stdout>:                tf.convert_to_tensor(np.stack([
[0]<stdout>:                    np.array([
[0]<stdout>:                        [r, r + 1],
[0]<stdout>:                        [r * 2, r * 2 + 1],
[0]<stdout>:                        [r * 3, r * 3 + 1],
[0]<stdout>:                        [r * 4, r * 4 + 1]
[0]<stdout>:                    ], dtype=np.float32)
[0]<stdout>:                    for r in range(hvd.size())
[0]<stdout>:                ]), np.float32),
[0]<stdout>:                tf.convert_to_tensor(np.stack([
[0]<stdout>:                    np.array([
[0]<stdout>:                        [r + 1],
[0]<stdout>:                        [r * 2 + 1],
[0]<stdout>:                        [r * 3 + 1],
[0]<stdout>:                        [r * 4 + 1]
[0]<stdout>:                    ], dtype=np.float32)
[0]<stdout>:                    for r in range(hvd.size())
[0]<stdout>:                ]), np.float32),
[0]<stdout>:            ]
[0]<stdout>:    
[0]<stdout>:            for x in x_list:
[0]<stdout>:                bn = tf.keras.layers.BatchNormalization(axis=1, fused=False)
[0]<stdout>:                sync_bn = hvd.SyncBatchNormalization(axis=1)
[0]<stdout>:>               bn_func = bn.apply(x, training=True)
[0]<stdout>:E               AttributeError: 'BatchNormalization' object has no attribute 'apply'

TF nightly removed apply() function:

@deprecation.deprecated( date=None, instructions=‘Please use `layer.__call__` method instead.’)
@doc_controls.do_not_doc_inheritable
def apply(self, inputs, *args, **kwargs):
“”"Deprecated, do NOT use!

We should switch to layer.__call__ for TF>=2.9.0.

Failed unit tests:
test_horovod_syncbn_gpu(): https://github.com/horovod/horovod/blob/master/test/parallel/test_tensorflow.py#L4054
test_horovod_syncbn_cpu(): https://github.com/horovod/horovod/blob/master/test/parallel/test_tensorflow.py#L4101

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 participants