-
Notifications
You must be signed in to change notification settings - Fork 74k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[determinism] Add GPU excepts, CPU d9m, and tests to crop_and_resize #48905
[determinism] Add GPU excepts, CPU d9m, and tests to crop_and_resize #48905
Conversation
1a56261
to
49f0685
Compare
@@ -1,4 +1,4 @@ | |||
# Copyright 2020 The TensorFlow Authors. All Rights Reserved. | |||
# Copyright 2020, 2021 The TensorFlow Authors. All Rights Reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to this doc, copyright years should never be updated. From the link:
the year the file was created in full numeric form (e.g., 2010); don’t change this if you edit, move, or copy the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR does not change the year that the file was created.
See
When making changes to code with an existing copyright notice: ... Optionally, add current copyright year.
The upcoming commit replaces "2020, 2021" with "2020-2021".
err1 = gradient_checker_v2.max_error( | ||
*gradient_checker_v2.compute_gradient( | ||
if (self.__class__.__name__ == | ||
"CropAndResizeOpDeterministicTest" and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of checking for the class name, I would check if os.environ['TF_DETERMINISTIC_OPS']
is "1".
Also, maybe just disable this test when determinism is used, since you already test the backprop error messages elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose to do it this way to document the provenance of this conditional branch (i.e. in a child class) and to encapsulate the mechanism by which op-determinism is enabled. However, I don't see any major downsides to changing this, specifically:
- if this conditional branch is removed by someone who doesn't understand why it's there or why it's needed then tests will fail and the person who made the change will be alerted to the error, and
- when
TF_DETERMINISTIC_OPS
is replaced withtf.config.*deterministic*
, andTF_DETERMINISTIC_OPS
is no longer set to '1' in the child class, then this test will fail because the exceptions will no longer be thrown.
Also, maybe just disable this test when determinism is used, since you already test the backprop error messages elsewhere.
Testing that the functionality in this conditional branch is (mostly) correct (not testing the error message) will ensure that when the exceptions are removed and deterministic GPU backprop functionality is added, this test will be modified to test that functionality. If the gradient tests are just skipped when determinism is expected then there is a higher probability that the deterministic GPU backprop functionality will (accidentally) not be tested.
The upcoming commit changes the condition to os.getenv('TF_DETERMINISTIC_OPS', '0') == '1'
.
crop_size = constant_op.constant([3, 3], dtype=dtypes.int32) | ||
return image, boxes, box_indices, crop_size | ||
|
||
def _imageErrorMessage(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this a class constant:
_IMAGE_ERROR_MESSAGE = ("Deterministic GPU implementation of" +
" CropAndResizeBackpropImage not available")
And same for _boxesErrorMessage
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the simplification suggestion.
I originally put these messages in the class-level with the intention of using them in the parent class, but I ended up deciding to not test the error messages in the parent class; there it seemed sufficient to simply check that the correct exception type was being thrown. The upcoming commit simplifies this so that there is nothing at all at the class level related to the exception messages.
This test assumes that the base op test runs all the same test cases when | ||
deterministic ops are not enabled and will therefore detect erroneous | ||
exception throwing in those cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand this comment. What is the "base op test"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upcoming commit changes "base op test" to "test_base.CropAndResizeOpTestBase" in this particular instance. It also makes various similar clarifications elsewhere in image_grad_deterministic_test.py
, xent_op_deterministic_test.py
, and sparse_xent_op_deterministic_test.py
.
This current PR adds the following functionality, plus associated tests:
When
TF_DETERMINISTIC_OPS
is set to"true"
or"1"
(when op-determinism is expected),image
oftf.image.crop_and_resize
when running on CPU will be bit-exactly reproducible (from run-to-run), andimage
orboxes
oftf.image.crop_and_resize
will cause atf.errors.UnimplementedError
to be thrown along with an understandable message.This current PR is associated with RFC: Enabling Determinism in TensorFlow.
cc @sanjoy, @reedwm, @nluehr