Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RaggedTensor support for sparse_categorical_crossentropy. #47092

Merged
merged 1 commit into from Apr 2, 2021

Conversation

pedro-r-marques
Copy link
Contributor

@pedro-r-marques pedro-r-marques commented Feb 11, 2021

Attempt to support RaggedTensors in sparse_categorical_crossentropy loss.

@google-ml-butler google-ml-butler bot added the size:M CL Change Size: Medium label Feb 11, 2021
@google-cla google-cla bot added the cla: yes label Feb 11, 2021
@gbaned gbaned self-assigned this Feb 11, 2021
@gbaned gbaned added the comp:keras Keras related issues label Feb 11, 2021
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Feb 11, 2021
@gbaned gbaned self-requested a review February 11, 2021 17:24
@qlzh727 qlzh727 requested a review from tomerk February 11, 2021 19:05
When used by SparseCategoricalCrossentropy() with the default reduction
(SUM_OVER_BATCH_SIZE), the reduction averages the loss over the
number of elements independent of the batch. E.g. if the RaggedTensor
has 2 batches with [2, 1] values respectivly the resulting loss is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a typo - respectively; and I believe there should be a comma after respectively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@foxik
Copy link
Contributor

foxik commented Feb 12, 2021

Independently to the OP_REQUIRES issue, which I could not tackle, is the sparse_categorical_crossentropy expected to work in case there are multiple ragged dimensions? For example, the batch examples are images and we classify every pixel?

It seems that such a scenario works with binary_crossentropy from #47075, but if crashes here. Notably, the following seems to work:

sce_obj = tf.losses.SparseCategoricalCrossentropy()
y_true = tf.ragged.constant([[1, 1], [0]])
y_pred = tf.ragged.constant([[[0.1, 0.9], [0.1, 0.9]], [[0.9, 0.1]]])
print(sce_obj(y_true, y_pred))

but after adding another ragged dimension

y_true = tf.ragged.constant([[[1, 1]], [[0]]])
y_pred = tf.ragged.constant([[[[0.1, 0.9], [0.1, 0.9]]], [[[0.9, 0.1]]]]) # both without and with `inner_shape=(2,)`
print(sce_obj(y_true, y_pred))

it crashes with ValueError: TypeError: object of type 'RaggedTensor' has no len().

It seems to be connected to _convert_to_dense in _ragged_tensor_apply_loss, because if the _convert_to_dense just passes through the inputs unchanged, the example above works.

@gbaned gbaned added the awaiting review Pull request awaiting review label Feb 17, 2021
@pedro-r-marques
Copy link
Contributor Author

@foxit Added a test with the example you provided and changed the approach a bit to avoid the warning generated by the previous code.

@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Feb 27, 2021
@gbaned gbaned added the awaiting review Pull request awaiting review label Mar 1, 2021
@foxik
Copy link
Contributor

foxik commented Mar 14, 2021

A gentle bump after two weeks -- we would be grateful for a review.

Thanks & cheers.

Copy link

@tomerk tomerk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

PR Queue automation moved this from Assigned Reviewer to Approved by Reviewer Mar 23, 2021
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Mar 23, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Mar 23, 2021
@rthadur
Copy link
Contributor

rthadur commented Mar 23, 2021

@pedro-r-marques can you please check build failures ?

@pedro-r-marques
Copy link
Contributor Author

@rthadur Fixed. It was a pylint error for an unused import.

@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Mar 24, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Mar 24, 2021
@gbaned gbaned added ready to pull PR ready for merge process and removed awaiting review Pull request awaiting review ready to pull PR ready for merge process labels Mar 25, 2021
@pedro-r-marques
Copy link
Contributor Author

@rthadur Can you please check why the CL doesn't built internally to google ?

@rthadur
Copy link
Contributor

rthadur commented Mar 26, 2021

@pedro-r-marques seeing this error internally

tensorflow/python/util/dispatch.py", line 206, in wrapper
   return target(*args, **kwargs)
 File "/tensorflow/python/keras/losses.py", line 1736, in sparse_categorical_crossentropy
   y_pred = ops.convert_to_tensor_v2_with_dispatch(y_pred)
 File "/tensorflow/python/util/dispatch.py", line 206, in wrapper
   return target(*args, **kwargs)
 File "/tensorflow/python/framework/ops.py", line 1429, in convert_to_tensor_v2_with_dispatch
   value, dtype=dtype, dtype_hint=dtype_hint, name=name)
 File "/tensorflow/python/framework/ops.py", line 1439, in convert_to_tensor_v2
   as_ref=False)
 File "/tensorflow/python/profiler/trace.py", line 163, in wrapped
   return func(*args, **kwargs)
 File "/tensorflow/python/framework/ops.py", line 1564, in convert_to_tensor
   ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
 File "/tensorflow/python/framework/constant_op.py", line 340, in _constant_tensor_conversion_function
   return constant(v, dtype=dtype, name=name)
 File "/tensorflow/python/framework/constant_op.py", line 266, in constant
   allow_broadcast=True)
 File "/tensorflow/python/framework/constant_op.py", line 277, in _constant_impl
   return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
 File "/tensorflow/python/framework/constant_op.py", line 302, in _constant_eager_impl
   t = convert_to_eager_tensor(value, ctx, dtype)
 File "/tensorflow/python/framework/constant_op.py", line 99, in convert_to_eager_tensor
   return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: TypeError: object of type 'RaggedTensor' has no len()


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
 File "/py/absl/testing/parameterized.py", line 314, in bound_param_test
   return test_method(self, **testcase_params)
 File "/tensorflow/python/framework/test_combinations.py", line 367, in decorated
   execute_test_method()
 File "/tensorflow/python/framework/test_combinations.py", line 350, in execute_test_method
   test_method(**kwargs_to_pass)
 File "/tensorflow/python/keras/losses_test.py", line 1180, in test_ragged_tensors_3d
   loss = cce_obj(y_true, y_pred)
 File "/tensorflow/python/keras/losses.py", line 155, in __call__
   losses = call_fn(y_true, y_pred)
 File "/tensorflow/python/keras/losses.py", line 259, in call
   return ag_fn(y_true, y_pred, **self._fn_kwargs)
 File "/tensorflow/python/util/dispatch.py", line 210, in wrapper
   result = dispatch(wrapper, args, kwargs)
 File "/tensorflow/python/util/dispatch.py", line 122, in dispatch
   result = dispatcher.handle(args, kwargs)
 File "/tensorflow/python/util/dispatch.py", line 154, in handle
   return self._override_func(*args, **kwargs)
 File "/tensorflow/python/keras/losses.py", line 1762, in _ragged_tensor_sparse_categorical_crossentropy
   return _ragged_tensor_apply_loss(fn, y_true, y_pred, y_pred_extra_dim=True)
 File "/tensorflow/python/keras/losses.py", line 1291, in _ragged_tensor_apply_loss
   assertion_list = ragged_util.assert_splits_match(nested_splits_list)
 File "/tensorflow/python/ops/ragged/ragged_util.py", line 52, in assert_splits_match
   raise ValueError(error_msg)
ValueError: Inputs must have identical ragged splits

@pedro-r-marques
Copy link
Contributor Author

@rthadur
I've updated the patch with the following change in _ragged_tensor_apply_loss:

-  if y_pred_extra_dim and y_pred.shape[-1] is None:
+  if y_pred_extra_dim:
     nested_splits_list[1] = nested_splits_list[1][:-1]

The flag y_pred_extra_dim is only set when calculating losses for sparse categorical cross entropy since we expect the shape of y_pred to have one extra dimension. This if statement had an extra check which is both unnecessary and incorrect. I would appreciate it if you could trigger the google internal CI and check that the patch passes the tests.

@pedro-r-marques
Copy link
Contributor Author

@tomerk @rthadur Please retrigger the CI.

@rthadur rthadur added the kokoro:force-run Tests on submitted change label Mar 31, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Mar 31, 2021
@google-ml-butler google-ml-butler bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Apr 2, 2021
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Apr 2, 2021
@copybara-service copybara-service bot merged commit c8c3cdc into tensorflow:master Apr 2, 2021
PR Queue automation moved this from Approved by Reviewer to Merged Apr 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes comp:keras Keras related issues ready to pull PR ready for merge process size:M CL Change Size: Medium
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet

7 participants