Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to cuDNN CTC loss #32302

Merged
merged 22 commits into from Jan 18, 2020

Conversation

kaixih
Copy link
Contributor

@kaixih kaixih commented Sep 6, 2019

  • This PR supports CUDNN CTC Loss as the backend of ctc_loss_v2()

  • Users need to use the environment variable TF_CUDNN_CTC_LOSS=1

  • Why can we make it default (not using TF_CUDNN_CTC_LOSS)?

    • ctc_loss_v2 supports a variable blank index but will transpose it to the last index before calling the actual implementation. However, CUDNN implementation only supports the 0th blank index. This indicates ctc_loss_v2 has to select the correct implementation based on platform, which cannot be done inside operation definition.
  • What is the logic in the new ctc_loss_v2()?

    • _ctc_loss_impl() will call the actual implementation based on given use_cudnn parameter
      • If true, call the CUDNN implementation
      • If false, call the original implementation
    • ctc_loss_v2() will transpose the blank index to 0 if TF_CUDNN_CTC_LOSS=1 and then call _ctc_loss_impl(use_cudnn=true)
    • ctc_loss_v2() will transpose the blank index to the last index if TF_CUDNN_CTC_LOSS=0 and then call _ctc_loss_impl(use_cudnn=false)

fyi @nluehr

@tensorflow-bot tensorflow-bot bot added the size:L CL Change Size: Large label Sep 6, 2019
@kaixih kaixih requested a review from aaroey September 6, 2019 23:06
@gbaned gbaned self-assigned this Sep 9, 2019
@gbaned gbaned added this to Assigned Reviewer in PR Queue via automation Sep 9, 2019
@gbaned gbaned added the awaiting review Pull request awaiting review label Sep 9, 2019
@aaroey
Copy link
Member

aaroey commented Sep 12, 2019

@pkanwar23 would you please help to find someone to review this?

@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Sep 13, 2019
@gbaned gbaned added the awaiting review Pull request awaiting review label Sep 20, 2019
@aaroey aaroey requested a review from chsigg September 23, 2019 17:47
@aaroey
Copy link
Member

aaroey commented Sep 23, 2019

@chsigg could you help to take a look at this? Thanks

@aaroey aaroey removed their request for review September 23, 2019 17:48
@aaroey aaroey requested review from aaroey and removed request for chsigg October 9, 2019 21:00
@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Oct 10, 2019
@gbaned gbaned added the awaiting review Pull request awaiting review label Oct 11, 2019
@aaroey aaroey removed their request for review October 25, 2019 16:54
@rmlarsen rmlarsen self-requested a review November 1, 2019 17:27
@rmlarsen rmlarsen self-assigned this Nov 1, 2019
@rmlarsen rmlarsen added the API review API Review label Nov 6, 2019
std::vector<int> *labels_lengths) {
const T* h_in = labels_indices->flat<T>().data();
for(int i = 0; i < num_indices; i++) {
T key = h_in[i * 2];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const ref

// takes the ownership of the underlying memory. The expectation is that the
// memory should be alive for the span of the cudnnCTCLoss itself.
template <typename T>
class CudnnCtcLossAllocatorInTemp : public ScratchAllocator {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is identical to e.g. CudnnBatchNormAllocatorInTemp in fused_batch_norm_op.cc. Can you consolidate them to a single location instead of duplicating code, please?

@@ -62,6 +62,43 @@ REGISTER_OP("CTCLoss")
return Status::OK();
});

REGISTER_OP("CTCLossV2")
.Input("inputs: float")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the CuDNN implementation support types other than float? If so, we should also support them here.

#31164 added support for double for CTCLossOp, for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, CuDNN only support the float CTCLoss.

@sanjoy
Copy link
Contributor

sanjoy commented Nov 6, 2019

Adding Tim Shen to review the stream executor bits.

@kaixih
Copy link
Contributor Author

kaixih commented Jan 9, 2020

@sanjoy @alextp , I have replaced the previous environment variable with the implementation selector (Thx @qlzh727 for helping me out with some test cases.) Now, we don't need the env var to control if cuDNN is used or not. The runtime can automatically determine that if GPU is available or not.

I added another python function to contain this new implement (ie. ctc_loss_v3), which is only available in TF2.

Please help me find the reviewers to review this part. Thx.

Copy link
Contributor

@alextp alextp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a couple of minor nits and we can approve.

Thanks!

@@ -42,6 +46,28 @@
from tensorflow.python.util import nest
from tensorflow.python.util.tf_export import tf_export

import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linter will complain; standard python imports need to be above all others

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Removed this unused import.

@@ -0,0 +1,71 @@
op {
graph_op_name: "CTCLossV2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a line here saying "visibility: HIDDEN"; this will prevent the generation of a tf.ctc_loss_v2 API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Done.

alextp
alextp previously approved these changes Jan 9, 2020
@@ -576,6 +576,10 @@ tf_module {
name: "cosh"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you can revert this file to make the API tests pass again

@@ -1068,6 +1068,10 @@ tf_module {
name: "cross"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now you can revert this file to make the API tests pass again

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. PTAL.

@tensorflow-bot tensorflow-bot bot added kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Jan 9, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 9, 2020
@tensorflow-bot tensorflow-bot bot added the kokoro:force-run Tests on submitted change label Jan 10, 2020
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 10, 2020
@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Jan 11, 2020
@gbaned gbaned added ready to pull PR ready for merge process and removed ready to pull PR ready for merge process labels Jan 13, 2020
@kaixih
Copy link
Contributor Author

kaixih commented Jan 17, 2020

Anything else I can do? Thx.

@pkanwar23
Copy link
Contributor

Thanks for checking. We should be good. I'm looking at why it hasn't merged.

PR Queue automation moved this from Reviewer Requested Changes to Approved by Reviewer Jan 17, 2020
@tensorflow-bot tensorflow-bot bot added the kokoro:force-run Tests on submitted change label Jan 17, 2020
@sanjoy
Copy link
Contributor

sanjoy commented Jan 17, 2020

Thanks for checking. We should be good. I'm looking at why it hasn't merged.

It was waiting for an approval from me for some reason. Should be good to go now.

@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Jan 17, 2020
@kaixih
Copy link
Contributor Author

kaixih commented Jan 17, 2020

Thx for the update.

tensorflow-copybara pushed a commit that referenced this pull request Jan 18, 2020
PiperOrigin-RevId: 290387603
Change-Id: I28491f42a4559a9f79bd6a7b73d8e6b670f55368
@tensorflow-copybara tensorflow-copybara merged commit 195729d into tensorflow:master Jan 18, 2020
PR Queue automation moved this from Approved by Reviewer to Merged Jan 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API review API Review cla: yes ready to pull PR ready for merge process size:L CL Change Size: Large
Projects
PR Queue
  
Merged
Development

Successfully merging this pull request may close these issues.

None yet