Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: assert py.is_contiguous() #14

Closed
Anwarvic opened this issue Aug 2, 2022 · 12 comments · Fixed by #16
Closed

AssertionError: assert py.is_contiguous() #14

Anwarvic opened this issue Aug 2, 2022 · 12 comments · Fixed by #16

Comments

@Anwarvic
Copy link

Anwarvic commented Aug 2, 2022

I'm working on integrating FastRNNT with Speechbrain, check this Pull Request.

At the current moment, I'm trying to train a transducer model on the multilingual TEDx dataset (mTEDx) for French. Whenever I train my model, I get this assertion error (he issue's title). However, it says in the mutual_information.py file that:

# The following assertions are for efficiency
assert px.is_contiguous()
assert py.is_contiguous()

Once I comment these two lines, everything works just fine. Using a transducer model with an encoder of wav2vec2 pre-trained model + one linear layer, and a one layer GRU as a decoder, the model trains just fine and I got 14.37 WER on the French test set which is way better than our baseline.

Now, I have these two questions:

  • How do I avoid getting this AssertionError?
  • Does commenting these two assertions hurt the performance?

Your guidance is much appreciated!

@csukuangfj
Copy link

If I remember correctly, the cpp code is using tensor accessor to access the data, which does not require a contiguous tensor.

But a contiguous tensor is more cache friendly, so I suggest changing it to

px = px.contiguous()

@Anwarvic
Copy link
Author

Anwarvic commented Aug 2, 2022

So, theoretically commenting these two assertions won't affect the performance... right? And changing the tensors to contiguous will just help a little bit with memory?

@danpovey
Copy link
Collaborator

danpovey commented Aug 2, 2022

It says right there, it's for efficiency, so yes, using non-contiguous tensors will affect the performance. Making that copy may not necessarily require more memory, it depends whether the original (before the copy) is required for backprop. I suggest to try adding the .contiguous() statement before the log_softmax, if possible, since likely the log_softmax needs the output of its operation for the backprop (but not the input), so the copy prior to the .contiguous() before the log_softmax likely would not be held for backprop.

@Anwarvic
Copy link
Author

Anwarvic commented Aug 9, 2022

@danpovey I'm sorry I didn't get what you mean by "adding the .contiguous() statement before the log_softmax".

By ".contiguous() statement", you meant px = px.contiguous() & py = py.contiguous().. right?

Also, which log_softmax are we talking about here exactly? The one at the end of the jointer network?

@danpovey
Copy link
Collaborator

danpovey commented Aug 9, 2022

At some point in the RNN-T computation there is a normalization of log-probs, probably via log_softmax(). I meant doing it just before then.
But this is probably not super critical as I think this is not going to dominate memory requirements anyway; thanks to using pruned RNN-T, we are not instantiating any really huge tensors. So you can do it to the px and py, I think, if they are not naturally contiguous.

@Anwarvic
Copy link
Author

Anwarvic commented Aug 11, 2022

I have added the following two lines just before this part in the mutual_information.py script:

if not px.is_contiguous(): px = px.contiguous()
if not py.is_contiguous(): py = py.contiguous()

@danpovey If you agree with what I did, feel free to close this issue!

@csukuangfj
Copy link

I think you don't need to check whether it is contiguous.

px.contiguous() is a no-op if px is already contiguous, I think.

@Anwarvic
Copy link
Author

Thanks for the help!

@pkufool
Copy link
Contributor

pkufool commented Aug 11, 2022

@pkufool
Copy link
Contributor

pkufool commented Aug 11, 2022

Ok, I think I forgot get_rnnt_logprobs and get_rnnt_logprobs_smoothed.

@Anwarvic
Copy link
Author

My issue was in the AssertionError which only exists in the mutual_information.py script... I think.

@pkufool
Copy link
Contributor

pkufool commented Aug 11, 2022

My issue was in the AssertionError which only exists in the mutual_information.py script... I think.

Yes, I meaned we won't call mutual_information_recursion directly, we call it from functions in rnnt_loss.py. Anyway, fix it in mutual_information.py is OK. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants