AssertionError: assert py.is_contiguous() #14

Anwarvic · 2022-08-02T14:20:44Z

I'm working on integrating FastRNNT with Speechbrain, check this Pull Request.

At the current moment, I'm trying to train a transducer model on the multilingual TEDx dataset (mTEDx) for French. Whenever I train my model, I get this assertion error (he issue's title). However, it says in the mutual_information.py file that:

# The following assertions are for efficiency
assert px.is_contiguous()
assert py.is_contiguous()

Once I comment these two lines, everything works just fine. Using a transducer model with an encoder of wav2vec2 pre-trained model + one linear layer, and a one layer GRU as a decoder, the model trains just fine and I got 14.37 WER on the French test set which is way better than our baseline.

Now, I have these two questions:

How do I avoid getting this AssertionError?
Does commenting these two assertions hurt the performance?

Your guidance is much appreciated!

The text was updated successfully, but these errors were encountered:

csukuangfj · 2022-08-02T14:26:40Z

If I remember correctly, the cpp code is using tensor accessor to access the data, which does not require a contiguous tensor.

But a contiguous tensor is more cache friendly, so I suggest changing it to

px = px.contiguous()

Anwarvic · 2022-08-02T16:37:42Z

So, theoretically commenting these two assertions won't affect the performance... right? And changing the tensors to contiguous will just help a little bit with memory?

danpovey · 2022-08-02T19:47:58Z

It says right there, it's for efficiency, so yes, using non-contiguous tensors will affect the performance. Making that copy may not necessarily require more memory, it depends whether the original (before the copy) is required for backprop. I suggest to try adding the .contiguous() statement before the log_softmax, if possible, since likely the log_softmax needs the output of its operation for the backprop (but not the input), so the copy prior to the .contiguous() before the log_softmax likely would not be held for backprop.

Anwarvic · 2022-08-09T15:23:45Z

@danpovey I'm sorry I didn't get what you mean by "adding the .contiguous() statement before the log_softmax".

By ".contiguous() statement", you meant px = px.contiguous() & py = py.contiguous().. right?

Also, which log_softmax are we talking about here exactly? The one at the end of the jointer network?

danpovey · 2022-08-09T19:59:58Z

At some point in the RNN-T computation there is a normalization of log-probs, probably via log_softmax(). I meant doing it just before then.
But this is probably not super critical as I think this is not going to dominate memory requirements anyway; thanks to using pruned RNN-T, we are not instantiating any really huge tensors. So you can do it to the px and py, I think, if they are not naturally contiguous.

Anwarvic · 2022-08-11T09:03:21Z

I have added the following two lines just before this part in the mutual_information.py script:

if not px.is_contiguous(): px = px.contiguous()
if not py.is_contiguous(): py = py.contiguous()

@danpovey If you agree with what I did, feel free to close this issue!

csukuangfj · 2022-08-11T09:05:08Z

I think you don't need to check whether it is contiguous.

px.contiguous() is a no-op if px is already contiguous, I think.

Anwarvic · 2022-08-11T09:53:54Z

Thanks for the help!

pkufool · 2022-08-11T10:45:20Z

@Anwarvic Where do you add this line, I think there is px.contiguous in rnnt_loss.py
https://github.com/danpovey/fast_rnnt/blob/c268c3d5a005968b87a724a21082410a3ec0bac3/fast_rnnt/python/fast_rnnt/rnnt_loss.py#L810-L811

pkufool · 2022-08-11T10:48:45Z

Ok, I think I forgot get_rnnt_logprobs and get_rnnt_logprobs_smoothed.

Anwarvic · 2022-08-11T12:56:06Z

My issue was in the AssertionError which only exists in the mutual_information.py script... I think.

pkufool · 2022-08-11T23:25:18Z

My issue was in the AssertionError which only exists in the mutual_information.py script... I think.

Yes, I meaned we won't call mutual_information_recursion directly, we call it from functions in rnnt_loss.py. Anyway, fix it in mutual_information.py is OK. Thanks!

Anwarvic closed this as completed Aug 11, 2022

pkufool mentioned this issue Aug 11, 2022

Add delay_penalty to rnnt_loss; Add constrained rnnt; Fix monotonic_lower_bound to make it more efficient. #16

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError: assert py.is_contiguous() #14

AssertionError: assert py.is_contiguous() #14

Anwarvic commented Aug 2, 2022

csukuangfj commented Aug 2, 2022

Anwarvic commented Aug 2, 2022 •

edited

Loading

danpovey commented Aug 2, 2022

Anwarvic commented Aug 9, 2022

danpovey commented Aug 9, 2022

Anwarvic commented Aug 11, 2022 •

edited

Loading

csukuangfj commented Aug 11, 2022

Anwarvic commented Aug 11, 2022

pkufool commented Aug 11, 2022

pkufool commented Aug 11, 2022

Anwarvic commented Aug 11, 2022

pkufool commented Aug 11, 2022

AssertionError: assert py.is_contiguous() #14

AssertionError: assert py.is_contiguous() #14

Comments

Anwarvic commented Aug 2, 2022

csukuangfj commented Aug 2, 2022

Anwarvic commented Aug 2, 2022 • edited Loading

danpovey commented Aug 2, 2022

Anwarvic commented Aug 9, 2022

danpovey commented Aug 9, 2022

Anwarvic commented Aug 11, 2022 • edited Loading

csukuangfj commented Aug 11, 2022

Anwarvic commented Aug 11, 2022

pkufool commented Aug 11, 2022

pkufool commented Aug 11, 2022

Anwarvic commented Aug 11, 2022

pkufool commented Aug 11, 2022

Anwarvic commented Aug 2, 2022 •

edited

Loading

Anwarvic commented Aug 11, 2022 •

edited

Loading