How does the cross attention work? #943
-
|
Hello, I want to understand how parallel/serial the whole approach is. I highly appreciate any answer! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 11 replies
-
|
@jongwook I hope the ping is ok Forget what I asked above, this is the right question: |
Beta Was this translation helpful? Give feedback.
-
|
@jongwook Are there any resources on how to retrain the decoder? (instead of fine tuning it) Or do you have an idea on how to solve the above without major retraining? |
Beta Was this translation helpful? Give feedback.
@jongwook I hope the ping is ok
Forget what I asked above, this is the right question:
Given the for loop is at position 2s of a 5s wav file, it takes all 5s audio_features for prediction at any point, right? So, does predicting the token at timepoint 2s also take future audio_features into consideration?
What would happen if the decoder only had access to the current audio_features?