You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Forward step of Transformer is type unstable. Running the example from the docs
using Transfomers
m = Transformer(512, 8, 64, 2048) #define a Transformer block with 8 head and 64 neuron for each head
x = randn(512, 30, 3) #fake data of length 30
y = m(x)
The source of the unstabillity is probably the multihead attention, but I have not been able to distill it any further.
I am using latest tagged version 0.1.3 of Transformers on Julia 1.4.1.
The text was updated successfully, but these errors were encountered:
I can reproduce this result on Julia 1.4.2 with the master branch. It does look like there are some problems with type inference for multihead attention. I will take some time to fix this.
Forward step of Transformer is type unstable. Running the example from the docs
and checking for
@code_warntype
produces:The source of the unstabillity is probably the multihead attention, but I have not been able to distill it any further.
I am using latest tagged version 0.1.3 of Transformers on Julia 1.4.1.
The text was updated successfully, but these errors were encountered: