You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think it is worthwhile to evaluate the performance of the transformer. I have an indication that it is performing slower when compared to LSTM as of now.
We’re improving the state of scalable GPU computing in Python.
-- Matthew Rocklin
This post covers Python, performance, and GPUs. It lays out the current status, and describes future work.
It might be worth to evaluate performance boost with these techniques.
The text was updated successfully, but these errors were encountered:
Hi, yes Transformer currently performs slower, on this implementation, with this dataset.
LSTM and Transformer have different complexity, LSTM being in O(T, N^2), while Transformer is in O(T^2, N), where T is the time dimension and N the input vector dimension. In our dataset, T > N, which explains why LSTM is faster than Transformer.
LSTM class in pytorch are coded directly in CUDA if I'm not mistaken, whereas my Transformer is written in python, using pytorch of course. This added layer can slow things down. Additionally, I have made little efforts to optimize it. If you are looking for faster Transformer implementations, there is now a native one in pytorch.
I think it is worthwhile to evaluate the performance of the transformer. I have an indication that it is performing slower when compared to LSTM as of now.
This post covers Python, performance, and GPUs. It lays out the current status, and describes future work.
It might be worth to evaluate performance boost with these techniques.
The text was updated successfully, but these errors were encountered: