Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Evaluation #8

Closed
DanielAtKrypton opened this issue May 24, 2020 · 1 comment
Closed

Performance Evaluation #8

DanielAtKrypton opened this issue May 24, 2020 · 1 comment

Comments

@DanielAtKrypton
Copy link

DanielAtKrypton commented May 24, 2020

I think it is worthwhile to evaluate the performance of the transformer. I have an indication that it is performing slower when compared to LSTM as of now.

We’re improving the state of scalable GPU computing in Python.

-- Matthew Rocklin

This post covers Python, performance, and GPUs. It lays out the current status, and describes future work.

It might be worth to evaluate performance boost with these techniques.

@maxjcohen
Copy link
Owner

Hi, yes Transformer currently performs slower, on this implementation, with this dataset.

  • LSTM and Transformer have different complexity, LSTM being in O(T, N^2), while Transformer is in O(T^2, N), where T is the time dimension and N the input vector dimension. In our dataset, T > N, which explains why LSTM is faster than Transformer.
  • LSTM class in pytorch are coded directly in CUDA if I'm not mistaken, whereas my Transformer is written in python, using pytorch of course. This added layer can slow things down. Additionally, I have made little efforts to optimize it. If you are looking for faster Transformer implementations, there is now a native one in pytorch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants