Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate triton #258

Closed
epwalsh opened this issue Sep 6, 2023 · 2 comments
Closed

Investigate triton #258

epwalsh opened this issue Sep 6, 2023 · 2 comments
Assignees
Labels
difficulty/hard May take a week or more project/model Related to modeling decisions and implementations severity/could A nice-to-have that we might not get to

Comments

@epwalsh
Copy link
Member

epwalsh commented Sep 6, 2023

I've been messing around with triton to see if it makes sense to start replacing some of our components with a triton implementation. So far preliminary results look good.

I have been working on a triton version of LayerNorm both with and without the element-wise affine transform. These are the benchmarking results for a batch of 4096 tokens and d_model of 4096 (representative of a typical microbatch with our medium model) or 8192 (our large model), on an A100 GPU.

The units are in GBPS (throughput), so larger is better.

layer-norm-with-affine-forward:
          N       Triton        Torch
1    4096.0  1129.931006   936.228546
2    8192.0  1379.705219   949.797080

layer-norm-with-affine-backward:
          N      Triton       Torch
1    4096.0  309.132070  479.531698
2    8192.0  632.180043  491.520012

layer-norm-no-affine-forward:
          N       Triton        Torch
1    4096.0  1092.266694   992.969689
2    8192.0  1409.376308   949.797080

layer-norm-no-affine-backward:
          N       Triton       Torch
1    4096.0  1156.517652  750.412251
2    8192.0  1524.093109  712.347810
@epwalsh epwalsh added project/model Related to modeling decisions and implementations severity/could A nice-to-have that we might not get to difficulty/hard May take a week or more labels Sep 6, 2023
@epwalsh epwalsh self-assigned this Sep 6, 2023
@epwalsh
Copy link
Member Author

epwalsh commented Sep 26, 2023

Marking as blocked again because it doesn't appear to work properly on AMD GPUs. See #260.

@dumitrac
Copy link
Contributor

Marking the items prior to Feb 29th as "closed".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty/hard May take a week or more project/model Related to modeling decisions and implementations severity/could A nice-to-have that we might not get to
Projects
None yet
Development

No branches or pull requests

2 participants