Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added packing for gelu forwards kernel #301

Merged
merged 1 commit into from
Apr 30, 2024

Conversation

ChrisDryden
Copy link
Contributor

This PR implements packing for the Gelu forwards kernel using the example provided. The kernel dev file was also updated to show the impact of changing the data types for floatX.

Before changes:
total average iteration time: 38.480425 ms
After changes:
total average iteration time: 37.817789 ms

Before:
block_size 32 | time 0.1502 ms | bandwidth 335.17 GB/s
block_size 64 | time 0.0761 ms | bandwidth 661.65 GB/s
block_size 128 | time 0.0404 ms | bandwidth 1246.72 GB/s
block_size 256 | time 0.0376 ms | bandwidth 1337.40 GB/s
block_size 512 | time 0.0389 ms | bandwidth 1294.26 GB/s
block_size 1024 | time 0.0407 ms | bandwidth 1235.78 GB/s

After:
block_size 32 | time 0.0225 ms | bandwidth 2232.26 GB/s
block_size 64 | time 0.0198 ms | bandwidth 2547.13 GB/s
block_size 128 | time 0.0198 ms | bandwidth 2544.23 GB/s
block_size 256 | time 0.0201 ms | bandwidth 2508.65 GB/s
block_size 512 | time 0.0207 ms | bandwidth 2433.87 GB/s
block_size 1024 | time 0.0218 ms | bandwidth 2308.04 GB/s

@karpathy karpathy merged commit 242981e into karpathy:master Apr 30, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants