Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update residual_forward to use packed input #299

Merged
merged 7 commits into from
May 2, 2024

Conversation

JaneIllario
Copy link
Contributor

Update residual_forward to use 128 bit packed input, with floatX

Previous Kernel:
block_size 32 | time 0.1498 ms | bandwidth 503.99 GB/s
block_size 64 | time 0.0760 ms | bandwidth 993.32 GB/s
block_size 128 | time 0.0490 ms | bandwidth 1540.78 GB/s
block_size 256 | time 0.0487 ms | bandwidth 1548.88 GB/s
block_size 512 | time 0.0487 ms | bandwidth 1548.88 GB/s
block_size 1024 | time 0.0497 ms | bandwidth 1518.38 GB/s

total average iteration time: 39.030942 ms

New Kernel
block_size 32 | time 0.0219 ms | bandwidth 3440.86 GB/s
block_size 64 | time 0.0214 ms | bandwidth 3522.09 GB/s
block_size 128 | time 0.0223 ms | bandwidth 3392.29 GB/s
block_size 256 | time 0.0225 ms | bandwidth 3357.22 GB/s
block_size 512 | time 0.0226 ms | bandwidth 3333.70 GB/s
block_size 1024 | time 0.0225 ms | bandwidth 3352.64 GB/s

total average iteration time: 38.639469 ms

@karpathy karpathy merged commit 3d1761b into karpathy:master May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants