Megakernel to match Apple Silicon Efficiency at 2x the Throughput on a RTX 3090
-
Updated
Apr 10, 2026 - Cuda
Megakernel to match Apple Silicon Efficiency at 2x the Throughput on a RTX 3090
Add a description, image, and links to the m5-max topic page so that developers can more easily learn about it.
To associate your repository with the m5-max topic, visit your repo's landing page and select "manage topics."