Pull requests: karpathy/llm.c
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
run Adam only for parameters that are available on the local device
#463
opened May 25, 2024 by
ngc92
Loading…
Added new cuda kernel for encoder forwards using three dimensional kernels
#459
opened May 25, 2024 by
ChrisDryden
Loading…
only save missing bits to reconstruct fp32 master weights
#432
opened May 19, 2024 by
ngc92
Loading…
NCCL only multi gpu training for slurm enabled cluster
#426
opened May 17, 2024 by
chinthysl
Loading…
vectorized gemm loading and use register to hold the intermediate value
#424
opened May 17, 2024 by
patricxu
Loading…
Include a matmul_backward_bias kernel based on PMPP CoarsenedSumReduction kernel in 10.15
#419
opened May 16, 2024 by
lancerts
Loading…
Modified version of ademeure's fused gelu_forward kernel
#363
opened May 5, 2024 by
ChrisDryden
Loading…
Experimenting with global instantiation for the layouts
#347
opened May 3, 2024 by
ChrisDryden
•
Draft
Added FlameGraphs for nsys reports and some nsys documentation
#333
opened May 2, 2024 by
PeterZhizhin
Loading…
Rewrite the encoder_forward float4 kernel with pack128
#302
opened Apr 30, 2024 by
lancerts
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.