Supplementary Material for Lectures

The PMPP Book: Programming Massively Parallel Processors: A Hands-on Approach (Amazon link)

Lecture 1: Profiling and Integrating CUDA kernels in PyTorch

Video
Date: 2024-01-13, Speaker: Mark Saroufim
Notebook and slides in lecture1 folder

Lecture 2: Recap Ch. 1-3 from the PMPP book

Video
Date: 2024-01-20, Speaker: Andreas Koepf
Slides: The powerpoint file lecture2/cuda_mode_lecture2.pptx can be found in the root directory of this repository. Alternatively here as Google docs presentation.

Lecture 3: Getting Started With CUDA

Video
Date: 2024-01-27, Speaker: Jeremy Howard
Notebook: See the lecture3 folder, or run the Colab version

Lecture 4: Intro to Compute and Memory Architecture

Video
Date: 2024-02-03, Speaker: Thomas Viehmann
Notebook and slides in the lecture4 folder.

Lecture 5: Going Further with CUDA for Python Programmers

Video
Date: 2024-02-10, Speaker: Jeremy Howard
Notebook in the lecture5 folder.

Lecture 6: Optimizing PyTorch Optimizers

Video
Date: 2024-02-17, Speaker: Jane Xu
Slides

Lecture 7: Advanced Quantization

Video
Date: 2024-02-25, Speaker: Charles Hernandez
Slides

Lecture 8: CUDA Performance Checklist

Video
Date: 2024-03-09, Speaker: Mark Saroufim
Code in the lecture8 folder
Slides

Lecture 9: Reductions

Video
Date: 2024-03-09, Speaker: Mark Saroufim
Code in the lecture9 folder
Slides

Lecture 10: Build a Prod Ready CUDA Library

Video
Date: 2024-03-16, Speaker: Oscar Amoros Huguet
slides

Lecture 11: Sparsity

Video
Date: 2024-03-23, Speaker: Jesse Cai
Slides

Lecture 12: Flash Attention

Video
Date: 2024-03-30, Speaker: Thomas Viehmann

Lecture 13: Ring Attention

Video
Date: 2024-04-06, Speaker: Andreas Koepf
Slides

Lecture 14: Practitioner's Guide to Triton

Video
Date: 2024-04-13, Speaker: Umer Adil
[Notebook](./lecture 14/A_Practitioners_Guide_to_Triton.ipynb)

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
lecture 14		lecture 14
lecture1		lecture1
lecture11		lecture11
lecture13		lecture13
lecture2		lecture2
lecture3		lecture3
lecture4		lecture4
lecture5		lecture5
lecture8		lecture8
lecture9		lecture9
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
utils.py		utils.py

License

cuda-mode/lectures

Folders and files

Latest commit

History

Repository files navigation

Supplementary Material for Lectures

Lecture 1: Profiling and Integrating CUDA kernels in PyTorch

Lecture 2: Recap Ch. 1-3 from the PMPP book

Lecture 3: Getting Started With CUDA

Lecture 4: Intro to Compute and Memory Architecture

Lecture 5: Going Further with CUDA for Python Programmers

Lecture 6: Optimizing PyTorch Optimizers

Lecture 7: Advanced Quantization

Lecture 8: CUDA Performance Checklist

Lecture 9: Reductions

Lecture 10: Build a Prod Ready CUDA Library

Lecture 11: Sparsity

Lecture 12: Flash Attention

Lecture 13: Ring Attention

Lecture 14: Practitioner's Guide to Triton

About

Resources

License

Stars

Watchers

Forks

Languages