ml-systems-notes

a personal collection of notes on ml systems engineering covering distributed computing, parallelism, quantization, and pytorch internals.

everything here is a work in progress. i add notes as i do experiments and projects.

distributed-techniques - distributed training fundamentals: nccl collectives (gather, all-gather, reduce, all-reduce, scatter, reduce-scatter), mixture-of-experts, parallelism strategies (dp, ddp, zero, tensor/pipeline parallelism), and torch.distributed basics.
quantization - model quantization from first principles: symmetric/asymmetric quantization, llm.int8(), awq, smoothquant, gptq/obs/obq, and quip.
torch-notes - pytorch internals
jax-scaling-book - roofline analysis exercises for matrix multiplication in jax/tpu context.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
distributed_techniques		distributed_techniques
jax-scaling-book		jax-scaling-book
quantization		quantization
torch_notes		torch_notes
.gitignore		.gitignore
README.md		README.md

Provide feedback