Learning how to build my own large(-ish) language models from scratch

My attempt at recreating some ML learnings as I try to learn how to build LLMs from scratch.

Initially I will be going through Karpathy's YouTube lecture series called Neural Networks: Zero to Hero. We will see how far I will get...

(also using this as an excuse to set up WSL on my Desktop PC and learning how to use GPU acceleration on it)

Useful resources I found along the way

This is mainly here to force me to stop hoarding them in open tabs :)

Karpathy's course & repo (linked above)
WSL setup (linked above)
This guide from freeCodeCamp on setting up WSL for NN applications with CUDA support
Maximme Labonne's LLM Course for a meta-list of learning paths
Fast.ai's Practical Deep Learning for Coders, especially its new Part II
HuggingFace's Transformers Docs

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.gitignore		.gitignore
README.md		README.md
cuda_test.ipynb		cuda_test.ipynb
lecture_01.ipynb		lecture_01.ipynb
lecture_02.ipynb		lecture_02.ipynb
lecture_03.ipynb		lecture_03.ipynb
names.txt		names.txt