This repository contains implementations and exercises for building a Large Language Model (LLM) from the ground up, based on the book and resources from the original repository by Sebastian Raschka.
Each chapter covers a critical component of the LLM pipeline, from data preparation to instruction fine-tuning.
- ch02: Working with Text Data - Tokenization and data sampling.
- ch03: Coding Attention Mechanisms - Self-attention, causal attention, and multi-head attention.
- ch04: Implementing a GPT Model - Building the GPT architecture and various attention optimizations.
- ch05: Training on Unlabeled Data - Loss calculation, training loops, and loading pretrained weights.
- ch06: Fine-Tuning for Classification - Adapting the model for tasks like spam detection.
- ch07: Fine-Tuning to Follow Instructions - Instruction fine-tuning for conversational capabilities.
- Each folder contains a main notebook and supporting scripts to demonstrate the concepts covered in that chapter.
- For the authoritative source and additional resources, please visit the main repository.