LLM Foundations Guide 🚀

A free, community-driven, visual guide to understanding Large Language Models from scratch. No paywalls. No fluff. Just clear explanations with diagrams and runnable code.

📚 What You Will Learn

This guide covers 12 chapters that take you from "what is an LLM?" all the way to fine-tuning your own generation model. Each chapter has a Jupyter Notebook with working code you can run on Google Colab (free GPU) or locally.

🗂️ Table of Contents

Chapter	Topic	Notebook
01	Introduction to Language Models	Open
02	Tokens and Embeddings	Open
03	Inside the Transformer	Open
04	Text Classification	Open
05	Text Clustering and Topic Modeling	Open
06	Prompt Engineering	Open
07	Advanced Text Generation	Open
08	Semantic Search and RAG	Open
09	Multimodal LLMs	Open
10	Creating Text Embedding Models	Open
11	Fine-Tuning BERT for Classification	Open
12	Fine-Tuning Generation Models	Open

🏁 How the LLM Journey Works

Here is the big picture before we dive into any chapter:

graph LR
    A[Raw Internet Text\nTrillions of tokens] -->|Self-supervised Pre-training| B(Base LLM\nPredicts next token)
    B -->|Supervised Fine-Tuning| C(Instruction-Following Model\nFollows commands)
    C -->|RLHF / DPO| D(Aligned Assistant\nHelpful + Harmless)
    D -->|RAG / Tool Use| E(Production App\nConnected to your data)

Plain English: The model reads vast amounts of text and learns to predict the next word. We then teach it to follow instructions, align it to be helpful and safe, and finally connect it to your own private data.

🛠️ Setup

git clone https://github.com/ashish993/LLM-Foundations-Guide.git
cd LLM-Foundations-Guide
pip install -r requirements.txt

Or open any notebook directly in Google Colab — click the badge inside each chapter notebook. No local setup required!

📦 Key Packages

Package	Purpose
`transformers`	Load and run any Hugging Face model in 2 lines
`datasets`	Access thousands of pre-built benchmark datasets
`sentence-transformers`	Build and use semantic embedding models
`torch`	PyTorch deep learning backbone
`scikit-learn`	Classical ML utilities (classifiers, metrics)
`bertopic`	State-of-the-art topic modeling

See the full requirements.txt.

📖 Chapter Highlights

Chapter 1 — Introduction to Language Models

"How does a machine learn language?"

Language models learn by reading text and predicting the next word, billions of times. We cover the history: n-gram models → RNNs → Transformers, with visual diagrams at every step.

Chapter 3 — Inside the Transformer

"What actually happens inside ChatGPT?"

We open the black box. You will understand Attention, Multi-Head Attention, positional encoding, and why Transformers are so powerful — all explained with visuals and analogies.

Chapter 6 — Prompt Engineering

"The art of talking to AI"

A prompt is not just a question — it is a precise instruction. We explore zero-shot, few-shot, chain-of-thought, and structured output prompting with practical exercises.

Chapter 8 — Semantic Search & RAG

"Giving AI access to your private data"

RAG (Retrieval Augmented Generation) lets an AI answer questions about documents it was never trained on. It is the most important pattern in enterprise AI today.

Chapter 12 — Fine-Tuning Generation Models

"Make the AI yours"

Use LoRA (Low-Rank Adaptation) to fine-tune a large model on your own dataset using a single GPU. We walk through every line of code.

🤝 Contributing

This is a community project! See CONTRIBUTING.md to add a chapter, fix a bug, or improve an explanation.

📄 License

MIT License — free to use, modify, and share.

Acknowledgements

Thanks to all contributors and the open source community for making this guide possible.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
chapters		chapters
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Foundations Guide 🚀

📚 What You Will Learn

🗂️ Table of Contents

🏁 How the LLM Journey Works

🛠️ Setup

📦 Key Packages

📖 Chapter Highlights

Chapter 1 — Introduction to Language Models

Chapter 3 — Inside the Transformer

Chapter 6 — Prompt Engineering

Chapter 8 — Semantic Search & RAG

Chapter 12 — Fine-Tuning Generation Models

🤝 Contributing

📄 License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LLM Foundations Guide 🚀

📚 What You Will Learn

🗂️ Table of Contents

🏁 How the LLM Journey Works

🛠️ Setup

📦 Key Packages

📖 Chapter Highlights

Chapter 1 — Introduction to Language Models

Chapter 3 — Inside the Transformer

Chapter 6 — Prompt Engineering

Chapter 8 — Semantic Search & RAG

Chapter 12 — Fine-Tuning Generation Models

🤝 Contributing

📄 License

Acknowledgements

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages