# 📘 Day-3 Notes: Generative AI – LLMs, LangChain & Fine-tuning

---

## 🔹 Large Language Models (LLMs)

**Definition:**

* LLM = Large Language Model → Trained on massive text datasets, predicts next tokens.
* Powerful for tasks like translation, summarization, Q&A, reasoning, coding, etc.

### ✅ Key Steps in LLM Development

1. **Data Preprocessing & Curation**
   * Clean text → remove duplicates, noise, sensitive data
   * Sources: Internet, Wikipedia, books, research papers, Hugging Face datasets

2. **Model Architecture**
   * RNN, CNN, Transformers (encoder, decoder, seq2seq)
   * Residual connections, normalization, activation functions (ReLU, GELU, Swish), positional embeddings

3. **Training**
   * Requires **large GPU/TPU infra**
   * FLOPs ↑ exponentially with parameter size
   * Avoid **overfitting** (too large/too long training) and **underfitting** (too small/too short training)

4. **Prompt Engineering**
   * Zero-shot, few-shot learning, chain-of-thought prompting
   * Structured output (JSON, schemas, tools)

---

## 🔹 Fine-tuning LLMs

**Definition:** Adapting a pre-trained base model to a specific task/domain.

**Steps:**
1. Select base model (GPT, LLaMA, T5, etc.)
2. Adjust parameters (layers, learning rate, optimizers)
3. Train on task-specific dataset

**Open-source models that can be fine-tuned:**
* Google T5, PaLM
* Meta LLaMA (most used)

⚠️ Some proprietary models (GPT-3.5, GPT-4, Claude) are **not fine-tunable**.

---

## 🔹 LangChain & Ecosystem

**LangChain = Framework for building LLM-powered apps**

### 🌐 Core Components
* **LangChain-Core** → Base abstractions for LLMs, prompts, memory
* **LangChain** → Chains, agents, retrieval pipelines
* **LangChain-Community** → 3rd party integrations
* **LangGraph** → Orchestration layer (build workflows, agents, persistence)
* **LangSmith** → Debugging, evaluation, monitoring

### 🏗️ LangChain Concepts
* **Chat Models** → process messages (input: user msg, output: AI msg)
* **Memory** → save conversation history
* **Tools** → external APIs, databases, functions
* **Vector Stores** → store embeddings for retrieval
* **Retrieval Augmented Generation (RAG)** → combine LLMs with external knowledge
* **Streaming** → surface results in real-time
* **Prompt Templates** → reusable prompt structures
* **Output Parsers** → clean structured outputs (JSON, dicts)

---

## 🔹 LangSmith

* Playground → test prompts
* Debugging poor LLM runs
* Annotation, evaluation, monitoring
* Prompt optimization

---

## 🔹 Training LLMs – Techniques

* **Mixed Precision Training** → reduce compute cost (16-bit + 32-bit floats)
* **Parallelism**:
  * Data Parallelism → distribute data
  * Model Parallelism → split large model across GPUs
  * Pipeline Parallelism → split layers across GPUs
  * **3D Parallelism** = combination of all

* **ZeRO Optimizer** → reduces memory redundancy

---

## 🔹 Key Interview Questions (with short answers)

**Q1. What is fine-tuning in LLMs?**  
➡ Adjusting parameters of a pre-trained model for a specific task using smaller task-specific datasets.

**Q2. Which models can be fine-tuned?**  
➡ Open-source: LLaMA, T5, PaLM, BERT, Dolly, CTRL. Proprietary (GPT-3.5, GPT-4, Claude) → ❌ not fine-tunable.

**Q3. Why do we need data curation before training LLMs?**  
➡ To remove duplicates, noise, and sensitive data → ensures model quality and fairness.

**Q4. Difference between Zero-shot and Few-shot prompting?**  
➡ Zero-shot: No examples, only instruction.  
➡ Few-shot: Instruction + a few examples → improves accuracy.

**Q5. What is Retrieval Augmented Generation (RAG)?**  
➡ Technique combining LLMs with external knowledge bases to improve factual correctness.

**Q6. What is LangChain used for?**  
➡ Building LLM-powered apps (chatbots, RAG apps, agents) with modular components (memory, tools, prompts).

**Q7. How do you handle hallucinations in LLMs?**  
➡ Use RAG, structured prompting, fine-tuning with domain-specific data, evaluation via LangSmith.

**Q8. Why are LLMs called "large"?**  
➡ Because of huge **parameters (billions+), dataset size, and compute resources** required.

**Q9. What’s the role of embeddings in LLM apps?**  
➡ Convert text/images into vectors → used for similarity search, retrieval, clustering.

**Q10. What’s the difference between Model Parallelism & Data Parallelism?**  
➡ Model parallelism splits the model itself; data parallelism splits the training data.

---

✅ With this structure, you can revise **in 1–2 hours before an interview** and still confidently explain everything.
