# üìò Day-3 Notes: Generative AI ‚Äì LLMs, LangChain & Fine-tuning

---

## üîπ Large Language Models (LLMs)

**Definition:**

* LLM = Large Language Model ‚Üí Trained on massive text datasets, predicts next tokens.
* Powerful for tasks like translation, summarization, Q&A, reasoning, coding, etc.

### ‚úÖ Key Steps in LLM Development

1. **Data Preprocessing & Curation**
   * Clean text ‚Üí remove duplicates, noise, sensitive data
   * Sources: Internet, Wikipedia, books, research papers, Hugging Face datasets

2. **Model Architecture**
   * RNN, CNN, Transformers (encoder, decoder, seq2seq)
   * Residual connections, normalization, activation functions (ReLU, GELU, Swish), positional embeddings

3. **Training**
   * Requires **large GPU/TPU infra**
   * FLOPs ‚Üë exponentially with parameter size
   * Avoid **overfitting** (too large/too long training) and **underfitting** (too small/too short training)

4. **Prompt Engineering**
   * Zero-shot, few-shot learning, chain-of-thought prompting
   * Structured output (JSON, schemas, tools)

---

## üîπ Fine-tuning LLMs

**Definition:** Adapting a pre-trained base model to a specific task/domain.

**Steps:**
1. Select base model (GPT, LLaMA, T5, etc.)
2. Adjust parameters (layers, learning rate, optimizers)
3. Train on task-specific dataset

**Open-source models that can be fine-tuned:**
* Google T5, PaLM
* Meta LLaMA (most used)

‚ö†Ô∏è Some proprietary models (GPT-3.5, GPT-4, Claude) are **not fine-tunable**.

---

## üîπ LangChain & Ecosystem

**LangChain = Framework for building LLM-powered apps**

### üåê Core Components
* **LangChain-Core** ‚Üí Base abstractions for LLMs, prompts, memory
* **LangChain** ‚Üí Chains, agents, retrieval pipelines
* **LangChain-Community** ‚Üí 3rd party integrations
* **LangGraph** ‚Üí Orchestration layer (build workflows, agents, persistence)
* **LangSmith** ‚Üí Debugging, evaluation, monitoring

### üèóÔ∏è LangChain Concepts
* **Chat Models** ‚Üí process messages (input: user msg, output: AI msg)
* **Memory** ‚Üí save conversation history
* **Tools** ‚Üí external APIs, databases, functions
* **Vector Stores** ‚Üí store embeddings for retrieval
* **Retrieval Augmented Generation (RAG)** ‚Üí combine LLMs with external knowledge
* **Streaming** ‚Üí surface results in real-time
* **Prompt Templates** ‚Üí reusable prompt structures
* **Output Parsers** ‚Üí clean structured outputs (JSON, dicts)

---

## üîπ LangSmith

* Playground ‚Üí test prompts
* Debugging poor LLM runs
* Annotation, evaluation, monitoring
* Prompt optimization

---

## üîπ Training LLMs ‚Äì Techniques

* **Mixed Precision Training** ‚Üí reduce compute cost (16-bit + 32-bit floats)
* **Parallelism**:
  * Data Parallelism ‚Üí distribute data
  * Model Parallelism ‚Üí split large model across GPUs
  * Pipeline Parallelism ‚Üí split layers across GPUs
  * **3D Parallelism** = combination of all

* **ZeRO Optimizer** ‚Üí reduces memory redundancy

---

## üîπ Key Interview Questions (with short answers)

**Q1. What is fine-tuning in LLMs?**  
‚û° Adjusting parameters of a pre-trained model for a specific task using smaller task-specific datasets.

**Q2. Which models can be fine-tuned?**  
‚û° Open-source: LLaMA, T5, PaLM, BERT, Dolly, CTRL. Proprietary (GPT-3.5, GPT-4, Claude) ‚Üí ‚ùå not fine-tunable.

**Q3. Why do we need data curation before training LLMs?**  
‚û° To remove duplicates, noise, and sensitive data ‚Üí ensures model quality and fairness.

**Q4. Difference between Zero-shot and Few-shot prompting?**  
‚û° Zero-shot: No examples, only instruction.  
‚û° Few-shot: Instruction + a few examples ‚Üí improves accuracy.

**Q5. What is Retrieval Augmented Generation (RAG)?**  
‚û° Technique combining LLMs with external knowledge bases to improve factual correctness.

**Q6. What is LangChain used for?**  
‚û° Building LLM-powered apps (chatbots, RAG apps, agents) with modular components (memory, tools, prompts).

**Q7. How do you handle hallucinations in LLMs?**  
‚û° Use RAG, structured prompting, fine-tuning with domain-specific data, evaluation via LangSmith.

**Q8. Why are LLMs called "large"?**  
‚û° Because of huge **parameters (billions+), dataset size, and compute resources** required.

**Q9. What‚Äôs the role of embeddings in LLM apps?**  
‚û° Convert text/images into vectors ‚Üí used for similarity search, retrieval, clustering.

**Q10. What‚Äôs the difference between Model Parallelism & Data Parallelism?**  
‚û° Model parallelism splits the model itself; data parallelism splits the training data.

---

‚úÖ With this structure, you can revise **in 1‚Äì2 hours before an interview** and still confidently explain everything.
