# Generative AI Notes Handbook
---
This notebook contains structured notes from **Day-1 & Day-2** of Generative AI class.

# 📘 Day 1 – Introduction to Generative AI

## 1. What is Generative AI?
- Branch of AI focused on **creating new content** (text, image, audio, video, code).
- Works by learning patterns from data and generating outputs.

Examples:
- ChatGPT (text)
- DALL·E, MidJourney (images)
- Synthesia (video)
- Jukebox (music)

---

## 2. Types of AI
- **Narrow AI:** Specialized task (e.g., Alexa, Siri)
- **General AI:** Human-like intelligence (still research stage)
- **Super AI:** Beyond human intelligence (theory)

---

## 3. Machine Learning Basics
- **Supervised Learning:** Uses labeled data (input → output).
- **Unsupervised Learning:** Finds patterns without labels.
- **Reinforcement Learning:** Learns by trial & error.

---

## 4. Deep Learning
- Subset of ML using **Neural Networks** (ANN, CNN, RNN, Transformers).
- Core for **Generative AI**.

---

## 5. Difference: Traditional AI vs Generative AI
| Traditional AI | Generative AI |
|----------------|---------------|
| Rule-based | Pattern-learning |
| Predicts outputs | Creates new data |
| Example: Spam filter | Example: ChatGPT |

---

## 6. Applications of Generative AI
- Text: ChatGPT, Bard, Claude
- Image: MidJourney, Stable Diffusion
- Video: Synthesia
- Audio: Jukebox, ElevenLabs
- Code: GitHub Copilot

---

## 7. Why Generative AI is Important
- Enhances productivity
- Saves cost & time
- Automates creativity
- Enables personalized experiences

---


# 📘 Day 2 – Large Language Models (LLMs)

## 1. What is a Large Language Model (LLM)?
- AI model trained on massive text datasets.
- Famous: GPT-3.5, GPT-4.
- "Large" because of billions of tokens + parameters.

---

## 2. How LLMs Work
### 🔹 Tokenization
- Text → Tokens
- Methods: BPE, WordPiece, SentencePiece

### 🔹 Embeddings
- Tokens → vectors
- Used in vector DB + RAG

### 🔹 Transformers
- Based on **Attention mechanism**
- Encoder-only (BERT), Decoder-only (GPT), Encoder-decoder (T5)

### 🔹 Training Process
- Pre-training: General data (very costly)
- Fine-tuning: Task-specific (cheaper)

---

## 3. Key Concepts
- Prompt Design
- Few-shot & Zero-shot learning
- Transfer Learning

---

## 4. Challenges (Without API)
- Huge infra cost
- Slow inference
- Limited POC
- Better to use APIs (OpenAI, Google, Meta, Anthropic)

---

## 5. Ecosystem
- LangChain, Pinecone, FAISS, Weaviate
- LLaMA 2, Claude, GPT-4, Gemini

---

## 6. Applications
- Chatbots
- Summarization
- Code gen (Copilot)
- Education, Legal, Healthcare

---

# 🎯 Interview Prep – Q&A

**Q1. What is LLM?** → AI model trained on massive datasets that generates human-like text.  
**Q2. Why Large?** → Parameters + tokens are huge.  
**Q3. Tokenization?** → Splits text into tokens (BPE, WordPiece, SentencePiece).  
**Q4. Embeddings?** → Vector representation of text.  
**Q5. Attention?** → Focuses on important words in context.  
**Q6. Pre-training vs Fine-tuning?** → Pre-training = general, Fine-tuning = specific.  
**Q7. Prompt Engineering?** → Designing inputs to get best output.  
**Q8. Few-shot vs Zero-shot?** → Few-shot = examples, Zero-shot = none.  
**Q9. Why use APIs?** → Training cost is too high.  
**Q10. Real-world example?** → Customer chatbot, GitHub Copilot.  
