# LLaMA（Large Language Model Meta AI）

## 1. What is LLaMA?
- **LLaMA** = Large Language Model Meta AI
- Goal: Provide **smaller, efficient open models** comparable to GPT-3/4

---

## 2. Versions
### Pure pre-trained language model:
- **LLaMA 1** (2023.02): 7B, 13B, 33B, 65B\
  → **Alpaca** 7B (instruction-tuned in Instruction-Following Dataset from GPT-3)\
  → **Vicuna** 13B, 33B (fine-tuned for dialogue)
- **LLaMA 2** (2023.07): 7B, 13B, 70B\
  → **LLaMA-2-Chat** (instruction-tuned for dialogue)
- **LLaMA 3** (2024.04): 8B, 70B
- **LLaMA 4** (2025.04): 17B + 109B, 400B, 288B + 2T  



---

## 3. Architecture
- **Decoder-only Transformer** (like GPT)
- Training objective: **Autoregressive LM** (next-token prediction)
- Key improvements:
  - **SwiGLU** activation (faster & more stable than ReLU/GELU)
  - **RMSNorm** instead of LayerNorm
  - **RoPE (Rotary Position Embedding)** for relative position encoding
  - **Efficient tokenizer** (SentencePiece, 32k vocab)

---

## 4. Training
- Dataset: ≈ **1.4T tokens**, high-quality mix (Wikipedia, books, CommonCrawl, code, etc.)
- Emphasis on **quality over raw size** (smaller but cleaner than GPT-3’s 45TB)
- Techniques: distributed training (ZeRO), mixed precision (FP16/FP32)

---

## 5. Why LLaMA Matters
- **Efficiency**: LLaMA-13B matches or beats GPT-3 (175B) on many benchmarks
- **Accessibility**: weights available for research & commercial use (under license)
- **Ecosystem**: inspired many open-source derivatives (Alpaca, Vicuna, WizardLM, etc.)
- **Runs locally**: 7B and 13B can run on a single GPU (e.g., RTX 3090/4090)

---

## 6. Pros and Cons
✅ Pros
- Smaller but powerful
- Open access → strong open-source ecosystem
- Easy to fine-tune (LoRA, QLoRA)
- Efficient inference

❌ Cons
- Weaker performance in Chinese & low-resource languages
- Still behind GPT-4 in reasoning and multi-modal tasks
- LLaMA 1/2/3/4 require license approval (not pure open-source)

---

## 7. Key Difference from GPT
- **Same architecture** (Decoder-only Transformer, autoregressive)
- **Different philosophy**:
  - GPT: closed, very large-scale, commercial API
  - LLaMA: smaller, efficient, open for research/innovation
- GPT = “product-first”;  LLaMA = “research foundation”

---

## 8. Summary
- LLaMA = Meta’s efficient, open alternative to GPT
- Architecture: Decoder-only Transformer
- Impact: boosted open-source NLP, made LLMs more accessible
- -> **“Small but mighty open LLM from Meta”**