```{contents}
```
## Chunking and Chunk Overlap

**Chunking** is the process of splitting large documents into smaller, manageable pieces (**chunks**) before embedding, indexing, and retrieval.
**Chunk overlap** ensures continuity of meaning across chunk boundaries by sharing content between adjacent chunks.

Both are critical design choices in **RAG systems, semantic search, and document QA pipelines**.

---

### Core Intuition

LLMs and embedding models have context limits and perform best on focused inputs.

> **Large document → meaningful pieces → better retrieval → better generation**

Chunk overlap prevents important information from being cut in half.

---

### Why Chunking Is Necessary

| Problem            | Without Chunking | With Chunking |
| ------------------ | ---------------- | ------------- |
| Context limits     | Exceeded         | Respected     |
| Retrieval accuracy | Low              | High          |
| Embedding quality  | Diluted          | Focused       |
| Generation quality | Inconsistent     | Stable        |

---

### Basic Workflow

```
Raw Document
   ↓
Text Cleaning
   ↓
Chunking + Overlap
   ↓
Embedding
   ↓
Vector Database
   ↓
Retrieval → LLM
```

---

### Chunking Strategies

#### 1. Fixed-Size Chunking

Split by token or character count.

```
Chunk size: 500 tokens
Overlap: 100 tokens
```

#### 2. Semantic Chunking

Split on logical boundaries:

* Paragraphs
* Headings
* Sentences

Often combined with max token limits.

---

### Chunk Overlap Explained

Overlap copies a portion of text from the end of one chunk to the start of the next.

```
Chunk 1: [ A B C D E F ]
Chunk 2:           [ D E F G H I ]
```

Here, **D E F** is the overlap.

**Purpose:** preserve context continuity.

---

### Choosing Chunk Size & Overlap

| Parameter  | Typical Range  |
| ---------- | -------------- |
| Chunk size | 300–800 tokens |
| Overlap    | 50–200 tokens  |

Depends on:

* Model context window
* Document structure
* Retrieval precision requirements

---

### Demonstration (Python Example)

```python
from langchain.text_splitter import RecursiveCharacterTextSplitter

text = open("document.txt").read()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=100
)

chunks = splitter.split_text(text)
print(len(chunks))
```

---

### Impact on RAG Performance

| Design         | Effect                                  |
| -------------- | --------------------------------------- |
| Small chunks   | Precise retrieval, more storage         |
| Large chunks   | Fewer vectors, weaker precision         |
| Proper overlap | Reduced hallucination, better grounding |

---

### Summary

| Concept        | Description                         |
| -------------- | ----------------------------------- |
| Chunking       | Break document into retrieval units |
| Overlap        | Preserve semantic continuity        |
| Primary use    | RAG & semantic search               |
| Quality impact | Very high                           |

