## What Are N-Grams?

**N-Grams** are continuous sequences of *n* items (usually words or characters) extracted from a text. In NLP, they are used to preserve word order and capture local context.

- **Unigram (n = 1)** → Single words: `["NLP", "is", "fun"]`
- **Bigram (n = 2)** → Pairs of consecutive words: `[("NLP", "is"), ("is", "fun")]`
- **Trigram (n = 3)** → Triplets: `[("NLP", "is", "fun")]`

N-Grams are useful in many NLP tasks such as language modeling, text classification, and sentiment analysis, where the order of words can significantly affect meaning.

---

### 🔍 Why Not Just Use Bag of Words?

While **Bag of Words (BoW)** captures word frequency, it completely ignores the **order** of words. This leads to major issues in understanding meaning.

#### Consider:

- **Sentence 1**: `"Food is good"`  
- **Sentence 2**: `"Food is not good"`

Despite opposite meanings, BoW treats both as very similar because they share the same set of words.

#### Their BoW vectors might look like this:

| Word     | food | is | good | not |
|----------|------|----|------|-----|
| Sentence 1 |  1   | 1  |  1   |  0  |
| Sentence 2 |  1   | 1  |  1   |  1  |

As you can see, both vectors are almost identical, however, sentiment is completely opposite.

---

## 🧠 N-Grams in Action: "Food is good" vs "Food is not good"

Let’s see how N-Grams help capture **meaning through word combinations**, which Bag of Words misses.

---

### ✅ Input Sentences:

1. `"Food is good"`  
2. `"Food is not good"`

### 💡 Why N-Grams Help

By including bigrams such as:
- ("is", "good")
- ("not", "good")

N-Grams allow us to represent these phrases explicitly. This way, the model can distinguish between **"good"** and **"not good"**, even though both words appear in isolation.

This makes N-Grams especially powerful in tasks like **sentiment analysis**, **intent detection**, and **contextual understanding**.