# Word2Vec and GloVe

**Objective:** Explore two popular word embedding models — Word2Vec and GloVe — and learn how to use them for semantic similarity and analogy tasks.

---
## 🧠 1️⃣ Introduction

Both **Word2Vec** (Google, 2013) and **GloVe** (Stanford, 2014) are methods to learn **dense vector representations** of words.

- **Word2Vec:** Learns word meaning through *prediction* (neural network model).
- **GloVe:** Learns word meaning through *count-based statistics* (co-occurrence matrix factorization).

---
## ⚙️ 2️⃣ Training a Custom Word2Vec Model
We’ll train a simple Word2Vec model using a small text corpus to understand how embeddings are generated.

In [None]:
from gensim.models import Word2Vec

sentences = [
    ["the", "king", "rules", "the", "kingdom"],
    ["the", "queen", "rules", "the", "empire"],
    ["a", "man", "is", "strong"],
    ["a", "woman", "is", "wise"],
    ["the", "prince", "is", "the", "son", "of", "the", "king"],
    ["the", "princess", "is", "the", "daughter", "of", "the", "queen"]
]

model = Word2Vec(sentences, vector_size=100, window=3, min_count=1, sg=1)

print("✅ Word2Vec model trained successfully!")
print("\nVector for 'king':\n", model.wv['king'][:10], "...")

---
## 🔍 3️⃣ Exploring Word Similarity and Analogies
Word2Vec captures **semantic relationships** between words. Let’s explore a few examples.

In [None]:
# Most similar words to 'king'
print(model.wv.most_similar('king'))

# Analogy example: king - man + woman ≈ ?
print("\nAnalogy result (king - man + woman):")
print(model.wv.most_similar(positive=['king', 'woman'], negative=['man'], topn=3))

✅ These relationships emerge naturally through context prediction during training.

---
## 📊 4️⃣ Visualizing Word2Vec Embeddings
We can use PCA to project high-dimensional embeddings into 2D for visualization.

In [None]:
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

words = list(model.wv.key_to_index.keys())
X = model.wv[words]

pca = PCA(n_components=2)
result = pca.fit_transform(X)

plt.figure(figsize=(7,5))
plt.scatter(result[:,0], result[:,1])
for i, word in enumerate(words):
    plt.annotate(word, xy=(result[i,0], result[i,1]))
plt.title('Word2Vec Embeddings (PCA Visualization)')
plt.show()

---
## 💡 5️⃣ Understanding GloVe

**GloVe (Global Vectors for Word Representation)** uses **word co-occurrence statistics** to learn embeddings.

- It counts how often words co-occur within a window.
- Uses matrix factorization to find dense vectors that explain these co-occurrences.

Pretrained GloVe models are widely available — e.g., `glove.6B.100d.txt` from Stanford NLP.

Let’s see how to use one (if downloaded).

In [None]:
from gensim.models import KeyedVectors

# Example (requires pretrained GloVe file)
# glove_model = KeyedVectors.load_word2vec_format('glove.6B.100d.txt', binary=False, no_header=True)
# print(glove_model.most_similar('king'))

✅ **Note:** The GloVe model isn’t trained in real time here (as it’s large), but you can load pretrained embeddings to explore word relationships directly.

---
## 🔢 6️⃣ Comparing Word2Vec vs GloVe

| Feature | Word2Vec | GloVe |
|----------|-----------|--------|
| Training Type | Predictive (Neural Network) | Count-based (Matrix Factorization) |
| Data Used | Context window | Global co-occurrence matrix |
| Computation | Online (Stochastic Gradient Descent) | Offline (Matrix factorization) |
| Interpretability | Harder | Easier (co-occurrence stats) |
| Speed | Faster on small data | Better for large corpora |

---
## 🧩 7️⃣ Applications of Word Embeddings
- Sentiment analysis
- Named Entity Recognition (NER)
- Document similarity
- Chatbots and question-answering
- Machine translation

---
## ✅ Summary
- **Word2Vec** learns word meaning using context prediction.
- **GloVe** learns embeddings from global word co-occurrences.
- Both produce **dense, semantic vectors** used across modern NLP tasks.

---
📘 **Next:** `07-Contextual_Embeddings_BERT.ipynb` — Learn how contextual embeddings like **BERT** go beyond Word2Vec by understanding words in context.