# GloVe

GloVe is an unsupervised learning algorithm that generates dense vector representations for words. Unlike Word2Vec, which learns embeddings from local context windows, GloVe leverages global co-occurrence statistics across the entire corpus.

The idea is simple but brilliant:
“Words that appear in similar contexts tend to have similar meanings.”



### ⚙️ How GloVe Works

- **Build a Co-occurrence Matrix**
Count how often word i appears in the context of word j across the entire corpus.

- **Compute Ratios**
GloVe focuses on the ratios of co-occurrence probabilities between word pairs. For example:
    - ice and steam both co-occur with solid, gas, water, but in different proportions.
    - These ratios help distinguish semantic relationships.

- **Train Word Vectors**
GloVe minimizes a cost function that ensures the dot product of word vectors approximates the log of co-occurrence counts.


Let’s say we have a tiny corpus:

    "I like deep learning"
    "I like NLP"
    "I enjoy flying"

From this, we build a co-occurrence matrix (window size = 1):


||I|like|deep|learning|NLP|enjoy|flying|
|--------|------|-------|------|-----|-------|-------|-------|
|I|0|2|0|0|1|1|0|
|like|2|0|1|0|1|0|0|
|deep|0|1|0|1|0|0|0|
|learning|0|0|1|0|0|0|0|
|NLP|1|1|0|0|0|0|0|
|enjoy|1|0|0|0|0|0|1|
|flying|0|0|0|0|0|1|0|

Now, GloVe tries to learn word vectors such that:

$$dot(wordvector_i, wordvector_j) ≈ log(cooccurrence(i, j))$$

So if “deep” and “learning” co-occur often, their vectors will have a high dot product. If “deep” and “flying” never co-occur, their vectors will be far apart

⚙️ What Makes GloVe Special?

Instead of just using raw counts, GloVe focuses on ratios of co-occurrence. For example:
- “ice” co-occurs more with “cold” than “steam” does.
- “steam” co-occurs more with “hot” than “ice” does.

These ratios help GloVe learn that “ice” and “steam” are related but different in temperature context.


#### Example for pretrained Glove model

In [5]:
import numpy as np

# Load GloVe embeddings (e.g., glove.6B.100d.txt)
def load_glove_embeddings(file_path):
    embeddings = {}
    with open(file_path, 'r', encoding='utf8') as f:
        for line in f:
            parts = line.strip().split()
            word = parts[0]
            vector = np.array(parts[1:], dtype='float32')
            embeddings[word] = vector
    return embeddings

# Load embeddings
glove_path = './glove.6B/glove.6B.100d.txt'  # Download from https://nlp.stanford.edu/projects/glove/
glove = load_glove_embeddings(glove_path)

# Example: vector for "king"
print("Vector for 'king':\n", glove['king'])

# Analogy: king - man + woman ≈ queen
def analogy(w1, w2, w3):
    vec = glove[w1] - glove[w2] + glove[w3]
    return sorted(glove.keys(), key=lambda word: np.dot(glove[word], vec), reverse=True)[:10]

print("\nAnalogy result for 'king - man + woman':", analogy('king', 'man', 'woman'))

Vector for 'king':
 [-0.32307  -0.87616   0.21977   0.25268   0.22976   0.7388   -0.37954
 -0.35307  -0.84369  -1.1113   -0.30266   0.33178  -0.25113   0.30448
 -0.077491 -0.89815   0.092496 -1.1407   -0.58324   0.66869  -0.23122
 -0.95855   0.28262  -0.078848  0.75315   0.26584   0.3422   -0.33949
  0.95608   0.065641  0.45747   0.39835   0.57965   0.39267  -0.21851
  0.58795  -0.55999   0.63368  -0.043983 -0.68731  -0.37841   0.38026
  0.61641  -0.88269  -0.12346  -0.37928  -0.38318   0.23868   0.6685
 -0.43321  -0.11065   0.081723  1.1569    0.78958  -0.21223  -2.3211
 -0.67806   0.44561   0.65707   0.1045    0.46217   0.19912   0.25802
  0.057194  0.53443  -0.43133  -0.34311   0.59789  -0.58417   0.068995
  0.23944  -0.85181   0.30379  -0.34177  -0.25746  -0.031101 -0.16285
  0.45169  -0.91627   0.64521   0.73281  -0.22752   0.30226   0.044801
 -0.83741   0.55006  -0.52506  -1.7357    0.4751   -0.70487   0.056939
 -0.7132    0.089623  0.41394  -1.3363   -0.61915  -0.33089  -0.52881

Ref: https://nlp.stanford.edu/projects/glove/

### same example with higher dimensional pre-trained Glove model

In [6]:
import numpy as np

# Load GloVe embeddings (e.g., glove.6B.100d.txt)
def load_glove_embeddings(file_path):
    embeddings = {}
    with open(file_path, 'r', encoding='utf8') as f:
        for line in f:
            parts = line.strip().split()
            word = parts[0]
            vector = np.array(parts[1:], dtype='float32')
            embeddings[word] = vector
    return embeddings

# Load embeddings
glove_path = './glove.6B/glove.6B.300d.txt'  # Download from https://nlp.stanford.edu/projects/glove/
glove = load_glove_embeddings(glove_path)

# Example: vector for "king"
print("Vector for 'king':\n", glove['king'])

# Analogy: king - man + woman ≈ queen
def analogy(w1, w2, w3):
    vec = glove[w1] - glove[w2] + glove[w3]
    return sorted(glove.keys(), key=lambda word: np.dot(glove[word], vec), reverse=True)[:10]

print("\nAnalogy result for 'king - man + woman':", analogy('king', 'man', 'woman'))

Vector for 'king':
 [ 0.0033901 -0.34614    0.28144    0.48382    0.59469    0.012965
  0.53982    0.48233    0.21463   -1.0249    -0.34788   -0.79001
 -0.15084    0.61374    0.042811   0.19323    0.25462    0.32528
  0.05698    0.063253  -0.49439    0.47337   -0.16761    0.045594
  0.30451   -0.35416   -0.34583   -0.20118    0.25511    0.091111
  0.014651  -0.017541  -0.23854    0.48215   -0.9145    -0.36235
  0.34736    0.028639  -0.027065  -0.036481  -0.067391  -0.23452
 -0.13772    0.33951    0.13415   -0.1342     0.47856   -0.1842
  0.10705   -0.45834   -0.36085   -0.22595    0.32881   -0.13643
  0.23128    0.34269    0.42344    0.47057    0.479      0.074639
  0.3344     0.10714   -0.13289    0.58734    0.38616   -0.52238
 -0.22028   -0.072322   0.32269    0.44226   -0.037382   0.18324
  0.058082   0.26938    0.36202    0.13983    0.016815  -0.34426
  0.4827     0.2108     0.75618   -0.13092   -0.025741   0.43391
  0.33893   -0.16438    0.26817    0.68774    0.311     -0.2509
  0

Notice the new word monarch added in higher dim model

### Glove vs word2vec

|Feature|Glove|Word2vec|
|------|--------|---------|
|Context|Global(co-occurence matrix)|Local(context window)|
|Training Objective|Matrix factorization|Predictive(CBOW/Skip-Gram)|
|Interpretability|More interpretable|Less|
|Performance|String on analysis tasks|Strong on similarity tasks|