The document discusses **Word2Vec**, a popular technique for representing words as vectors in a high-dimensional space, which enables tasks like word analogy and similarity computation. Below is a breakdown of its mathematical intuition and working principles:

### **Mathematical Intuition Behind Word2Vec**
Word2Vec generates vector representations of words using a **neural network-based** approach. The goal is to place words with similar meanings closer together in the vector space.

---

### **1. Word Representation as Vectors**
Each word is represented as a **K-dimensional vector**. Typically, K is around **300**, meaning each word has a 300-length floating-point representation.

**Example:**
- \( \text{vec}(\text{king}) \) → 300-dimensional vector
- \( \text{vec}(\text{queen}) \) → 300-dimensional vector

These vectors capture semantic meaning, enabling operations like:
\[
\text{king} - \text{man} + \text{woman} \approx \text{queen}
\]
which means that the model captures **analogies** in language.

---

### **2. Cosine Similarity for Measuring Word Relationships**
To determine how similar two words are, Word2Vec uses **cosine similarity**:

\[
\text{sim}(a, b) = \frac{\vec{a} \cdot \vec{b}}{|\vec{a}| |\vec{b}|}
\]

- \( \vec{a} \cdot \vec{b} \) is the dot product of two word vectors.
- \( |\vec{a}| \) and \( |\vec{b}| \) are the magnitudes of the vectors.

If two words are similar, their cosine similarity will be close to **1**, and if they are dissimilar, it will be closer to **0** or even negative.

---

### **3. Training Word2Vec**
There are two main architectures used to train Word2Vec:

#### **A. Continuous Bag of Words (CBOW)**
- Predicts a target word given its surrounding context words.
- Example: Given ["the", "king", "a", "throne"], predict "rules".
- Uses a **shallow neural network** to learn word embeddings.

#### **B. Skip-Gram Model**
- Predicts context words given a single target word.
- Example: Given "king", predict ["queen", "throne", "monarch"].

Both models use a neural network with one hidden layer, where weights of the hidden layer are used as word embeddings.

---

### **4. Word Analogies and Vector Arithmetic**
A key feature of Word2Vec is the ability to compute analogies:

\[
\hat{x} = \arg\max_{x' \in V} \text{sim}(x', \text{king} + \text{woman} - \text{man})
\]

This equation finds the word (\( x' \)) that is most similar to the vector obtained by adding and subtracting word vectors.

Alternative versions of this equation include:

\[
\hat{x} = \arg\max_{x' \in V} \text{sim}(x', \text{king}) + \text{sim}(x', \text{woman}) - \text{sim}(x', \text{man})
\]

or with multiplication:

\[
\hat{x} = \arg\max_{x' \in V} \frac{\text{sim}(x', \text{king}) \cdot \text{sim}(x', \text{woman})}{\text{sim}(x', \text{man})}
\]

These formulations allow capturing **semantic relationships** between words.

---

### **5. Strengths and Limitations**
✅ **Strengths:**
- Captures word relationships effectively.
- Efficient and scalable for large corpora.
- Enables analogy reasoning.

❌ **Limitations:**
- Word2Vec is **static**—the meaning of words doesn’t change based on context (unlike BERT).
- Requires a large dataset to generalize well.
- Can encode biases present in the training data.

---

### **Conclusion**
Word2Vec is a powerful tool for **word embeddings** that enables vector-based word similarity computations. It is widely used in NLP for **semantic analysis**, **text classification**, and **machine translation**. The mathematical foundation of Word2Vec relies on **vector spaces, cosine similarity, and neural network optimization** to generate meaningful word representations.


In [1]:
!pip install gensim



In [2]:
import gensim

In [3]:
from gensim.models import Word2Vec,KeyedVectors

In [4]:
import gensim.downloader as api
wv=api.load('word2vec-google-news-300')
vec_king=wv['king']

In [5]:
vec_king

array([ 1.25976562e-01,  2.97851562e-02,  8.60595703e-03,  1.39648438e-01,
       -2.56347656e-02, -3.61328125e-02,  1.11816406e-01, -1.98242188e-01,
        5.12695312e-02,  3.63281250e-01, -2.42187500e-01, -3.02734375e-01,
       -1.77734375e-01, -2.49023438e-02, -1.67968750e-01, -1.69921875e-01,
        3.46679688e-02,  5.21850586e-03,  4.63867188e-02,  1.28906250e-01,
        1.36718750e-01,  1.12792969e-01,  5.95703125e-02,  1.36718750e-01,
        1.01074219e-01, -1.76757812e-01, -2.51953125e-01,  5.98144531e-02,
        3.41796875e-01, -3.11279297e-02,  1.04492188e-01,  6.17675781e-02,
        1.24511719e-01,  4.00390625e-01, -3.22265625e-01,  8.39843750e-02,
        3.90625000e-02,  5.85937500e-03,  7.03125000e-02,  1.72851562e-01,
        1.38671875e-01, -2.31445312e-01,  2.83203125e-01,  1.42578125e-01,
        3.41796875e-01, -2.39257812e-02, -1.09863281e-01,  3.32031250e-02,
       -5.46875000e-02,  1.53198242e-02, -1.62109375e-01,  1.58203125e-01,
       -2.59765625e-01,  

In [6]:
vec_king.shape

(300,)

In [7]:
wv['cricket']

array([-3.67187500e-01, -1.21582031e-01,  2.85156250e-01,  8.15429688e-02,
        3.19824219e-02, -3.19824219e-02,  1.34765625e-01, -2.73437500e-01,
        9.46044922e-03, -1.07421875e-01,  2.48046875e-01, -6.05468750e-01,
        5.02929688e-02,  2.98828125e-01,  9.57031250e-02,  1.39648438e-01,
       -5.41992188e-02,  2.91015625e-01,  2.85156250e-01,  1.51367188e-01,
       -2.89062500e-01, -3.46679688e-02,  1.81884766e-02, -3.92578125e-01,
        2.46093750e-01,  2.51953125e-01, -9.86328125e-02,  3.22265625e-01,
        4.49218750e-01, -1.36718750e-01, -2.34375000e-01,  4.12597656e-02,
       -2.15820312e-01,  1.69921875e-01,  2.56347656e-02,  1.50146484e-02,
       -3.75976562e-02,  6.95800781e-03,  4.00390625e-01,  2.09960938e-01,
        1.17675781e-01, -4.19921875e-02,  2.34375000e-01,  2.03125000e-01,
       -1.86523438e-01, -2.46093750e-01,  3.12500000e-01, -2.59765625e-01,
       -1.06933594e-01,  1.04003906e-01, -1.79687500e-01,  5.71289062e-02,
       -7.41577148e-03, -

In [8]:
wv.most_similar(['cricket'])

[('cricketing', 0.8372227549552917),
 ('cricketers', 0.8165745735168457),
 ('Test_cricket', 0.8094819188117981),
 ('Twenty##_cricket', 0.8068487048149109),
 ('Twenty##', 0.762426495552063),
 ('Cricket', 0.7541398406028748),
 ('cricketer', 0.7372578382492065),
 ('twenty##', 0.7316358685493469),
 ('T##_cricket', 0.7304614186286926),
 ('West_Indies_cricket', 0.698798656463623)]

In [9]:
wv.most_similar(['wife'])

[('husband', 0.8294166326522827),
 ('daughter', 0.7662221193313599),
 ('fiancée', 0.7583051919937134),
 ('mother', 0.7550681233406067),
 ('fiancee', 0.7449482083320618),
 ('daughters', 0.7342471480369568),
 ('girlfriend', 0.7102156281471252),
 ('niece', 0.7085863351821899),
 ('estranged_wife', 0.7017824053764343),
 ('eldest_daughter', 0.6898223161697388)]

In [14]:
wv.similarity('cricket','cricketers')

0.8165746

In [15]:
vec=wv['king']+wv['man']-wv['queen']

In [16]:
vec

array([ 4.46899414e-01,  3.04199219e-01,  1.12609863e-01, -6.68945312e-02,
       -6.76269531e-02,  1.14746094e-02, -1.51367188e-02,  2.54516602e-02,
        3.92089844e-01,  3.09562683e-01, -1.37695312e-01, -1.71875000e-01,
       -3.65722656e-01, -1.75453186e-01, -4.02832031e-01, -2.41943359e-01,
       -1.35742188e-01,  1.11419678e-01,  1.02539062e-02, -6.59179688e-02,
       -1.84570312e-01, -2.26287842e-02,  3.22021484e-01, -1.61132812e-02,
        9.96093750e-02, -3.14941406e-01, -2.14843750e-02,  2.27783203e-01,
        5.28320312e-01,  9.04541016e-02,  4.22851562e-01,  3.29345703e-01,
        1.43066406e-01,  1.66870117e-01, -6.34765625e-02,  4.69726562e-01,
        1.90917969e-01, -3.84765625e-01,  2.36816406e-01,  3.09570312e-01,
        6.32324219e-02,  1.45996094e-01,  2.25585938e-01,  1.62170410e-01,
        3.88671875e-01, -4.58984375e-02, -1.81915283e-01, -1.01806641e-01,
       -2.62695312e-01, -2.14172363e-01, -2.77954102e-01,  1.84570312e-01,
       -2.56835938e-01, -

In [17]:
wv.most_similar([vec])

[('man', 0.7067604064941406),
 ('boy', 0.4814143776893616),
 ('Alexios_Marakis', 0.4358147382736206),
 ('guy', 0.4347025454044342),
 ('Man', 0.4208385944366455),
 ('king', 0.4176100790500641),
 ('dude', 0.39430728554725647),
 ('him', 0.39286789298057556),
 ('businessman', 0.39085325598716736),
 ('he', 0.38446497917175293)]