# Embedding Layer in TensorFlow/Keras

## Overview
This guide explains how to create an embedding layer in TensorFlow/Keras to convert words into dense numerical vectors. Instead of one-hot encoding, which is inefficient, we use an embedding layer to learn meaningful word representations.

## Steps to Make Embedding Layers

### 1️⃣ Define Sentences
- **Action:** Create text data.
- **Purpose:** Provides input for the model.

```python
sent = [
    'the glass of milk',
    'the glass of juice',
    'the cup of tea',
    'I am a good boy',
    'I am a good developer',
    'understand the meaning of words',
    'your videos are good',
]
```

---

### 2️⃣ One-Hot Encoding
- **Action:** Convert words to unique numbers.
- **Purpose:** Makes words machine-readable.

```python
from tensorflow.keras.preprocessing.text import one_hot
voc_size = 10000  # Define vocabulary size
onehot_repr = [one_hot(words, voc_size) for words in sent]
```

---

### 3️⃣ Padding
- **Action:** Make all sentences the same length.
- **Purpose:** Required for neural networks.

```python
from tensorflow.keras.utils import pad_sequences

fixed_length = 8  # Define the fixed sentence length
padded_sentences = pad_sequences(onehot_repr, maxlen=fixed_length, padding='post')
```

---

### 4️⃣ Embedding Layer
- **Action:** Convert words to dense vectors.
- **Purpose:** Captures word relationships.

```python
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding

model = Sequential([
    Embedding(input_dim=voc_size, output_dim=10, input_length=fixed_length),
])
```

---

### 5️⃣ Model Compilation
- **Action:** Prepare model for training.
- **Purpose:** Allows learning word meanings.

```python
model.compile(optimizer='adam', loss='mse')
model.summary()
```

---

### 6️⃣ Prediction
- **Action:** Generate embeddings.
- **Purpose:** Outputs vector representations.

```python
embeddings = model.predict(padded_sentences)
print(embeddings.shape)  # (7, 8, 10) -> 7 sentences, 8 words each, 10-d vector per word
```

---

## Why Use an Embedding Layer Instead of One-Hot Encoding?
✅ One-hot encoding uses **huge sparse matrices** (wastes memory).
✅ The embedding layer **learns relationships** (e.g., "king" and "queen" are similar).
✅ It enables models to generalize across words and **understand meaning**.

---

## Summary
| **Step** | **Action** | **Purpose** |
|----------|-----------|-------------|
| 1️⃣ Define Sentences | Create text data | Provides input for the model |
| 2️⃣ One-Hot Encoding | Convert words to unique numbers | Makes words machine-readable |
| 3️⃣ Padding | Make all sentences the same length | Required for neural networks |
| 4️⃣ Embedding Layer | Convert words to dense vectors | Captures word relationships |
| 5️⃣ Model Compilation | Prepare model for training | Allows learning word meanings |
| 6️⃣ Prediction | Generate embeddings | Outputs vector representations |



In [1]:
### sentences
sent=[  'the glass of milk',
     'the glass of juice',
     'the cup of tea',
    'I am a good boy',
     'I am a good developer',
     'understand the meaning of words',
     'your videos are good',]

In [2]:
## Define the vocabulary size
voc_size=10000


In [3]:
from tensorflow.keras.preprocessing.text import one_hot
onehot_repr=[one_hot(words,voc_size)for words in sent]
onehot_repr

2025-02-23 19:57:41.799116: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [14]:
onehot_repr

[[4461, 7876, 5948, 4178],
 [4461, 7876, 5948, 8760],
 [4461, 6922, 5948, 774],
 [2667, 5741, 5824, 6347, 8747],
 [2667, 5741, 5824, 6347, 6316],
 [9838, 4461, 4740, 5948, 8085],
 [6486, 7058, 1200, 6347]]

In [15]:
from tensorflow.keras.layers import Embedding
from tensorflow.keras.utils import pad_sequences
from tensorflow.keras.models import Sequential

In [21]:
# apply padding to make all sentences of same length
fixed_length = 8
padded_sentences = pad_sequences(onehot_repr, maxlen=fixed_length, padding='post')
padded_sentences

array([[4461, 7876, 5948, 4178,    0,    0,    0,    0],
       [4461, 7876, 5948, 8760,    0,    0,    0,    0],
       [4461, 6922, 5948,  774,    0,    0,    0,    0],
       [2667, 5741, 5824, 6347, 8747,    0,    0,    0],
       [2667, 5741, 5824, 6347, 6316,    0,    0,    0],
       [9838, 4461, 4740, 5948, 8085,    0,    0,    0],
       [6486, 7058, 1200, 6347,    0,    0,    0,    0]], dtype=int32)

In [23]:
model = Sequential(
  [
    Embedding(input_dim=voc_size, output_dim=10, input_length=fixed_length),
  ]
)
model.compile('adam', 'mse')
model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (None, 8, 10)             100000    
                                                                 
Total params: 100000 (390.62 KB)
Trainable params: 100000 (390.62 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [24]:
model.predict(padded_sentences)



array([[[ 0.04333014,  0.02350302, -0.02824965,  0.01726067,
          0.00651807, -0.0306138 , -0.03183967,  0.01574612,
          0.03968767,  0.0188956 ],
        [-0.0246981 , -0.04529704,  0.03991035,  0.02098986,
          0.0012746 ,  0.00517123,  0.03313885, -0.00806148,
          0.0284325 , -0.04138196],
        [-0.01828529, -0.01818435,  0.02403356,  0.0450044 ,
         -0.01886544,  0.04500891, -0.04477687,  0.00639769,
          0.02108623, -0.02160677],
        [ 0.01042484, -0.04008345, -0.0101433 , -0.01027871,
         -0.02335546,  0.04838714, -0.02591652,  0.04670319,
          0.04724597,  0.01240114],
        [-0.03741231,  0.01323703, -0.04453991,  0.00029878,
         -0.03997507, -0.01352666, -0.01122178, -0.02556491,
         -0.03592603, -0.04666705],
        [-0.03741231,  0.01323703, -0.04453991,  0.00029878,
         -0.03997507, -0.01352666, -0.01122178, -0.02556491,
         -0.03592603, -0.04666705],
        [-0.03741231,  0.01323703, -0.04453991,  0.0