# 📘 Level 4: Advanced NLP Models

---

# 1. Word Embeddings (Word2Vec, GloVe)

### ➔ Definition:  
Word embeddings convert words into **dense vector representations** where similar words are **closer** in the vector space.

### ➔ Why use Word Embeddings?  
- Captures **semantic meaning** (king - man + woman ≈ queen).  
- Improves model understanding compared to simple one-hot encoding.


In [1]:
## Code for Word2Vec:

# Install gensim if not installed
# pip install gensim

from gensim.models import Word2Vec

# Sample corpus
sentences = [["i", "love", "nlp"], ["nlp", "is", "fun"], ["i", "enjoy", "learning", "nlp"]]

# Train Word2Vec model
model = Word2Vec(sentences, vector_size=100, window=5, min_count=1, workers=4)

# Get vector of a word
vector = model.wv['nlp']
print("Vector for 'nlp':", vector)

# Find similar words
similar = model.wv.most_similar('nlp')
print("Similar words to 'nlp':", similar)

Vector for 'nlp': [-5.3622725e-04  2.3643136e-04  5.1033497e-03  9.0092728e-03
 -9.3029495e-03 -7.1168090e-03  6.4588725e-03  8.9729885e-03
 -5.0154282e-03 -3.7633716e-03  7.3805046e-03 -1.5334714e-03
 -4.5366134e-03  6.5540518e-03 -4.8601604e-03 -1.8160177e-03
  2.8765798e-03  9.9187379e-04 -8.2852151e-03 -9.4488179e-03
  7.3117660e-03  5.0702621e-03  6.7576934e-03  7.6286553e-04
  6.3508903e-03 -3.4053659e-03 -9.4640139e-04  5.7685734e-03
 -7.5216377e-03 -3.9361035e-03 -7.5115822e-03 -9.3004224e-04
  9.5381187e-03 -7.3191668e-03 -2.3337686e-03 -1.9377411e-03
  8.0774371e-03 -5.9308959e-03  4.5162440e-05 -4.7537340e-03
 -9.6035507e-03  5.0072931e-03 -8.7595852e-03 -4.3918253e-03
 -3.5099984e-05 -2.9618145e-04 -7.6612402e-03  9.6147433e-03
  4.9820580e-03  9.2331432e-03 -8.1579173e-03  4.4957981e-03
 -4.1370760e-03  8.2453608e-04  8.4986202e-03 -4.4621765e-03
  4.5175003e-03 -6.7869602e-03 -3.5484887e-03  9.3985079e-03
 -1.5776526e-03  3.2137157e-04 -4.1406299e-03 -7.6826881e-03
 -1.50

✅ **Explanation:**  
- `vector_size=100`: Output dimension.  
- `window=5`: Context window size.  
- `min_count=1`: Minimum count of word to consider.

---

# 2. Transformer Architecture

### ➔ Definition:  
Transformer is a model architecture based on **self-attention mechanisms**, **without RNNs**.

### ➔ Why use Transformers?  
- Handle **long dependencies** better.  
- Train **faster** than RNNs on large data.

---

## Basic Structure (No coding here, just theory):

| Encoder | Decoder |
|:-------:|:-------:|
| Self Attention | Masked Self Attention |
| Feed Forward | Feed Forward |
| Add & Norm | Add & Norm |

✅ **Self-Attention** allows the model to **focus on different parts** of the input for each word.

---

# 3. Pre-trained Models (BERT, GPT)

### ➔ Definition:  
Pre-trained models like **BERT** and **GPT** are trained on large datasets and **fine-tuned** for specific tasks.

### ➔ Why use Pre-trained Models?  
- Save time and compute.  
- Achieve **state-of-the-art performance** quickly.

---

Downgrade TensorFlow to 2.12.0 or 2.13.0
(Because TensorFlow 2.12 uses Keras 2.x, fully compatible with transformers.)

Run these commands in your environment:

In [None]:
pip uninstall keras tensorflow
pip install tensorflow==2.12.0
pip install transformers
pip install tf-keras

In [1]:
## Code for using BERT (Text Classification):

# Install transformers library if not installed
# pip install transformers
from transformers import pipeline

# Load sentiment-analysis pipeline
classifier = pipeline('sentiment-analysis')

# Sample text
result = classifier("I love working with NLP models!")
print(result)

  from .autonotebook import tqdm as notebook_tqdm
No model was supplied, defaulted to distilbert/distilbert-base-uncased-finetuned-sst-2-english and revision 714eb0f (https://huggingface.co/distilbert/distilbert-base-uncased-finetuned-sst-2-english).
Using a pipeline without specifying a model name and revision in production is not recommended.





Device set to use cpu


[{'label': 'POSITIVE', 'score': 0.9988333582878113}]


✅ **Explanation:**  
- `pipeline('sentiment-analysis')`: Ready-to-use BERT under the hood.  
- Takes **any text** and returns **positive** or **negative**.

---

# 4. Text Generation with GPT-2

In [2]:
## Code for GPT-2 Text Generation:

from transformers import pipeline

# Load text generation pipeline
generator = pipeline('text-generation', model='gpt2')

# Generate text
result = generator("Once upon a time", max_length=50, num_return_sequences=1)
print(result[0]['generated_text'])

Device set to use cpu
Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


Once upon a time the sky was a sea of a thousand miles wide. This was the night sky, and as you entered the great night, each round of it was a little room of infinite darkness with the same colors as the old night sky.


✅ **Explanation:**  
- Start with a **prompt** and GPT-2 **continues the story**.  
- `max_length=50`: Limits how long the generated text is.

---

# 📚 Mini Assignments

➔ 1. Train a Word2Vec model on your own dataset.  
➔ 2. Use HuggingFace BERT for text classification on movie reviews.  
➔ 3. Generate your own stories using GPT-2.

---

# ✅ Done!

---

# 📜 Summary of Topics Covered so far:

| Level | Topics |
|:----|:------|
| Level 1 | Text Preprocessing (Tokenization, Stopwords, etc.) |
| Level 2 | Basic Models (BoW, TF-IDF, Naive Bayes) |
| Level 3 | Intermediate Models (NER, Language Modeling, Summarization) |
| Level 4 | Advanced Models (Word2Vec, Transformer, BERT, GPT) |

---

✅ Now your NLP foundation is **solid** from basic to advanced!  
If you want, next we can also cover:

- ✍️ How to **Fine-tune BERT or GPT models** yourself  
- 📦 How to **Deploy NLP apps** (Flask, Streamlit, Gradio)

