### Overview of Modern NLP and Transformer Models

Modern Natural Language Processing (NLP) has evolved rapidly through pre-trained language models and transformer architectures, enabling tasks such as:

1. Question Answering (QA) ‚Äì automatically generating answers to user questions.
2. Text Classification ‚Äì categorizing text into predefined labels (e.g., spam, sentiment, topic).
3. Text Summarization ‚Äì generating concise summaries that retain key ideas.
4. Text Generation ‚Äì producing coherent and human-like text based on context.
5. Code Generation ‚Äì converting natural language descriptions into source code.
6. Image Generation ‚Äì creating images from textual descriptions.
7. Chatbots ‚Äì conversational agents that simulate human dialogue.

These applications are powered by large-scale transformer-based models such as BERT, GPT, Mistral, Gemini, and other LLMs (Large Language Models) that understand and generate natural language.

### Transformer Architecture ‚Äî Overview

A transformer model consists of two main components:

1. **Encoder:**  
   Converts input text into contextual vector representations.  
    [View PyTorch Encoder Example](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertModel)

2. **Decoder:**  
   Generates output sequences (like translations or responses).  
    [View PyTorch Decoder Example](https://huggingface.co/docs/transformers/model_doc/gpt2#transformers.GPT2Model)



### üîπ Key Components in Each Layer

1. **Self-Attention Mechanism**  
   Enables every token to "attend" to every other token in the input sequence.  
   This helps capture relationships between words regardless of their position.  

2. **Feed-Forward Neural Network**  
   Applies nonlinear transformations and captures hidden relationships.  

3. **Positional Encoding**  
   Adds order-awareness so the model knows which word comes first or last.  



### üí¨ Why Self-Attention Matters
The **self-attention mechanism** allows the transformer to understand global context ‚Äî  
for example, connecting ‚ÄúParis‚Äù with ‚ÄúFrance‚Äù even if they appear far apart in a sentence.

> Without self-attention, models would only capture nearby relationships (like in RNNs),  
> but transformers can capture long-range dependencies efficiently.


### Key Models: BERT vs GPT
Model	Type	Direction	Strengths
- BERT (Bidirectional Encoder Representations from Transformers)	Encoder-only	Bidirectional (reads left ‚Üî right)	Understanding text ‚Üí good for classification, NER, QA
- GPT (Generative Pre-trained Transformer)	Decoder-only	Unidirectional (reads left ‚Üí right)	Generating text ‚Üí good for text generation, summarization, chatbots

BERT captures context from both sides, making it ideal for understanding language.
GPT focuses on predicting the next token, enabling fluent and context-aware text generation.

### How LLMs Work

- Input ‚Äì The user provides text (question, prompt, or paragraph).
- Tokenization & Embeddings ‚Äì The model splits text into tokens and converts them into dense numeric vectors.
- Output Generation ‚Äì The model processes embeddings and produces results like text, code, summaries, or answers.

LLMs are ‚Äúlarge‚Äù because they are trained on massive datasets (books, websites, code, etc.) and have millions to billions of parameters, which capture linguistic patterns and meaning.


## Challenge: Sentiment Analysis Using DistilBERT

In this challenge, we‚Äôll perform **sentiment analysis** using the pre-trained **DistilBERT** model from Hugging Face Transformers.

We'll:
- Load a small customer feedback dataset.
- Apply the `distilbert-base-uncased-finetuned-sst-2-english` model.
- Predict sentiment (Positive / Negative) and confidence scores.
- Visualize results using a WordCloud, pie chart, and bar plot.


In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
import warnings, os
warnings.filterwarnings("ignore")
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "3"

# ============================================================
#  Simple Transformer Example: BERT vs GPT
# ============================================================

# Install libraries (only needed once)
# !pip install transformers torch --quiet

from transformers import pipeline

# ------------------------------------------------------------
# BERT - Text Understanding (Sentiment Analysis)
# ------------------------------------------------------------
bert_classifier = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

text = "I really love learning NLP with transformers!"
bert_result = bert_classifier(text)

print(" Input Text:", text)
print("BERT Sentiment Result:", bert_result)
print("--------------------------------------------------")

# ------------------------------------------------------------
#  GPT - Text Generation
# ------------------------------------------------------------
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load GPT-2 model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

prompt = "Artificial Intelligence will transform education by"
inputs = tokenizer(prompt, return_tensors="pt")

# Generate continuation
outputs = model.generate(**inputs, max_new_tokens=40)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print("GPT Prompt:", prompt)
print("GPT Generated Text:")
print(generated_text)


#  NLP Series ‚Äî BERT vs GPT
This notebook demonstrates two key Transformer models:

| Model | Architecture | Task | Library |
|--------|---------------|------|----------|
| DistilBERT | Encoder-only | Sentiment Analysis | Hugging Face Transformers |
| GPT-2 | Decoder-only | Text Generation | Hugging Face Transformers |

Built using Kaggle GPU  
Libraries: `transformers`, `torch`  
Demonstrates text understanding vs. text generation
