# Day 2 Tutorials: Natural Language Processing in Humans and Machines

## NLP Advanced: Sequential Models
In this notebook, we'll explore different NLP models from traditional RNNs to the latest language models like GPT-2 and BERT.

We'll cover:
- Recurrent Neural Networks (RNN)
- Long Short-Term Memory (LSTM)
- Generative Pre-trained Transformers (GPT-2)
- Bidirectional Encoder Representations from Transformers (BERT)

Let's get started!




In [None]:
# dummy input sentence
sentence = "The curious AI explored the world of words, crafting stories that amazed even the most skeptical humans."

## 1) Recurrent Neural Networks (RNN)

 RNNs are a type of neural network designed for sequence data. They maintain a hidden state that captures the sequence's context, making them suitable for tasks like language modeling and time-series prediction. However, RNNs struggle with long-term dependencies due to the vanishing gradient problem

In [None]:
import torch
import torch.nn as nn

# Define a simple RNN
class SimpleRNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleRNN, self).__init__()
        self.rnn = nn.RNN(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        h0 = torch.zeros(1, x.size(0), hidden_size)  # Initialize hidden state
        out, _ = self.rnn(x, h0)
        out = self.fc(out[:, -1, :])  # Take the last hidden state
        return out


input_size = len(sentence.split())  # Simulate the word embeddings size
hidden_size = 20
output_size = 5
rnn_model = SimpleRNN(input_size, hidden_size, output_size)

dummy_input = torch.randn(3, len(sentence.split()), input_size)  # Random input to simulate the sentence
output = rnn_model(dummy_input)
print("RNN Output Embeddings for the sentence:", output)
print("shape:", output.shape)


## 2) Long Short-Term Memory (LSTM)

LSTM networks address the limitations of RNNs by introducing gates (forget, input, and output) to control the flow of information. This allows LSTMs to capture long-term dependencies more effectively, making them suitable for tasks like machine translation and text generation.



In [None]:
class SimpleLSTM(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(SimpleLSTM, self).__init__()
        self.lstm = nn.LSTM(input_size, hidden_size, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        h0 = torch.zeros(1, x.size(0), hidden_size)
        c0 = torch.zeros(1, x.size(0), hidden_size)
        out, _ = self.lstm(x, (h0, c0))
        out = self.fc(out[:, -1, :])
        return out

# Example usage
input_size = len(sentence.split())  # Simulate the word embeddings size
hidden_size = 20
output_size = 5
lstm_model = SimpleLSTM(input_size, hidden_size, output_size)

dummy_input = torch.randn(3, len(sentence.split()), input_size)  # Random input to simulate the sentence
output = lstm_model(dummy_input)
print("LSTM Output Embeddings for the sentence:", output)
print("shape:", output.shape)

## 3) Generative Pre-trained Transformer (GPT-2)

GPT-2 is a transformer-based model designed for text generation tasks. It is trained to predict the next token in a sequence and can generate coherent text. Hugging Face’s transformers library provides easy access to GPT-2 for various NLP tasks.



In [None]:
from transformers import GPT2Tokenizer, GPT2Model

# Load GPT-2 model and tokenizer
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2Model.from_pretrained("gpt2")

# Example input sentence
input_ids = tokenizer(sentence, return_tensors="pt").input_ids

# Get hidden states from GPT-2
outputs = model(input_ids)
hidden_states = outputs.last_hidden_state

print("GPT-2 Hidden States for the sentence:", hidden_states)
print("shape:", hidden_states.shape)


## 4) Bidirectional Encoder Representations from Transformers (BERT)

BERT is a transformer-based model pre-trained on masked language modeling and next sentence prediction tasks. It generates deep contextualized embeddings that can be fine-tuned for a variety of downstream tasks like text classification, question answering, and more.



In [None]:
from transformers import BertTokenizer, BertModel

# Load pre-trained BERT model and tokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertModel.from_pretrained("bert-base-uncased")

# Example input sentence
input_ids = tokenizer(sentence, return_tensors="pt").input_ids

# Get BERT embeddings
outputs = model(input_ids)
last_hidden_states = outputs.last_hidden_state

print("BERT Embeddings for the sentence:", last_hidden_states)
print("shape:", last_hidden_states.shape)



## 5) Breakout Session: Exploring Hugging Face and Generating Embeddings with Pre-trained Models!


- Go explore [Hugging Face](https://huggingface.co/).
- Choose an appropriate pre-trained model for generating sentence embeddings.
- Use the model to generate embeddings for the given sentence.



In [None]:
from transformers import AutoTokenizer, AutoModel
import torch

# Replace 'your-chosen-model' with the model you picked from Hugging Face
model_name = "your-chosen-model"

# Load tokenizer and model from Hugging Face
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

# Example sentence
sentence = "The curious AI explored the world of words, crafting stories that amazed even the most skeptical humans."

