# **Task**

---
Complete a story based on the seed text "It was a bright cold day" in the voice of Sir Arthur Conan Doyle.

**Approaches:**
1. HuggingFace Transformers (LLM)
  * Using "and Sherlock Holmes had just received an unexpected visitor at 221B Baker Street." to help the model understand the voice required for this task.
2. Bi-LSTM

## **LLM Approach Using GPT2 From HuggingFace**


---




In [1]:
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM

# Load the model and tokenizer
model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Initialize the pipeline
generator = pipeline('text-generation', model=model, tokenizer=tokenizer)

# Function to generate text
def generate_text(prompt, max_length=200):
    response = generator(prompt, max_length=max_length, num_return_sequences=1)
    return response[0]['generated_text']

# Example usage
seed_text = "It was a bright cold day"
prompt = seed_text + " and Sherlock Holmes had just received an unexpected visitor at 221B Baker Street."
generated_text = generate_text(prompt)
print(generated_text)


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]



config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


It was a bright cold day and Sherlock Holmes had just received an unexpected visitor at 221B Baker Street.


They had exchanged pleasantries for more than a few minutes, each with the exact same request. The three of them were both on their first date, and Sherlock had already agreed to his request that he put a large flat in the fireplace – but they hadn't planned on putting it in the flat, he believed. The cold was making them uncomfortable.


"You like coffee?" said James. "I prefer it more. It's good but I'm not going to rush it."


"Oooh, that's nice," Holmes said shyly. "Your hair is shorter than I anticipated. Can't say it's getting too hot, but sometimes it's great to meet your friend. It does get cold when you're not looking."


"Yes, but it won't do anything to keep you safe from being followed, in fact it makes my hair


**Another Output:**
It was a bright cold day and Sherlock Holmes had just received an unexpected visitor at 221B Baker Street. The visitor, a nervous man, claimed to have witnessed something unusual, but didn't say what he saw. But the Doctor could tell that they would encounter it first hand.

"What are you doing," the Doctor asked, after consulting Sherlock who looked at Doctor Holmes with an amused expression. "I am very sorry, I was afraid that something had happened in any way to the rest of the building."

"Quite true, I could understand that," the Doctor replied, giving the man a rather stern nod of acknowledgment. "But I thought it would be helpful in the future to come as you are. I hope any visitors in future will recognise you, and may I ask what you are doing at the moment?" The man looked at him with surprise as he returned the sceptre to Sherlock Holmes. "What exactly did you see," he said with an eyebrow raised. "

# **Bi-LSTM Approach**


---



In [10]:
# Import necessary libraries
import re
import requests
import numpy as np
import torch
import torch.nn as nn
import torch.utils.data as data
import torch.optim as optim

In [11]:
# Load and clean text (pre-process data)
# URL of the text file on Project Gutenberg
url = "https://www.gutenberg.org/files/1661/1661-0.txt"

# Send HTTP GET request to the URL
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
    # Store the first 100000 characters in content in a variable
    text = response.text[:100000]
else:
    print("Failed to retrieve the text.")

def clean_text(text):
    text = re.sub(r'\s+', ' ', text)
    text = re.sub(r'[^a-zA-Z0-9.,;\'"!? ]', '', text)
    return text.lower()

cleaned_text = clean_text(text)

In [12]:
# Create character mappings
chars = sorted(list(set(cleaned_text)))
char_to_index = {char: i for i, char in enumerate(chars)}
index_to_char = {i: char for i, char in enumerate(chars)}

# Create sequences
seq_length = 100
sequences = []
next_chars = []

# Split text into sequences
for i in range(0, len(cleaned_text) - seq_length):
    sequences.append(cleaned_text[i:i + seq_length])
    next_chars.append(cleaned_text[i + seq_length])

# Vectorize sequences
X = np.zeros((len(sequences), seq_length, len(chars)), dtype=np.float32)
y = np.zeros((len(sequences), len(chars)), dtype=np.float32)

for i, sequence in enumerate(sequences):
    for t, char in enumerate(sequence):
        X[i, t, char_to_index[char]] = 1
    y[i, char_to_index[next_chars[i]]] = 1

In [13]:
# Convert data to PyTorch tensors
X_tensor = torch.tensor(X, dtype=torch.float32)
y_tensor = torch.tensor(y, dtype=torch.float32)

# Move tensors to GPU if available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
X_tensor = X_tensor.to(device)
y_tensor = y_tensor.to(device)

# Create a DataLoader
dataset = data.TensorDataset(X_tensor, y_tensor)
dataloader = data.DataLoader(dataset, batch_size=128, shuffle=True)

In [14]:
# Define BiLSTM Model
class BiLSTMModel(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(BiLSTMModel, self).__init__()
        self.hidden_size = hidden_size
        self.bilstm = nn.LSTM(input_size, hidden_size, batch_first=True, bidirectional=True)
        self.fc = nn.Linear(hidden_size * 2, output_size)

    def forward(self, x):
        output, _ = self.bilstm(x)
        output = self.fc(output[:, -1, :])
        return output

# Parameters
input_size = len(chars)
hidden_size = 128
output_size = len(chars)

# Initialize model
model = BiLSTMModel(input_size, hidden_size, output_size).to(device)

# Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)


In [15]:
# Training the model
epochs = 30
for epoch in range(epochs):
    model.train()
    total_loss = 0
    for batch_x, batch_y in dataloader:
        optimizer.zero_grad()
        outputs = model(batch_x)
        loss = criterion(outputs, batch_y)
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    print(f'Epoch {epoch+1}/{epochs}, Loss: {total_loss/len(dataloader)}')

Epoch 1/30, Loss: 2.4846737079804013
Epoch 2/30, Loss: 2.1923309049711266
Epoch 3/30, Loss: 2.0758511111693663
Epoch 4/30, Loss: 1.9825002627326829
Epoch 5/30, Loss: 1.9048575467879703
Epoch 6/30, Loss: 1.8356180264828457
Epoch 7/30, Loss: 1.7748525400109272
Epoch 8/30, Loss: 1.7202530156467444
Epoch 9/30, Loss: 1.6714427315385338
Epoch 10/30, Loss: 1.6273244165160796
Epoch 11/30, Loss: 1.5876106581300948
Epoch 12/30, Loss: 1.5523889182850288
Epoch 13/30, Loss: 1.518630175183859
Epoch 14/30, Loss: 1.4878998753785104
Epoch 15/30, Loss: 1.4586183736216087
Epoch 16/30, Loss: 1.4309195739203011
Epoch 17/30, Loss: 1.4049757945324401
Epoch 18/30, Loss: 1.3806240006030999
Epoch 19/30, Loss: 1.3563427402031962
Epoch 20/30, Loss: 1.3326170806871647
Epoch 21/30, Loss: 1.31001059217007
Epoch 22/30, Loss: 1.288540616242069
Epoch 23/30, Loss: 1.268143797183135
Epoch 24/30, Loss: 1.2482576282513027
Epoch 25/30, Loss: 1.2291659345489092
Epoch 26/30, Loss: 1.2110743832555402
Epoch 27/30, Loss: 1.19368

In [21]:
# Generate text
def generate_text(model, seed, length):
    model.eval()
    generated = seed
    seed_tensor = torch.zeros(1, len(seed), len(chars))
    for t, char in enumerate(seed):
        seed_tensor[0, t, char_to_index[char]] = 1

    seed_tensor = seed_tensor.to(device)

    for _ in range(length):
        with torch.no_grad():
            output = model(seed_tensor)
            predicted_index = torch.argmax(output).item()
            next_char = index_to_char[predicted_index]
            generated += next_char

            next_input = torch.zeros(1, 1, len(chars))
            next_input[0, 0, predicted_index] = 1
            seed_tensor = torch.cat((seed_tensor[:, 1:, :], next_input.to(device)), dim=1)

    return generated

seed_text = "It was a bright cold day"
print(generate_text(model, seed_text.lower(), 200))

it was a bright cold day and to be an ampoman condertions and such a fached to the chair. its an oncess to the league that i had started and compent to be any with the street. and the passion and started and stack in the con
