<a href="https://colab.research.google.com/github/hakim733/AI-project/blob/main/LLM_fine_tuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# complementary task : Language Models
#I-Assignment: Fine-tuning a Language Model for Chatbot Response Generation
Introduction

In this assignment, I explored the capabilities of modern language models by fine-tuning GPT-2—a widely used model from the Hugging Face library—for the specific task of chatbot response generation.
The aim was to teach the model how to carry out human-like conversations, using real-world dialog data as training material.

Reference:

    Hugging Face Transformers Documentation : https://huggingface.co/docs/transformers/index

Task Selection and Dataset :https://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html

I chose to focus on chatbot generation using the Cornell Movie Dialogs Corpus, a public dataset containing thousands of movie conversations.
This dataset is suitable for building conversational AI because it includes natural, turn-by-turn dialog.
What I Did: Step-by-Step.

In [None]:
!pip install transformers datasets




#Loading and unzipping Dataset

In [None]:
!wget http://www.cs.cornell.edu/~cristian/data/cornell_movie_dialogs_corpus.zip
!unzip cornell_movie_dialogs_corpus.zip


--2025-06-30 18:30:29--  http://www.cs.cornell.edu/~cristian/data/cornell_movie_dialogs_corpus.zip
Resolving www.cs.cornell.edu (www.cs.cornell.edu)... 132.236.207.53
Connecting to www.cs.cornell.edu (www.cs.cornell.edu)|132.236.207.53|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.cs.cornell.edu/~cristian/data/cornell_movie_dialogs_corpus.zip [following]
--2025-06-30 18:30:29--  https://www.cs.cornell.edu/~cristian/data/cornell_movie_dialogs_corpus.zip
Connecting to www.cs.cornell.edu (www.cs.cornell.edu)|132.236.207.53|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9916637 (9.5M) [application/zip]
Saving to: ‘cornell_movie_dialogs_corpus.zip’


2025-06-30 18:30:29 (42.8 MB/s) - ‘cornell_movie_dialogs_corpus.zip’ saved [9916637/9916637]

Archive:  cornell_movie_dialogs_corpus.zip
   creating: cornell movie-dialogs corpus/
  inflating: cornell movie-dialogs corpus/.DS_Store  
   creating: __MACOSX/
   cre

#Step 3: Preparing the Dialog Data

In [None]:
import pandas as pd

# Read movie lines
lines_path = "cornell movie-dialogs corpus/movie_lines.txt"
convs_path = "cornell movie-dialogs corpus/movie_conversations.txt"

with open(lines_path, encoding='utf-8', errors='ignore') as f:
    lines = f.readlines()

with open(convs_path, encoding='utf-8', errors='ignore') as f:
    convs = f.readlines()

# Map lineID to text
line_map = {}
for line in lines:
    parts = line.split(" +++$+++ ")
    if len(parts) == 5:
        line_map[parts[0]] = parts[4].strip()

# Build pairs (prompt, response)
pairs = []
for conv in convs:
    parts = conv.split(" +++$+++ ")
    if len(parts) == 4:
        line_ids = eval(parts[3])
        for i in range(len(line_ids) - 1):
            q = line_map.get(line_ids[i], "")
            a = line_map.get(line_ids[i + 1], "")
            if q and a:
                pairs.append((q, a))

# Convert to DataFrame and keep only 1000 pairs (for quick training)
df = pd.DataFrame(pairs, columns=["input", "response"]).sample(1000, random_state=42)
print(df.head())

                                                    input  \
111288                                        Jody. Wait.   
21837                                          Frances...   
83549                 This is going to drive the ante up.   
81404   I'm your political advisor, and I'm giving you...   
147690                                               Nah.   

                                                 response  
111288                                              What?  
21837                                               What?  
83549   Frank Galvin's... who's calling please? Bishop...  
81404                     I'm gonna protect those pilots.  
147690                               What you been up to?  


#Step 5: Creating a Dataset and Tokenize

In [None]:
from transformers import AutoTokenizer

# Load a pre-trained tokenizer
tokenizer = AutoTokenizer.from_pretrained("distilbert/distilgpt2")

# Set the padding token to the end-of-sequence token
tokenizer.pad_token = tokenizer.eos_token

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/762 [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

In [None]:
from datasets import Dataset

# Concatenate 'input' and 'response' columns into a 'text' column
df['text'] = df['input'] + " " + df['response']

# Create Hugging Face Dataset from the new 'text' column
dataset = Dataset.from_pandas(df[['text']])

# Define the tokenize function
def tokenize(example):
    # Make sure the tokenizer is defined (you might need to add a cell to load it)
    # For now, assuming 'tokenizer' is available
    tokenized_inputs = tokenizer(example["text"], truncation=True, padding="max_length", max_length=128)
    tokenized_inputs["labels"] = tokenized_inputs["input_ids"].copy()
    return tokenized_inputs

# Apply tokenization
tokenized_dataset = dataset.map(tokenize, batched=True)

Map:   0%|          | 0/1000 [00:00<?, ? examples/s]

#Step 6: Loading GPT-2 and Fine-Tuning

In [None]:
from transformers import AutoModelForCausalLM, Trainer, TrainingArguments

model = AutoModelForCausalLM.from_pretrained("gpt2")

training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=1,
    per_device_train_batch_size=4,
    logging_steps=200,
    save_total_limit=1,
    report_to="none"
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_dataset,
)

trainer.train()


Step,Training Loss
200,0.9396


TrainOutput(global_step=250, training_loss=0.9109258270263672, metrics={'train_runtime': 2152.5737, 'train_samples_per_second': 0.465, 'train_steps_per_second': 0.116, 'total_flos': 65323008000000.0, 'train_loss': 0.9109258270263672, 'epoch': 1.0})

#Step 7: Chatting with the Model!

In [None]:
prompt = "User: What's your favorite movie?\nBot:"
inputs = tokenizer.encode(prompt, return_tensors="pt")
attention_mask = (inputs != tokenizer.eos_token_id).long()
outputs = model.generate(inputs, attention_mask=attention_mask, max_length=64, pad_token_id=tokenizer.eos_token_id)





In [None]:
def greedy_decode(model, src_sentence, src_tokenizer, trg_tokenizer, max_len=20, device='cpu'):
    model.eval()
    src = torch.tensor([src_tokenizer(src_sentence)], device=device)
    src_mask = model.transformer.generate_square_subsequent_mask(src.size(1)).to(device)

    # Start token for target
    trg_indices = [trg_tokenizer['<sos>']]
    for _ in range(max_len):
        trg = torch.tensor([trg_indices], device=device)
        trg_mask = model.transformer.generate_square_subsequent_mask(trg.size(1)).to(device)
        with torch.no_grad():
            output = model(src, trg)
        next_token = output[0, -1].argmax(-1).item()
        trg_indices.append(next_token)
        if next_token == trg_tokenizer['<eos>']:
            break
    # Convert indices to words, skip <sos>
    trg_inv = {i: w for w, i in trg_tokenizer.items()}
    return " ".join([trg_inv[idx] for idx in trg_indices[1:-1]])


In [None]:
def chat_with_bot(user_message):
    prompt = f"User: {user_message}\nBot:"
    inputs = tokenizer.encode(prompt, return_tensors="pt")
    attention_mask = (inputs != tokenizer.eos_token_id).long()  # Handles attention mask warning
    outputs = model.generate(
        inputs,
        attention_mask=attention_mask,
        max_length=64,
        pad_token_id=tokenizer.eos_token_id,
        do_sample=True,   # adds randomness
        top_k=50,         # samples from top 50 tokens
        top_p=0.95        # nucleus sampling for more variety
    )
    reply = tokenizer.decode(outputs[0], skip_special_tokens=True)
    # Extract only the Bot's reply
    if "Bot:" in reply:
        reply = reply.split("Bot:")[1].strip()
    return reply

# Example test
print(chat_with_bot("What's your favorite movie?"))
print(chat_with_bot("How are you?"))
print(chat_with_bot("Tell me a joke."))


The King of Thieves.
How were you able to help me?  I don't know. I got lost.
Hello, sir, you.


In [None]:
while True:
    user_input = input("You: ")
    if user_input.lower() in ["quit", "exit"]:
        break
    print("Bot:", chat_with_bot(user_input))


You: hi
Bot: ...
You: what date is today ?
Bot: 
You: suggest a good horror movie 
Bot: It's not gonna be much of an anagram, mister, but the thing's gonna be so weird that there can be nothing but blood and mister.
You: what kind of model are you ?
Bot: what kind of model is your model? What kind of model are we talking about ?
You: provide a good recepie
Bot: Yes
You: ka du svenska ?
Bot: I am not
You: recepie for lam
Bot: What is it? What is it? Who is it?
You: stue ?
Bot: uh !
You: whos is lise larsen ?
Bot: I'm not.
You: who is abdelhakim Mraihi ?
Bot: yes he's a guy...
You: no he is a girl ?
Bot: No, you have a penis too.
You: do you have a vagina ?
Bot: You have a vagina. You are a man.
You: fuck you 
Bot: https://www.reddit.com/r/Bitcoin/
You: fuck you 
Bot: how?
You: you are an idio t
Bot: I was told the problem is your problem.
You: why are you hallucinating ?
Bot: 
You: are you on drugs ?
Bot: Yes.  And here you are.
You: you like footbal ?
Bot: No you know that.
You: you sp

KeyboardInterrupt: Interrupted by user

#II- Implementing and testing Pytorch’s own Transformer models for machine translation

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
import math

# Sample data: toy pairs for demonstration
data = [
    ("hello world", "hallo welt"),
    ("good morning", "guten morgen"),
    ("i love you", "ich liebe dich"),
    ("how are you", "wie geht es dir"),
    ("thank you", "danke"),
]

# Build simple vocabularies
SRC_WORDS = set()
TRG_WORDS = set()
for src, trg in data:
    SRC_WORDS.update(src.split())
    TRG_WORDS.update(trg.split())
SRC_WORDS = ['<pad>', '<sos>', '<eos>'] + sorted(SRC_WORDS)
TRG_WORDS = ['<pad>', '<sos>', '<eos>'] + sorted(TRG_WORDS)
SRC2IDX = {word: idx for idx, word in enumerate(SRC_WORDS)}
TRG2IDX = {word: idx for idx, word in enumerate(TRG_WORDS)}
IDX2TRG = {idx: word for word, idx in TRG2IDX.items()}

# Simple tokenization
def encode_sentence(sentence, word2idx):
    return [word2idx['<sos>']] + [word2idx[word] for word in sentence.split()] + [word2idx['<eos>']]

src_seqs = [encode_sentence(src, SRC2IDX) for src, _ in data]
trg_seqs = [encode_sentence(trg, TRG2IDX) for _, trg in data]

# Pad sequences
def pad_seq(seq, max_len, pad_idx):
    return seq + [pad_idx] * (max_len - len(seq))

src_max_len = max(len(seq) for seq in src_seqs)
trg_max_len = max(len(seq) for seq in trg_seqs)
src_tensor = torch.tensor([pad_seq(seq, src_max_len, SRC2IDX['<pad>']) for seq in src_seqs])
trg_tensor = torch.tensor([pad_seq(seq, trg_max_len, TRG2IDX['<pad>']) for seq in trg_seqs])

# Transformer Model
class Seq2SeqTransformer(nn.Module):
    def __init__(self, num_tokens_src, num_tokens_tgt, emb_size, nhead, num_encoder_layers, num_decoder_layers, dim_feedforward=512):
        super().__init__()
        self.transformer = nn.Transformer(
            d_model=emb_size, nhead=nhead,
            num_encoder_layers=num_encoder_layers,
            num_decoder_layers=num_decoder_layers,
            dim_feedforward=dim_feedforward
        )
        self.src_tok_emb = nn.Embedding(num_tokens_src, emb_size)
        self.tgt_tok_emb = nn.Embedding(num_tokens_tgt, emb_size)
        self.positional_encoding = nn.Parameter(torch.zeros(100, emb_size)) # for small dataset
        self.fc_out = nn.Linear(emb_size, num_tokens_tgt)

    def forward(self, src, tgt):
        src_emb = self.src_tok_emb(src) + self.positional_encoding[:src.size(1)]
        tgt_emb = self.tgt_tok_emb(tgt) + self.positional_encoding[:tgt.size(1)]
        src_mask = self.transformer.generate_square_subsequent_mask(src.size(1)).to(src.device)
        tgt_mask = self.transformer.generate_square_subsequent_mask(tgt.size(1)).to(tgt.device)
        outs = self.transformer(src_emb.permute(1,0,2), tgt_emb.permute(1,0,2), src_mask, tgt_mask)
        return self.fc_out(outs.permute(1,0,2))

# Hyperparameters
EMB_SIZE = 32
NHEAD = 2
FFN_HID_DIM = 64
BATCH_SIZE = len(data)
NUM_ENCODER_LAYERS = 2
NUM_DECODER_LAYERS = 2

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = Seq2SeqTransformer(len(SRC_WORDS), len(TRG_WORDS), EMB_SIZE, NHEAD, NUM_ENCODER_LAYERS, NUM_DECODER_LAYERS, FFN_HID_DIM).to(device)

# Loss and optimizer
loss_fn = nn.CrossEntropyLoss(ignore_index=TRG2IDX['<pad>'])
optimizer = optim.Adam(model.parameters(), lr=0.01)

# Training (for demonstration, just a few epochs)
EPOCHS = 30
for epoch in range(EPOCHS):
    model.train()
    optimizer.zero_grad()
    output = model(src_tensor.to(device), trg_tensor[:,:-1].to(device))
    output = output.reshape(-1, output.shape[-1])
    trg_y = trg_tensor[:,1:].contiguous().view(-1).to(device)
    loss = loss_fn(output, trg_y)
    loss.backward()
    optimizer.step()
    if (epoch+1)%10==0:
        print(f'Epoch {epoch+1}, Loss: {loss.item():.4f}')

# Test translation
def translate(sentence):
    model.eval()
    src = torch.tensor([pad_seq(encode_sentence(sentence, SRC2IDX), src_max_len, SRC2IDX['<pad>'])], device=device)
    tgt = torch.tensor([[TRG2IDX['<sos>']]], device=device)
    for _ in range(trg_max_len):
        out = model(src, tgt)
        next_word = out[0, -1].argmax().item()
        tgt = torch.cat([tgt, torch.tensor([[next_word]], device=device)], dim=1)
        if next_word == TRG2IDX['<eos>']:
            break
    return " ".join([IDX2TRG[idx] for idx in tgt[0].tolist()[1:-1]])

print("\nTranslation tests:")
for src, _ in data:
    print(f"{src} -> {translate(src)}")




Epoch 10, Loss: 1.1640
Epoch 20, Loss: 0.3126
Epoch 30, Loss: 0.0509

Translation tests:
hello world -> hallo welt
good morning -> guten morgen
i love you -> ich liebe dich
how are you -> wie geht es dir
thank you -> danke


#III- Text generation

##LSTM

In [None]:
data = [
    "Action: A brave knight fights dragons in a mystical land.",
    "Action: Space marines battle alien invaders on Mars.",
    "Puzzle: Connect colored gems to unlock magical doors.",
    "Puzzle: Rearrange tiles to reveal ancient symbols.",
    "RPG: Explore a post-apocalyptic world as a mutant survivor.",
    "RPG: Assemble a team of heroes to save the kingdom.",
    "Strategy: Build a civilization from the Stone Age to the future.",
    "Strategy: Command armies and outsmart your enemies in real time."
]
import torch
import torch.nn as nn
import random

all_text = "\n".join(data)
chars = sorted(list(set(all_text)))
char2idx = {ch: i for i, ch in enumerate(chars)}
idx2char = {i: ch for i, ch in enumerate(chars)}
VOCAB_SIZE = len(chars)

class CharLSTM(nn.Module):
    def __init__(self, vocab_size, hidden_size=128, num_layers=2):
        super().__init__()
        self.embed = nn.Embedding(vocab_size, hidden_size)
        self.lstm = nn.LSTM(hidden_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, vocab_size)

    def forward(self, x, hidden=None):
        x = self.embed(x)
        out, hidden = self.lstm(x, hidden)
        out = self.fc(out)
        return out, hidden

seq_length = 40
step = 3
sequences = []
next_sequences = []

for line in data:
    for i in range(0, len(line) - seq_length, step):
        seq = line[i:i+seq_length]
        next_seq = line[i+1:i+seq_length+1]
        sequences.append(seq)
        next_sequences.append(next_seq)

X = torch.tensor([[char2idx[ch] for ch in seq] for seq in sequences])
y = torch.tensor([[char2idx[ch] for ch in seq] for seq in next_sequences])


for epoch in range(20):
    model.train()
    optimizer.zero_grad()
    output, _ = model(X)        # output: [batch, seq_len, vocab]
    loss = criterion(output.view(-1, VOCAB_SIZE), y.view(-1))
    loss.backward()
    optimizer.step()
    if (epoch+1) % 5 == 0:
        print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")


def generate_text(model, prompt, length=100):
    model.eval()
    input_seq = torch.tensor([[char2idx.get(ch, 0) for ch in prompt]], dtype=torch.long)
    generated = list(prompt)
    hidden = None
    for _ in range(length):
        output, hidden = model(input_seq, hidden)
        prob = output[:, -1, :].squeeze().softmax(0).detach()
        idx = torch.multinomial(prob, 1).item()
        next_char = idx2char[idx]
        generated.append(next_char)
        input_seq = torch.tensor([[idx]], dtype=torch.long)
        if next_char == "\n":
            break
    return "".join(generated)

print("Action Example:\n", generate_text(model, "Action:", 120))
print("Puzzle Example:\n", generate_text(model, "Puzzle:", 120))
print("RPG Example:\n", generate_text(model, "RPG:", 120))
print("Strategy Example:\n", generate_text(model, "Strategy:", 120))


## RNN + RGU models


In [None]:
class CharRNN(nn.Module):
    def __init__(self, vocab_size, hidden_size=128, num_layers=2):
        super().__init__()
        self.embed = nn.Embedding(vocab_size, hidden_size)
        self.rnn = nn.RNN(hidden_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, vocab_size)
    def forward(self, x, hidden=None):
        x = self.embed(x)
        out, hidden = self.rnn(x, hidden)
        out = self.fc(out)
        return out, hidden

class CharGRU(nn.Module):
    def __init__(self, vocab_size, hidden_size=128, num_layers=2):
        super().__init__()
        self.embed = nn.Embedding(vocab_size, hidden_size)
        self.gru = nn.GRU(hidden_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, vocab_size)
    def forward(self, x, hidden=None):
        x = self.embed(x)
        out, hidden = self.gru(x, hidden)
        out = self.fc(out)
        return out, hidden

def train(model, X, y, epochs=20):
    optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
    criterion = nn.CrossEntropyLoss()
    for epoch in range(epochs):
        model.train()
        optimizer.zero_grad()
        output, _ = model(X)
        loss = criterion(output.reshape(-1, VOCAB_SIZE), y.reshape(-1))
        loss.backward()
        optimizer.step()
        if (epoch+1) % 5 == 0:
            print(f"Epoch {epoch+1}, Loss: {loss.item():.4f}")

def generate_text(model, prompt, length=100):
    model.eval()
    input_seq = torch.tensor([[char2idx.get(ch, 0) for ch in prompt]], dtype=torch.long)
    generated = list(prompt)
    hidden = None
    for _ in range(length):
        output, hidden = model(input_seq, hidden)
        prob = output[:, -1, :].squeeze().softmax(0).detach()
        idx = torch.multinomial(prob, 1).item()
        next_char = idx2char[idx]
        generated.append(next_char)
        input_seq = torch.tensor([[idx]], dtype=torch.long)
        if next_char == "\n":
            break
    return "".join(generated)


In [None]:
print("\n== RNN ==")
rnn_model = CharRNN(VOCAB_SIZE)
train(rnn_model, X, y, epochs=20)
print("Action Example:\n", generate_text(rnn_model, "Action:", 120))
print("Puzzle Example:\n", generate_text(rnn_model, "Puzzle:", 120))


In [None]:
print("\n== GRU ==")
gru_model = CharGRU(VOCAB_SIZE)
train(gru_model, X, y, epochs=20)
print("RPG Example:\n", generate_text(gru_model, "RPG:", 120))
print("Strategy Example:\n", generate_text(gru_model, "Strategy:", 120))


# IV- Aalyzing both models


| Model | Handles Long Sequences | Output Quality on Small Data | Complexity |
| ----- | ---------------------- | ---------------------------- | ---------- |
| RNN   | Weak                   | OK for short ideas           | Simple     |
| GRU   | Good                   | Good for short/med ideas     | Moderate   |


In [None]:
import nbformat

# Load the notebook
with open("/content/drive/MyDrive/'Colab Notebooks'/'Fine tuning.ipynb'", "r", encoding="utf-8") as f:
    try:
        nb = nbformat.read(f, as_version=4)
    except Exception as e:
        print(f"Error reading notebook: {e}")
        raise

# Write it back (fixes formatting issues)
with open("/content/drive/MyDrive/'Colab Notebooks'/LLM-fine-tuning.ipynb", "w", encoding="utf-8") as f:
    nbformat.write(nb, f)

FileNotFoundError: [Errno 2] No such file or directory: "/content/drive/MyDrive/'Colab Notebooks'/'Fine tuning.ipynb'"

In [None]:
cd results/




SyntaxError: invalid syntax (ipython-input-28-1544452934.py, line 1)

To access files in your Google Drive, you need to mount it first.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
ls /content/drive/MyDrive/'Colab Notebooks'/LLM-fine-tuning.ipynb

' AI-Project_Data cleaning and LR imlimentation.ipynb'
 AI-Project_final-version.ipynb
'Banking project.ipynb'
'Copy of AI-project.ipynb'
'Copy of  AI-project_v0_09 10.ipynb'
'Copy of PyGWalker Test'
'Copy of Scania-CNN.ipynb'
'Copy of scania-final.ipynb'
 ExpDataAnalyzes.ipynb
'Exploratory data analysis-lab.ipynb'
'Fine tuning.ipynb'
'förutsägelse av låneberättigande.ipynb'
 ica.ipynb
'K-means clustering.ipynb'
 LLM-fine-tuning.ipynb
'Neural network hypermarametres tuniing.ipynb'
 Optimizing_CNN.ipynb
'Random_ Forest.ipynb'
 Scania-CNN.ipynb
'Scania dataset-0 .ipynb'
'Scania dataset-1 .ipynb'
'Scania dataset-2.ipynb'
'scania dataset3-Multiclass.ipynb'
 scania_dataset3_Multiclass_UPDATED.ipynb
 scania-GRU.ipynb
'Simle NN-lab1.ipynb'
'special events.ipynb'
' Titanic:machine learning model '
'tryingout spark.ipynb'
 Untitled
 Untitled0.ipynb
 Untitled1.ipynb
 Untitled2.ipynb
 Untitled3.ipynb
 Untitled4.ipynb
 Untitled5.ipynb
 Untitled6.ipynb
 Untitled7.ipynb
 Vestas.ipynb
'Wind_Regre

In [None]:
whoimi

NameError: name 'whoimi' is not defined