# GPT - Messing Around with Extra Neural Net Layers

Goals:
- Creating a generative pretrained tokenizer. 
- Learning the ins and outs of GPTs.
- Adding in more layers from the base model I created earlier.
- Experimenting with different neuron types in the layers, such as:
    - Recurrent (long-short-term) layer, which can increase the contextual information taken up by the machine to learn and generate outputs. This helps the machine remember words, that are important to a context. 
    - Convolutional layer, which can help get more local nuances in the data. This can help detect those localised patterns in the data that aid in generating meaningful text in response to prompts. 

# Setup: Packages & Important Parameters

In [1]:
import torch
import torch.nn as nn
from torch.nn import functional as F
import mmap
import random
import pickle
import argparse

device = 'cuda' if torch.cuda.is_available() else 'cpu' #be smoother, use CUDA ;)
print(device)

cuda


### Hyperparameters

In [2]:
batch_size = 32 #The number of samples processed before the model is updated. Larger batch sizes can lead to faster training, but require more memory.
block_size = 128 #The length of the sequence processed by the model. Refers to the number of tokens in each sequence.
max_iters = 200 #The maximum number of iterations or steps the model will take during training. Each iteration updates the model's parameters based on a batch of data.
learning_rate = 2e-5 #The step size used by the optimization algorithm to update the model's weights. A smaller learning rate can lead to more precise updates but slower convergence.
eval_iters = 100 #The number of iterations between evaluations of the model's performance on the validation set. This helps monitor training progress and prevent overfitting.
n_embd = 384 #The dimensionality of the embeddings used in the model. It defines the size of the vector representation for each token.
n_head = 4 #The number of attention heads in the multi-head attention mechanism. More heads can capture different aspects of the input, but require more computational resources.
n_layer = 4 #The number of layers (or blocks) in the model. Each layer consists of multiple sub-layers, including self-attention and feedforward layers.
dropout = 0.2 #The dropout rate. The fraction of neurons randomly set to zero during training to prevent overfitting.

# Input Data

In [3]:
# Step 1: Read the entire text from the file
chars = ""
with open('WizOfOz.txt', 'r', encoding='utf-8') as f:
    text = f.read()
chars = sorted(list(set(text)))
vocab_size = len(chars)

# Step 2: Define the training split size; adjust 0.8 to whatever ratio you want
train_split = 0.8

# Step 3: Calculate the split index based on the 80/20 ratio
split_index = int(len(text) * train_split) 

# Step 4: Split the text into training and validation sets
train_text = text[:split_index]
val_text = text[split_index:]

# Step 5: Write the first {train_split}% of the text to 'train_text.txt'
train_filename = 'train_split.txt'
with open(train_filename, 'w', encoding='utf-8') as f:
    f.write(train_text)

# Step 6: Write the remaining % of the text to 'val_text.txt'
val_filename = 'val_split.txt'
with open(val_filename, 'w', encoding='utf-8') as f:
    f.write(val_text)

# Confirming the files are created and written
print(f"{train_filename} and {val_filename} have been created with a {train_split} training size.")

train_split.txt and val_split.txt have been created with a 0.8 training size.


# Tokenization & Encoding

In [4]:
string_to_int = { ch:i for i, ch in enumerate(chars) } #This line creates a dictionary called string_to_int that maps each character (ch) in the list chars to its corresponding index (i).
int_to_string = { i:ch for i, ch in enumerate(chars) } #This line creates another dictionary called int_to_string that does the reverse of string_to_int. It maps each index (i) to its corresponding character (ch) from the list chars.
encode = lambda s: [string_to_int[c] for c in s] #This line defines a lambda function called encode that takes a string s as input and returns a list of integers. Each character c in the string s is converted to its corresponding integer using the string_to_int dictionary.
decode = lambda l: ''.join([int_to_string[i] for i in l]) #This line defines a lambda function called decode that takes a list of integers l as input and returns a string. Each integer i in the list l is converted to its corresponding character using the int_to_string dictionary. The characters are then joined together to form the final string.

# Memory Map  - Batching
How the machine will use small snippets of text from a single file of any size. It's like a sampling tool.

It optimises hardware use. Lets the machine open large text files without opening the entire thing.

In [5]:
def get_random_chunk(split): #This line defines a function named get_random_chunk that takes a parameter split.
    filename = 'train_split.txt' if split == 'train' else 'val_split.txt' #This line sets the variable filename to 'train_split.txt' if the split parameter is 'train'; otherwise, it sets it to 'val_split.txt'.
    with open(filename, 'rb') as f: #This line opens the file specified by filename in read-binary mode ('rb') and assigns the file object to the variable f.
        with mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) as mm: #This line uses memory-mapped file support (mmap) to map the file into memory for read-only access. f.fileno() gets the file descriptor of the open file, and mmap.ACCESS_READ specifies read-only access.
            file_size = len(mm) #This line gets the size of the memory-mapped file and assigns it to the variable file_size.
            start_pos = random.randint(0, (file_size) - block_size*batch_size) #This line calculates a random starting position  within the file. Minus margin of the searching size
            mm.seek(start_pos) #This line moves the file pointer to the calculated random starting position 
            block = mm.read(block_size*batch_size-1) #This line reads a block of data from the memory-mapped file starting at start_pos. The size of the block is block_size * batch_size - 1 bytes.
            decoded_block = block.decode('utf-8', errors='ignore').replace('\r', '') #This line decodes the read block from bytes to a string using UTF-8 encoding, ignoring any errors during decoding. It also replaces any carriage return characters ('\r') with an empty string.
            data = torch.tensor(encode(decoded_block), dtype=torch.long) #This line encodes the decoded block to a list of integers using the encode function, then converts it to a PyTorch tensor of type long.
        return data

def get_batch(split):
    data = get_random_chunk(split) #This line calls the get_random_chunk function with the given split and assigns the returned data tensor to the variable data.
    ix = torch.randint(len(data) - block_size, (batch_size,)) #This line generates batch_size random indices (ix) within the range from 0 to the length of data minus block_size.
    x = torch.stack([data[i:i+block_size] for i in ix]) #This line creates a list of slices from data of length block_size starting at each index in ix, then stacks these slices into a single tensor x.
    y = torch.stack([data[i+1:i+block_size+1] for i in ix]) #This line creates a list of slices from data of length block_size starting at each index in ix plus 1, then stacks these slices into a single tensor y. This represents the target data shifted by one position.
    x, y = x.to(device), y.to(device) #This line moves the tensors x and y to the specified device (e.g., CPU or GPU).
    return x, y

# Neural Nets
Here are our decoders, self-attention modules, etc.

In [6]:
class Head(nn.Module): #A single head in a self-attention mechanism
    """a head of self-attention"""
    
    def __init__(self, head_size): #This line defines the constructor (__init__) of the Head class, which initializes the object. .
        super().__init__() #The super().__init__() call initializes the base class (nn.Module)
        self.key = nn.Linear(n_embd, head_size, bias=False)
        self.query = nn.Linear(n_embd, head_size, bias=False) #These lines define three linear layers (self.key, self.query, self.value) without biases. They transform the input embeddings (n_embd) into vectors of size head_size.
        self.value = nn.Linear(n_embd, head_size, bias=False) #These layers compute the key, query, and value matrices in the self-attention mechanism.
        self.register_buffer('tril', torch.tril(torch.ones(block_size, block_size))) #This line registers a buffer named tril that contains a lower triangular matrix of ones with dimensions block_size x block_size. This matrix is used to mask out future positions in the attention scores to prevent peeking into the future.
        self.dropout = nn.Dropout(dropout) #This line initializes a dropout layer (self.dropout) with the specified dropout probability (dropout). Dropout is used to prevent overfitting by randomly setting some elements to zero during training.
        
    def forward(self, x):
        B, T, C = x.shape #This line unpacks the shape of the input tensor x into B (batch size), T (sequence length), and C (number of channels or embedding dimension).
        k = self.key(x) #These lines compute the key (k) and query (q) matrices by passing the input tensor x through the respective linear layers.
        q = self.query(x)
        v = self.value(x)
        wei = q @ k.transpose(-2,-1) * k.shape[-1]**-0.5 #Computes the attention scores (wei) by performing a scaled dot-product between the query and the transpose of the key matrix. The scores are scaled by the square root of the last dimension of k to stabilize gradients.
        wei = wei.masked_fill(self.tril[:T, :T] == 0, float('-inf')) #Masks out the upper triangular part of the attention scores (wei) to ensure that each position can only attend to previous positions and itself, preventing information leakage from future positions.
        wei = F.softmax(wei, dim=-1) #This line applies the softmax function to the attention scores along the last dimension (dim=-1) to obtain the attention weights. The softmax function converts the scores to probabilities that sum to 1.
        wei = self.dropout(wei) #This line applies dropout to the attention weights to prevent overfitting.
        v = self.value(x) #This line computes the value matrix (v) by passing the input tensor x through the value linear layer.
        out = wei @ v #This line computes the output by performing a weighted sum of the value matrix (v) using the attention weights (wei).
        return out

In [7]:
class MultiHeadAttention(nn.Module):
    """multiple heads of self-attention in parallel"""

    def __init__(self, num_heads, head_size):
        super().__init__()
        self.heads = nn.ModuleList([Head(head_size) for _ in range(num_heads)]) #This line creates a module list (self.heads) containing num_heads instances of the Head class, each with a specified head_size. Each head performs self-attention independently.
        self.proj = nn.Linear(head_size * num_heads, n_embd) #This line defines a linear layer (self.proj) that projects the concatenated output of all attention heads back to the original embedding dimension (n_embd).
        self.dropout = nn.Dropout(dropout) #This line initializes a dropout layer (self.dropout) with the specified dropout probability (dropout). Dropout is used to prevent overfitting by randomly setting some elements to zero during training.

    def forward(self, x):
        out = torch.cat([h(x) for h in self.heads], dim=-1) #This line concatenates the outputs of all attention heads along the last dimension (dim=-1). Each head processes the input tensor x independently, and their outputs are combined to form a single tensor.
        out = self.dropout(self.proj(out)) #This line applies the linear projection (self.proj) to the concatenated output, reducing it back to the original embedding dimension (n_embd). Dropout is then applied to the projected output to prevent overfitting.
        return out

In [8]:
class FeedForward(nn.Module):
    """a basic linear layer followed by non-linear layer"""

    def __init__(self, n_embd):
        super().__init__() #This line defines the constructor (__init__) of the FeedForward class, which initializes the object. The super().__init__() call initializes the base class (nn.Module).
        self.net = nn.Sequential( # This line initializes a sequential container (self.net) that will hold the layers of the feedforward network. The nn.Sequential container allows for the easy stacking of layers.
            nn.Linear(n_embd, 4 * n_embd), #This line adds a linear layer that transforms the input tensor of size n_embd to a tensor of size 4 * n_embd. This expansion is often used to increase the representational capacity of the network.
            nn.ReLU(),                     #This line adds a ReLU (Rectified Linear Unit) activation function, which introduces non-linearity to the model. ReLU replaces all negative values in the tensor with zero.
            nn.Linear(4 * n_embd, n_embd), #This line adds another linear layer that transforms the tensor back from size 4 * n_embd to n_embd. This ensures that the input and output dimensions of the feedforward network are the same.
            nn.Dropout(dropout), #This line adds a dropout layer with the specified dropout probability (dropout). Dropout randomly sets a percentage of the neurons to zero during training, which helps to prevent overfitting.
        )

    def forward(self, x):
        return self.net(x)

In [9]:
class Block(nn.Module):
    """transformer block: communication between nodes and computation"""

    def __init__(self, n_embd, n_head):
        super().__init__() #This line defines the constructor (__init__) of the Block class, which initializes the object. The super().__init__() call initializes the base class (nn.Module).
        head_size = n_embd // n_head #This line calculates the size of each attention head (head_size) by dividing the embedding dimension (n_embd) by the number of attention heads (n_head).
        self.sa = MultiHeadAttention(n_head, head_size) #This line initializes a multi-head attention mechanism (self.sa) with n_head attention heads, each of size head_size.
        self.ffwd = FeedForward(n_embd) #This line initializes a feedforward neural network (self.ffwd) with the specified embedding dimension (n_embd).
        self.ln1 = nn.LayerNorm(n_embd)
        self.ln2 = nn.LayerNorm(n_embd) #These lines initialize two layer normalization layers (self.ln1 and self.ln2), each normalizing the input to have zero mean and unit variance across the embedding dimension (n_embd).

    def forward(self, x): #This line defines the forward pass method (forward) of the Block class, which takes an input tensor x.
        y = self.sa(x) #This line passes the input tensor x through the multi-head attention mechanism (self.sa) and assigns the output to y.
        x = self.ln1(x + y) #This line adds the output of the self-attention mechanism (y) to the original input tensor (x) and normalizes the result using the first layer normalization layer (self.ln1). This is known as the "add and norm" step.
        y = self.ffwd(x) #This line passes the normalized tensor x through the feedforward neural network (self.ffwd) and assigns the output to y.
        x = self.ln2(x + y) #This line adds the output of the feedforward network (y) to the tensor x from the previous "add and norm" step and normalizes the result using the second layer normalization layer (self.ln2). This is another "add and norm" step.
        return x

In [10]:
class RecurrentLayer(nn.Module):
    """long-short-term module"""

    def __init__(self, n_embd, n_head, hidden_size, n_layers):
        super().__init__()
        self.lstm = nn.LSTM(n_embd, hidden_size, n_layers, batch_first=True)

    def forward(self, x):
        x, _ = self.lstm(x)
        return x

In [24]:
class ConvLayer(nn.Module):
    def __init__(self, n_embd, kernel_size=3, stride=1, padding=1):
        super().__init__()
        self.conv=nn.Conv1d(n_embd, n_embd, kernel_size, stride, padding)

    def forward(self, x):
        x = x.transpose(1, 2)
        x = self.conv(x)
        x = x.transpose(1, 2)
        return x

# The Language Model

In [25]:
class GPTLanguageModel(nn.Module):
    def __init__(self, vocab_size):
        super().__init__()
        self.token_embedding_table = nn.Embedding(vocab_size, n_embd) #This line initializes an embedding layer (self.token_embedding_table) that converts token indices into dense vectors of size n_embd. The vocabulary size is specified by vocab_size.
        self.position_embedding_table = nn.Embedding(block_size, n_embd) #This line initializes an embedding layer (self.position_embedding_table) that converts position indices into dense vectors of size n_embd. The maximum sequence length is specified by block_size.
        self.blocks = nn.Sequential(*[Block(n_embd, n_head=n_head) for _ in range(n_layer)]) #This line initializes a sequential container (self.blocks) containing n_layer instances of the Block class. Each block performs self-attention and feedforward operations.
        self.conv_layer = ConvLayer(n_embd)
        self.recurrent_layer = RecurrentLayer(n_embd, hidden_size=n_embd, n_layers=1, n_head=n_head)
        self.ln_f = nn.LayerNorm(n_embd) #This line initializes a layer normalization layer (self.ln_f) that normalizes the final output across the embedding dimension (n_embd).
        self.lm_head = nn.Linear(n_embd, vocab_size)#This line initializes a linear layer (self.lm_head) that projects the final output back to the vocabulary size, producing logits for each token in the vocabulary.

        self.apply(self._init_weights) #This line applies the _init_weights method to initialize the weights of the model.

    def _init_weights(self, module): 
        if isinstance(module, nn.Linear): #This method initializes the weights of the model.
            torch.nn.init.normal_(module.weight, mean = 0.0, std = 0.02) #If the module is a linear layer, it initializes the weights with a normal distribution (mean 0, standard deviation 0.02) and sets the biases to zero.
            if module.bias is not None: #If the module is an embedding layer, it also initializes the weights with a normal distribution.
                torch.nn.init.zeros_(module.bias)
            elif isinstance(module, nn.Embedding):
                torch.nn.init.normal_(module.weight, mean = 0.0, std = 0.02)

    def forward(self, index, targets=None): #This line defines the forward pass method (forward) of the GPTLanguageModel class, which takes input indices (index) and optional target indices (targets).
        B, T = index.shape #This line unpacks the shape of the input tensor index into B (batch size) and T (sequence length).

        tok_emb = self.token_embedding_table(index) #This line converts the input token indices (index) into dense vectors using the token embedding table (self.token_embedding_table).
        pos_emb = self.position_embedding_table(torch.arange(T, device=device)) #This line generates position indices (torch.arange(T, device=device)) and converts them into dense vectors using the position embedding table (self.position_embedding_table).
        x = tok_emb + pos_emb #This line adds the token embeddings and positional embeddings to form the input to the neural network.
        x = self.blocks(x) #This line passes the combined embeddings through the transformer blocks (self.blocks).
        x = self.recurrent_layer(x)
        x = self.ln_f(x) #This line normalizes the output of the transformer blocks using the final layer normalization layer (self.ln_f).
        logits = self.lm_head(x) #This line projects the normalized output to the vocabulary size using the linear layer (self.lm_head), producing logits for each token.

        if targets is None:
            loss = None
        else:
            B, T, C = logits.shape #If targets are provided, this block of code reshapes the logits and targets to have a shape of (B*T, C) and (B*T) respectively.
            logits = logits.view(B*T, C) #It then computes the cross-entropy loss between the logits and targets.
            targets = targets.view(B*T)
            loss = F.cross_entropy(logits, targets)
        return logits, loss

    def generate(self, index, max_new_tokens):
        for _ in range(max_new_tokens): #This line defines the generate method, which generates new tokens. 
            index_cond = index[:, -block_size:] #It starts by extracting the last block_size tokens from the input index.
            logits, loss = self.forward(index_cond) #This line passes the extracted tokens through the model to get the logits
            logits = logits[:, -1, :] #This line extracts the logits corresponding to the last position.
            probs = F.softmax(logits, dim = -1) #This line applies the softmax function to the logits to obtain probabilities for each token in the vocabulary.
            index_next = torch.multinomial(probs, num_samples=1) #This line samples the next token from the probability distribution using multinomial sampling.
            index = torch.cat((index, index_next), dim=1) #This line appends the sampled token to the input index.
        return index

model = GPTLanguageModel(vocab_size) #This line initializes an instance of the GPTLanguageModel class with the specified vocabulary size (vocab_size).

# Save the model parameters
with open('model-02.pkl', 'wb') as f:
    torch.save(model.state_dict(), f)

print('model saved')

# Load the model parameters
model = GPTLanguageModel(vocab_size)
with open('model-02.pkl', 'rb') as f:
    model.load_state_dict(torch.load(f))
print('model loaded successfully')
model.to(device)

model saved
model loaded successfully


  model.load_state_dict(torch.load(f))


GPTLanguageModel(
  (token_embedding_table): Embedding(91, 384)
  (position_embedding_table): Embedding(128, 384)
  (blocks): Sequential(
    (0): Block(
      (sa): MultiHeadAttention(
        (heads): ModuleList(
          (0-3): 4 x Head(
            (key): Linear(in_features=384, out_features=96, bias=False)
            (query): Linear(in_features=384, out_features=96, bias=False)
            (value): Linear(in_features=384, out_features=96, bias=False)
            (dropout): Dropout(p=0.2, inplace=False)
          )
        )
        (proj): Linear(in_features=384, out_features=384, bias=True)
        (dropout): Dropout(p=0.2, inplace=False)
      )
      (ffwd): FeedForward(
        (net): Sequential(
          (0): Linear(in_features=384, out_features=1536, bias=True)
          (1): ReLU()
          (2): Linear(in_features=1536, out_features=384, bias=True)
          (3): Dropout(p=0.2, inplace=False)
        )
      )
      (ln1): LayerNorm((384,), eps=1e-05, elementwise_affine

# Training Loop

In [26]:
@torch.no_grad() #This decorator disables gradient calculation. It is used to reduce memory consumption and speed up computations during inference, since gradients are not needed.
def estimate_loss():
    out = {} #This line initializes an empty dictionary out to store the average losses for the training and validation splits.
    model.eval() #This line sets the model to evaluation mode using model.eval(). This mode is important because it disables dropout and other training-specific behaviors.
    for split in ['train', 'val']: #This line starts a loop that iterates over the two splits: 'train' and 'val'. This loop will estimate the loss for both the training and validation datasets.
        losses = torch.zeros(eval_iters) #This line initializes a tensor losses with zeros, having a length equal to eval_iters. This tensor will store the loss values for each iteration.
        for k in range(eval_iters): #This nested loop runs for eval_iters iterations. In each iteration:
            X, Y = get_batch(split) #Retrieves a batch of input data (X) and target data (Y) for the given split.
            logits, loss = model(X, Y) #Computes the logits and loss for the batch using the model.
            losses[k] = loss.item() #Stores the computed loss in the losses tensor at index k.
        out[split] = losses.mean() #This line calculates the mean of the losses tensor for the current split and stores it in the out dictionary with the split name as the key.
    model.train() #This line sets the model back to training mode using model.train(). This mode re-enables dropout and other training-specific behaviors.
    return out

In [27]:
optimizer = torch.optim.AdamW(model.parameters(), lr=learning_rate) #This line initializes an AdamW optimizer with the parameters of the model (model.parameters()) and a specified learning rate (learning_rate). AdamW is a variant of the Adam optimizer with weight decay regularization.

for iter in range(max_iters): #This line starts a loop that iterates from 0 to max_iters-1, where max_iters is the total number of iterations for training.
    if iter % eval_iters == 0: #Inside the loop, this block of code checks if the current iteration (iter) is a multiple of eval_iters. 
        losses = estimate_loss() #If true, it estimates the training and validation losses using the estimate_loss function and prints the current iteration step along with the losses.
        print(f"step:{iter}, train loss:{losses['train']:.3f}, val loss:{losses['val']:.3f}")

    xb, yb = get_batch('train') #This line retrieves a batch of input data (xb) and target data (yb) for training using the get_batch function.

    logits, loss = model.forward(xb, yb) #This line performs a forward pass through the model with the training batch (xb, yb) and computes the logits and loss.
    optimizer.zero_grad(set_to_none=True) #This line clears the gradients of all optimized parameters by setting them to None. This is a more memory-efficient way to zero the gradients compared to setting them to zero.
    loss.backward() #This line computes the gradient of the loss with respect to the model parameters using backpropagation.
    optimizer.step() #This line updates the model parameters based on the computed gradients using the optimizer.
print(loss.item())

with open('model-02.pkl', 'wb') as f: #This block of code opens a file named 'model-02.pkl' in write-binary mode ('wb') and saves the model parameters to the file using pickle.dump.
    pickle.dump(model, f)
print('model saved')

step:0, train loss:4.580, val loss:4.570
step:100, train loss:3.003, val loss:3.114
2.608344078063965
model saved


In [28]:
prompt = " "
context = torch.tensor(encode(prompt), dtype=torch.long, device=device)
generated_chars = decode(model.generate(context.unsqueeze(0), max_new_tokens=500)[0].tolist())
print(generated_chars)

 mo' ok traolyOkkjrochar5eJ, .


”rera doda t hosWuprhenPtuanb.
lmrei”t, fer a,tepu'

uq: ol
_,x s•ing tenfnait “b1ingned ZrBou aEanenePcese rasoum’lfe tapm tRhe _idusu utpmd t" whee ga$ ithinal- t Rin 
hidar’it
scchiYe. unore n﻿oved the we$3er_Xed,  tho anpdhurshedrGe bFta pis toneJ eXrwree maro.yUc ro fusotat !aly itged Ict o—'ll“ly sano tres st 7o uren het ce,r wome souvg,id

We th Sotche we fot95  atheed thro
 wewe":
*auno?Ber o tu4) caLcy pek yy tis‘7 _a, pthernre4.e her ged,
nDe f?o wroanMo


# A Chat Bot Loop
So I've trained a (rather underwhelming!!) chatbot. 

This is example code of how to interact with it. 

I'm not going to make a .py file for it to execute because it isn't worth it, but to show how to integrate. 

Enjoy this capability in Jupyter notebook for now. Otherwise, visit: <<https://chatgpt.com/>> for a more robust experience.

In [None]:
while True:
    prompt = input("prompt (type 'exit' to quit):\n")
    if prompt.lower() == 'exit':
        print("Shutting down...")
        break

    context = torch.tensor(encode(prompt), dtype=torch.long, device=device).unsqueeze(0)
    generated_tokens = model.generate(context, max_new_tokens=200)[0].tolist()
    generated_chars = decode(generated_tokens)
    print(f'complete:\n{generated_chars}')

prompt (type 'exit' to quit):
 lo


complete:
lood&yprUy fomeh "er2nsp n-osragahe Woutilt indre ud)eangd 5le

" bevethe Uaut weabr"
4e"
ql Re  Giniw Van tBet ouur theut s sadtn!he t7e the1 tOid, ;soarlkeuaI ,Joy heq nn t Nm.
d
Tl-hothal eareh, anbe


prompt (type 'exit' to quit):
 hi


complete:
hiyburlac
eao theind u‘tremHof ang  foorShaogW arock the2and
"
f'ret6fj &or ad *moa. "me (l
yef a.™dQrJsble scqe‘ awr‘e ﻿rh
e to nnd the ty’ faOd™ey szalad waU re2sathe wsute horgA the;erxerundQnt Jte t


prompt (type 'exit' to quit):
 low


complete:
lowhepd i.0toub tithleH 1of"reyd the the ?ne, akt hhu anayO erd
o
t "omu thes soLhuisdi t ot]"e fre gal tPhex ravy“ w7asi 2ed™ma toD﻿”k39y nleole sewrewely4 Re ne. fu tFan1ov atlerl%u 0ted ge"(in9 do siL
