# Exploring Generative AI Libraries

## Text generation before transformers

1. N-gram language models: They predict what words come next in a sentence based on the words that came before. For example, if you say "The sky is," these models guess that the next word might be "blue".
2. Recurrent neural networks: The essence of their design lies in maintaining a 'memory' or 'hidden state' throughout the sequence by employing loops. This enables RNNs to recognise and capture the temporal dependencies inherent in sequential data. For example, for a sentence "I love RNNs", the network interprets the sentence word by word. Beginning with the word "I", the RNN ingests it, generates an output, and updates its hidden state. Moving on to "love", the RNN processes it alongside the updated hidden state which already holds insights about the word "I". The hidden state is updated again post this. This pattern of processing and updating continues until the last word. By the end of the sequence, the hidden state ideally encapsulates insights from the entire sentence.
3. Long short-term memory (LSTM) and gated recurrent units (GRUs): They are advanced variations of recurrent neural networks, designed to address the limitations of traditional RNNs and enhance their ability to model sequential data effectively.
4. Seq2seq models with attention: Often built with RNNs or LSTMs, they are designed to handle tasks like translation where an input sequence is transformed into an output sequence. The attention mechanism allows the model to "focus" on relevant parts of the input sequence when generating the output.

## Transformers

Key steps include:

- Tokenization: The first step is breaking down a sentence into tokens (words or subwords).
- Embedding: Each token is represented as a vector, capturing its meaning.
- Self-attention: The model computes scores determining the importance of every other word for a particular word in the sequence. These scores are used to weight the input tokens and produce a new representation of the sequence. For instance, in the sentence "He gave her a gift because she'd helped him", understanding who "her" refers to requires the model to pay attention to other words in the sentence. The transformer does this for every word, considering the entire context, which is particularly powerful for understanding meaning.
- Feed-forward neural networks: After attention, each position is passed through a feed-forward network separately.
- Output sequence: The model produces an output sequence, which can be used for various tasks, like classification, translation, or text generation.
- Layering: Importantly, transformers are deep models with multiple layers of attention and feed-forward networks, allowing them to learn complex patterns.

## Implementation: Building a simple chatbot with transformers

### Step 1: Installing libraries

In [None]:
from transformers import AutoTokenizer, TFAutoModelForSeq2SeqLM
import sentencepiece

### Step 2: Importing the required tools from the transformers library

In [14]:
!pip install tf-keras

Collecting tf-keras
  Downloading tf_keras-2.17.0-py3-none-any.whl.metadata (1.6 kB)
Downloading tf_keras-2.17.0-py3-none-any.whl (1.7 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.7/1.7 MB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hInstalling collected packages: tf-keras
Successfully installed tf-keras-2.17.0


In [15]:
# Load the model and tokeniser

model_name = "facebook/blenderbot-400M-distill"

model = TFAutoModelForSeq2SeqLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

tf_model.h5:   0%|          | 0.00/1.46G [00:00<?, ?B/s]

All model checkpoint layers were used when initializing TFBlenderbotForConditionalGeneration.

Some layers of TFBlenderbotForConditionalGeneration were not initialized from the model checkpoint at facebook/blenderbot-400M-distill and are newly initialized: ['final_logits_bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


generation_config.json:   0%|          | 0.00/347 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/127k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/62.9k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/16.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/310k [00:00<?, ?B/s]



In [24]:
# Define the chat function

def chat_with_bot():
    while True:
        # Get user input
        input_text = input("You: ")

        # Exit conditions
        if input_text.lower() in ["quit", "exit", "bye"]:
            print("Chatbot: Goodbye!")
            break

        # Tokenise input and generate response
        inputs = tokenizer.encode(input_text, return_tensors="pt")
        outputs = model.generate(inputs, max_new_tokens=150)
        response = tokenizer.decode(outputs[0], skip_special_tokens=True).strip()

        # Display bot's response
        print("Chatbot:", response)

In [25]:
# Start chatting

chat_with_bot()

ImportError: Unable to convert output to PyTorch tensors format, PyTorch is not installed.

### Step 2: Trying another language model and comparing the output

In [None]:
model_name = "google/flan-t5-base"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForSeq2SeqLM.from_pretrained(model_name)