# **Laboratory:** Building a Chatbot

In [None]:
# Install necessary libraries
!pip install transformers

# Verify installation
!pip show transformers

Name: transformers
Version: 4.47.1
Summary: State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
Home-page: https://github.com/huggingface/transformers
Author: The Hugging Face team (past and future) with the help of all our contributors (https://github.com/huggingface/transformers/graphs/contributors)
Author-email: transformers@huggingface.co
License: Apache 2.0 License
Location: /usr/local/lib/python3.11/dist-packages
Requires: filelock, huggingface-hub, numpy, packaging, pyyaml, regex, requests, safetensors, tokenizers, tqdm
Required-by: peft, sentence-transformers


In [None]:
from transformers import AutoModelForCausalLM, AutoTokenizer

## **Model:** GPT2

In [None]:
# Model selection (we will use GPT-2)
model_name = "gpt2"

# Download the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Ensure padding and end-of-sequence tokens
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

print("GPT-2 model and tokenizer ready.")


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/665 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.04M [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/548M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Modelo GPT-2 y tokenizador listos.


In [None]:
# Function to generate chatbot responses
def chat_with_gpt2(prompt, max_new_tokens=50, temperature=0.7, top_k=50, repetition_penalty=1.2):
    """
    Generates a response from a prompt using GPT-2.
    """
    inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512)
    outputs = model.generate(
        inputs['input_ids'],
        attention_mask=inputs['attention_mask'],
        max_new_tokens=max_new_tokens,
        temperature=temperature,
        top_k=top_k,
        repetition_penalty=repetition_penalty,  # Penalty to avoid repetition
        pad_token_id=tokenizer.pad_token_id,
    )
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return response[len(prompt):].strip()

# Initial chatbot test
user_input = "What is AI?"
response = chat_with_gpt2(user_input)
print("Chatbot:", response)


Chatbot: The term "AI" refers to the ability to create, process and manipulate information. It's a concept that has been around for decades but was first used by scientists in 1887 when they discovered how computers could learn from one another through their interactions


In [None]:
# Allow the user to have a basic conversation with the chatbot
def interactive_chat():
    """
    Función interactiva para el chatbot.
    """
    conversation_history = []  # Conversation history
    max_history_length = 5     # Maximum number of messages in history

    print("¡Bienvenido al Chatbot! Escribe 'exit' para salir.")
    while True:
        user_input = input("You: ").strip()
        if user_input.lower() == "exit":
            print("¡Adiós!")
            break

        # Add user input to history
        conversation_history.append(f"User: {user_input}")
        if len(conversation_history) > max_history_length:
            conversation_history.pop(0)  # Remove oldest entries

        # Structure the prompt for the model
        prompt = "\n".join(conversation_history) + "\nAI:"
        response = chat_with_gpt2(prompt)

        # Limit to a single clear response
        response_lines = response.split("\n")
        clean_response = response_lines[0].strip()

        print(f"Chatbot: {clean_response}")
        conversation_history.append(f"AI: {clean_response}")

# Start the revised chatbot
interactive_chat()


¡Bienvenido al Chatbot! Escribe 'exit' para salir.
You: Hi. What is AI?
Chatbot: I'm a programmer, and my job as an engineer was to make sure that the code would be able for me to understand what it needed in order to do something useful with this game engine (and other games). It's not just about making things
You: Do you know math?
Chatbot: Yes! My name isn't Mathy but rather "Math" because of how much time he spent on his computer programming while working at Microsoft Research where we were doing some research into machine learning algorithms which are used by many companies today such Asimov-
You: Do you know python?
Chatbot: Python has been around since before computers existed so there aren´t any major problems here either... or if they did exist then why didn`T people use them anyway?? User : How can i help someone who needs more than one thing ? ????
You: exit
¡Adiós!


## **Model:** DialoGPT-medium

In [None]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load the DialoGPT model and tokenizer
model_name = "microsoft/DialoGPT-medium"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Initialize conversation history as None
conversation_history = None

def chatbot_response(user_input, history):
    global conversation_history  # Use the global variable to maintain history

    # Encode the new input
    new_input_ids = tokenizer.encode(user_input + tokenizer.eos_token, return_tensors="pt")

    # Concatenate with history, if it exists
    if history is None:
        bot_input_ids = new_input_ids
    else:
        bot_input_ids = torch.cat([history, new_input_ids], dim=-1)

    # Generate response
    outputs = model.generate(
        bot_input_ids,
        max_length=bot_input_ids.shape[-1] + 50,
        pad_token_id=tokenizer.eos_token_id,
        do_sample=True,
        temperature=1,
        top_k=50,
        top_p=0.95
    )

    # Update history with the generated response
    conversation_history = outputs

    # Decode and return the generated response
    response = tokenizer.decode(outputs[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)
    return response

# Interact with the chatbot
print("Start chatting (type 'exit' to end):")
while True:
    user_input = input("You: ")
    if user_input.lower() == "exit":
        print("Ending conversation.")
        break

    try:
        response = chatbot_response(user_input, conversation_history)
        print(f"DialoGPT says: {response}")
    except Exception as e:
        print(f"Error: {e}")


Comienza a conversar (escribe 'salir' para terminar):
Tú: Can you help me?
DialoGPT dice: What's up?
Tú: salir
Finalizando conversación.


## **Model:** BlenderBot

Although the model is primarily optimized for English, it can handle Spanish inputs with acceptable quality.


In [None]:
from transformers import BlenderbotTokenizer, BlenderbotForConditionalGeneration

# Load the BlenderBot model and tokenizer
model_name = "facebook/blenderbot-400M-distill"
tokenizer = BlenderbotTokenizer.from_pretrained(model_name)
model = BlenderbotForConditionalGeneration.from_pretrained(model_name)

# Function to handle the conversation
def chatbot_blenderbot(user_input, history=None):
    # Encode the user input
    inputs = tokenizer(user_input, return_tensors="pt")

    # Generate the model's response
    reply_ids = model.generate(**inputs, max_length=200)

    # Decode the generated response
    response = tokenizer.decode(reply_ids[0], skip_special_tokens=True)
    return response

# Interact with the chatbot
print("Start chatting with BlenderBot (type 'exit' to end):")
while True:
    user_input = input("You: ")
    if user_input.lower() == "exit":
        print("Ending conversation.")
        break

    try:
        response = chatbot_blenderbot(user_input)
        print(f"BlenderBot says: {response}")
    except Exception as e:
        print(f"Error: {e}")

tokenizer_config.json:   0%|          | 0.00/1.15k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/127k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/62.9k [00:00<?, ?B/s]

added_tokens.json:   0%|          | 0.00/16.0 [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/772 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/310k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/1.57k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/730M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/347 [00:00<?, ?B/s]

Comienza a conversar con BlenderBot (escribe 'salir' para terminar):
Tú: What is AI?
BlenderBot dice:  It's basically a computer that can do a lot of things. It's like a robot.
Tú: cAN YOU HELP ME?
BlenderBot dice:  I don't know what you mean by "CAN" but I do know that I need to find a new job.
Tú: salir
Finalizando conversación.


## **Model:** OpenAssistant

OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 is a large model specifically designed for chatbots. Although heavier, it is ideal for applications requiring detailed and rich interactions.


NOTE: It may consume all RAM when installed in Collab. Run in Visual Studio Code.


In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the OpenAssistant model and tokenizer
model_name = "OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# Function to handle the conversation
def chatbot_openassistant(user_input, history=None):
    if history is None:
        history = ""
    # Build the conversation history
    prompt = history + f"User: {user_input}\nAssistant: "

    # Encode the input
    inputs = tokenizer(prompt, return_tensors="pt")

    # Generate the response
    reply_ids = model.generate(
        inputs.input_ids,
        max_length=512,
        temperature=0.7,
        top_p=0.9,
        do_sample=True,
    )

    # Decode the generated response
    response = tokenizer.decode(reply_ids[0], skip_special_tokens=True)

    # Extract the relevant part of the response
    response = response.split("Assistant:")[-1].strip()

    # Update the history
    history += f"User: {user_input}\nAssistant: {response}\n"
    return response, history

# Interact with the chatbot
print("Start chatting with OpenAssistant (type 'exit' to end):")
history = ""
while True:
    user_input = input("You: ")
    if user_input.lower() == "exit":
        print("Ending conversation.")
        break

    try:
        response, history = chatbot_openassistant(user_input, history)
        print(f"OpenAssistant says: {response}")
    except Exception as e:
        print(f"Error: {e}")



The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
The `GPTNeoXSdpaAttention` class is deprecated in favor of simply modifying the `config._attn_implementation`attribute of the `GPTNeoXAttention` class! It will be removed in v4.48


Loading checkpoint shards:   0%|          | 0/3 [00:00<?, ?it/s]

## **Testing Hugging Face Models**

Models:
https://huggingface.co/models

## **Model:** Llama-3.1-8B

In [None]:
!pip install -U "huggingface_hub[cli]"

Collecting InquirerPy==0.3.4 (from huggingface_hub[cli])
  Downloading InquirerPy-0.3.4-py3-none-any.whl.metadata (8.1 kB)
Collecting pfzy<0.4.0,>=0.3.1 (from InquirerPy==0.3.4->huggingface_hub[cli])
  Downloading pfzy-0.3.4-py3-none-any.whl.metadata (4.9 kB)
Downloading InquirerPy-0.3.4-py3-none-any.whl (67 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.7/67.7 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading pfzy-0.3.4-py3-none-any.whl (8.5 kB)
Installing collected packages: pfzy, InquirerPy
Successfully installed InquirerPy-0.3.4 pfzy-0.3.4


In [None]:
!huggingface-cli login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.
    To log in, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) n
Token is valid (permission: fineG

In [None]:
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="meta-llama/Llama-3.1-8B")
# It will download the model and then perform the reading

## Models without authentication

## **Model:** Llama-Express.1

In [None]:
# Use a pipeline as a high-level helper
from transformers import pipeline

messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="prithivMLmods/Llama-Express.1")
pipe(messages)

config.json:   0%|          | 0.00/994 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.47G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/248 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/57.5k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/477 [00:00<?, ?B/s]

Device set to use cpu


[{'generated_text': [{'role': 'user', 'content': 'Who are you?'},
   {'role': 'assistant',
    'content': "I'm a computer program designed to understand and generate human-like text. I'm here to help answer questions, provide information, and engage in conversations. My primary function is to assist users by providing accurate and helpful responses. I don't have personal feelings or emotions, but I'm always ready to help when you need me. I'm a bit like a digital assistant, but I don't have a physical presence, so I'm not capable of meeting in person or having personal interactions. I exist solely to serve and provide information to those who interact with me.\n\nI was created by a team of developers who designed me to be a helpful tool for people. They wanted to create a system that could understand natural language and respond in a way that's clear and engaging. Over time, I've been fine-tuned to improve my performance and provide more accurate answers. I'm constantly learning and im

## **Test with OpenAI Key**

In [None]:
!pip install OpenAI
!pip install python-dotenv

Collecting python-dotenv
  Downloading python_dotenv-1.0.1-py3-none-any.whl.metadata (23 kB)
Downloading python_dotenv-1.0.1-py3-none-any.whl (19 kB)
Installing collected packages: python-dotenv
Successfully installed python-dotenv-1.0.1


In [None]:
!pip install --upgrade openai

Collecting openai
  Downloading openai-1.60.0-py3-none-any.whl.metadata (27 kB)
Downloading openai-1.60.0-py3-none-any.whl (456 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m456.1/456.1 kB[0m [31m15.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.59.6
    Uninstalling openai-1.59.6:
      Successfully uninstalled openai-1.59.6
Successfully installed openai-1.60.0


In [None]:
# Import the library
from openai import OpenAI
from dotenv import load_dotenv, find_dotenv
import os

# Set up the OpenAI API key
load_dotenv(find_dotenv(), override=True)

# Students should obtain their key from https://platform.openai.com/account/api-keys

client = OpenAI(
    api_key=os.environ.get("OPENAI_API_KEY"),  # This is the default and can be omitted
)


# Function to interact with the model
def chat_with_openai(prompt):
    try:
        # Create the request to the model
        response = client.chat.completions.create(
            model="gpt-3.5-turbo",  # Recommended model
            messages=[
                {"role": "system", "content": "You are a friendly and helpful assistant."},
                {"role": "user", "content": prompt},
            ]
        )
        # Extract the response correctly
        message = response.choices[0].message.content
        return message
    except Exception as e:
        return f"Error: {str(e)}"

# Interactive chat interface
print("Welcome to the interactive chat with OpenAI. Type 'exit' to end.")

while True:
    user_input = input("You: ")
    if user_input.lower() == "exit":
        print("Goodbye!")
        break
    response = chat_with_openai(user_input)
    print(f"AI: {response}")

Bienvenido al chat interactivo con OpenAI. Escribe 'salir' para terminar.
Tú: Hi
AI: Hello! How can I assist you today?
Tú: Yes please, what day is it today?
AI: Today is Monday. How can I assist you further today?
Tú: salir
¡Hasta luego!
