# Lesson 2: Natural Language Processing (NLP)

NLP is a field of linguistic and machine learning, and it is focused on everything related to human language.

We saw significant progress in this field thanks to the transformer architecture from the well-known paper Attention is All You Need in 2017. And since then, this architecture is now the core of many state-of-the-art machine learning models nowadays. 

In this lesson, we will be using the transformers library and in particular, the pipeline function.

- In the classroom, the libraries are already installed for you.
- If you would like to run this code on your own machine, you can install the following:
```
    !pip install transformers
```

### Build the `chatbot` pipeline using 🤗 Transformers Library

In [1]:
# Here is some code that suppresses warning messages.
from transformers.utils import logging
logging.set_verbosity_error()

In [2]:
# Importing pipeline function from transformer library
from transformers import pipeline

- Define the conversation pipeline

In [4]:
# We create a text2text-generation pipeline using a model from facebook called BlenderBox
chatbot = pipeline("text2text-generation", model="facebook/blenderbot-400M-distill")

# We decided to use this model because the model is very small and performs quite well.
# We only need 1.6G to load it.

We cannot use Llama 2 or any other big models since we won't be able to load it. 

Info about ['blenderbot-400M-distill'](https://huggingface.co/facebook/blenderbot-400M-distill)

In [6]:
# Initialize chat history
chat_history = [
    {"role": "system", "content": "You are a helpful travel assistant."}
]

In [7]:
# Key in first user input
user_input = """
What are some fun activities I can do in the winter?
"""
# Append chat history with first user input
chat_history.append({"role": "user", "content": user_input})

In [8]:
conversation = chatbot(user_input)

  test_elements = torch.tensor(test_elements)


In [9]:
print(conversation[0]['generated_text'])

 I like snowboarding and skiing.  What do you like to do in winter?


In [10]:
print(chatbot("What else do you recommend?"))

[{'generated_text': " Well, I'm not sure what else I can think of, but I do know that I'm going to miss her."}]


In [11]:
chat_history

[{'role': 'system', 'content': 'You are a helpful travel assistant.'},
 {'role': 'user',
  'content': '\nWhat are some fun activities I can do in the winter?\n'}]

- You can continue the conversation with the chatbot with:
```
print(chatbot(Conversation("What else do you recommend?")))
```
- However, the chatbot may provide an unrelated response because it does not have memory of any prior conversations.

- To include prior conversations in the LLM's context, you can add a 'message' to include the previous chat history.

In [32]:
# Initialize an empty chat history
chat_history = [
    {"role": "system", "content": "You are a helpful travel assistant."}
]

In [None]:
# Key in first user input
user_input = """
What are some fun activities I can do in the winter?
"""
# Append chat history with first user input
chat_history.append({"role": "user", "content": user_input})

In [43]:
def chat_with_bot(chat_history, user_input):
    """
    chat_history: list of dict, each dict has {"role": "...", "content": "..."}
    user_input: string (the user message)
    returns: (response_text, updated_chat_history)
    """
    # Append the user message as a dict
    chat_history.append({"role": "user", "content": user_input})

    # Convert the chat history (dicts) into a single formatted string
    formatted_history = "\n".join([
        f"{msg['role'].capitalize()}: {msg['content']}"
        for msg in chat_history
    ])
    # We'll add "Assistant:" to prompt the model
    input_text = formatted_history + "\nAssistant:"

    # Generate a response
    response_text = chatbot(
        input_text, max_length=100, truncation=True
    )[0]["generated_text"]

    # Append the assistant's response
    chat_history.append({"role": "assistant", "content": response_text})

    return response_text, chat_history


In [44]:
user_message = "Hello, how are you?"
response, chat_history = chat_with_bot(chat_history, user_message)

TypeError: string indices must be integers, not 'str'

In [42]:
# Example conversation
print(chat_with_bot("What else do you recommend?"))

TypeError: string indices must be integers, not 'str'

In [38]:
conversation = chat_with_bot(user_input)

TypeError: sequence item 0: expected str instance, dict found

In [29]:
# Example conversation
print(chat_with_bot("Hi"))

TypeError: sequence item 0: expected str instance, dict found

In [24]:
chatbot.add_message(
    {"role": "user",
     "content": """
What else do you recommend?
"""
    })

AttributeError: 'Text2TextGenerationPipeline' object has no attribute 'add_message'

In [25]:
print(conversation)

[{'generated_text': ' I like snowboarding and skiing.  What do you like to do in winter?'}]


In [26]:
conversation = chatbot(conversation)

print(conversation)

ValueError:  `args[0]`: {'generated_text': ' I like snowboarding and skiing.  What do you like to do in winter?'} have the wrong format. The should be either of type `str` or type `list`

- [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
- [LMSYS Chatbot Arena Leaderboard](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)

### Try it yourself! 
- Try chatting with the model!