## Natural Language Processing (NLP)

### Definition

NLP is a field of linguistic and machine learning, and it is focused on everything related to human language. 

We saw significant progress in this field thanks to the **transformer architecture** from the well-known paper **Attention is All You Need** in 2017. And since then, this architecture is now core of many State-Of-The-Art ML models nowadays. 

State-Of-The-Art (SOTA) ML models” 指的是当前最先进的机器学习模型

In this cookbook, we will be using the `transformers` library, and in particular, the Pipeline function. 

In [3]:
from transformers import pipeline

In [4]:
chatbot = pipeline(task = "text2text-generation", 
                   model = "facebook/blenderbot-400M-distill")

Device set to use mps:0


In [5]:
user_message = """
What are some fun activaties I can do in the winter?
"""

In [6]:
conversation = []

In [7]:
conversation.append(f"User: {user_message}")

In [8]:
response = chatbot(user_message)[0]['generated_text']

In [9]:
response

' I love snowboarding and skiing.  What activities do you like to do?'

In [10]:
conversation.append(f"Bot: {response}")

In [11]:
conversation

['User: \nWhat are some fun activaties I can do in the winter?\n',
 'Bot:  I love snowboarding and skiing.  What activities do you like to do?']

### NLP applications

Here are some common NLP tasks:

- Text generation
- Sentence similarity
- Summarization
- Machine translation

In [12]:
user_message = "What else do you recommend?"

In [17]:
response = chatbot(user_message)[0]['generated_text']

In [18]:
conversation.append(f"User: {user_message}")

In [19]:
conversation.append(f"Bot: {response}")

As you can see, the assistant answer has nothing to do with winter activities. **It doesn't remember the earlier conversation.** 

The right way to do that is to add a message to the conversation. 

LLMs don't naturally keep a memory of your previous messages, but when you are using transformer object, you can add follow up messages and conversation object will keep your previous prompt as well as the LLM's response. 

So that the LLM will converse with you as if it remembers your earlier conversation. 

In [20]:
conversation

['User: \nWhat are some fun activaties I can do in the winter?\n',
 'Bot:  I love snowboarding and skiing.  What activities do you like to do?',
 'User: What else do you recommend?',
 "Bot:  Well, I'm not sure what else I can think of, but I do know that I'm going to miss her."]

In [21]:
user_message = "What can you do?"

In [22]:
conversation.append(f"User: {user_message}")

In [26]:
conversation

['User: \nWhat are some fun activaties I can do in the winter?\n',
 'Bot:  I love snowboarding and skiing.  What activities do you like to do?',
 'User: What else do you recommend?',
 "Bot:  Well, I'm not sure what else I can think of, but I do know that I'm going to miss her.",
 'User: What can you do?']

In [27]:
chatbot("\n".join(conversation))

[{'generated_text': ' I like to ski and snowboard.  I also like to snowboard and surf.'}]

In [28]:
response = chatbot("\n".join(conversation))[0]['generated_text']

In [29]:
response

' I like to ski and snowboard.  I also like to snowboard and surf.'

In [30]:
conversation.append(f"Bot: {response}")

In [31]:
conversation

['User: \nWhat are some fun activaties I can do in the winter?\n',
 'Bot:  I love snowboarding and skiing.  What activities do you like to do?',
 'User: What else do you recommend?',
 "Bot:  Well, I'm not sure what else I can think of, but I do know that I'm going to miss her.",
 'User: What can you do?',
 'Bot:  I like to ski and snowboard.  I also like to snowboard and surf.']