## Problem:
### How could we create an AI powered Chatbot to respond to questions for a specific content? (pdf, Website, etc)

### ⚙️ SetUp

Uncomment and run the following to set up your environment. In the code that follows we will be using openai, langchain, optionally chromadb and numpy.

In [None]:
# # Uncomment this cell to get everything installed in colab.
# # You will get a bunch of logs and errors.  Don't worry about them.  Everything will be installed properly in the end.

# %pip install openai==0.28
# %pip install langchain
# %pip install numpy
# %pip install chromadb
# %pip install tiktoken

Collecting openai==0.28
  Downloading openai-0.28.0-py3-none-any.whl (76 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m76.5/76.5 kB[0m [31m1.7 MB/s[0m eta [36m0:00:00[0m
Installing collected packages: openai
  Attempting uninstall: openai
    Found existing installation: openai 1.3.7
    Uninstalling openai-1.3.7:
      Successfully uninstalled openai-1.3.7
[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
llmx 0.0.15a0 requires cohere, which is not installed.[0m[31m
[0mSuccessfully installed openai-0.28.0


Note that to use openai library and invoke the API you will need to
1. Create an account
2. Create an OPENAI_API_KEY
3. Export this key to the system

Please follow the steps in [OpenAI API Reference](https://platform.openai.com/docs/api-reference/introduction) page.

In [None]:
# import os

# os.environ['OPENAI_API_KEY'] = "your api key here"

### 💬 Text generation

Before diving into our problem, we will first discuss how you can use the openai python library to generate a response to a question to understand better the idea of "prompt before the prompt". Suppose you have some information about the sky and you want to make an assistant that will responds to questions only about the sky.

In [None]:
context =  "The sky is the appearance of the atmosphere around the surface of the planet from our \
            point of view. We see many objects that are actually in space such as the Sun, \
            the Moon, and stars because they are in the sky. On a clear day the sky appears blue. \
            At night it appears from very dark blue to black."

how_to_respond =  f"Use this information to respond to the user question: {context}. \
                    Please stick to this context when answering the question. "

In [None]:
background =  "You are an assistant."


In [None]:
question = input("Provide Question here: ")
question

Provide Question here: What color is the sky?


'What color is the sky?'

To create a response you can use the *create* function from openai's ChatCompletion object. To call it you will need to provide at least two paramters:
* The *<span style="color:green">model</span>* you want to use
* Some *<span style="color:green">messages</span>* to start the conversation

In [None]:
required_parameters = {

    'model': "gpt-4",

    'messages': [
        {"role": "system", "content": background},
        {"role": "user", "content": question},
        {"role": "assistant", "content": how_to_respond}
    ]

}

Messages should be provided in a form of a conversation (question - response). Usually a conversation starts with a system message and then multiple back and forth turns. Each message is an object with the following format:
```json
{ "role": "...",  "content": "..." }
```
The content value depends on the role and we can have three different roles:

* **system**: Sets the assistant's behavior (Optional; without it, the model behaves like a generic assistant)
* **user**: Contain requests or comments for the assistant
* **assistant**: Store previous assistant responses. Can also be written to provide examples of desired behavior (context)

In [None]:
import openai

# >>>>>>> create response
response = openai.ChatCompletion.create(
    model= required_parameters['model'],
    messages= required_parameters['messages'],

    # >>>>>>> other parameters that can be used for model tuning
    max_tokens=100, # number of tokens to return
    n=1, # how many responses to generate

    # read more about model fine tuning: https://platform.openai.com/docs/api-reference/chat/create
)

In [None]:
# >>>>>>> retrieve response
answer = response.choices[0]['message']['content']

## puts each sentence (when multi-sentence responce) in separate line in the answer
answer = answer.replace('.', '.\n\t')

print("Question:\n\t", question)
print("Answer:\n\t", answer)

Question:
	 What color is the sky?
Answer:
	 The color of the sky can change depending on the time of day and weather conditions.
	 On a clear day, the sky generally appears to be blue.
	 At sunrise or sunset, the sky can take on hues of red, orange, pink, or purple.
	 At night, the sky appears very dark blue or black.
	


So, "prompt before the prompt" refers to providing additional context or information before asking a question or making a request. It helps set the stage for more informed or relevant response. In our case this would be describing the "system" or the way the "assistant" will respond. Notice that in this way you guide the model without training or fine tuning it treating it as an intelligent black box.

### 💡 Brainstorming

We already know about:
 * How to <span style="color:purple">create embeddings</span>
 * How to <span style="color:purple">find similar embeddings</span>
 * How to <span style="color:purple">generate response on a question</span>

Can we combine all those to solve our problem?

## Part 1: Create Knowledge Base

<img src="https://www.donwoodlock.com/ml301-Nov2023/querytxt_demo/images/embeddings.png" width="700" height="400" alt="Embeddings Storage Pipeline">

The first thing you need to implement when making a context base ai chat assistant is our <span style="color:purple">knowledge base</span>. This can be thought of as a storage of all the information we want to use to respond to potential questions. And of course, since this information will be processed by a machine, it better be encoded and easy to search. In this section we will explore how we can create a knowledge base using text embeddings.

### STEP 1: Load context

The first thing you might want to do is define how you will be loading your document. This includes parsers or special filetype loaders you will be using, as well as deciding which parts from the imported data you will maintain as a context. In this simple example we will be loading an ascii character txt that contains four french toast recipes.

In [None]:
# read the text

## the simple way

import requests
url = "https://www.donwoodlock.com/ml301-Nov2023/querytxt_demo"

response = requests.get(f"{url}/thefrenchtoast.txt")

text = response.text if response.status_code == 200 else ""

print(text)

### the Langchain way

# from langchain.document_loaders import TextLoader, UnstructuredURLLoader

# loader = UnstructuredURLLoader(urls=["http://www.donwoodlock.com/ml301-Nov2023/querytxt_demo/thefrenchtoast.txt"])

# text = loader.load()[0].page_content

# text


RECIPE 1

The Best French Toast Recipe (+VIDEO) - The Girl Who Ate Everything
Christy Denney

This French Toast recipe is a classic recipe made with milk, eggs, vanilla, and cinnamon. This breakfast has the perfect soft center with a crisp outside.


BEST FRENCH TOAST RECIPE

I know this is a basic recipe for French toast but it is the best. I tried all different versions. Some with more milk, more eggs, less eggs, and this is the winning version. Make this your staple recipe for the weekend. 
WHAT KIND OF BREAD TO USE FOR FRENCH TOAST

The bread you use for French toast should be sturdy enough to handle a soaking in milk and eggs. Here are a few options.

    BRIOCHE â Has a firm crust but a soft center.
    SOURDOUGH â I love the contrast of this tangy bread to the sweet French toast.
    FRENCH BREAD â This is a great  bread for soaking since it has a solid crust.
    WHITE BREAD â This is what most of us have on hand and it works great. Itâs usually not 

> **Suggestion**: Try to load a pdf of your preference using PyPDF2 library or Langchain's PyPDF loader (see [here](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf)). Replace _text_ with the pdf data.

### STEP 2: Create embeddings

After you are done with the loading you can now start the process of creating embeddings. An embedding, as discussed earlier in the class, is a mathematical representation of some piece of text (which can span from a word to multiple paragraphs). This means that to create embeddings we first need to decide on the pieces of text to create embeddings for.

Although possible, having one embedding for the full text wouldn't be that usefull. Imagine someone telling you "the information you are trying to find is somewhere within this encyclopedia" (handing you out 10 books). This is somewhat useful but also not really. Would be much better if they told you "Look at book 3 on page 230". To allow our chatbot to be able to reach this level of granularity we will need to create embeddings of text large enough to contain useful information but small enough to remain "granular". In fact we might want to allow some overlap of information within our text pieces so we can maintain the linkage between information available.

For this task we will first create text chunks using the _CharacterTextSplitter_ class from _langchain.text_splitter_ library and then we will generate embeddings using the _OpenAIEmbeddings_ class from _langchain.embeddings.openai_ library

In [None]:
from langchain.text_splitter import CharacterTextSplitter

# split into chunks
text_splitter = CharacterTextSplitter(
    separator="\n",
    chunk_size=700, # this is amount of characters!
    chunk_overlap=150,
    length_function=len
)

chunks = text_splitter.split_text(text)

In [None]:
# ANSI escape codes for text color
PURPLE = '\033[95m'
BOLD = '\033[1m'
RESET = '\033[0m'  # Reset text color to default


for i in range(len(chunks)):
    print(PURPLE + BOLD + f'CHUNK: {i}' + RESET)
    print("\""+ chunks[i] + "\"\n")

[95m[1mCHUNK: 0[0m
"RECIPE 1

The Best French Toast Recipe (+VIDEO) - The Girl Who Ate Everything
Christy Denney

This French Toast recipe is a classic recipe made with milk, eggs, vanilla, and cinnamon. This breakfast has the perfect soft center with a crisp outside.


BEST FRENCH TOAST RECIPE

I know this is a basic recipe for French toast but it is the best. I tried all different versions. Some with more milk, more eggs, less eggs, and this is the winning version. Make this your staple recipe for the weekend. 
WHAT KIND OF BREAD TO USE FOR FRENCH TOAST

The bread you use for French toast should be sturdy enough to handle a soaking in milk and eggs. Here are a few options."

[95m[1mCHUNK: 1[0m
"The bread you use for French toast should be sturdy enough to handle a soaking in milk and eggs. Here are a few options.

    BRIOCHE â Has a firm crust but a soft center.
    SOURDOUGH â I love the contrast of this tangy bread to the sweet French toast.
    FRENCH B

> **Note**: Language models have a token limit. It is a good idea when you split your text into chunks to know how many tokens a chunck contains. You can read further here: [Langchain - Split by tokens](https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/split_by_token)

In [None]:
print(len(chunks))

42


In [None]:
from langchain.embeddings import OpenAIEmbeddings

embeddings_model = OpenAIEmbeddings()
embeddings = embeddings_model.embed_documents(chunks)
len(embeddings), len(embeddings[0])

(42, 1536)

Now we have 27 embeddings, one for each chunk of text. Each embeddings is a vector of size 1536.

> **Suggestion**: There is a huge list of embedding models that langchain gives us access to. Some of them do not require API keys. Due to langchain it is easy to swap in the code any other embedder you want to try. You can browse the list here: [Langchain - Text embedding models](https://python.langchain.com/docs/integrations/text_embedding)

### STEP 3: Store embeddings

Often, especially when dealing with large amounts of text, it is a good idea to have a place to store the text and the embeddings, as well as maintain a mapping between them. For this reason there exist the concept of <span style="color:purple">Vector Stores</span>, which are databases dedicated in handling vectors (notice that the embeddings are nothing else but vectors). They take care not only of vector storage but also vector space search, making search for similar embeddings efficient.

Langchain supports the concept of Vector Stores. You can create a Vector Store of your embeddings as follows:

```Python

    # prep step: install chromadb v0.4.13
    from langchain.vectorstores import Chroma

    db = Chroma.from_texts(chunks, OpenAIEmbeddings())

```

This example is simple enough to not require a Vector store. However we encourage you to experiment with different Vector Stores when working with a larger project. You can read more about them here: [Langchain - Vector Stores](https://python.langchain.com/docs/modules/data_connection/vectorstores/)

## Part 2: Query Knowledge Base

<img src="https://www.donwoodlock.com/ml301-Nov2023/querytxt_demo/images/response.png" width="900" height="400" alt="Query Pipeline">




In the previous section you learned how to create a knowledge base. Now we will continue with the fun part: Answering all of your questions!

#### Question: How many ingredients do I need to make french toast? 🤔

In [None]:
question = "How many ingredients do I need to make french toast?"

### STEP 1: Create question embedding

One of the main reasons to convert text into embeddings is so that we can identify text with similar content using math! Since we try to find an answer, we care about the similarity of our question to other text in our knowledge base. To do that you will first convert the question to embedding.

In [None]:
import numpy as np

question_embedding = np.array(embeddings_model.embed_query(question))
question_embedding

array([ 0.01837343,  0.02399293,  0.00805375, ...,  0.01656069,
        0.0140617 , -0.03306959])

### STEP 2: Search for similar embeddings

Now you can go ahead and search for similar embeddings. For that you can use any available distance between two points metric. In fact you can implement it or use a premade function. In this example we show you how you can compute the cosine similarity of the question embedding to all the embeddings in the knowledge base.

In [None]:
distances = []

for embedding in embeddings:
    distances.append(np.linalg.norm(embedding - question_embedding)) # L2 Norm is the same as Euclidean distance.)

distances = np.array(distances)
distances.shape

(42,)

Once you have the distance you can sort your results to find the top k embeddings as follows:

In [None]:
k = 4
sorted_indices = np.argsort(distances)
top_k = sorted_indices[:k]

similar_docs = [chunks[idx] for idx in top_k]

In [None]:
for doc in similar_docs:
    # grab first 200 characters
    text = doc# [:200]

    # highlight eggs, milk or butter
    text = text.replace('eggs', BOLD + PURPLE + 'eggs' + RESET)
    text = text.replace('milk', BOLD + PURPLE + 'milk' + RESET)
    text = text.replace('sugar', BOLD + PURPLE + 'sugar' + RESET)
    text = text.replace('bread', BOLD + PURPLE + 'bread' + RESET)

    print(text + "\n")

Ingredients

You only need five ingredients and I bet you have all of these in your kitchen right now:

    Eggs
    Milk
    Vanilla extract
    Cinnamon
    Pinch of salt

Substituting Milk

We use whole [1m[95mmilk[0m because the higher fat content makes the French toast nice and creamy, but you can substitute heavy cream, almond [1m[95mmilk[0m, or even coconut [1m[95mmilk[0m. The flavors will change a little, but the end result will still be delicious!

    Heavy Cream-use heavy cream for an extra decadent French toast. You can also use half [1m[95mmilk[0m and half heavy cream.
    Almond Milk-unsweetened vanilla almond [1m[95mmilk[0m makes great French toast. I love the extra vanilla flavor.

A plate with two slices of french toast cut in half with powdered [1m[95msugar[0m and syrup drizzled on top, next to two slices of bacon.

If I offered up breakfast options for a family vote, everyone would choose this classic French Toast recipe, served wit

> Note: Why top k? Well, don't forget we pay by token! We cannot just feed the full context into the model...

#### What if you have a Vector Store?

Then maybe your life is a little bit easier. Here is how you can query a Vector Store to avoid all this hustle (and by all, we mean steps 1 & 2):

```Python
    docs = db.similarity_search(question)
    print(len(docs))
    print(docs[0].page_content)
```

### STEP 3: Generate a response

You have collected all the information that is close enough to the question. You can use it to generate a response. In this example we use ChatCompletion from openai as well as prompt before the prompt ideas to set the stage for the ai to respond to question.

In [None]:
import openai

# >>>>>>> create context: concatenate all relevant information to be used in text generation
context = "\n\n".join([doc for doc in similar_docs])

# >>>>>>> create language model
parameters = {
    'model': 'gpt-4',
    'messages': [
        {"role": "system", "content":  "You are an assistant who is helping Marta \
                                        to respond to questions about french toast recipes.  \
                                        When you answer the question, use the first person (e.g. We) to refer to Marta."},

        {"role": "user", "content": question},

        {"role": "assistant", "content": f"Use only this information from a set of french toast recipes \
                                            as context to answer the user question:\n\n {context}. \
                                            \n\nPlease stick to this context when answering the question. \
                                            Do not use any other knowledge you might have" }
    ]
}

response = openai.ChatCompletion.create(
    model=parameters['model'],
    messages=parameters['messages'],
    max_tokens=1000,
    n=1
)

# >>>>>>> generate response
answer = response.choices[0]['message']['content']

answer = answer.replace('.', '.\n\t')

print("Question:\n\t", question)
print("Answer:\n\t", answer)

Question:
	 How many ingredients do I need to make french toast?
Answer:
	 We need eight ingredients to make French toast:

1.
	 4 large eggs
2.
	 2/3 cup milk (you can substitute with heavy cream, almond milk, or coconut milk)
3.
	 1/4 cup all-purpose flour
4.
	 1/4 cup granulated sugar
5.
	 1/4 teaspoon salt
6.
	 1 teaspoon ground cinnamon
7.
	 1 teaspoon vanilla extract
8.
	 8 thick slices bread

Please note that the flavors will change a little with different types of milk, but the end result will still be delicious!


In [None]:
import re
print(re.sub(' +', ' ', parameters['messages'][2]['content']))

Use only this information from a set of french toast recipes as context to answer the user question:

 Ingredients

You only need five ingredients and I bet you have all of these in your kitchen right now:

 Eggs
 Milk
 Vanilla extract
 Cinnamon
 Pinch of salt

Substituting Milk

We use whole milk because the higher fat content makes the French toast nice and creamy, but you can substitute heavy cream, almond milk, or even coconut milk. The flavors will change a little, but the end result will still be delicious!

 Heavy Cream-use heavy cream for an extra decadent French toast. You can also use half milk and half heavy cream.
 Almond Milk-unsweetened vanilla almond milk makes great French toast. I love the extra vanilla flavor.

A plate with two slices of french toast cut in half with powdered sugar and syrup drizzled on top, next to two slices of bacon.

If I offered up breakfast options for a family vote, everyone would choose this classic French Toast recipe, served

### What about bad questions?

In [None]:
question = "I am a cat. Can I make french toast?"

In [None]:
question_embedding = np.array(embeddings_model.embed_query(question))
cosine_similarities = np.dot(embeddings, question_embedding) / (np.linalg.norm(embeddings, axis=1) * np.linalg.norm(question_embedding))
k = 4
sorted_indices = np.argsort(cosine_similarities)
top_k = sorted_indices[-k:][::-1]

similar_docs = [chunks[idx] for idx in top_k]

# >>>>>>> create context: concatenate all relevant information to be used in text generation
context = "\n\n".join([doc for doc in similar_docs])

# >>>>>>> create language model
parameters = {
    'model': 'gpt-4',
    'messages': [
        {"role": "system", "content":  "You are an assistant who is helping Marta \
                                        to respond to questions about french toast recipes.  \
                                        When you answer the question, use the first person (e.g. We) to refer to Marta."},

        {"role": "user", "content": question},

        {"role": "assistant", "content":  f"Use only this information from a set of french toast recipes \
                                            as context to answer the user question:\n\n {context}. \
                                            \n\nPlease stick to this context when answering the question. \
                                            Do not use any other knowledge you might have." }
    ]
}

response = openai.ChatCompletion.create(
    model=parameters['model'],
    messages=parameters['messages'],
    max_tokens=1000,
    n=1
)

# >>>>>>> generate response
answer = response.choices[0]['message']['content']

answer = answer.replace('.', '.\n\t')

print("Question:\n\t", question)
print("Answer:\n\t", answer)

Question:
	 I am a cat. Can I make french toast?
Answer:
	 While we appreciate your enthusiasm for French toast, since you're a cat, we cannot recommend you attempt to make it.
	 This cooking process involves using a stove, and it's crucial to prioritize safety.
	 However, if your human owner wants to learn how to make French toast, we have many recipes they could try! French toast is our go-to answer for a warm, effortless weekend breakfast.
	 All we need to do is whisk together eggs, milk, cinnamon, and vanilla, then dip bread into this mixture and pan fry in butter until golden.
	 Topping options are plentiful - our favorite is a liberal dousing of maple syrup.
	


### Try yourself:

Question 1: What will happen if chunks are too small or if they do not have overlap? You can experiment with parameters _chunk_size_ and _chunk_overlap_ in the following code block.

> Remember: Since chunks are smaller you might need to adjust the amount of chunks you use to generate the response (top_k)

In [None]:
import openai
import numpy as np
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
import os

# load
if os.path.exists("thefrenchtoast.txt"):
    with open('thefrenchtoast.txt', "r", encoding="utf8") as f:
        text = ""
        for l in f.readlines():
            text += l

# split
text_splitter = CharacterTextSplitter(
    separator=" ",
    chunk_size= 50, # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< CHANGE chunck_size here <<<<<<<<<<
    chunk_overlap= 0, # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< CHANGE chunck_overlap here <<<<<<<
    length_function=len
)

chunks = text_splitter.split_text(text)

# create embeddings
embeddings_model = OpenAIEmbeddings()
embeddings = embeddings_model.embed_documents(chunks)

# create question embedding
question = "How many ingredients do I need to make french toast?"
question_embedding = np.array(embeddings_model.embed_query(question))

# compute distance
distances = []

for embedding in embeddings:
    distances.append(np.linalg.norm(embedding - question_embedding)) # L2 Norm is the same as Euclidean distance.)

distances = np.array(distances)
distances.shape

k = 4 # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<  CHANGE k here <<<<<<<<<<<<<<<<<<<
sorted_indices = np.argsort(distances)
top_k = sorted_indices[:k]

similar_docs = [chunks[idx] for idx in top_k]

# >>>>>>> create context: concatenate all relevant information to be used in text generation
context = "\n\n".join([doc for doc in similar_docs])
print("Context used: \n\t", context.replace('\n\n', '\n\t'), '\n')

# >>>>>>> create language model
parameters = {
    'model': 'gpt-4',
    'messages': [
        {"role": "system", "content":  "You are an assistant who is helping Marta \
                                        to respond to questions about french toast recipes.  \
                                        When you answer the question, use the first person (e.g. We) to refer to Marta."},

        {"role": "user", "content": question},

        {"role": "assistant", "content":  f"Use only this information from a set of french toast recipes \
                                            as context to answer the user question:\n\n {context}. \
                                            \n\nPlease stick to this context when answering the question. \
                                            Do not use any other knowledge you might have"}
    ]
}

response = openai.ChatCompletion.create(
    model=parameters['model'],
    messages=parameters['messages'],
    max_tokens=1000,
    n=1
)

# >>>>>>> generate response
answer = response.choices[0]['message']['content']

answer = answer.replace('.', '.\n\t')

print("Question:\n\t", question)
print("Answer:\n\t", answer)

Context used: 
	 quick! This is how to make French Toast â whisk
	this weekend!

Pile of French Toast on a rustic
	RECIPE 3

recipetineats.com
French
	golden, then douse with maple syrup!

French 

Question:
	 How many ingredients do I need to make french toast?
Answer:
	 I'm sorry, but the context provided doesn't include specific information about the number of ingredients needed to make French toast.
	


Question 2: Can you make the assistant respond in a different tone or more explicitly in the behalf or Marta? What if you always want it to end the response with "Hope I made you smile". You can experiment with the response generation in the following code block.

In [None]:
parameters = {
    'model': 'gpt-4',
    'messages': [
        # <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<  CHANGE system and/or assistant here <<<<<<<<<<<<<<<<<<<

        {"role": "system", "content":  "You are an assistant who is helping Marta \
                                        to respond to questions about french toast recipes.  \
                                        When you answer the question, use the third person \
                                        (e.g. We) to refer to Marta. Always end your response \
                                        with 'Hope I made you smile'"
        },

        {"role": "user", "content": question},

        {"role": "assistant",
         "content":  f"Use this information from a set of french toast recipes \
                    as context to answer the user question:\n\n {context}. \
                    \n\nPlease stick to this context when answering the question."
        }
    ]
}

response = openai.ChatCompletion.create(
    model=parameters['model'],
    messages=parameters['messages'],
    max_tokens=1000,
    n=1
)

# >>>>>>> generate response
answer = response.choices[0]['message']['content']

answer = answer.replace('.', '.\n\t')

print("Question:\n\t", question)
print("Answer:\n\t", answer)

Question:
	 How many ingredients do I need to make french toast?
Answer:
	 Marta typically uses seven primary ingredients for her French toast recipe.
	 These ingredients are bread, eggs, milk, granulated sugar, salt, cinnamon, and unsalted butter for cooking.
	 Additionally, she uses maple syrup for serving.
	 But remember, recipes can be quite flexible, so feel free to add some extra toppings like whipped cream, berries, or even powdered sugar, depending on your preference.
	 Hope I made you smile.
	
