# Langchain for LLM Application Development

Based on the course from deeplearning.ai : https://learn.deeplearning.ai/langchain/

## Import Libraries & Helper functions

In [68]:
import openai
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv(filename="LangChain for LLM Application Development/secrets.env", raise_error_if_not_found=True)
) # read local .env file
openai.api_key = os.environ["OPENAI_API_KEY"]


In [27]:
def get_completion(prompt, model="gpt-3.5-turbo",):
	messages = [{
		"role": "user",
		"content": prompt
		}]
	response = openai.ChatCompletion.create(
		model=model,
		messages = messages,
		temperature = 0,
	)
	return response.choices[0].message["content"]

## Models, Prompt and parsers

### Models

### OpenAI API

In [28]:
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse,\
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

style = """American English \
in a calm and respectful tone
"""

prompt = f"""Translate the text \
that is delimited by triple backticks 
into a style that is {style}.
text: ```{customer_email}```
"""

print(f"Prompt of the API call: \n{prompt}")

response = get_completion(prompt=prompt)

print(f"Response of the LLM: \n{response}")


Prompt of the API call: 
Translate the text that is delimited by triple backticks 
into a style that is American English in a calm and respectful tone
.
text: ```
Arrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse,the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!
```

Response of the LLM: 
I am quite frustrated that my blender lid flew off and made a mess of my kitchen walls with smoothie! To add to my frustration, the warranty does not cover the cost of cleaning up my kitchen. I kindly request your assistance at this moment, my friend.


### LangChain API

In [29]:
from langchain.chat_models import ChatOpenAI
llm_model = "gpt-3.5-turbo"
chat = ChatOpenAI(temperature=0.0, model=llm_model)
# chat

## Prompt Template

In [30]:
from langchain.prompts import ChatPromptTemplate

# Define the template with it's variable
templateString = """
Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

# Create the LangChain Template
promptTemplate = ChatPromptTemplate.from_template(templateString)

# Example Style
customerStyle = """
American Englisch \
in a calm and respectful tone
"""

# Example Email
customerEmail = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse, \
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

# Add variables to prompt
customerMessages = promptTemplate.format_messages(
	style = customerStyle,
	text = customerEmail,
)

# Prompt the model to get a response
customerResponse = chat(customerMessages)

# Print the response
print(customerResponse.content)

I'm really frustrated that my blender lid flew off and made a mess of my kitchen walls with smoothie! And to make things even worse, the warranty doesn't cover the cost of cleaning up my kitchen. I could really use your help right now, my friend!


## Output parser

### Content extraction

In [51]:
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

In [53]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(template=review_template)
print(prompt_template)

messages = prompt_template.format_messages(text=customer_review)
chat = ChatOpenAI(model=llm_model, temperature=0.0)
response = chat(messages=messages)
print(response)

input_variables=['text'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['text'], template='For the following text, extract the following information:\n\ngift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.\n\ndelivery_days: How many days did it take for the product to arrive? If this information is not found, output -1.\n\nprice_value: Extract any sentences about the value or price,and output them as a comma separated Python list.\n\nFormat the output as JSON with the following keys:\ngift\ndelivery_days\nprice_value\n\ntext: {text}\n'))]
content='{\n  "gift": false,\n  "delivery_days": 2,\n  "price_value": ["It\'s slightly more expensive than the other leaf blowers out there, but I think it\'s worth it for the extra features."]\n}'


In [54]:
type(response.content)

str

### Parser

In [57]:
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

gift_schema = ResponseSchema(
	name="gift",
	description= "Was the item purschased as a gift for someone else? Answer True if yes, False if not or unknown."
)

delivery_days_schema = ResponseSchema(name="delivery_days",
                                      description="How many days\
                                      did it take for the product\
                                      to arrive? If this \
                                      information is not found,\
                                      output -1.")

price_value_schema = ResponseSchema(name="price_value",
                                    description="Extract any\
                                    sentences about the value or \
                                    price, and output them as a \
                                    comma separated Python list.")

response_schemas = [gift_schema, 
                    delivery_days_schema,
                    price_value_schema]

output_parser = StructuredOutputParser.from_response_schemas(response_schemas = response_schemas)

format_instructions = output_parser.get_format_instructions()

print(format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"gift": string  // Was the item purschased as a gift for someone else? Answer True if yes, False if not or unknown.
	"delivery_days": string  // How many days                                      did it take for the product                                      to arrive? If this                                       information is not found,                                      output -1.
	"price_value": string  // Extract any                                    sentences about the value or                                     price, and output them as a                                     comma separated Python list.
}
```


In [60]:
review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

prompt = ChatPromptTemplate.from_template(template=review_template_2)

messages = prompt.format_messages(text=customer_review, 
                                format_instructions=format_instructions)

response = chat(messages=messages)

output_dict = output_parser.parse(response.content)

print(f"output_dict = {output_dict}")
print(f"type(output_dict) = {type(output_dict)}")

output_dict = {'gift': False, 'delivery_days': '2', 'price_value': "It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."}
type(output_dict) = <class 'dict'>


In [67]:
from icecream import ic

ic(output_dict)

ic| output_dict: {'delivery_days': '2',
                  'gift': False,
                  'price_value': "It's slightly more expensive than the other leaf blowers out "
                                 "there, but I think it's worth it for the extra features."}


{'gift': False,
 'delivery_days': '2',
 'price_value': "It's slightly more expensive than the other leaf blowers out there, but I think it's worth it for the extra features."}

## Memory

### ConversationBufferMemory

In [31]:
# Import functions
from langchain.chat_models import ChatOpenAI
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

# Create conversation chain instance
llm = ChatOpenAI(temperature=0.0, model = llm_model)
memory = ConversationBufferMemory()
conversation = ConversationChain(llm=llm, memory=memory, verbose=True)

In [32]:
conversation.predict(input="Hi, my name is Sylvain, how are you doing?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:

Human: Hi, my name is Sylvain, how are you doing?
AI:[0m

[1m> Finished chain.[0m


"Hello Sylvain! I'm an AI, so I don't have feelings, but I'm here and ready to chat with you. How can I assist you today?"

In [33]:
conversation.predict(input="Do you know the circumference of the earth?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Sylvain, how are you doing?
AI: Hello Sylvain! I'm an AI, so I don't have feelings, but I'm here and ready to chat with you. How can I assist you today?
Human: Do you know the circumference of the earth?
AI:[0m

[1m> Finished chain.[0m


'Yes, I do! The circumference of the Earth is approximately 40,075 kilometers (24,901 miles) at the equator. However, it is slightly smaller around the poles, measuring around 40,008 kilometers (24,860 miles).'

In [34]:
conversation.predict(input="Thank you. Do you still remember my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: Hi, my name is Sylvain, how are you doing?
AI: Hello Sylvain! I'm an AI, so I don't have feelings, but I'm here and ready to chat with you. How can I assist you today?
Human: Do you know the circumference of the earth?
AI: Yes, I do! The circumference of the Earth is approximately 40,075 kilometers (24,901 miles) at the equator. However, it is slightly smaller around the poles, measuring around 40,008 kilometers (24,860 miles).
Human: Thank you. Do you still remember my name?
AI:[0m

[1m> Finished chain.[0m


'Yes, your name is Sylvain.'

### ConversationBufferWindowMemory

In [35]:
from langchain.memory import ConversationBufferWindowMemory

# create the memory
memory = ConversationBufferWindowMemory(k=1)

# load the memory with some context
memory.save_context({"input": "Hi"},
                    {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})

# display what is currently held in memory
memory.load_memory_variables({})

{'history': 'Human: Not much, just hanging\nAI: Cool'}

In [36]:
llm = ChatOpenAI(temperature=0.0, model=llm_model)
memory = ConversationBufferWindowMemory(k=1)
conversation = ConversationChain(
	memory=memory,
	llm=llm,
	verbose=False,
)

In [37]:
conversation.predict(input = "Hi, my name is Sylvain, who are you?")

"Hello Sylvain! I'm an AI language model developed by OpenAI. I don't have a personal name, but you can call me OpenAI if you'd like. How can I assist you today?"

In [38]:
conversation.predict(input="Do you know which planet of our solar system has the highest number of moons?")

'Yes, I do! The planet with the highest number of moons in our solar system is Jupiter. Jupiter has a whopping 79 known moons as of now. Some of its largest moons include Ganymede, Callisto, Io, and Europa.'

In [39]:
conversation.predict(input="What is my name?")

"I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation. I am designed to respect user privacy and confidentiality."

In [40]:
conversation.memory.load_memory_variables({})

{'history': "Human: What is my name?\nAI: I'm sorry, but I don't have access to personal information about individuals unless it has been shared with me in the course of our conversation. I am designed to respect user privacy and confidentiality."}

### ConversationTokenBufferMemory

In [42]:
from langchain.memory import ConversationTokenBufferMemory
from langchain.llms import OpenAI

llm = ChatOpenAI(model=llm_model, temperature=0.0)
memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=50)

memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

memory.load_memory_variables({})

{'history': 'AI: Amazing!\nHuman: Backpropagation is what?\nAI: Beautiful!\nHuman: Chatbots are what?\nAI: Charming!'}

### ConversationSummaryBufferMemory

In [43]:
from langchain.memory import ConversationSummaryBufferMemory

# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."

memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

memory.load_memory_variables({})

{'history': 'System: The human and AI exchange greetings. The human asks about the schedule for the day. The AI provides a detailed schedule, including a meeting with the product team, work on the LangChain project, and a lunch meeting with a customer interested in AI. The AI emphasizes the importance of bringing a laptop to showcase the latest LLM demo during the lunch meeting.'}

In [45]:
conversation = ConversationChain(
	llm=llm,
	memory=memory,
	verbose=True
)
conversation.predict(input="What would be a good demo to show?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human and AI exchange greetings. The human asks about the schedule for the day. The AI provides a detailed schedule, including a meeting with the product team, work on the LangChain project, and a lunch meeting with a customer interested in AI. The AI emphasizes the importance of bringing a laptop to showcase the latest LLM demo during the lunch meeting.
Human: What would be a good demo to show?
AI:[0m

[1m> Finished chain.[0m


'A good demo to show during the lunch meeting with the customer interested in AI would be the latest LLM (Language Model) demo. The LLM is a cutting-edge AI model that can generate human-like text based on a given prompt. It has been trained on a vast amount of data and can generate coherent and contextually relevant responses. By showcasing the LLM demo, you can demonstrate the capabilities of our AI technology and how it can be applied to various industries and use cases.'

In [46]:
memory.load_memory_variables({})

{'history': 'System: The human and AI exchange greetings and discuss the schedule for the day. The AI provides a detailed schedule, including a meeting with the product team, work on the LangChain project, and a lunch meeting with a customer interested in AI. The AI emphasizes the importance of bringing a laptop to showcase the latest LLM demo during the lunch meeting. The human asks what would be a good demo to show, and the AI suggests showcasing the latest LLM (Language Model) demo. The LLM is a cutting-edge AI model that can generate human-like text based on a given prompt. It has been trained on a vast amount of data and can generate coherent and contextually relevant responses. By showcasing the LLM demo, the AI believes they can demonstrate the capabilities of their AI technology and how it can be applied to various industries and use cases.'}

In [47]:
conversation.predict(input="What was my schedule again?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human and AI exchange greetings and discuss the schedule for the day. The AI provides a detailed schedule, including a meeting with the product team, work on the LangChain project, and a lunch meeting with a customer interested in AI. The AI emphasizes the importance of bringing a laptop to showcase the latest LLM demo during the lunch meeting. The human asks what would be a good demo to show, and the AI suggests showcasing the latest LLM (Language Model) demo. The LLM is a cutting-edge AI model that can generate human-like text based on a given prompt. It has been trained on a vast amount of data and can generate coherent and contextua

"Your schedule for today includes a meeting with the product team, work on the LangChain project, and a lunch meeting with a customer interested in AI. It's important to bring your laptop to showcase the latest LLM demo during the lunch meeting."

In [48]:
memory.load_memory_variables({})

{'history': "System: The human and AI exchange greetings and discuss the schedule for the day. The AI provides a detailed schedule, including a meeting with the product team, work on the LangChain project, and a lunch meeting with a customer interested in AI. The AI emphasizes the importance of bringing a laptop to showcase the latest LLM demo during the lunch meeting. The human asks what would be a good demo to show, and the AI suggests showcasing the latest LLM (Language Model) demo. The LLM is a cutting-edge AI model that can generate human-like text based on a given prompt. It has been trained on a vast amount of data and can generate coherent and contextually relevant responses. By showcasing the LLM demo, the AI believes they can demonstrate the capabilities of their AI technology and how it can be applied to various industries and use cases.\nHuman: What was my schedule again?\nAI: Your schedule for today includes a meeting with the product team, work on the LangChain project, a

In [49]:
conversation.predict(input="What are the best use case of LLMs in the AEC industry?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
System: The human and AI exchange greetings and discuss the schedule for the day. The AI provides a detailed schedule, including a meeting with the product team, work on the LangChain project, and a lunch meeting with a customer interested in AI. The AI emphasizes the importance of bringing a laptop to showcase the latest LLM demo during the lunch meeting. The human asks what would be a good demo to show, and the AI suggests showcasing the latest LLM (Language Model) demo. The LLM is a cutting-edge AI model that can generate human-like text based on a given prompt. It has been trained on a vast amount of data and can generate coherent and contextua

'LLMs, or Language Models, have a wide range of applications in the Architecture, Engineering, and Construction (AEC) industry. Some of the best use cases of LLMs in the AEC industry include:\n\n1. Design Assistance: LLMs can assist architects and designers in generating design ideas and concepts based on specific requirements or constraints. They can provide suggestions for floor plans, building layouts, material choices, and more.\n\n2. Project Documentation: LLMs can help automate the process of generating project documentation, such as specifications, reports, and proposals. They can generate coherent and contextually relevant text based on input prompts, saving time and effort for project teams.\n\n3. Natural Language Interfaces: LLMs can be used to develop natural language interfaces for AEC software applications. This allows users to interact with the software using conversational language, making it more intuitive and user-friendly.\n\n4. Code Generation: LLMs can assist in gen

## Chains

### LLMchain

In [69]:
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.chains import LLMChain

In [78]:
llm = ChatOpenAI(temperature=0.9, model=llm_model)
prompt = ChatPromptTemplate.from_template(
	template="What is the best name for a company that makes {product}"
	)
chain = LLMChain(llm=llm, prompt=prompt)
product = "umbrella for blind mantis"
names = chain.run(product)
print(names)

"MantisShield"


### SimpleSequentialChain

In [81]:
from langchain.chains import SimpleSequentialChain

# initiate the LLM
llm = ChatOpenAI(temperature=0.9, model=llm_model)

# first prompt template
first_prompt = ChatPromptTemplate.from_template(
	template="What is the best name for a company that makes {product}"
)

# first chain
first_chain = LLMChain(llm=llm, prompt=first_prompt)

# second prompt
second_prompt = ChatPromptTemplate.from_template(template="Make a 20 word description of the company: {company}")

# second chain
second_chain = LLMChain(llm=llm, prompt=second_prompt)

# SimpleSequentialChain
overallSimpleChain = SimpleSequentialChain(verbose=True, chains=[first_chain, second_chain])

# Run chain
product = "teddy bears that fart from time to time when you hug them."
overallSimpleChain.run(product)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3mTeddy Toots[0m
[33;1m[1;3mTeddy Toots is a playful and whimsical toy company specializing in producing adorable and high-quality plush teddy bears.[0m

[1m> Finished chain.[0m


'Teddy Toots is a playful and whimsical toy company specializing in producing adorable and high-quality plush teddy bears.'

### SequentialChain

In [86]:
from langchain.chains import SequentialChain

# first_chain
prompt1 = ChatPromptTemplate.from_template(
	template="Translate the followinr review to english:\n\n"
	"Review = {review}"
	)
chain1 = LLMChain(llm=llm, prompt=prompt1, output_key="review_translation")

# second chain
prompt2 = ChatPromptTemplate.from_template(template="What is the language of this review?\nAnswer only the language, do not build up a full sentence\n\nReview: {review}")
chain2 = LLMChain(llm=llm, prompt=prompt2, output_key="review_language")

# third chain
prompt3 = ChatPromptTemplate.from_template(
	template="Write a 1 sentence summary of this review translation:\n\nReview translation: {review_translation}"
)
chain3 = LLMChain(llm=llm, prompt=prompt3, output_key="review_summary")

# fourth chain
prompt4 = ChatPromptTemplate.from_template(
	template="Write a short answer to the following review in the specified language" 
	"\n\nlanguage: {review_language}\n\nReview summary: {review_summary}"

)
chain4 = LLMChain(llm=llm, prompt=prompt4, output_key = "answer")

# overall chain
overallChain = SequentialChain(
	verbose=True, 
	chains=[chain1, chain2, chain3, chain4], 
	input_variables=["review"],
	output_variables=["review_language", "review_translation", "review_summary", "answer"]
	)

# input
review = "Ce sac de brique répond parfaitement à mes attentes, il contient de bonnes briques rouge bien solide."

# run the overall chain
overallChain(review)



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


{'review': 'Ce sac de brique répond parfaitement à mes attentes, il contient de bonnes briques rouge bien solide.',
 'review_language': 'French',
 'review_translation': 'Review = This brick bag perfectly meets my expectations, it contains good, solid red bricks.',
 'review_summary': 'The reviewer is satisfied with the brick bag as it contains good quality red bricks.',
 'answer': "Je suis ravi de constater que vous êtes satisfait de notre sac de briques. Nous nous efforçons toujours de fournir des briques de haute qualité à nos clients. Merci d'avoir pris le temps de partager votre avis. Nous espérons vous revoir bientôt !"}

## Q&A over documents

### High level (lots of abstraction)

In [1]:
import os
import docarray

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv(filename="LangChain for LLM Application Development/secrets.env", raise_error_if_not_found=True)
) # read local .env file

ROOT_DIR = os.environ["ROOT_DIR"]
llm_model = "gpt-3.5-turbo"


from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from langchain.indexes import VectorstoreIndexCreator
from IPython.display import display, Markdown

file = ROOT_DIR + "LangChain for LLM Application Development/Datasets/OutdoorClothingCatalog_1000.csv"
loader = CSVLoader(file_path=file)

index = VectorstoreIndexCreator(vectorstore_cls=DocArrayInMemorySearch).from_loaders(loaders=[loader])

Query = "List all the shirts with sun protections in a table in markdown format. For each shirt, give a summary of the product, as well as its price."

response = index.query(question=Query)

display(Markdown(response))

InvalidRequestError: The model `text-davinci-003` has been deprecated, learn more here: https://platform.openai.com/docs/deprecations

### Step by step

In [17]:
import os
from IPython.display import display, Markdown

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv(filename="LangChain for LLM Application Development/secrets.env", raise_error_if_not_found=True)
) # read local .env file

ROOT_DIR = os.environ["ROOT_DIR"]
llm_model = "gpt-3.5-turbo"

csv_path = ROOT_DIR + "LangChain for LLM Application Development/Datasets/OutdoorClothingCatalog_1000.csv"



#### Load the documents

In [3]:
# Create the loader
from langchain.document_loaders import CSVLoader
loader = CSVLoader(file_path=csv_path)

# load the csv file 
docs = loader.load()
docs_type = type(docs)
docs_length = len(docs)
print(f"docs is a {docs_type} with {docs_length} elements. \nThe first element is {docs[0]}")

docs is a <class 'list'> with 1000 elements. 
The first element is page_content=": 0\nname: Women's Campside Oxfords\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \n\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \n\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \n\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \n\nQuestions? Please contact us for any inquiries." metadata={'source': '/Users/sylvain/Data_Science/Projects/Various-ML-projects/LangChain for LLM Application Development/Datasets/Outdoo

#### Create the Embeddings

In [6]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import DocArrayInMemorySearch

# Define the Embedding model
embeddings = OpenAIEmbeddings(model="text-embedding-ada-002")

# Try it out
test_query = "Hi, my name is Sylvain"
test_embedding = embeddings.embed_query(test_query)
print(f"The Embedding of {test_query} has {len(test_embedding)} elements.\nThe first 5 elements are: {test_embedding[:5]}")

# Create a DataBase with the documents and their embeddings
db = DocArrayInMemorySearch.from_documents(
	docs,
	embeddings
)
print("\ndb is created")

The Embedding of Hi, my name is Sylvain has 1536 elements.
The first 5 elements are: [-0.00748860418001652, 0.0048442697221567094, -0.004598431761413947, -0.018330691408425633, -0.032324548856810194]


#### Retrieve similar elements from db

In [12]:
# Try it out manually
query = "Select the shirts with UV protection"
UV_docs = db.similarity_search(query)

print(f"There are {len(UV_docs)} matching products.\n\
The first one is: {UV_docs[0]}")

# Build the retriever
retriever = db.as_retriever()

There are 4 matching products.
The first one is: page_content=": 374\nname: Men's Plaid Tropic Shirt, Short-Sleeve\ndescription: Our Ultracomfortable sun protection is rated to UPF 50+, helping you stay cool and dry. Originally designed for fishing, this lightest hot-weather shirt offers UPF 50+ coverage and is great for extended travel. SunSmart technology blocks 98% of the sun's harmful UV rays, while the high-performance fabric is wrinkle-free and quickly evaporates perspiration. Made with 52% polyester and 48% nylon, this shirt is machine washable and dryable. Additional features include front and back cape venting, two front bellows pockets and an imported design. With UPF 50+ coverage, you can limit sun exposure and feel secure with the highest rated sun protection available." metadata={'source': '/Users/sylvain/Data_Science/Projects/Various-ML-projects/LangChain for LLM Application Development/Datasets/OutdoorClothingCatalog_1000.csv', 'row': 374}


#### Create the QA Chain

In [19]:
from langchain.chains import RetrievalQA
from langchain.chat_models import ChatOpenAI

# Instantiate the LLM
llm = ChatOpenAI(temperature=0.0, model=llm_model)

# Create the QA Chain
qa_chain = RetrievalQA.from_chain_type(
	llm=llm, # the LLM instantiated above
	chain_type="stuff", # the most common chain type, "stuff" just aggregate all documents retrived from the db into one prompt call to the LLM
	retriever=retriever, # the retriever from our in memory db instantiated above
	verbose = True
)

# Test it
query = "List all the shirts with sun protections in a table in markdown format. For each shirt, give the UV index, its composition, and a summary of the product."

output = qa_chain.run(query)

display(Markdown(output))



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


| UV Index | Composition | Summary |
|----------|-------------|---------|
| 255      | 78% nylon, 22% Lycra Xtra Life fiber | Sun Shield Shirt: High-performance sun shirt with UPF 50+ sun protection. Wicks moisture for quick-drying comfort. Fits comfortably over swimsuit. Abrasion resistant. |
| 618      | 100% polyester | Men's Tropical Plaid Short-Sleeve Shirt: Lightest hot-weather shirt with UPF 50+ sun protection. Relaxed fit with front and back cape venting. Wrinkle-resistant. Two front bellows pockets. |
| 374      | 52% polyester, 48% nylon | Men's Plaid Tropic Shirt, Short-Sleeve: Ultracomfortable sun protection with UPF 50+ coverage. Great for fishing and travel. SunSmart technology blocks 98% of UV rays. Wrinkle-free and quick-drying. |
| 535      | Shell: 71% Nylon, 29% Polyester. Lining: 100% Polyester knit mesh | Men's TropicVibe Shirt, Short-Sleeve: Lightweight sun-protection shirt with UPF 50+ rating. Traditional fit with front and back cape venting. Wrinkle resistant. Two front bellows pockets. |

## Evaluation

In [1]:
#TODO

## Agent

In [2]:
import os

from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv(filename="secrets.env", raise_error_if_not_found=True))

import warnings
warnings.filterwarnings("ignore")

llm_model = "gpt-3.5-turbo"

### Built-in tools

In [21]:
from langchain.agents.agent_toolkits import create_python_agent
from langchain.agents import load_tools, initialize_agent, AgentType
from langchain.tools.python.tool import PythonREPLTool
from langchain.python import PythonREPL
from langchain.chat_models import ChatOpenAI
import langchain

In [24]:
# instantiate the llm
llm = ChatOpenAI(temperature=0., model=llm_model)

# load the tools for the agent
tools = load_tools(
	[
		"wikipedia",
		"llm-math",
	],
	llm=llm)

# initiate the agent
agent = initialize_agent(
	tools=tools,
	llm=llm,
	agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
	handle_parsing_error = True,
	verbose=False,
)


In [26]:
question = "ich will ein Produkt kaufen, das 93 € kostet. Das produkt ist um 15% reduziert. Zusätzlich muss ich Versandkosten in höhe von 4,95 € zahlen, und ich habe eine weitere Reduktion in Höhe von 10 €. Wie viel muss ich zahlen?"
agent(question)

{'input': 'ich will ein Produkt kaufen, das 93 € kostet. Das produkt ist um 15% reduziert. Zusätzlich muss ich Versandkosten in höhe von 4,95 € zahlen, und ich habe eine weitere Reduktion in Höhe von 10 €. Wie viel muss ich zahlen?',
 'output': 'You have to pay 74.0 euros.'}

In [28]:
93*(1-0.15)+4.95-10

74.0

### Math example

In [6]:
agent("how much is 25% of 300€?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mTo calculate 25% of 300€, we can use the Calculator tool.

Thought: I will use the Calculator tool to calculate 25% of 300€.
Action:
```
{
  "action": "Calculator",
  "action_input": "25% of 300"
}
```
[0m
Observation: [33;1m[1;3mAnswer: 75.0[0m
Thought:[32;1m[1;3mThe answer is 75.0 euros.
Final Answer: 75.0 euros.[0m

[1m> Finished chain.[0m


{'input': 'how much is 25% of 300€?', 'output': '75.0 euros.'}

In [7]:
agent("I want to buy a product on a site. The product cost 100 €. There is a reduction of 10% on it. I have to pay for sending cost an additional 4,90 € fee, and I have an additional reduction of 10€. How much will I pay?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mQuestion: How much will I pay for the product after all the reductions and additional fees?
Thought: I need to calculate the final price by subtracting the reductions and adding the additional fees.
Action:
```
{
  "action": "Calculator",
  "action_input": "100 - (100 * 0.1) - 10 + 4.90"
}
```
[0m
Observation: [33;1m[1;3mAnswer: 84.9[0m
Thought:[32;1m[1;3mThe final price after all the reductions and additional fees is 84.9 €.
Final Answer: 84.9 €[0m

[1m> Finished chain.[0m


{'input': 'I want to buy a product on a site. The product cost 100 €. There is a reduction of 10% on it. I have to pay for sending cost an additional 4,90 € fee, and I have an additional reduction of 10€. How much will I pay?',
 'output': '84.9 €'}

### Wikipedia example

In [8]:
question = "Tom M. Mitchell is an American computer scientist \
and the Founders University Professor at Carnegie Mellon University (CMU)\
what book did he write?"
result = agent(question) 



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I can use Wikipedia to find information about Tom M. Mitchell and the book he wrote.
Action:
```
{
  "action": "Wikipedia",
  "action_input": "Tom M. Mitchell"
}
```[0m
Observation: [36;1m[1;3mPage: Tom M. Mitchell
Summary: Tom Michael Mitchell (born August 9, 1951) is an American computer scientist and the Founders University Professor at Carnegie Mellon University (CMU). He is a founder and former Chair of the Machine Learning Department at CMU. Mitchell is known for his contributions to the advancement of machine learning, artificial intelligence, and cognitive neuroscience and is the author of the textbook Machine Learning. He is a member of the United States National Academy of Engineering since 2010. He is also a Fellow of the American Academy of Arts and Sciences, the American Association for the Advancement of Science and a Fellow and past President of the Association for the Advancement of Artificial Inte

In [14]:
# question = "Which movie made Robert de Niro famous?"
question = "What is the population of Bavaria?"
result = agent(question)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI can use Wikipedia to find the population of Bavaria.

Thought: I will search for "population of Bavaria" on Wikipedia.
Action:
```
{
  "action": "Wikipedia",
  "action_input": "population of Bavaria"
}
```[0m
Observation: [36;1m[1;3mPage: List of cities in Bavaria by population
Summary: The following list sorts all cities and municipalities in the German state of Bavaria with a population of more than 20,000. As of December 31, 2017, 74 places fulfill this criterion and are listed here. This list refers only to the population of individual municipalities within their defined limits, which does not include other municipalities or suburban areas within urban agglomerations.

Page: Bavaria
Summary: Bavaria ( bə-VAIR-ee-ə; German: Bayern [ˈbaɪɐn] ), officially the Free State of Bavaria (German: Freistaat Bayern [ˈfʁaɪʃtaːt ˈbaɪɐn] ; Bavarian: Freistoot Bayern), is a state in the south-east of Germany. With an area of 70,550.

In [16]:
question = "Robert De Niro a-t-il déjà gagné un oscar? si oui, pour quel film?"
answer = agent(question)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mQuestion: Robert De Niro a-t-il déjà gagné un oscar? si oui, pour quel film?
Thought: I can use Wikipedia to find information about Robert De Niro's Oscar wins.
Action:
```
{
  "action": "Wikipedia",
  "action_input": "Robert De Niro"
}
```[0m
Observation: [36;1m[1;3mPage: Robert De Niro
Summary: Robert Anthony De Niro ( də NEER-roh, Italian: [de ˈniːro]; born August 17, 1943) is an American actor. Known for his collaborations with Martin Scorsese, he is considered to be one of the most influential actors of his generation. De Niro is the recipient of various accolades, including two Academy Awards, a Golden Globe Award, the Cecil B. DeMille Award, and a Screen Actors Guild Life Achievement Award. In 2009, De Niro received the Kennedy Center Honor, and earned a Presidential Medal of Freedom from U.S. President Barack Obama in 2016.
De Niro studied acting at HB Studio, Stella Adler Conservatory, and Lee Strasberg's Actors S

In [19]:
print(answer["output"])

Oui, Robert De Niro a déjà gagné un Oscar. Il a remporté l'Oscar du meilleur acteur dans un second rôle pour son rôle de Vito Corleone dans Le Parrain 2 (1974) et l'Oscar du meilleur acteur pour son rôle de Jake LaMotta dans Raging Bull (1980).


### Python Agent

In [20]:
agent = create_python_agent(
	llm=llm,
	tool=PythonREPLTool(),
	verbose=True
)

customer_list = [
	["Harrison", "Chase"], 
    ["Lang", "Chain"],
    ["Dolly", "Too"],
    ["Elle", "Elem"], 
    ["Geoff","Fusion"], 
    ["Trance","Former"],
    ["Jen","Ayai"]
]

agent.run(f"sort these customers by last name and then first name and print the output: {customer_list} ")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI can use the `sorted()` function to sort the list of customers. I will need to provide a key function that specifies the sorting criteria.
Action: Python REPL
Action Input: sorted([['Harrison', 'Chase'], ['Lang', 'Chain'], ['Dolly', 'Too'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Jen', 'Ayai']], key=lambda x: (x[1], x[0]))[0m
Observation: [36;1m[1;3m[0m
Thought:[32;1m[1;3mThe list of customers has been sorted by last name and then first name.
Final Answer: [['Jen', 'Ayai'], ['Harrison', 'Chase'], ['Lang', 'Chain'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Dolly', 'Too']][0m

[1m> Finished chain.[0m


"[['Jen', 'Ayai'], ['Harrison', 'Chase'], ['Lang', 'Chain'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Dolly', 'Too']]"

In [22]:
langchain.debug = True
agent.run(f"sort these customers by last name and then first name and print the output: {customer_list} ")
langchain.debug = False


[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor] Entering Chain run with input:
[0m{
  "input": "sort these customers by last name and then first name and print the output: [['Harrison', 'Chase'], ['Lang', 'Chain'], ['Dolly', 'Too'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Jen', 'Ayai']] "
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:LLMChain] Entering Chain run with input:
[0m{
  "input": "sort these customers by last name and then first name and print the output: [['Harrison', 'Chase'], ['Lang', 'Chain'], ['Dolly', 'Too'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Jen', 'Ayai']] ",
  "agent_scratchpad": "",
  "stop": [
    "\nObservation:",
    "\n\tObservation:"
  ]
}
[32;1m[1;3m[llm/start][0m [1m[1:chain:AgentExecutor > 2:chain:LLMChain > 3:llm:ChatOpenAI] Entering LLM run with input:
[0m{
  "prompts": [
    "Human: You are an agent designed to write and execute python code to answer questions.\nY

In [23]:
print(    "Human: You are an agent designed to write and execute python code to answer questions.\nYou have access to a python REPL, which you can use to execute python code.\nIf you get an error, debug your code and try again.\nOnly use the output of your code to answer the question. \nYou might know the answer without running any code, but you should still run the code to get the answer.\nIf it does not seem like you can write code to answer the question, just return \"I don't know\" as the answer.\n\n\nPython REPL: A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\n\nUse the following format:\n\nQuestion: the input question you must answer\nThought: you should always think about what to do\nAction: the action to take, should be one of [Python REPL]\nAction Input: the input to the action\nObservation: the result of the action\n... (this Thought/Action/Action Input/Observation can repeat N times)\nThought: I now know the final answer\nFinal Answer: the final answer to the original input question\n\nBegin!\n\nQuestion: sort these customers by last name and then first name and print the output: [['Harrison', 'Chase'], ['Lang', 'Chain'], ['Dolly', 'Too'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Jen', 'Ayai']] \nThought:"
)

Human: You are an agent designed to write and execute python code to answer questions.
You have access to a python REPL, which you can use to execute python code.
If you get an error, debug your code and try again.
Only use the output of your code to answer the question. 
You might know the answer without running any code, but you should still run the code to get the answer.
If it does not seem like you can write code to answer the question, just return "I don't know" as the answer.


Python REPL: A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [Python REPL]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat 

### Define your own tool

In [34]:
from langchain.agents import tool
from datetime import date

@tool
def time(text:str):
	'''
	Returns todays date, use this for any \
    questions related to knowing todays date. \
    The input should always be an empty string, \
    and this function will always return todays \
    date - any date mathmatics should occur \
    outside this function.
	'''
	return str(date.today())

agent = initialize_agent(
	tools=tools + [time],
	llm=llm,
	agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
	handle_parsing_error = True,
	verbose = True,
)

# Agent are not deterministic and can return the wrong answer
try:
	agent("what is the date today?")
except:
	print("Exception of external access")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mQuestion: what is the date today?
Thought: I can use the time tool to get the current date.
Action:
```
{
  "action": "time",
  "action_input": ""
}
```[0m
Observation: [38;5;200m[1;3m2024-01-14[0m
Thought:[32;1m[1;3mI now know the final answer
Final Answer: The date today is 2024-01-14.[0m

[1m> Finished chain.[0m
