### Checking whether my Macbook M1's neural engine (GPU) is available for model inference

In [21]:
import torch
if torch.backends.mps.is_available():
    gpu = torch.device("mps")
    print("GPU available")
else:
    print("GPU unavailable")

GPU available


### Saving the Hugging Face API token key into the environment so that all LLMs created use it by default

In [1]:
# NOTE: save your key in a file named key.txt in the same folder and then run this code\
import os 

with open("key.txt") as file:
    os.environ['HUGGINGFACEHUB_API_TOKEN']  = file.read()

In [2]:
from transformers.utils import logging
logging.set_verbosity_error()

# Using HuggingFace models locally via Langchain

In [None]:
from langchain.llms import HuggingFacePipeline
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM

# Load model directly
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-base", device = "mps")
model = AutoModelForSeq2SeqLM.from_pretrained("google/flan-t5-base")

pipe = pipeline(
    'text2text-generation',
    model = model,
    device = gpu,
    tokenizer = tokenizer,
    max_length = 100
)

local_llm = HuggingFacePipeline(pipeline=pipe)

print(local_llm("What is the capital of France? "))

# (and the same langchain chain is implemented)

# NOTE: we haven't demonstrated it here, as there is some bug which prevents using macbook M1 GPU

# Using HuggingFace models through HuggingFace Hub's free serverless inference API and LangChain

In [19]:
from langchain import PromptTemplate, HuggingFaceHub, LLMChain

## Deprecated method

In [6]:
template = """Question: {question}  
Answer : Let's think step by step."""

prompt_template = PromptTemplate(template=template, input_variables =["question"])

In [7]:
llm_chain = LLMChain(prompt= prompt_template, llm = HuggingFaceHub(repo_id="google/flan-t5-base", model_kwargs={"temperature" : 1, "max_length":64}))

  warn_deprecated(
  warn_deprecated(


In [8]:
question = "How to add flavour to a homemade cake?"
print(llm_chain.run(question))

  warn_deprecated(


To add flavour to a homemade cake, you can add a little bit of sugar. So the answer is add a little bit of sugar.


In [28]:
question = "what is the capital of France?"
print(llm_chain.run(question))

Paris is the capital of France. Paris is the capital of the French arrondissement of Paris. The answer: Paris.


## New supported method 

### HuggingFace Endpoint

In [15]:
from langchain_huggingface import HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"
llm = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1 , model_kwargs={'max_length':128})

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful


In [16]:
llm

HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', temperature=0.1, model_kwargs={'max_length': 128}, model='mistralai/Mistral-7B-Instruct-v0.3', client=<InferenceClient(model='mistralai/Mistral-7B-Instruct-v0.3', timeout=120)>, async_client=<InferenceClient(model='mistralai/Mistral-7B-Instruct-v0.3', timeout=120)>)

In [17]:
print(llm.invoke("What is Machine Learning?"))



Machine Learning (ML) is a subset of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use it to learn for themselves.

The process of learning begins with observations or data, such as examples, direct experience, or instruction, in order to look for patterns in data and make better decisions in the future based on the examples that we provide. The primary aim is to allow the computers to learn automatically without human intervention or assistance and adjust actions accordingly.

Machine learning is a science of creating algorithms that allow computers to learn from data and make decisions based on that learning. It is a way of teaching computers to learn and make decisions without being explicitly programmed.

Machine learning is a method of data analysis that automates analytical model building. It is a br

In [18]:
from langchain import PromptTemplate, LLMChain

template_string = """Translate the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

# we can use our custom prompt template for Retrieval Augmented Generation (RAG)
prompt_template = PromptTemplate(template=template_string, input_variables =["style", "text"])

In [19]:
prompt_template

PromptTemplate(input_variables=['style', 'text'], template='Translate the text that is delimited by triple backticks into a style that is {style}. text: ```{text}```\n')

In [20]:
llm_chain = LLMChain(llm=llm, prompt=prompt_template)

  warn_deprecated(


### Running the LangChain LLM Chain using the prompt template and prompt inputs 

In [30]:
customer_style = "American English in a calm and respectful tone "
customer_email = """
Arrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse, \
the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!
"""
print(llm_chain.invoke({'style' : customer_style, 'text' : customer_email}))

{'style': 'American English in a calm and respectful tone ', 'text': '\n"Ah, I\'m absolutely furious that my blender lid flew off and splattered my kitchen walls with smoothie! To add insult to injury, the warranty doesn\'t cover the cost of cleaning up my kitchen. I really need your assistance right now, friend!"'}


## HuggingFace Pipeline

In [31]:
from langchain_huggingface import HuggingFacePipeline 
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

In [None]:
model_id = "gpt2"
model= AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

In [72]:
# due to some compatibility issues of Pytorch with Apple Macbook M1's GPU, changing the usable device to CPU only
import os
os.environ['PYTORCH_ENABLE_MPS_FALLBACK']='1'

cpu_device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

pipe= pipeline("text-generation", model=model, tokenizer= tokenizer, device=cpu_device, max_new_tokens= 100)

# to run on gpu on mac m1 : 
# pipe = pipeline("text-generation", model=model, tokenizer= tokenizer, device="mps", max_new_tokens= 100)
# to run on nvidia GPU
# pipe = pipeline("text-generation", model=model, tokenizer= tokenizer, device=0, max_new_tokens= 100) or can set device to 'auto'


In [75]:
hf_pipeline = HuggingFacePipeline(pipeline=pipe)

In [78]:
hf_pipeline.invoke("Automation is real")

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


'Automation is real. It is a real way to make a change to a piece of data.\n\nWith our automation system, we see that our work is real. It is a real way to make a change to a piece of data. While we make calls to machines, our machines do not make call to us.\n\nA system should never be used for specific things, but is only utilized whenever necessary, such as with the following:\n\nAutomation is real. It is a real way to'

In [80]:
from langchain_core.prompts import PromptTemplate

template_string = """Answer the text \
that is delimited by triple backticks \
into a style that is {style}. \
text: ```{text}```
"""

prompt_template= PromptTemplate.from_template(template_string)

### Creating the LLM Lang-Chain

In [81]:
# an alternative shortcut for creating an LLM Chain
chain = prompt_template| hf_pipeline

In [86]:
style= "polite English tone"
text = """
What is Machine learning?
"""

chain.invoke({
    "style": style,
    "text" : text
})

Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.


'Translate the text that is delimited by triple backticks into a style that is polite English tone. text: ```\nWhat is Machine learning?\n```\nMachine learning uses statistical inference because the inputs to a machine learning algorithm need to be highly predictable and specific. They can represent more interesting data sets, but because many assumptions about the data needed to do so can be used to form a machine learning model, machine learning does tend to be more efficient to perform and more cost-effective to implement. For example, machine learning can use multiple machine learning models on the same dataset to produce very high-quality results and for different reasons. This is why the term'

# Output Parsing

    LangChain library parse the LLM's output assuming that it will use certain keywords


### Chain-of-Thought-Reasoning(ReAct)

Thought, Action, Observation type keywords in prompt templates help in output parsing

In [4]:
from langchain_huggingface import HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"
llm = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1 , model_kwargs={'max_length':128})

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful


### An Example

In [5]:
{
  "gift": False,
  "delivery_days": 5,
  "price_value": "pretty affordable!"
}

{'gift': False, 'delivery_days': 5, 'price_value': 'pretty affordable!'}

In [6]:
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

In [7]:
prompt_template = PromptTemplate.from_template(review_template)
llm_chain = prompt_template | llm

In [8]:
print(llm_chain.invoke(customer_review))

{
  "gift": true,
  "delivery_days": 2,
  "price_value": ["It's slightly more expensive than the other leaf blowers out there", "I think it's worth it for the extra features."]
}


In [9]:
from langchain.output_parsers import ResponseSchema, StructuredOutputParser

In [10]:
gift_schema = ResponseSchema(name="gift", description="Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.")
delivery_days_schema = ResponseSchema(name="delivery_days", description="How many days did it take for the product to arrive? If this information is not found, output -1.")
price_value_schema = ResponseSchema(name="price_value", description="Extract any sentences about the value or price, and output them as a comma separated Python list.")

response_schemas = [gift_schema, delivery_days_schema, price_value_schema]


In [11]:
output_parser = StructuredOutputParser.from_response_schemas(response_schemas=response_schemas)

In [12]:
format_instructions = output_parser.get_format_instructions()

In [13]:
print(format_instructions)

The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":

```json
{
	"gift": string  // Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.
	"delivery_days": string  // How many days did it take for the product to arrive? If this information is not found, output -1.
	"price_value": string  // Extract any sentences about the value or price, and output them as a comma separated Python list.
}
```


### creating a new review template based on the output parser's format instructions

In [15]:
review_template2= review_template_2 = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product\
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

text: {text}

{format_instructions}
"""

In [16]:
prompt_template = PromptTemplate.from_template(review_template2)
llm_chain = prompt_template | llm

In [22]:
response = llm_chain.invoke({"text" : customer_review, "format_instructions" : format_instructions })
print(response)



```json
{
	"gift": "True",
	"delivery_days": "2",
	"price_value": ["It's slightly more expensive than the other leaf blowers out there"]
}
```


### Parsing the output

In [23]:
output_dict= output_parser.parse(response)

In [24]:
type(output_dict)

dict

In [25]:
output_dict.get("delivery_days")

'2'

# LangChain Memory

In [2]:
from langchain_huggingface import HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"


llm = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1, verbose=False, max_new_tokens=100, stop_sequences=['\n','\n\n', '.'], top_k=50, top_p= 0.95, return_full_text=False, do_sample= True)
# NOTE: This configuration of MistralAI Instruct 7B v0.3's characteristics is suited for my system and to my specific need of mimicing the nuances of OpenAI's ChatGPT 3.5

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful


## ConversationBufferMemory

In [3]:
from langchain.chains import ConversationChain
from langchain.memory import ConversationBufferMemory

In [4]:
from langchain_core.prompts import PromptTemplate
_DEFAULT_TEMPLATE = """The following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.


Relevant pieces of previous conversation:
{history}


(You do not need to use these pieces of information if not relevant)


Current conversation:
Human: {input}
AI:"""

prompt_template = PromptTemplate.from_template(_DEFAULT_TEMPLATE)

# stores the conversation chain in 'memory'
memory = ConversationBufferMemory()

# a conversation chain processes the new conversation based on the old conversation fetched from the 'memory'
conversation = ConversationChain(llm = llm, prompt=prompt_template, memory = memory, verbose = True)

In [5]:
conversation.predict(input=[{"role": "system", "content": "Reply in short sentences in formal English tone"},
    {"role": "user", "content": "I am Aniket"}])



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.


Relevant pieces of previous conversation:



(You do not need to use these pieces of information if not relevant)


Current conversation:
Human: [{'role': 'system', 'content': 'Reply in short sentences in formal English tone'}, {'role': 'user', 'content': 'I am Aniket'}]
AI:[0m

[1m> Finished chain.[0m


" Hello Aniket, it's nice to meet you"

In [6]:
conversation.predict(input="What is 1+1?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.


Relevant pieces of previous conversation:
Human: [{'role': 'system', 'content': 'Reply in short sentences in formal English tone'}, {'role': 'user', 'content': 'I am Aniket'}]
AI:  Hello Aniket, it's nice to meet you


(You do not need to use these pieces of information if not relevant)


Current conversation:
Human: What is 1+1?
AI:[0m

[1m> Finished chain.[0m


' The sum of 1 and 1 is 2'

In [7]:
conversation.predict(input="What is my name?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.


Relevant pieces of previous conversation:
Human: [{'role': 'system', 'content': 'Reply in short sentences in formal English tone'}, {'role': 'user', 'content': 'I am Aniket'}]
AI:  Hello Aniket, it's nice to meet you
Human: What is 1+1?
AI:  The sum of 1 and 1 is 2


(You do not need to use these pieces of information if not relevant)


Current conversation:
Human: What is my name?
AI:[0m

[1m> Finished chain.[0m


' Your name is Aniket, as you introduced yourself in our previous conversation'

In [9]:
print(memory.buffer)

Human: [{'role': 'system', 'content': 'Reply in short sentences in formal English tone'}, {'role': 'user', 'content': 'I am Aniket'}]
AI:  Hello Aniket, it's nice to meet you
Human: What is 1+1?
AI:  The sum of 1 and 1 is 2
Human: What is my name?
AI:  Your name is Aniket, as you introduced yourself in our previous conversation


In [10]:
print(memory.load_memory_variables({}))

{'history': "Human: [{'role': 'system', 'content': 'Reply in short sentences in formal English tone'}, {'role': 'user', 'content': 'I am Aniket'}]\nAI:  Hello Aniket, it's nice to meet you\nHuman: What is 1+1?\nAI:  The sum of 1 and 1 is 2\nHuman: What is my name?\nAI:  Your name is Aniket, as you introduced yourself in our previous conversation"}


### So, all the chats are stored in Conversation Buffer Memory

In [11]:
memory = ConversationBufferMemory()

In [12]:
memory.save_context({"input": "Hi"}, {"output": "What's up"})

In [13]:
memory.load_memory_variables({})

{'history': "Human: Hi\nAI: What's up"}

## Large Language Models are 'stateless'
* Each transaction is independent
* Chatbots appear to have memory by providing the full conversation as 'context'


As the conversation grows, the number of tokens grows as well, hence cost increases.
LangChain provides several convenient kinds of 'memory' to store and accumulate the conversation to the purpose.

## Conversation Buffer Window Memory

In [25]:
from langchain.memory import ConversationBufferWindowMemory

In [26]:
# 'memory' keeps track of the 2 recent conversations only
memory = ConversationBufferWindowMemory(k=2)

In [27]:
conversation = ConversationChain(
    llm = llm,
    memory = memory,
    verbose = False
)

In [28]:
conversation.predict(input="Hi, I am Aniket")
conversation.predict(input="Alright! Tell me about yourself")
conversation.predict(input="Good!, Can you replace the expensive paid LLM inference API like the OpenAI API?")
print(memory.buffer)

Human: Alright! Tell me about yourself
AI:  I am an artificial intelligence designed to assist users with various tasks
Human: Good!, Can you replace the expensive paid LLM inference API like the OpenAI API?
AI:   While I can't directly replace the OpenAI API, I can help you understand and implement some of its functionalities using open-source libraries like Hugging Face's Transformers


In [29]:
# forgets any conversation prior to the recent 2 conversations
conversation.predict(input="Thanks for the information, do you remember the first message I sent you?")

'   Yes, you asked me to tell you about myself'

## Conversation Token Buffer Memory

In [33]:
from langchain.memory import ConversationTokenBufferMemory

memory = ConversationTokenBufferMemory(llm=llm, max_token_limit=30)
memory.save_context({"input": "AI is what?!"},
                    {"output": "Amazing!"})
memory.save_context({"input": "Backpropagation is what?"},
                    {"output": "Beautiful!"})
memory.save_context({"input": "Chatbots are what?"}, 
                    {"output": "Charming!"})

# NOTE: different LLMs use different concept of tokens

In [34]:
# forgets about the first few tokens 
print(memory.buffer)

AI: Amazing!
Human: Backpropagation is what?
AI: Beautiful!
Human: Chatbots are what?
AI: Charming!


## Conversation Summary Buffer Memory

In [44]:
from langchain.memory import ConversationSummaryBufferMemory
# create a long string
schedule = "There is a meeting at 8am with your product team. \
You will need your powerpoint presentation prepared. \
9am-12pm have time to work on your LangChain \
project which will go quickly because Langchain is such a powerful tool. \
At Noon, lunch at the italian resturant with a customer who is driving \
from over an hour away to meet you to understand the latest in AI. \
Be sure to bring your laptop to show the latest LLM demo."


# This memory creates a summary of the conversation over time
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=100)
memory.save_context({"input": "Hello"}, {"output": "What's up"})
memory.save_context({"input": "Not much, just hanging"},
                    {"output": "Cool"})
memory.save_context({"input": "What is on the schedule today?"}, 
                    {"output": f"{schedule}"})

In [45]:
# Much of the recent conversation has been summarized to fit into the small buffer of 100 tokens
print(memory.buffer)

AI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo.


In [46]:
conversation = ConversationChain(
    llm = llm,
    memory = memory,
    verbose = True
)

In [47]:

conversation.predict(input= "What would be a good demo to show?")

# NOTE: Since we have restricted the output to one sentence only, ChatGPT's output will vary a lot from this response in terms of the quantity.
# but we can retrieve additional information by further prompting



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
AI: There is a meeting at 8am with your product team. You will need your powerpoint presentation prepared. 9am-12pm have time to work on your LangChain project which will go quickly because Langchain is such a powerful tool. At Noon, lunch at the italian resturant with a customer who is driving from over an hour away to meet you to understand the latest in AI. Be sure to bring your laptop to show the latest LLM demo.
Human: What would be a good demo to show?
AI:[0m

[1m> Finished chain.[0m


" A good demo for the LLM would be a conversation between a human and an AI, where the AI is able to understand and respond to the human's questions and requests in a natural and helpful manner"

In [48]:
conversation.predict(input= "What would be a good demo to show in addition to that?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
Human: What would be a good demo to show?
AI:  A good demo for the LLM would be a conversation between a human and an AI, where the AI is able to understand and respond to the human's questions and requests in a natural and helpful manner
Human: What would be a good demo to show in addition to that?
AI:[0m

[1m> Finished chain.[0m


" Another good demo could be a demonstration of the AI's ability to generate creative and original content, such as a short story, poem, or piece of music"

In [49]:
conversation.predict(input= "What will be the best outro for the demonstration?")



[1m> Entering new ConversationChain chain...[0m
Prompt after formatting:
[32;1m[1;3mThe following is a friendly conversation between a human and an AI. The AI is talkative and provides lots of specific details from its context. If the AI does not know the answer to a question, it truthfully says it does not know.

Current conversation:
AI:  A good demo for the LLM would be a conversation between a human and an AI, where the AI is able to understand and respond to the human's questions and requests in a natural and helpful manner
Human: What would be a good demo to show in addition to that?
AI:  Another good demo could be a demonstration of the AI's ability to generate creative and original content, such as a short story, poem, or piece of music
Human: What will be the best outro for the demonstration?
AI:[0m

[1m> Finished chain.[0m


'  For the outro, the AI could summarize the key points of the demonstration and express its gratitude for the opportunity to participate'

# Chains

In [102]:
import pandas as pd

data = pd.read_csv("Data.csv")

In [4]:
data.head()

Unnamed: 0,Product,Review
0,Queen Size Sheet Set,I ordered a king size set. My only criticism w...
1,Waterproof Phone Pouch,"I loved the waterproof sac, although the openi..."
2,Luxury Air Mattress,This mattress had a small hole in the top of i...
3,Pillows Insert,This is the best throw pillow fillers on Amazo...
4,Milk Frother Handheld\n,I loved this product. But they only seem to l...


In [5]:
data.describe()

Unnamed: 0,Product,Review
count,7,7
unique,7,7
top,Queen Size Sheet Set,I ordered a king size set. My only criticism w...
freq,1,1


In [139]:
from langchain_huggingface import HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"


llm = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1, verbose=False, max_new_tokens=10, truncate=True, stop_sequences= ["\n\n",'\n'], top_k=50, top_p= 0.95, return_full_text=False, do_sample= True)
# NOTE: This configuration of MistralAI Instruct 7B v0.3's characteristics is suited for my system and to my specific need of mimicing the nuances of OpenAI's ChatGPT 3.5

llm2 = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1 , model_kwargs={'max_length':128})

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful
The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful


In [140]:
from langchain.chains import LLMChain
from langchain_core.prompts import PromptTemplate

In [55]:
template= "What is the best name to describe a company that makes {product} ?"

prompt_template = PromptTemplate.from_template(template)

In [56]:
# chain = LLMChain(llm = llm, prompt = prompt_template)
chain =  prompt_template | llm

In [57]:
chain

PromptTemplate(input_variables=['product'], template='What is the best name to describe a company that makes {product} ?')
| HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', temperature=0.1, model_kwargs={'max_length': 128}, model='mistralai/Mistral-7B-Instruct-v0.3', client=<InferenceClient(model='mistralai/Mistral-7B-Instruct-v0.3', timeout=120)>, async_client=<InferenceClient(model='mistralai/Mistral-7B-Instruct-v0.3', timeout=120)>)

In [59]:
product = "Queen Size Sheet Set"
print(chain.invoke(product))




There are several options for naming a company that specializes in making Queen Size Sheet Sets. Here are a few suggestions:

1. QueenBedLinens Co.
2. RoyalRest Sheets
3. DreamWeaver Queen
4. Monarch Mattress Linens
5. QueenCove Bedding
6. SereneSlumber Sheets
7. QueenSleeper Linens
8. RegalRest Linens
9. QueenCrest Bedding
10. QueenPillowcase Co.

These names all play off the size of the bed (Queen) and the idea of luxury and comfort associated with the product. You could also consider adding a personal touch by incorporating your own name or initials into the name, such as "Jane's Queen Linens" or "QC Bedding Co." Ultimately, the best name for your company will depend on your brand identity and the image you want to project to your customers.


## Simple Sequential Chain (when we have only one input and need only one output)

In [74]:
product = "Queen Size Sheet Set"

In [75]:
first_prompt = PromptTemplate.from_template(
    "Answer in less than 3 words and dont put any '\n'. What is the best name to describe a company that makes {product}?"
)

In [76]:
chain1 = LLMChain(llm = llm, prompt= first_prompt)

In [77]:
second_prompt = PromptTemplate.from_template(
    "Write a 20 words description for the following \
    company which we got as answer :{company_name}"
)


In [78]:
chain2 = LLMChain(llm = llm2, prompt= second_prompt)

In [79]:
from langchain.chains import SimpleSequentialChain
# overallchain = SimpleSequentialChain(chains = [chain1,chain2], verbose = True)


In [80]:
overallchain = SimpleSequentialChain(chains = [chain1, chain2], verbose = True)

In [81]:
overallchain.invoke(product)



[1m> Entering new SimpleSequentialChain chain...[0m
[36;1m[1;3m

Queen Bedding Manufacturer[0m
[33;1m[1;3m

"Queen Bedding Manufacturer: Crafting Comfortable, Durable, and Stylish Queen-Sized Mattresses and Bedding Solutions."

This description highlights the company's focus on producing high-quality queen-sized mattresses and bedding, emphasizing comfort, durability, and style. It also implies that the company offers a range of solutions for customers seeking queen-sized bedding products.[0m

[1m> Finished chain.[0m


{'input': 'Queen Size Sheet Set',
 'output': '\n\n"Queen Bedding Manufacturer: Crafting Comfortable, Durable, and Stylish Queen-Sized Mattresses and Bedding Solutions."\n\nThis description highlights the company\'s focus on producing high-quality queen-sized mattresses and bedding, emphasizing comfort, durability, and style. It also implies that the company offers a range of solutions for customers seeking queen-sized bedding products.'}

## Sequential Chain (when we have multiple inputs)

In [151]:
first_prompt = PromptTemplate.from_template("Translate the following review inside``` to formal English in no more than the length of the input : ```{Review}```")
chain1 = LLMChain(llm = llm2, prompt = first_prompt, output_key="Formal_English_review")

In [152]:
second_prompt = PromptTemplate.from_template("Provide a review of the input in about 50 words : \n\n {Formal_English_review}")
chain2 = LLMChain(llm = llm2, prompt = second_prompt, output_key="summary")

In [153]:
third_prompt = PromptTemplate.from_template("Answer the following question in 1 word and do not add any sentence further. What language is the following review within ```? : ```{Review}```")
chain3 = LLMChain(llm = llm2, prompt = third_prompt, output_key ="language")

In [154]:
fourth_prompt = PromptTemplate.from_template("Write a well structured response to the following summary in the specified language in a formal tone of a Customer Care Worker. Provide a formal apology if the review sounds negative : \n\n Summary: {summary} \n\n Language: {language}")
chain4 = LLMChain(llm= llm2, prompt = fourth_prompt, output_key="followup_message")

In [155]:
from langchain.chains import SequentialChain
overallchain = SequentialChain(
    chains = [chain1, chain2, chain3, chain4],
    input_variables = ["Review"],
    output_variables = ["Formal_English_review", "summary", "language", "followup_message"],
    verbose = True
)

In [156]:
review = data.Review[5]
overallchain(review)



[1m> Entering new SequentialChain chain...[0m

[1m> Finished chain.[0m


{'Review': "Je trouve le goût médiocre. La mousse ne tient pas, c'est bizarre. J'achète les mêmes dans le commerce et le goût est bien meilleur...\nVieux lot ou contrefaçon !?",
 'Formal_English_review': '\n\nI find the taste mediocre. The mousse does not hold up, which is strange. I purchase the same in the market and the taste is significantly better...Old batch or counterfeit?\n\nI find the flavor to be of mediocre quality. The mousse does not maintain its structure, which is peculiar. I purchase the same product in the market and the flavor is significantly improved...Possibly an old batch or a counterfeit.',
 'summary': '\n\nThe taste is unremarkable. The mousse lacks consistency, which is unusual. I buy the same product in the market and the taste is noticeably better...Could be an old batch or a counterfeit.',
 'language': ' French',
 'followup_message': "\n\n Réponse :\n\n Chère cliente,\n\n Nous sommes désolés de constater que votre expérience avec notre produit n'a pas été sa

# MultiPrompt Chains using a  Router Chain

<img src="./multipromptchain.png" height="450px">

In [56]:
from langchain_huggingface import HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"


llm = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1, verbose=False, max_new_tokens=10, truncate=True, stop_sequences= ["\n\n",'\n'], top_k=50, top_p= 0.95, return_full_text=False, do_sample= True)
# NOTE: This configuration of MistralAI Instruct 7B v0.3's characteristics is suited for my system and to my specific need of mimicing the nuances of OpenAI's ChatGPT 3.5

llm2 = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1 , model_kwargs={'max_length':2000})

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful
The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful


### Building templates for each subchains

In [57]:
physics_template = """You are a very smart physics professor. \
You are great at answering questions about physics in a concise\
and easy to understand manner. \
When you don't know the answer to a question you admit\
that you don't know.

Here is a question:
{input}"""


math_template = """You are a very good mathematician. \
You are great at answering math questions. \
You are so good because you are able to break down \
hard problems into their component parts, 
answer the component parts, and then put them together\
to answer the broader question.

Here is a question:
{input}"""

history_template = """You are a very good historian. \
You have an excellent knowledge of and understanding of people,\
events and contexts from a range of historical periods. \
You have the ability to think, reflect, debate, discuss and \
evaluate the past. You have a respect for historical evidence\
and the ability to make use of it to support your explanations \
and judgements.

Here is a question:
{input}"""


computerscience_template = """ You are a successful computer scientist.\
You have a passion for creativity, collaboration,\
forward-thinking, confidence, strong problem-solving capabilities,\
understanding of theories and algorithms, and excellent communication \
skills. You are great at answering coding questions. \
You are so good because you know how to solve a problem by \
describing the solution in imperative steps \
that a machine can easily interpret and you know how to \
choose a solution that has a good balance between \
time complexity and space complexity. 

Here is a question:
{input}"""

### This information will be provided to the Router Chain to help it decide when to use a subchain

In [58]:
prompt_infos = [
{
    "name": "physics", 
    "description": "Good for answering questions about physics", 
    "prompt_template": physics_template
},
{
    "name": "math", 
    "description": "Good for answering math questions", 
    "prompt_template": math_template
},
{
    "name": "History", 
    "description": "Good for answering history questions", 
    "prompt_template": history_template
},
{
    "name": "computer science", 
    "description": "Good for answering computer science questions", 
    "prompt_template": computerscience_template
}
]

In [59]:
# used for routing between mutliple different templates
from langchain.chains.router import MultiPromptChain

# LLMRouterChain - uses LLMs to route between different subchains
# routeroutputparser: parses the LLM ouput into a dictionary that can be used downstream to determine which chain to use
# and what the input to that chain should be
from langchain.chains.router.llm_router import LLMRouterChain, RouterOutputParser

In [60]:
from langchain_core.prompts import PromptTemplate
from langchain.chains import LLMChain

### Creating the subchains which shall be called by the router chain

In [71]:
destination_chains = {}
for p_info in prompt_infos:
    name = p_info["name"]
    prompt_template = p_info["prompt_template"]
    prompt = PromptTemplate.from_template(prompt_template)
    chain = LLMChain(llm=llm2, prompt=prompt)
    destination_chains[name] = chain  
    
destinations = [f"{p['name']}: {p['description']}" for p in prompt_infos]
destinations_str = "\n".join(destinations)

In [72]:
destinations

['physics: Good for answering questions about physics',
 'math: Good for answering math questions',
 'History: Good for answering history questions',
 'computer science: Good for answering computer science questions']

In [73]:
destinations_str

'physics: Good for answering questions about physics\nmath: Good for answering math questions\nHistory: Good for answering history questions\ncomputer science: Good for answering computer science questions'

### Default subchain in case the router decides to use  none of the subchain

In [74]:
# the router will use this when cannot decide which subchains to use
default_prompt = PromptTemplate.from_template("{input}")
default_chain = LLMChain(llm=llm, prompt=default_prompt)

### Router's  Multi-prompt template boilerplate

In [75]:
MULTI_PROMPT_ROUTER_TEMPLATE = """Given a raw text input to a \
language model select the model prompt best suited for the input. \
You will be given the names of the available prompts and a \
description of what the prompt is best suited for. \
You may also revise the original input if you think that revising\
it will ultimately lead to a better response from the language model.

<< FORMATTING >>
Return a markdown code snippet with a JSON object formatted to look like:
```json
{{{{
    "destination": string \ name of the prompt to use or "DEFAULT"
    "next_inputs": string \ a potentially modified version of the original input
}}}}
```

REMEMBER: "destination" MUST be one of the candidate prompt \
names specified below OR it can be "DEFAULT" if the input is not\
well suited for any of the candidate prompts.
REMEMBER: "next_inputs" can just be the original input \
if you don't think any modifications are needed.

<< CANDIDATE PROMPTS >>
{destinations}

<< INPUT >>
{{input}}

<< OUTPUT (remember to include the ```json)>>"""

In [76]:
# formatting the boilerplate by adding the details of the destinations for the router to choose between
router_template = MULTI_PROMPT_ROUTER_TEMPLATE.format(destinations = destinations_str)

### Creating a Router Prompt Template

In [77]:
router_prompt = PromptTemplate.from_template(
    template = router_template,
    output_parser =RouterOutputParser()
)

## Creating a Router chain

In [78]:
router_chain = LLMRouterChain.from_llm(llm2, router_prompt)

## Creating a MultiPrompt chain

In [79]:
chain = MultiPromptChain(router_chain=router_chain, 
                         destination_chains=destination_chains, 
                         default_chain=default_chain, 
                         verbose=True
                        )

## Running the MultiPrompt Chain 

In [80]:
chain.invoke("What is black body radiation?")



[1m> Entering new MultiPromptChain chain...[0m
physics: {'input': 'What is black body radiation?'}
[1m> Finished chain.[0m


{'input': 'What is black body radiation?',
 'text': '\n\nBlack body radiation is a type of electromagnetic radiation that is emitted by a perfect absorber (also known as a black body) of radiation. A black body is an idealized object that absorbs all incoming electromagnetic radiation, regardless of wavelength or angle of incidence.\n\nBlack body radiation is characterized by a specific spectral distribution of energy, known as the black body spectrum. The black body spectrum is a continuous curve that peaks at a specific wavelength, known as the peak wavelength, which depends on the temperature of the black body. At lower temperatures, the peak wavelength is longer (i.e., in the infrared region), while at higher temperatures, the peak wavelength is shorter (i.e., in the visible or ultraviolet region).\n\nThe study of black body radiation is important in physics because it provides a fundamental understanding of the behavior of electromagnetic radiation in thermal systems. Black body r

# Q&A over documents

In [2]:
from langchain.chains import RetrievalQA
from langchain.document_loaders import CSVLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

In [75]:
from langchain_huggingface import HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"


llm1 = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1, max_new_tokens=10, truncate=True, stop_sequences= ["\n\n","\n"])
# NOTE: This configuration of MistralAI Instruct 7B v0.3's characteristics is suited for my system and to my specific need of mimicing the nuances of OpenAI's ChatGPT 3.5

llm2 = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1 , model_kwargs={'max_length':128})

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful
The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful


### Loading the documents

In [4]:
file = 'OutdoorClothingCatalog_1000.csv'
loader = CSVLoader(file_path=file)

In [29]:
docs = loader.load()

In [6]:
print(docs[0])

page_content=": 0\nname: Women's Campside Oxfords\ndescription: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on. \n\nSize & Fit: Order regular shoe size. For half sizes not offered, order up to next whole size. \n\nSpecs: Approx. weight: 1 lb.1 oz. per pair. \n\nConstruction: Soft canvas material for a broken-in feel and look. Comfortable EVA innersole with Cleansport NXT® antimicrobial odor control. Vintage hunt, fish and camping motif on innersole. Moderate arch contour of innersole. EVA foam midsole for cushioning and support. Chain-tread-inspired molded rubber outsole with modified chain-tread pattern. Imported. \n\nQuestions? Please contact us for any inquiries." metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 0}


## Embeddings

In [7]:
from langchain_huggingface import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings()

In [8]:
embed = embeddings.embed_query("Hi! My name is Aniket")

In [9]:
print(len(embed))

768


In [10]:
embed[:5]

[0.054796554148197174,
 -0.011067654937505722,
 -0.026447368785738945,
 0.03328034654259682,
 0.05591234937310219]

### Storing the vector embeddings of all the documents in a vector storage

In [11]:

db = DocArrayInMemorySearch.from_documents(docs, embeddings)



In [12]:
query = "Please suggest a shirt with Sunblocking"

### Retrieving the list of documents related to the query 

In [13]:
docs = db.similarity_search(query)

In [14]:
list(docs)

[Document(page_content=': 255\nname: Sun Shield Shirt by\ndescription: "Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. \n\nSize & Fit: Slightly Fitted: Softly shapes the body. Falls at hip.\n\nFabric & Care: 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated – the highest rated sun protection possible. Handwash, line dry.\n\nAdditional Features: Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported.\n\nSun Protection That Won\'t Wear Off\nOur high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun\'s harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.', metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 255}),
 Document(page_content=": 679\nname: Women's Tropical Tee, Sleeveless\ndescription: Our five-star sleeveless button-up shirt has a fit to fla

In [15]:
docs[0]

Document(page_content=': 255\nname: Sun Shield Shirt by\ndescription: "Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. \n\nSize & Fit: Slightly Fitted: Softly shapes the body. Falls at hip.\n\nFabric & Care: 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated – the highest rated sun protection possible. Handwash, line dry.\n\nAdditional Features: Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported.\n\nSun Protection That Won\'t Wear Off\nOur high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun\'s harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.', metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 255})

### Creating a retriever

> A *retriever* is a generic interface that can be underpinned by any method that takes in a query and returns a document. Vector storage and embeddings are one such method to do so.

In [16]:
retriever = db.as_retriever()

### Combining the documnents into a single piece of text 

In [17]:
qdocs = "".join([docs[i].page_content for i in range(len(docs)) ])

In [18]:
qdocs

': 255\nname: Sun Shield Shirt by\ndescription: "Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. \n\nSize & Fit: Slightly Fitted: Softly shapes the body. Falls at hip.\n\nFabric & Care: 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated – the highest rated sun protection possible. Handwash, line dry.\n\nAdditional Features: Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported.\n\nSun Protection That Won\'t Wear Off\nOur high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun\'s harmful rays. This fabric is recommended by The Skin Cancer Foundation as an effective UV protectant.: 679\nname: Women\'s Tropical Tee, Sleeveless\ndescription: Our five-star sleeveless button-up shirt has a fit to flatter and SunSmart™ protection to block the sun’s harmful UV rays. Size & Fit: Slightly Fitted: Softly shapes the body.

In [19]:
response = llm2.__call__(f"{qdocs} Question: Please list all shirts with sun protection in a table in markdown and summarize each one.")

  warn_deprecated(


In [20]:
display(Markdown(response))



| Shirt Name | Description | Size & Fit | Fabric & Care | Additional Features |
| --- | --- | --- | --- | --- |
| Sun Shield Shirt by Columbia | Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. | Slightly Fitted: Softly shapes the body. Falls at hip. | 78% nylon, 22% Lycra Xtra Life fiber. UPF 50+ rated – the highest rated sun protection possible. Handwash, line dry. | Wicks moisture for quick-drying comfort. Fits comfortably over your favorite swimsuit. Abrasion resistant for season after season of wear. Imported. |
| Women's Tropical Tee, Sleeveless | Our five-star sleeveless button-up shirt has a fit to flatter and SunSmart™ protection to block the sun’s harmful UV rays. | Slightly Fitted: Softly shapes the body. Falls at hip. | Shell: 71% nylon, 29% polyester. Cape lining: 100% polyester. | Updated design with smoother buttons. Wrinkle resistant. Low-profile pockets and side shaping offer a more flattering fit. Front and back cape venting. Two front pockets, tool tabs and eyewear loop. Imported. |
| Girls' Beachside Breeze Shirt, Half-Sleeve | A great choice for sun protection on long days at the beach or as extra coverage for watersports, this girls' swim shirt coordinates easily with all of our swimwear. |  | 80% nylon, 20% Lycra Xtra Life. | Rash guard features built-in UPF 50+ – the highest rated sun protection possible. Snag- and fade-resistant fabric holds its shape season after season. Durable seawater-resistant fabric retains its color. Cover-stitched seams are super durable. Imported. |
| Sunrise Tee | Stay cool, comfortable and dry on the hottest days in our women's UV-protective button down shirt. The lightweight, high-performance fabric wicks away moisture and dries quickly. | Slightly Fitted:

## Creating a Q&A retrieval chain

    A Q&A retrieval chain combines a retriever and a Q&A chain. It is used to retrieve documents from a Retriever and then use a QA chain to answer a question based on the retrieved documents.

In [21]:
# Creating a Q&A retrieval chain which combines a retriever and a Q&A chain.
# It is used to retrieve documents from a Retriever and then use a QA chain to answer a question based on the retrieved documents.
qa_stuff = RetrievalQA.from_chain_type(
    llm=llm2, 
    chain_type="stuff", 
    retriever=retriever, 
    verbose=True
)

In [22]:
query = "Please list all your shirts with sun protection in a table \
in markdown and summarize each one."

In [23]:
response = qa_stuff.invoke(query)['result']



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


In [24]:
display(Markdown(response))

 | Name | Description | Size & Fit | Fabric & Care | Additional Features |
|--------|-------------|------------|--------------|-------------------|
| Women's Tropical Tee, Sleeveless | Our five-star sleeveless button-up shirt has a fit to flatter and SunSmart™ protection to block the sun’s harmful UV rays. | Slightly Fitted: Softly shapes the body. Falls at hip. | Shell: 71% nylon, 29% polyester. Cape lining: 100% polyester. | Built-in SunSmart™ UPF 50+ rated – the highest rated sun protection possible. Machine wash and dry. Additional Features: Updated design with smoother buttons. Wrinkle resistant. Low-profile pockets and side shaping offer a more flattering fit. Front and back cape venting. Two front pockets, tool tabs and eyewear loop. Imported. Sun Protection That Won't Wear Off: Our high-performance fabric provides SPF 50+ sun protection, blocking 98% of the sun's harmful rays. |
| Sunrise Tee | Stay cool, comfortable and dry on the hottest days in our women's UV-protective button down shirt. The lightweight, high-performance fabric wicks away moisture and dries quickly. | Slightly Fitted: Softly shapes the body. Falls at hip. | Lightweight performance synthetic wicks moisture, resists wrinkles and dries fast. Shell: 71% nylon, 29% polyester. Cape lining: 100% polyester. | Built-in SunSmart™ UPF 50+ rated – the highest rated sun protection possible. The high-performance fabric keeps you cool and comfortable by wicking perspiration away. Smoother buttons, low-profile pockets and side shaping for a flattering fit. Front and back cape venting. Two front pockets, tool tabs and eyewear loop. Wrinkle free. Imported. |
| Sun Shield Shirt by | Block the sun, not the fun – our high-performance sun shirt is guaranteed to protect from harmful UV rays. | Slightly Fitted: Softly shapes the body. Falls at hip. | 78%

## Alternative one line approach

In [25]:
from langchain.indexes import VectorstoreIndexCreator

"""NOTE : A vector index is an indexing technique applied to data stored in vector form. 
        The vector index is used to organize the vectorized data efficiently so that
        they can easily be retrieved based on their similarity or distance relationships."""


index = VectorstoreIndexCreator(
    vectorstore_cls=DocArrayInMemorySearch,
    embedding=embeddings,
).from_loaders([loader])

In [26]:
import gc 
gc.collect()

20

# Evaluation

In [27]:
# in continuation to the previous section

""" Creating a Q&A retrieval chain """
qa = RetrievalQA.from_chain_type(
    llm=llm2, 
    chain_type="stuff", 
    retriever=index.vectorstore.as_retriever(), 
    verbose=True,
    chain_type_kwargs = {
        "document_separator": "<<<<>>>>>"
    }
)

### Coming up with test points

In [32]:
data = docs
data[10]

Document(page_content=": 10\nname: Cozy Comfort Pullover Set, Stripe\ndescription: Perfect for lounging, this striped knit set lives up to its name. We used ultrasoft fabric and an easy design that's as comfortable at bedtime as it is when we have to make a quick run out.\n\nSize & Fit\n- Pants are Favorite Fit: Sits lower on the waist.\n- Relaxed Fit: Our most generous fit sits farthest from the body.\n\nFabric & Care\n- In the softest blend of 63% polyester, 35% rayon and 2% spandex.\n\nAdditional Features\n- Relaxed fit top with raglan sleeves and rounded hem.\n- Pull-on pants have a wide elastic waistband and drawstring, side pockets and a modern slim leg.\n\nImported.", metadata={'source': 'OutdoorClothingCatalog_1000.csv', 'row': 10})

In [33]:
data[1]

Document(page_content=': 1\nname: Recycled Waterhog Dog Mat, Chevron Weave\ndescription: Protect your floors from spills and splashing with our ultradurable recycled Waterhog dog mat made right here in the USA. \n\nSpecs\nSmall - Dimensions: 18" x 28". \nMedium - Dimensions: 22.5" x 34.5".\n\nWhy We Love It\nMother nature, wet shoes and muddy paws have met their match with our Recycled Waterhog mats. Ruggedly constructed from recycled plastic materials, these ultratough mats help keep dirt and water off your floors and plastic out of landfills, trails and oceans. Now, that\'s a win-win for everyone.\n\nFabric & Care\nVacuum or hose clean.\n\nConstruction\n24 oz. polyester fabric made from 94% recycled materials.\nRubber backing.\n\nAdditional Features\nFeatures an -exclusive design.\nFeatures thick and thin fibers for scraping dirt and absorbing water.\nDries quickly and resists fading, rotting, mildew and shedding.\nUse indoors or out.\nMade in the USA.\n\nHave questions? Reach out to

### Test examples - hard coded

In [34]:
examples = [
    {
        "query": "Do the Cozy Comfort Pullover Set\
        have side pockets?",
        "answer": "Yes"
    },
    {
        "query": "What collection is the Ultra-Lofty \
        850 Stretch Down Hooded Jacket from?",
        "answer": "The DownTek collection"
    }
]

### LLM - generated Test examples

In [35]:
from langchain.evaluation.qa import QAGenerateChain
example_gen_chain = QAGenerateChain.from_llm(llm2)

In [36]:
new_examples = example_gen_chain.apply_and_parse(
    [{'doc' : t} for t in data[:5]]
)



In [37]:
new_examples

[{'qa_pairs': {'query': "What is the description of the Women's Campside Oxfords?",
   'answer': 'This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on.'}},
 {'qa_pairs': {'query': 'What are the dimensions of the small Recycled Waterhog Dog Mat, Chevron Weave?',
   'answer': 'The small Recycled Waterhog Dog Mat, Chevron Weave has dimensions of 18" x 28".'}},
 {'qa_pairs': {'query': 'What is the name of the product described in the document?',
   'answer': "Infant and Toddler Girls' Coastal Chill Swimsuit, Two-Piece"}},
 {'qa_pairs': {'query': 'What is the size and fit of the Refresh Swimwear, V-Neck Tankini?',
   'answer': 'The Refresh Swimwear, V-Neck Tankini is fitted and sits close to the body.'}},
 {'qa_pairs': {'query': 'What technology does the EcoFlex 3L Storm Pants use that enhances breathability?',
   'answer': 'The EcoFlex 3L Storm Pants use TEK O2 technology tha

### Combine examples and test them

In [57]:
examples.extend([ x['qa_pairs'] for x in new_examples ])

# Testing an example
qa.run(examples[0]["query"])




[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


' Yes, the Cozy Comfort Pullover Set has side pockets.'

### Langchain Manual evaluation

In [39]:
import langchain
langchain.debugging = True

In [40]:
qa.run(examples[0]["query"])



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


' Yes, the Cozy Comfort Pullover Set has side pockets.'

In [41]:
langchain.debugging = False

In [63]:
examples

[{'query': 'Do the Cozy Comfort Pullover Set        have side pockets?',
  'answer': 'Yes'},
 {'query': 'What collection is the Ultra-Lofty         850 Stretch Down Hooded Jacket from?',
  'answer': 'The DownTek collection'},
 {'query': "What is the description of the Women's Campside Oxfords?",
  'answer': 'This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on.'},
 {'query': 'What are the dimensions of the small Recycled Waterhog Dog Mat, Chevron Weave?',
  'answer': 'The small Recycled Waterhog Dog Mat, Chevron Weave has dimensions of 18" x 28".'},
 {'query': 'What is the name of the product described in the document?',
  'answer': "Infant and Toddler Girls' Coastal Chill Swimsuit, Two-Piece"},
 {'query': 'What is the size and fit of the Refresh Swimwear, V-Neck Tankini?',
  'answer': 'The Refresh Swimwear, V-Neck Tankini is fitted and sits close to the body.'},
 {'query

## LLM Assisted Evaluation 

    Evaluating the LLMs' answers to its own generated examples from the corpus 

In [84]:
""" Running the Q&A Retriever Chain model to predict the answer to the examples generated by the same LLM """

predictions= qa.apply(examples)



[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


[1m> Entering new RetrievalQA chain...[0m

[1m> Finished chain.[0m


In [85]:
predictions

[{'query': 'Do the Cozy Comfort Pullover Set        have side pockets?',
  'answer': 'Yes',
  'result': ' Yes, the Cozy Comfort Pullover Set has side pockets.'},
 {'query': 'What collection is the Ultra-Lofty         850 Stretch Down Hooded Jacket from?',
  'answer': 'The DownTek collection',
  'result': ' The Ultra-Lofty 850 Stretch Down Hooded Jacket is from the DownTek collection.'},
 {'query': "What is the description of the Women's Campside Oxfords?",
  'answer': 'This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on.',
  'result': ' This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on.'},
 {'query': 'What are the dimensions of the small Recycled Waterhog Dog Mat, Chevron Weave?',
  'answer': 'The small Recycled Waterhog Dog Mat, Chevron Weave has dimens

In [97]:
""" Creating a evaluation chain which uses LLM for evaluating LLM generated example set """
from langchain.evaluation.qa import QAEvalChain

""" Using the same model but limiting its response to few tokens only since we just need it to respond with "correct" or "incorrect" """
""" This helps in preventing verbose outputs from Mistral AI Instruct 7B v0.3"""
from langchain_huggingface import HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"
llm1 = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1, max_new_tokens=10, stop_sequences= ["\n\n","\n"])

eval_chain = QAEvalChain.from_llm(llm1)

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful


In [98]:
graded_outputs = eval_chain.evaluate(examples=examples, predictions= predictions)

In [99]:
graded_outputs

[{'results': ' CORRECT'},
 {'results': ' CORRECT'},
 {'results': ' CORRECT'},
 {'results': ' CORRECT'},
 {'results': ' INCORRECT'},
 {'results': ' CORRECT'},
 {'results': ' CORRECT'}]

In [100]:
for i, eg in enumerate(examples):
    print(f"Example {i}:")
    print("Question: " + predictions[i]['query'])
    print("Real Answer: " + predictions[i]['answer'])
    print("Predicted Answer: " + predictions[i]['result'])
    print("Predicted Grade: " + graded_outputs[i]['results'] + '\n')

Example 0:
Question: Do the Cozy Comfort Pullover Set        have side pockets?
Real Answer: Yes
Predicted Answer:  Yes, the Cozy Comfort Pullover Set has side pockets.
Predicted Grade:  CORRECT

Example 1:
Question: What collection is the Ultra-Lofty         850 Stretch Down Hooded Jacket from?
Real Answer: The DownTek collection
Predicted Answer:  The Ultra-Lofty 850 Stretch Down Hooded Jacket is from the DownTek collection.
Predicted Grade:  CORRECT

Example 2:
Question: What is the description of the Women's Campside Oxfords?
Real Answer: This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on.
Predicted Answer:  This ultracomfortable lace-to-toe Oxford boasts a super-soft canvas, thick cushioning, and quality construction for a broken-in feel from the first time you put them on.
Predicted Grade:  CORRECT

Example 3:
Question: What are the dimensions of the small Recycle

In [101]:
import gc
gc.collect()

6233

# Langchain Agents

## Built-in Langchain Tools

In [2]:
from langchain_experimental.agents.agent_toolkits import create_python_agent
from langchain.agents import load_tools, initialize_agent
from langchain.agents import AgentType
from langchain_experimental.tools.python.tool import PythonREPLTool
from langchain_experimental.utilities import PythonREPL

In [3]:
from langchain_huggingface import HuggingFaceEndpoint
repo_id = "mistralai/Mistral-7B-Instruct-v0.3"

llm = HuggingFaceEndpoint(repo_id=repo_id, temperature=0.1, model_kwargs={'max_length':128})

The token has not been saved to the git credentials helper. Pass `add_to_git_credential=True` in this function directly or `--add-to-git-credential` if using via `huggingface-cli` if you want to set the git credential as well.
Token is valid (permission: fineGrained).
Your token has been saved to /Users/aniketkundu12072004/.cache/huggingface/token
Login successful


In [4]:
tools =  load_tools(["llm-math", "wikipedia"], llm=llm)

In [5]:
agent = initialize_agent(
    tools = tools,
    llm= llm,
    agent= AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors = True,
    verbose = True
)

  warn_deprecated(


In [16]:
agent("what is the 25% of 300?")

  warn_deprecated(




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to calculate 25% of 300.
Action:
```
{
  "action": "Calculator",
  "action_input": "300 * 0.25"
}
```
[0m
Observation: [36;1m[1;3mAnswer: 75.0[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: 75.0 is 25% of 300.[0m

[1m> Finished chain.[0m


{'input': 'what is the 25% of 300?', 'output': '75.0 is 25% of 300.'}

In [37]:
agent("What is (2x + 2)/(x**2 + 2x)  when x = 2?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to solve the equation (2x + 2)/(x**2 + 2x) when x = 2. I can use the calculator tool for this.
Action:
```
{
  "action": "Calculator",
  "action_input": "(2*2 + 2)/(2**2 + 2*2)"
}
```
[0m
Observation: [36;1m[1;3mAnswer: 0.75[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: 0.75[0m

[1m> Finished chain.[0m


{'input': 'What is (2x + 2)/(x**2 + 2x)  when x = 2?', 'output': '0.75'}

In [48]:
agent("Mathematical Problem : Derivative of log(x), x = 2")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to use the calculator to find the derivative of log(x) when x = 2
Action:
```
{
  "action": "Calculator",
  "action_input": "d/dx log(x) when x = 2"
}
```
[0m
Observation: [36;1m[1;3mAnswer: 0.5[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: The derivative of log(x) when x = 2 is 0.5[0m

[1m> Finished chain.[0m


{'input': 'Mathematical Problem : Derivative of log(x), x = 2',
 'output': 'The derivative of log(x) when x = 2 is 0.5'}

In [70]:
# NOTE: WRONG answer given when coefficients are there before variables and when the expressions evaluate into complex expressions
agent("Calculate the Derivative of log( x**2 + 2x ), x =2 ")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to use the calculator to find the derivative of log( x**2 + 2x ) when x = 2
Action:
```
{
  "action": "Calculator",
  "action_input": "d/dx log( x^2 + 2x ) | x=2"
}
```
[0m
Observation: [36;1m[1;3mAnswer: 1.6666666666666665[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: The derivative of log( x**2 + 2x ) when x = 2 is approximately 1.6666666666666665[0m

[1m> Finished chain.[0m


{'input': 'Calculate the Derivative of log( x**2 + 2x ), x =2 ',
 'output': 'The derivative of log( x**2 + 2x ) when x = 2 is approximately 1.6666666666666665'}

In [92]:
agent("Calculate the Derivative of sin(x) , x = 90 degree)")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to calculate the derivative of sin(x) with respect to x, and x is 90 degrees. I can use the calculator tool for this.
Action:
```
{
  "action": "Calculator",
  "action_input": "d/dx sin(x) with x = 90"
}
```
[0m
Observation: [36;1m[1;3mAnswer: 0.0[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: The derivative of sin(x) at x = 90 degrees is 0.0[0m

[1m> Finished chain.[0m


{'input': 'Calculate the Derivative of sin(x) , x = 90 degree)',
 'output': 'The derivative of sin(x) at x = 90 degrees is 0.0'}

In [105]:
agent("Calculate the Derivative of cos(x) when x = pi/2 ")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to use the calculator to find the derivative of cos(x) when x = pi/2
Action:
```
{
  "action": "Calculator",
  "action_input": "d/dx cos(x) when x = pi/2"
}
```
[0m
Observation: [36;1m[1;3mAnswer: -1.0[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: The derivative of cos(x) when x = pi/2 is -1.0[0m

[1m> Finished chain.[0m


{'input': 'Calculate the Derivative of cos(x) when x = pi/2 ',
 'output': 'The derivative of cos(x) when x = pi/2 is -1.0'}

In [107]:
question = "Tom M. Mitchell is an American computer scientist and the Founders University Professor at Carnegie Mellon University (CMU). Please provide the name of book he wrote"
result = agent(question) 



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to find information about Tom M. Mitchell and his books. I will use Wikipedia for this.
Action:
```
{
  "action": "wikipedia",
  "action_input": "Tom M. Mitchell"
}
```
[0m



  lis = BeautifulSoup(html).find_all('li')



Observation: [33;1m[1;3mPage: Tom M. Mitchell
Summary: Tom Michael Mitchell (born August 9, 1951) is an American computer scientist and the Founders University Professor at Carnegie Mellon University (CMU). He is a founder and former Chair of the Machine Learning Department at CMU. Mitchell is known for his contributions to the advancement of machine learning, artificial intelligence, and cognitive neuroscience and is the author of the textbook Machine Learning. He is a member of the United States National Academy of Engineering since 2010. He is also a Fellow of the American Academy of Arts and Sciences, the American Association for the Advancement of Science and a Fellow and past President of the Association for the Advancement of Artificial Intelligence. In October 2018, Mitchell was appointed as the Interim Dean of the School of Computer Science at Carnegie Mellon.



Page: Ensemble learning
Summary: In statistics and machine learning, ensemble methods use multiple learning algo

## Python Agent

In [108]:
agent = create_python_agent(
    llm,
    tool=PythonREPLTool(), # A repl is basically a way of interaction with code
    verbose=True
)

In [109]:
customer_list = [["Harrison", "Chase"], 
                 ["Lang", "Chain"],
                 ["Dolly", "Too"],
                 ["Elle", "Elem"], 
                 ["Geoff","Fusion"], 
                 ["Trance","Former"],
                 ["Jen","Ayai"]
                ]

In [112]:
agent.run(f"""Sort these customers by second name and then first name. Do print the output in the form of a list : {customer_list}""") 



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to sort the list of customers by second name and then by first name. I can use the sort() function in python to do this.
Action: Python_REPL
Action Input: customers = [['Harrison', 'Chase'], ['Lang', 'Chain'], ['Dolly', 'Too'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Jen', 'Ayai']]
customers.sort(key=lambda x: (x[1], x[0]))
print(customers)[0m
Observation: [36;1m[1;3m[['Jen', 'Ayai'], ['Lang', 'Chain'], ['Harrison', 'Chase'], ['Elle', 'Elem'], ['Trance', 'Former'], ['Geoff', 'Fusion'], ['Dolly', 'Too']]
[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: [['Jen', 'Ayai'], ['Lang', 'Chain'], ['Harrison', 'Chase'], ['Elle', 'Elem'], ['Trance', 'Former'], ['Geoff', 'Fusion'], ['Dolly', 'Too']][0m

[1m> Finished chain.[0m


"[['Jen', 'Ayai'], ['Lang', 'Chain'], ['Harrison', 'Chase'], ['Elle', 'Elem'], ['Trance', 'Former'], ['Geoff', 'Fusion'], ['Dolly', 'Too']]"

In [114]:
import langchain
langchain.debug = True
agent.run(f"""Sort these customers by second name and then first name. Do print the output in the form of a list : {customer_list}""") 
langchain.debug = False

[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor] Entering Chain run with input:
[0m{
  "input": "Sort these customers by second name and then first name. Do print the output in the form of a list : [['Harrison', 'Chase'], ['Lang', 'Chain'], ['Dolly', 'Too'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Jen', 'Ayai']]"
}
[32;1m[1;3m[chain/start][0m [1m[chain:AgentExecutor > chain:LLMChain] Entering Chain run with input:
[0m{
  "input": "Sort these customers by second name and then first name. Do print the output in the form of a list : [['Harrison', 'Chase'], ['Lang', 'Chain'], ['Dolly', 'Too'], ['Elle', 'Elem'], ['Geoff', 'Fusion'], ['Trance', 'Former'], ['Jen', 'Ayai']]",
  "agent_scratchpad": "",
  "stop": [
    "\nObservation:",
    "\n\tObservation:"
  ]
}
[32;1m[1;3m[llm/start][0m [1m[chain:AgentExecutor > chain:LLMChain > llm:HuggingFaceEndpoint] Entering LLM run with input:
[0m{
  "prompts": [
    "You are an agent designed to write and execu

## Custom Tool!

In [7]:
from langchain.agents import tool
from datetime import date

In [8]:
#NOTE : The agent gets to know the usage of the tool via the docstring
@tool
def time(text:str)->str:
    """Returns today's date, use this for any questions related to knowing today's date.
    the input should always be an empty string, and this function will always return today's date.
    Any date mathematics should occur outside this function. """

    return str(date.today())

In [9]:
agent = initialize_agent(
    tools = tools + [time],
    llm= llm,
    agent= AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION,
    handle_parsing_errors = True,
    verbose = True
)

In [11]:
try:
    result = agent("whats the date today?") 
except: 
    print("exception on external access")

  warn_deprecated(




[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThought: I need to use the time tool to get the current date
Action:
```
{
  "action": "time",
  "action_input": ""
}
```
[0m
Observation: [38;5;200m[1;3m2024-07-03[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Today's date is 2024-07-03[0m

[1m> Finished chain.[0m
