In [1]:
from keys import Openaikey,Deeplakekey

## LLMs
The fundamental component of LangChain involves invoking an LLM with a specific input. To illustrate this, we'll explore a simple example. Let's imagine we are building a service that suggests personalized workout routines based on an individual's fitness goals and preferences.<br>
To accomplish this, we will first need to import the LLM wrapper.

In [2]:
# !pip install langchain==0.0.208 deeplake openai tiktoken

In [3]:
from langchain.llms import OpenAI

- Tempreture parameter manages the randomness of the output
- temp == 0 -> predetermined outputs and used for most probable results
- temp == 1 -> inconsistent and interesting tasks - not advised to use it
- prefered 0.7 - 0.9
- use gpt3 variant Davinci

In [4]:
llm = OpenAI(model = 'text-davinci-003',temperature = 0.9)

In [5]:
prompt = 'What is the capital of India?'
print(llm(prompt))



The capital of India is New Delhi.


In [6]:
text = 'Suggest a personalized workout routine for someone looking to improve cardiovascular endurance and prefers outdoor activities'
print(llm(text))

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-UzOocMsB0zH8LN0hOTaQ84JX on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-UzOocMsB0zH8LN0hOTaQ84JX on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/acco



-Monday: Begin with a 10-minute warm-up jog. After, run an interval route of sprinting for 1 minute and jogging for 2 minutes, for a total of 20 minutes. End with a 5 minute cool-down jog.

-Tuesday: Begin with a 10-minute warm-up jog. After, run hills for a total of 30 minutes, alternating between sprinting up the hill and jogging down. End with a 5 minute cool-down jog.

-Wednesday: Take a day off.

-Thursday: Begin with a 10-minute warm-up jog. After, do 3 sets of 10 burpees with a 30-second sprint in between each set. End with a 5-minute cool-down jog.

-Friday: Begin with a 10-minute warm-up jog. After, go on a long distance run for a total of 45 minutes. End with a 5-minute cool-down jog.

-Saturday: Take a day off. 

-Sunday: Begin with a 10-minute warm-up jog. After, go on a hike for a total of 1 hour. Make sure to find hills and up the intensity by running the hills. End with


## Chain
- A chain is a wrapper around multiple individual components which are combined for a common task<br>
common chain - LLMChain

Work of llmchain
- multiple inputs variables
- Use `PromptTemplate` to format the input variables into the prompt
- pass the formated prompt into the llm
- used output parser to parse the output for the final output

In the next example, we demonstrate how to create a chain that generates a possible name for a company that produces eco-friendly water bottles. By using LangChain's LLMChain, PromptTemplate, and OpenAIclasses, we can easily define our prompt, set the input variables, and generate creative outputs. 

In [7]:
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

In [8]:
prompt = PromptTemplate(input_variables = ['Products'],
                       template = 'What is a good name for a company that makes {Products}')
chain = LLMChain(llm = llm, prompt = prompt)
# chain uses run() function
chain.run('Eco-friendly water bottles')

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-UzOocMsB0zH8LN0hOTaQ84JX on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..
Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-UzOocMsB0zH8LN0hOTaQ84JX on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/acco

'\n\nEcoQuench Bottle Co.'

In [9]:
# print(''.join(chain.run('Eco-friendly water bottle').strip().split()))

## Memory
- Stores the interaction between user and the bot
- helps maintain the context and coherency throughout the interaction
- if it has memory of previous interactions then the output it produces in the future will be more releavant and accurate
- `ConversationBufferMemory` acts as a wrapper around the `ChatMessageHistory`

In [10]:
# from langchain.memory import ConversationBufferMemory
# from langchain.chains import ConversationChain

# conversation_chain = ConversationChain(llm = llm, verbose = True,memory = ConversationBufferMemory())

# conversation_chain.predict(input = 'Tell me about yourself')
# conversation_chain.predict(input = 'What can you do?')
# conversation_chain.predict(input = 'Can you help me with my work?')


## Deep Lake vectorstore
- Deep Lake provides storage for embeddings and their corresponding metadata in the context of LLM apps. It enables hybrid searches on these embeddings and their attributes for efficient data retrieva
- It’s multimodal, which means that it can be used to store items of diverse modalities, such as texts, images, audio, and video, along with their vector representations
- It’s serverless, which means that we can create and manage cloud datasets without creating and managing a database instance. This aspect gives a great speedup to new projects.
- Last, it’s possible to easily create a data loader out of the data loaded into a Deep Lake dataset. It is convenient for fine-tuning machine learning models using common frameworks like PyTorch and TensorFlow.

In [11]:
import os
os.environ["ACTIVELOOP_TOKEN"] = Deeplakekey

In [12]:
# !pip install deeplake

In [13]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain import vectorstores
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA

In [14]:
ll = OpenAI(model = 'text-davinci-003',tempreture = 0)
embeddings = OpenAIEmbeddings(model = 'text-embedding-ada-002')

# create our documents
texts = [
    "Napoleon Bonaparte was born in 15 August 1769",
    "Louis XIV was born in 5 September 1638"
]
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.create_documents(texts)

                tempreture was transferred to model_kwargs.
                Please confirm that tempreture is what you intended.


In [15]:
my_activeloop_org_id = "rayroyale33" 
my_activeloop_dataset_name = "langchain_course_from_zero_to_hero"
dataset_path = f"hub://{my_activeloop_org_id}/{my_activeloop_dataset_name}"
db = vectorstores.DeepLake(dataset_path=dataset_path, embedding_function=embeddings)

# add documents to our Deep Lake dataset
db.add_documents(docs)

Using embedding function is deprecated and will be removed in the future. Please use embedding instead.


Deep Lake Dataset in hub://rayroyale33/langchain_course_from_zero_to_hero already exists, loading from the storage


/

Dataset(path='hub://rayroyale33/langchain_course_from_zero_to_hero', tensors=['embedding', 'id', 'metadata', 'text'])

  tensor      htype      shape     dtype  compression
  -------    -------    -------   -------  ------- 
 embedding  embedding  (6, 1536)  float32   None   
    id        text      (6, 1)      str     None   
 metadata     json      (6, 1)      str     None   
   text       text      (6, 1)      str     None   


 

['e2129d0b-3b4e-11ee-b60b-a497b1b4542e',
 'e212c405-3b4e-11ee-98a7-a497b1b4542e']

you’ve just created your first Deep Lake dataset!

Now, let's create a RetrievalQA chain:

In [16]:
retrieval_qa = RetrievalQA.from_chain_type(llm = llm, 
                                          chain_type = 'stuff',
                                          retriever = db.as_retriever())

Next, let's create an agent that uses the RetrievalQA chain as a tool:

In [17]:
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType

tools = [
    Tool(
        name="Retrieval QA System",
        func=retrieval_qa.run,
        description="Useful for answering questions."
    ),
]

agent = initialize_agent(tools,llm,agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,verbose=True
)

Finally, we can use the agent to ask a question:

In [18]:
response = agent.run('When was Napoleone born?')
print(response)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find out when Napoleone was born
Action: Retrieval QA System
Action Input: When was Napoleone born?
 timetable[0m
Observation: [36;1m[1;3m Napoleon Bonaparte was born on 15 August 1769.[0m
Thought:[32;1m[1;3m I now know the final answer
Final Answer: Napoleon Bonaparte was born on 15 August 1769.[0m

[1m> Finished chain.[0m
Napoleon Bonaparte was born on 15 August 1769.


Here, the agent used the “Retrieval QA System” tool with the query “When was Napoleone born?” which is then run on our new Deep Lake dataset, returning the most similar document (i.e., the document containing the date of birth of Napoleon). This document is eventually used to generate the final output.

Things that happened
- We created a vector database
- Created an agent with RetrievalQA chain as a tool to answer questions based on our document

Next - Add more data

In [19]:
    # load the existing Deep Lake dataset and specify the embedding function
db = vectorstores.DeepLake(dataset_path=dataset_path, embedding_function=embeddings)

# create new documents
texts = [
    "Lady Gaga was born in 28 March 1986",
    "Michael Jeffrey Jordan was born in 17 February 1963"
]
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.create_documents(texts)

# add documents to our Deep Lake dataset
db.add_documents(docs)

Using embedding function is deprecated and will be removed in the future. Please use embedding instead.


Deep Lake Dataset in hub://rayroyale33/langchain_course_from_zero_to_hero already exists, loading from the storage


 

Dataset(path='hub://rayroyale33/langchain_course_from_zero_to_hero', tensors=['embedding', 'id', 'metadata', 'text'])

  tensor      htype      shape     dtype  compression
  -------    -------    -------   -------  ------- 
 embedding  embedding  (8, 1536)  float32   None   
    id        text      (8, 1)      str     None   
 metadata     json      (8, 1)      str     None   
   text       text      (8, 1)      str     None   


['05e4c6f3-3b4f-11ee-b8f9-a497b1b4542e',
 '05e4c6f4-3b4f-11ee-9aea-a497b1b4542e']

In [20]:
# now that we have new information lets ask some new question
response = agent.run("When was Michael Jordan born?")
print(response)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m I need to find the date of Michael Jordan's birth
Action: Retrieval QA System
Action Input: When was Michael Jordan born?[0m
Observation: [36;1m[1;3m Michael Jordan was born on 17 February 1963.[0m
Thought:

Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised RateLimitError: Rate limit reached for default-text-davinci-003 in organization org-UzOocMsB0zH8LN0hOTaQ84JX on requests per min. Limit: 3 / min. Please try again in 20s. Contact us through our help center at help.openai.com if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://platform.openai.com/account/billing to add a payment method..


[32;1m[1;3m I now know the final answer
Final Answer: Michael Jordan was born on 17 February 1963.[0m

[1m> Finished chain.[0m
Michael Jordan was born on 17 February 1963.
