### The Power and Limitations of Large Language Models

Large Language Models (LLMs) are trained to understand the distribution of words in a language, enabling them to **generate meaningful text** without memorizing specific data. However, their knowledge is limited to the training data, leading to potential fabrications when asked about information beyond their training period, known as **"hallucination."**

To address this issue, **retrievers** can be used alongside LLMs. Retrievers fetch information from trusted sources, and LLMs are prompted to rearrange the retrieved information without adding new details. LLMs like GPT-4 and Claude can handle large context window sizes, but an efficient retriever is needed to find the most relevant documents due to the cost of execution.

Efficient retrievers are built using embedding models that map texts to vectors. These vectors are then stored in specialized databases called vector stores. This is where **Deep Lake** comes in: it provides a seamless way to store embeddings and their corresponding metadata. Deep Lake also enables hybrid searches on these embeddings and their attributes for efficient data retrieval. 

In [1]:
from keys import ACTIVELOOP_TOKEN, OPENAI_API_KEY

### The LLMs

In [2]:
# Import the LLM wrapper
from langchain.llms import OpenAI

The temperature parameter in OpenAI models manages the randomness of the output.
- 0: output is mostly predetermined and result is stable;
- 1: output can be inconsistent and interesting, but isn't generally advised for most tasks;
- between 0.70 and 0.90: offers a balance of reliability and creativity for creative tasks.

In [4]:
# Initialize the GPT-3 model’s Davinci variant
llm = OpenAI(model="text-davinci-003", temperature=0.9, openai_api_key=OPENAI_API_KEY)

In [5]:
# Example of calling the initialized LLM
text = "Suggest a personalized workout routine for someone looking to improve cardiovascular endurance and prefers outdoor activities."

print(llm(text))

 

Day 1: 
• Begin with 10 minutes of light stretching 
• 3 sets of walking lunges with 20 reps per set 
• 2 sets of alternating leg lunges with 15 reps per set 
• 2 sets of high-knees with 30 reps per set 
• 30-minute jog at a moderate pace 
• 2 sets of burpees with 10 reps per set 

Day 2: 
• Begin with 10 minutes of light stretching 
• 3 sets of stair sprints with 10 reps per set 
• 2 sets of jump squats with 15 reps per set 
• 3 sets of mountain climbers with 20 reps per set
• 30-minute jog at a moderate pace 
• 2 sets of hill sprints with 5 reps per set 

Day 3: 
• Begin with 10 minutes of light stretching 
• 3 sets of joggers with 20 reps per set 
• 2 sets of side skaters with 15 reps per set 
• 2 sets of butt kicks with 30 reps per set 
• 30-minute jog at a moderate pace 
• 2 sets of sprints with 10 reps per set


### The Chains

A chain is a combination of multiple individual components in a specific sequence. The most commonly used type of chain is the **LLMChain**, which consists of a PromptTemplate, a model (either an LLM or a ChatModel), and an optional output parser.

The LLMChain works as follows:

1. Takes input variable(s);
2. Format the input variables into a prompt by using the PromptTemplate;
3. Passes the formatted prompt to the model (LLM or ChatModel);
4. Parses the output of the LLM into a final format, if an OutputParser is provided.

In [9]:
# Creating a chain that generates a possible name for a company that produces a given product

from langchain.prompts import PromptTemplate
from langchain.llms import OpenAI
from langchain.chains import LLMChain

llm = OpenAI(model="text-davinci-003", temperature=0.9, openai_api_key=OPENAI_API_KEY)

prompt = PromptTemplate(
    input_variables=["product"],
    template="What is a good name for a company that makes {product}?",
)

chain = LLMChain(llm=llm, prompt=prompt)

In [10]:
# Run the chain only specifying the input variable.
print(chain.run("eco-friendly water bottles"))



GreenDrop Water Bottles.


In [11]:
print(chain.run("personalized tea cups"))



Steeped with Love Tea Co.
