In [1]:
%pip install openai
%pip install langchain
%pip install langchain_community
%pip install langchain_openai
%pip install langchainhub
%pip install python-dotenv

In [2]:
import os
from dotenv import load_dotenv

# loading from a .env file
# load_dotenv(dotenv_path="/full/path/to/your/.env")

# or 
# if you're on google colab just uncomment below and replace with your openai api key
# os.environ["OPENAI_API_KEY"] = "<your-openai-api-key>"

# Langchain for LLM App Development 

We talked about how building an LLM app involves doing some prompt management 
where we can either prepare the input data from the user with some 
pre-prompting, or do some post-prompting and some cleaning up after the LLM 
gives an output to ensure that our app performs the functionalities as expected.

So, this kind of workflow usually involves a lot of abstractions where prompts 
are no longer static pieces of text, but dynamic, they have to integrate 
information.

![](./images/Notebook_4-dynamic_prompt.png)

This dynamics requirement from a prompt will lead to the need for creating certain types of abstractions to properly handle and manage prompts effectively.

Another need in the context of more complex LLM App development, is the need for chaining prompts together, meaning connecting the output of one prompt to another. This is often the case for when prompts might be too large and a single call to the LLM won't be enough to solve the problem or the context window (maximum tokens/words the model can read and writer per request) is exceeded.

![](./images/Notebook_4-prompt_chaining.png)

# Lanchain

[Langchain](https://python.langchain.com/docs/get_started/introduction.html) is a framework created by Harrison Chase that facilitates the creation and management of dynamic prompts and chaining between prompts.

Its main features are:
- **Components**: abstractions for working with LMs
- **Off-the-shelf chains**: assembly of components for accomplishing certain higher-level tasks

With langchain it becomes much easier to create what are called Prompt Templates, which are prompts that can take in user data and abstract away the need for typing out everything that is required for a task to get done.

Let's take a look at some simple examples to get started.

In order to create an application with LangChain, we need to understand its core components:

- Models
- Prompts
- Output Parsers

![](2023-08-17-14-48-39.png)

**Models**

abstractions over the LLM APIs like the ChatGPT API.​

In [3]:
#!pip install langchain
# !pip install langchain-openai

In [4]:
from langchain_openai import ChatOpenAI
import os

chat_model = ChatOpenAI(api_key=os.environ["OPENAI_API_KEY"])

In [5]:
chat_model

ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x11bbb7210>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x11da2e910>, root_client=<openai.OpenAI object at 0x11bd81b90>, root_async_client=<openai.AsyncOpenAI object at 0x11da04b50>, model_kwargs={}, openai_api_key=SecretStr('**********'), openai_api_base='https://api.openai.com/v1')

You can predict outputs from both LLMs and ChatModels:

In [6]:
chat_model.invoke("hi! Tell me a quick story about large language models")
# Output: "Hi"

AIMessage(content='Once upon a time, in a world where technology was advancing at an unprecedented rate, there existed a powerful tool known as a large language model. This model was created using vast amounts of text data, allowing it to understand and generate human-like language with remarkable accuracy.\n\nPeople from all corners of the world marveled at the capabilities of this model, using it to translate languages, write articles, and even create entire stories. Some hailed it as a revolutionary breakthrough in artificial intelligence, while others expressed concern about the ethical implications of such advanced technology.\n\nDespite the debates and controversies surrounding it, the large language model continued to grow and evolve, becoming an indispensable tool for communication and creativity. And as it continued to learn and improve, the possibilities of what it could achieve seemed endless, leaving the world in awe of its potential.', additional_kwargs={'refusal': None}, 

In [7]:
output = chat_model.invoke("hi! Tell me a joke about an instructor who is always having issues when he tries to run live demos during his live-trainings.")

In [8]:
output

AIMessage(content='Why did the instructor struggle with running live demos during his trainings? Because every time he hit "play," it was always on "pause!"', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 30, 'prompt_tokens': 35, 'total_tokens': 65, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-a51bb1fe-4b9e-4875-8db0-1a53dddb809f-0', usage_metadata={'input_tokens': 35, 'output_tokens': 30, 'total_tokens': 65, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [9]:
from IPython.display import display, Markdown

Markdown(output.content)

Why did the instructor struggle with running live demos during his trainings? Because every time he hit "play," it was always on "pause!"

**Prompts**

Prompt Templates are useful abstractions for reusing prompts. 

They are used to provide context for the specific task that the language model needs to complete. 
A simple example is a `PromptTemplate` that formats a string into a prompt:

In [10]:
from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("What is a good name for a company that makes {product}?")
prompt.format(product="hair maker")

'Human: What is a good name for a company that makes hair maker?'

In [11]:
chain = prompt | chat_model

# PP
chain.invoke({"product": "hair maker"})

AIMessage(content='Locks & Luxe Co.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 20, 'total_tokens': 28, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-74de4812-994e-4331-872c-f245b7e4f3d7-0', usage_metadata={'input_tokens': 20, 'output_tokens': 8, 'total_tokens': 28, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [12]:
output_company_name = chain.invoke({"product": "emotional support for bald instructors"})
output_company_name.content

'"Roots of Strength"'

In [13]:
# U1
chain.invoke({"product": "fresh packaged meal"})

AIMessage(content='Fresh Fare Express', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 4, 'prompt_tokens': 21, 'total_tokens': 25, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-61c2207a-83e6-4023-b8aa-26e16b7f750a-0', usage_metadata={'input_tokens': 21, 'output_tokens': 4, 'total_tokens': 25, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [14]:
# MP
chain.invoke({"product": "Beddings"})

AIMessage(content='CozyDreams Bedding Co.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 9, 'prompt_tokens': 20, 'total_tokens': 29, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-503e780b-ad46-44de-8c2e-5b7870de7dd7-0', usage_metadata={'input_tokens': 20, 'output_tokens': 9, 'total_tokens': 29, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [15]:
# KP
product = "plats that are not easy to kill"

chain.invoke({"product": product})

AIMessage(content='"Evergreen Gardens"', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 26, 'total_tokens': 32, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-01294cc3-5645-4b51-b045-484f15540a5c-0', usage_metadata={'input_tokens': 26, 'output_tokens': 6, 'total_tokens': 32, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [16]:
#SZ:  Advance Night time Nutrients
product = "Advance Night time Nutrients"

chain.invoke({"product": product})

AIMessage(content='Nighttime Renewal Labs', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 23, 'total_tokens': 29, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-bf0f3abe-8634-4841-a0b4-3a09319b11ac-0', usage_metadata={'input_tokens': 23, 'output_tokens': 6, 'total_tokens': 29, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [17]:
# RC
product = "drum set?"

chain.invoke({"product": product})

AIMessage(content='"BeatCraft Drums"', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 20, 'total_tokens': 27, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-1070242c-70a2-45ff-aae8-7e874a598a2c-0', usage_metadata={'input_tokens': 20, 'output_tokens': 7, 'total_tokens': 27, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [18]:
# JC
product = "Pancakes"
chain.invoke({"product": product})

AIMessage(content='Fluffy Stack Co.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 20, 'total_tokens': 26, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-61c1b63a-fb4b-408d-88eb-c0f195f90d35-0', usage_metadata={'input_tokens': 20, 'output_tokens': 6, 'total_tokens': 26, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [19]:
# MP
product = "pestisides"

chain.invoke({"product": product})

AIMessage(content='EcoShield Solutions', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 5, 'prompt_tokens': 21, 'total_tokens': 26, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-850160e8-da0c-4d53-9a6e-6e7f416b6dbb-0', usage_metadata={'input_tokens': 21, 'output_tokens': 5, 'total_tokens': 26, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [20]:
# RM
product = "Feijoada"

chain.invoke({"product": product})

AIMessage(content='Feijoada Delights', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 21, 'total_tokens': 27, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-dfd8f4ff-abf4-4880-bf74-db31ff41547a-0', usage_metadata={'input_tokens': 21, 'output_tokens': 6, 'total_tokens': 27, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [21]:
# MP
product = "Dosa & Idly"

chain.invoke({"product": product})

AIMessage(content='"South Indian Delights"', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 7, 'prompt_tokens': 23, 'total_tokens': 30, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-ccc8dd35-60b3-4d2a-ae99-e32d7291910b-0', usage_metadata={'input_tokens': 23, 'output_tokens': 7, 'total_tokens': 30, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

However, the advantages of using these over raw string formatting are several. You can "partial" out variables - e.g. you can format only some of the variables at a time. You can compose them together, easily combining different templates into a single prompt. For explanations of these functionalities, see the section on prompts for more detail.

PromptTemplates can also be used to produce a list of messages. In this case, the prompt not only contains information about the content, but also each message (its role, its position in the list, etc.). Here, what happens most often is a ChatPromptTemplate is a list of ChatMessageTemplates. Each ChatMessageTemplate contains instructions for how to format that ChatMessage - its role, and then also its content. Let's take a look at this below:

In [22]:
# source: https://python.langchain.com/docs/modules/model_io/quick_start
from langchain_core.prompts import ChatPromptTemplate

template = "You are a helpful assistant that translates {input_language} to {output_language}."
human_template = "{text}"

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", template),
    ("human", human_template),
])

chat_prompt.format_messages(input_language="English", output_language="French", text="I love programming.")

[SystemMessage(content='You are a helpful assistant that translates English to French.', additional_kwargs={}, response_metadata={}),
 HumanMessage(content='I love programming.', additional_kwargs={}, response_metadata={})]

**Output Parsers**

OutputParsers convert the raw output from an LLM into a format that can be used downstream. Here is an example of an OutputParser that converts a comma-separated list into a list:

In [23]:
from langchain_core.output_parsers import JsonOutputParser


output_parser = JsonOutputParser()
output = output_parser.parse('{"name": "Lucas"}')
print(output)
type(output)

{'name': 'Lucas'}


dict

# Composing Chains with LCEL

source: https://python.langchain.com/docs/modules/model_io/quick_start#:~:text=We%20can%20now,green'%2C%20'yellow'%2C%20'orange'%5D
We can now combine all these into one chain. This chain will take input variables, pass those to a prompt template to create a prompt, pass the prompt to a language model, and then pass the output through an (optional) output parser. 

The modern version with the LCEL interface:

In [24]:
template = "Generate a list of 5 {text}.\n\n{format_instructions}"

chat_prompt = ChatPromptTemplate.from_template(template)

chat_prompt = chat_prompt.partial(format_instructions=output_parser.get_format_instructions())

chain = chat_prompt | chat_model | output_parser
chain.invoke({"text": "AI topics"})
# >> ['red', 'blue', 'green', 'yellow', 'orange']

{'topics': ['1. Natural Language Processing',
  '2. Computer Vision',
  '3. Reinforcement Learning',
  '4. Ethics in AI',
  '5. Neural Networks']}

In [25]:
# KP: professions that are least threatened by AI
example = "professions that are least threatened by AI"

chain.invoke({"text": example})

{'professions': ['Social worker', 'Therapist', 'Chef', 'Teacher', 'Lawyer']}

In [26]:
# TB
example = "names for spaceships"
chain.invoke({"text": example})

{'spaceshipNames': ['Stellar Voyager',
  'Galactic Explorer',
  'Cosmic Wanderer',
  'Nebula Seeker',
  'Starlight Cruiser']}

In [27]:
# AP
example = "things to do for productive day"

chain.invoke({"text": example})

{'things_to_do': ['Create a to-do list and prioritize tasks',
  'Break tasks into smaller, manageable chunks',
  'Take breaks to recharge and avoid burnout',
  'Limit distractions and focus on one task at a time',
  'Reflect on accomplishments at the end of the day']}

In [28]:
# MP
example = "Starwars Movies"

chain.invoke({"text": example})

{'Starwars Movies': ['Star Wars: A New Hope',
  'Star Wars: The Empire Strikes Back',
  'Star Wars: Return of the Jedi',
  'Star Wars: The Force Awakens',
  'Star Wars: The Last Jedi']}

In [29]:
# SZ: Favourite UK food

example = "Favourite UK food"

chain.invoke({"text": example})

{'favourite_UK_food': ['Fish and chips',
  'Full English breakfast',
  "Shepherd's pie",
  'Bangers and mash',
  'Chicken tikka masala']}

In [30]:
output_parser.get_format_instructions()

'Return a JSON object.'

we are using the | syntax to join these components together. This | syntax is powered by the LangChain Expression Language (LCEL) and relies on the universal Runnable interface that all of these objects implement. To learn more about LCEL, read the documentation here.

<!-- For this part I just took some info from the langchain official docs: https://python.langchain.com/docs/modules/model_io/quick_start -->

The modern LCEL interface version:

In [31]:
from langchain_core.output_parsers import CommaSeparatedListOutputParser
template = """What would be 5 good names for the animal: {animal} that is {adjective}?
The output should be just one sentence separated by commas."""

chat_prompt = ChatPromptTemplate.from_template(template)

chain = chat_prompt | ChatOpenAI() | CommaSeparatedListOutputParser()

chain.invoke({"animal":"dogs", "adjective": "sleepy"})

['1. Snoozy', '2. Dozer', '3. Napper', '4. Snuggles', '5. Dreamer.']

This chain will take input variables, pass those to a prompt template to create a prompt, pass the prompt to an LLM, and then pass the output through an output parser.

Ok, so these are the basics of langchain. But how can we leverage these abstraction capabilities inside our LLM app application?

One of the best applications of langchain is for the "chat with your data"-types of applications, where the user uploads a document like a pdf or a .txt file, and is able to query that document using langchain powered by an LLM like ChatGPT. 

# LangChain Lab Exercises

Let's take a look at a simple example of a simple chain using now only the modern interface.

In [32]:
from langchain_openai import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.schema.output_parser import StrOutputParser

In [33]:
llm = ChatOpenAI(temperature=.7)
template = """You are a learning assistant. Given a technical subject, write down 5 fundamental concepts to understand it.
Subject: {subject}
Learning assistant: The 5 fundamental concepts are:"""
subject_prompt = ChatPromptTemplate.from_template(template)

In [34]:
# This is an LLMChain to write a review of a play given a synopsis.
llm = ChatOpenAI(temperature=.7)
template = """You are an expert teacher in all technical and scientific fields. Given a list of 5 concepts, write down a simple intuitive explanation of each concept.
Concepts:
{concepts}
Intuitive explanations:"""
concepts_prompt = ChatPromptTemplate.from_template(template)

In [35]:
from IPython.display import Markdown
# This is the overall chain where we run these two chains in sequence.
learning_overall_chain = (
    {"concepts": subject_prompt | llm | StrOutputParser() }
    | concepts_prompt
    | llm
    | StrOutputParser()
    )

output = learning_overall_chain.invoke({"subject": "Quantum Mechanics"})
Markdown(output)

1. Superposition: Imagine a particle as a spinning top that can spin both clockwise and counterclockwise at the same time. In the quantum world, particles can exist in multiple spinning states simultaneously, creating a blend of possibilities until a measurement is made.

2. Wave-particle duality: Think of a particle as a tiny ball bouncing around, but also as a wave spreading out like ripples in a pond. In quantum mechanics, particles exhibit both particle-like behavior (bouncing off surfaces) and wave-like behavior (interfering with each other) depending on how they are observed.

3. Uncertainty principle: Picture trying to pinpoint the exact location and speed of a fast-moving car at the same time. The uncertainty principle states that in the quantum world, the more accurately we know one aspect of a particle (like its position), the less accurately we can know another aspect (like its momentum).

4. Quantum entanglement: Imagine two particles as dance partners who, once entangled, move in perfect synchrony no matter how far apart they are. In quantum entanglement, the state of one particle instantly affects the state of the other, even if they are separated by vast distances.

5. Quantum tunneling: Picture a tiny ant finding a way to crawl through a solid wall by magically appearing on the other side. Quantum tunneling allows particles to pass through barriers that would be impossible to cross according to classical physics, like a ghost slipping through a closed door.

Example from KP: Can you write a sample Langchain to do (2+3) * 6. (2+3) is one chain and + 6 is another. chain.

In [36]:
template = """
You are a mathematical engine. Given a math operation you should output only the result.
input: {math_input}
output:
"""

chat_model = ChatOpenAI(temperature=0)
prompt1 = ChatPromptTemplate.from_template(template)
output_parser = StrOutputParser()

chain_math1 = prompt1 | chat_model | output_parser

chain_math1.invoke({"math_input": "2+2"})

'4'

# Simple Q&A Example

In [49]:
# !pip install docarray
from langchain.chains import RetrievalQA
from langchain_openai.chat_models import ChatOpenAI
from langchain_community.document_loaders.csv_loader import CSVLoader
from langchain.vectorstores import Chroma
from IPython.display import display, Markdown
from langchain.indexes import VectorstoreIndexCreator
import pandas as pd

In [50]:
df = pd.read_csv("./superheroes.csv")
df.head()

Unnamed: 0,Superhero Name,Superpower,Power Level,Catchphrase
0,Captain Thunder,Bolt Manipulation,90,Feel the power of the storm!
1,Silver Falcon,Flight and Agility,85,"Soar high, fearlessly!"
2,Mystic Shadow,Invisibility and Illusions,78,Disappear into the darkness!
3,Blaze Runner,Pyrokinesis,88,Burn bright and fierce!
4,Electra-Wave,Electric Manipulation,82,Unleash the electric waves!


In [51]:
file = 'superheroes.csv'
loader = CSVLoader(file_path=file)

In [52]:
loader

<langchain_community.document_loaders.csv_loader.CSVLoader at 0x34c585a90>

In [53]:
documents = loader.load()

documents

[Document(metadata={'source': 'superheroes.csv', 'row': 0}, page_content='Superhero Name: Captain Thunder\nSuperpower: Bolt Manipulation\nPower Level: 90\nCatchphrase: Feel the power of the storm!'),
 Document(metadata={'source': 'superheroes.csv', 'row': 1}, page_content='Superhero Name: Silver Falcon\nSuperpower: Flight and Agility\nPower Level: 85\nCatchphrase: Soar high, fearlessly!'),
 Document(metadata={'source': 'superheroes.csv', 'row': 2}, page_content='Superhero Name: Mystic Shadow\nSuperpower: Invisibility and Illusions\nPower Level: 78\nCatchphrase: Disappear into the darkness!'),
 Document(metadata={'source': 'superheroes.csv', 'row': 3}, page_content='Superhero Name: Blaze Runner\nSuperpower: Pyrokinesis\nPower Level: 88\nCatchphrase: Burn bright and fierce!'),
 Document(metadata={'source': 'superheroes.csv', 'row': 4}, page_content='Superhero Name: Electra-Wave\nSuperpower: Electric Manipulation\nPower Level: 82\nCatchphrase: Unleash the electric waves!'),
 Document(meta

Now, let's set up our Vector store (we'll talk about what that is in a second):

In [54]:
from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings()

In [55]:
# !pip install faiss-cpu
from langchain_chroma import Chroma

db = Chroma.from_documents(documents, embeddings)

In [56]:
retriever = db.as_retriever()

In [57]:
from langchain_core.runnables import RunnableLambda, RunnablePassthrough


template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

In [58]:
from langchain_core.output_parsers import StrOutputParser
model = ChatOpenAI()

chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

In [59]:
query = "Tell me the catch phrase for Captain Thunder"
print(chain.invoke(query))

The catchphrase for Captain Thunder is "Feel the power of the storm!"


In [60]:
df

Unnamed: 0,Superhero Name,Superpower,Power Level,Catchphrase
0,Captain Thunder,Bolt Manipulation,90,Feel the power of the storm!
1,Silver Falcon,Flight and Agility,85,"Soar high, fearlessly!"
2,Mystic Shadow,Invisibility and Illusions,78,Disappear into the darkness!
3,Blaze Runner,Pyrokinesis,88,Burn bright and fierce!
4,Electra-Wave,Electric Manipulation,82,Unleash the electric waves!
5,Crimson Cyclone,Super Speed,91,Blazing fast and unstoppable!
6,Aqua Fury,Hydrokinesis,80,Ride the waves of power!
7,Lunar Guardian,Lunar Manipulation,77,Embrace the moon's might!
8,Steel Titan,Super Strength and Durability,95,Indestructible force of nature!
9,Nightblade,Night Vision and Stealth,84,Strike from the shadows!


In [61]:
query = "Tell me the catch phrase for the likely fastest superhero in the table"
print(chain.invoke(query))

The catchphrase for the likely fastest superhero in the table is "Blazing fast and unstoppable!"


# References
- https://python.langchain.com/docs/get_started/introduction.html
- https://medium.com/@remitoffoli/a-visual-guide-to-llm-powered-app-architecture-57e47426a92f
- [LangChain for LLM App Development short course by coursera](https://learn.deeplearning.ai/langchain/lesson/5/question-and-answer)
- [LLM Evaluation](https://learn.deeplearning.ai/langchain/lesson/6/evaluation)
[Models, Prompts, parsers, memory and chains from this langchain for](https://learn.deeplearning.ai/langchain/lesson/7/agents)
- [Chat With Your Data - Retrieval](https://learn.deeplearning.ai/langchain-chat-with-your-data/lesson/5/retrieval)
- [Emebeddings simple definition](https://learn.deeplearning.ai/langchain/lesson/5/question-and-answer)
- [Vector DBs - simple definition](https://learn.deeplearning.ai/langchain/lesson/5/question-and-answer)