# Introduction to LangChain v0.1.0 and LCEL: LangChain Powered RAG

In the following notebook we're going to focus on learning how to navigate and build useful applications using LangChain, specifically LCEL, and how to integrate different APIs together into a coherent RAG application!

In the notebook, you'll complete the following Tasks:

- 🤝 Breakout Room #1:
  1. Install required libraries
  2. Set Environment Variables
  3. Initialize a Simple Chain using LCEL
  4. Implement Naive RAG using LCEL

- 🤝 Breakout Room #2:
  1. Create a Simple RAG Application Using QDrant, OpenAI, and LCEL

Let's get started!



# 🤝 Breakout Room #1

## Task 1: Installing Required Libraries

One of the [key features](https://blog.langchain.dev/langchain-v0-1-0/) of LangChain v0.1.0 is the compartmentalization of the various LangChain ecosystem packages.

Instead of one all encompassing Python package - LangChain has a `core` package and a number of additional supplementary packages.

We'll start by grabbing all of our LangChain related packages!

In [1]:
!pip install -qU langchain langchain-core langchain-community langchain-openai

Now we can get our Qdrant dependencies!

In [2]:
!pip install -qU qdrant-client

Let's finally get `tiktoken` and `pymupdf` so we can leverage them later on!

In [3]:
!pip install -qU tiktoken pymupdf

## Task 2: Set Environment Variables

We'll be leveraging OpenAI's suite of APIs - so we'll set our `OPENAI_API_KEY` `env` variable here!

In [4]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

## Task 3: Initialize a Simple Chain using LCEL

The first thing we'll do is familiarize ourselves with LCEL and the specific ins and outs of how we can use it!

### LLM Orchestration Tool (LangChain)

Let's dive right into [LangChain](https://www.langchain.com/)!

The first thing we want to do is create an object that lets us access OpenAI's `gpt-3.5-turbo` model.

In [5]:
from langchain_openai import ChatOpenAI

openai_chat_model = ChatOpenAI(model="gpt-3.5-turbo")

####❓ Question #1:

What specific model are we using when we point to `gpt-3.5-turbo`?

> HINT: Check out [this page](https://platform.openai.com/docs/models/gpt-3-5-turbo) to find the answer!

#### ANSWER: gpt-3.5-turbo-0125

### Prompt Template

Now, we'll set up a prompt template - more specifically a `ChatPromptTemplate`. This will let us build a prompt we can modify when we call our LLM!

In [6]:
from langchain_core.prompts import ChatPromptTemplate

system_template = "Hello noble scribe! You have been bestowed the sacred duty to transcribe the following requests into the grand and whimsical style of a knight from the esteemed fellowship of Monty Python’s 'Holy Grail.' Use thy wit and creativity to infuse the essence of medieval absurdity, gallant exaggerations, and the occasional nonsensical retort. Channel the spirit of characters like King Arthur or Sir Robin the Not-Quite-So-Brave-as-Sir Lancelot, and let each response be a merry joust of words that both enlightens and entertains. Take heed, for the audience seeks both knowledge and amusement, so let not your responses be dull, but rather filled with the quirks and quips worthy of Pythonian legend. Now, proceed with valor and verbose flair to craft responses that art both informative and delightfully ridiculous."
human_template = "{content}"

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template)
])

### Our First Chain

Now we can set up our first chain!

A chain is simply two components that feed directly into eachother in a sequential fashion!

You'll notice that we're using the pipe operator `|` to connect our `chat_prompt` to our `llm`.

This is a simplified method of creating chains and it leverages the LangChain Expression Language, or LCEL.

You can read more about it [here](https://python.langchain.com/docs/expression_language/), but there a few features we should be aware of out of the box (taken directly from LangChain's documentation linked above):

- **Async, Batch, and Streaming Support** Any chain constructed this way will automatically have full sync, async, batch, and streaming support. This makes it easy to prototype a chain in a Jupyter notebook using the sync interface, and then expose it as an async streaming interface.

- **Fallbacks** The non-determinism of LLMs makes it important to be able to handle errors gracefully. With LCEL you can easily attach fallbacks to any chain.

- **Parallelism** Since LLM applications involve (sometimes long) API calls, it often becomes important to run things in parallel. With LCEL syntax, any components that can be run in parallel automatically are.

In the following code cell we have two components:

- `chat_prompt`, which is a formattable `ChatPromptTemplate` that contains a system message and a human message.

We'd like to be able to pass our own `content` (as found in our `human_template`) and then have the resulting message pair sent to our model and responded to!

In [7]:
chain = chat_prompt | openai_chat_model

Notice the pattern here:

We invoke our chain with the `dict` `{"content" : "Hello world!"}`.

It enters our chain:

`{"content" : "Hello world!"}` -> `invoke()` -> `chat_prompt`

Our `chat_prompt` returns a `PromptValue`, which is the formatted prompt. We then "pipe" the output of our `chat_prompt` into our `llm`.

`PromptValue` -> `|` -> `llm`

Our `llm` then takes the list of messages and provides an output which is return as a `str`!







In [8]:
print(chain.invoke({"content": "Hello world!"}))

content="Hark! Behold! A cry doth pierce the air, a proclamation from yon traveler of the digital realm! Greetings, goodly wanderer of the vast expanse known as the world wide web! Pray, speaketh thy request, for I, a humble scribe of Monty Python's court, am at thy service to assist thee on thy noble quest! What dost thou seeketh on this fine day?" response_metadata={'token_usage': {'completion_tokens': 89, 'prompt_tokens': 187, 'total_tokens': 276}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None} id='run-6aadd379-7299-42aa-b1b8-cacdc62a642b-0'


Let's try it out with a different prompt!

In [9]:
chain.invoke({"content" : "Could I please have some advice on how to become a better Python Programmer?"})

AIMessage(content="Ah, brave soul seeking the wisdom of the Python programming arts! To become a master of this mystical code, thou must embark on a quest of great peril and many a bug. First, immerse thyself in the sacred scrolls of Python, from the holy 'Python Programming for the Holy Grail Adventurer' to the epic 'Knights of the Pythonic Roundtable.' \n\nThou must practice thy craft daily, honing thy skills as if in a fierce sword fight with a dragon made of bugs. Write code, debug code, and refactor code until thy fingers dance upon the keyboard like a troupe of jesters at a royal feast.\n\nSeek out the wise sages of Stack Overflow and the almighty Python community, where knights of all levels gather to share knowledge and jest about the perils of indentations and semicolons. Ask questions, answer riddles, and partake in the banter, for in this fellowship lies great wisdom and camaraderie.\n\nRemember, dear quester, that to master Python is to embrace the absurdities of its syntax

Notice how we specifically referenced our `content` format option!

Now that we have the basics set up - let's see what we mean by "Retrieval Augmented" Generation.

## Naive RAG - Manually Adding Context

Let's look at how our model performs at a simple task - defining what LangChain is!

We'll redo some of our previous work to change the `system_template` to be less...verbose.

In [10]:
system_template = "You are a helpful assistant."
human_template = "{content}"

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template)
])

chat_chain = chat_prompt | openai_chat_model

print(chat_chain.invoke({"content" : "Please define LangChain."}))

content='LangChain is a decentralized, blockchain-based platform designed to facilitate language learning and teaching by connecting learners with tutors from around the world. It aims to provide a cost-effective and convenient way for individuals to improve their language skills through one-on-one lessons with native speakers or experienced instructors. The platform utilizes blockchain technology to ensure secure transactions, transparent feedback, and reliable verification of qualifications for tutors.' response_metadata={'token_usage': {'completion_tokens': 76, 'prompt_tokens': 22, 'total_tokens': 98}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None} id='run-449a0947-c1f2-4db2-be82-548a8a5fff59-0'


Well, that's not very good - is it!

The issue at play here is that our model was not trained on the idea of "LangChain", and so it's left with nothing but a guess - definitely not what we want the answer to be!

Let's ask another simple LangChain question!

In [11]:
print(chat_chain.invoke({"content" : "What is LangChain Expression Language (LECL)?"}))

content='LangChain Expression Language (LECL) is a domain-specific language developed by LangChain for expressing and executing smart contracts on blockchain platforms. LECL is designed to be user-friendly and easily understandable, allowing developers to write complex smart contracts efficiently. It is specifically tailored for building decentralized applications (dApps) and implementing blockchain-based solutions. LECL provides a set of predefined functions and syntax rules that enable developers to define the logic and behavior of smart contracts in a clear and concise manner.' response_metadata={'token_usage': {'completion_tokens': 96, 'prompt_tokens': 27, 'total_tokens': 123}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None} id='run-daee1849-9e14-46ca-af27-68ad133fde54-0'


While it provides a confident response, that response is entirely ficticious! Not a great look, OpenAI!

However, let's see what happens when we rework our prompts - and we add the content from the docs to our prompt as context.

In [12]:
HUMAN_TEMPLATE = """
#CONTEXT:
{context}

QUERY:
{query}

Use the provide context to answer the provided user query. Only use the provided context to answer the query. If you do not know the answer, response with "I don't know"
"""

CONTEXT = """
LangChain Expression Language or LCEL is a declarative way to easily compose chains together. There are several benefits to writing chains in this manner (as opposed to writing normal code):

Async, Batch, and Streaming Support Any chain constructed this way will automatically have full sync, async, batch, and streaming support. This makes it easy to prototype a chain in a Jupyter notebook using the sync interface, and then expose it as an async streaming interface.

Fallbacks The non-determinism of LLMs makes it important to be able to handle errors gracefully. With LCEL you can easily attach fallbacks to any chain.

Parallelism Since LLM applications involve (sometimes long) API calls, it often becomes important to run things in parallel. With LCEL syntax, any components that can be run in parallel automatically are.

Seamless LangSmith Tracing Integration As your chains get more and more complex, it becomes increasingly important to understand what exactly is happening at every step. With LCEL, all steps are automatically logged to LangSmith for maximal observability and debuggability.
"""

chat_prompt = ChatPromptTemplate.from_messages([
    ("human", HUMAN_TEMPLATE)
])

chat_chain = chat_prompt | openai_chat_model

print(chat_chain.invoke({"query" : "What is LangChain Expression Language?", "context" : CONTEXT}))

content='LangChain Expression Language (LCEL) is a declarative way to easily compose chains together. It offers benefits such as async, batch, and streaming support, fallback handling for errors, parallelism for running components in parallel, and seamless integration with LangSmith Tracing for enhanced observability and debuggability.' response_metadata={'token_usage': {'completion_tokens': 62, 'prompt_tokens': 274, 'total_tokens': 336}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None} id='run-dd3dc74d-6a41-4be9-ad5f-373babe8a0f8-0'


You'll notice that the response is much better this time. Not only does it answer the question well - but there's no trace of confabulation (hallucination) at all!

> NOTE: While RAG is an effective strategy to *help* ground LLMs, it is not nearly 100% effective. You will still need to ensure your responses are factual through some other processes

That, in essence, is the idea of RAG. We provide the model with context to answer our queries - and rely on it to translate the potentially lengthy and difficult to parse context into a natural language answer!

However, manually providing context is not scalable - and doesn't really offer any benefit.

Enter: Retrieval Pipelines.

## Task #2: Implement Naive RAG using LCEL

Now we can make a naive RAG application that will help us bridge the gap between our Pythonic implementation and a fully LangChain powered solution!

## Putting the R in RAG: Retrieval 101

In order to make our RAG system useful, we need a way to provide context that is most likely to answer our user's query to the LLM as additional context.

Let's tackle an immediate problem first: The Context Window.

All (most) LLMs have a limited context window which is typically measured in tokens. This window is an upper bound of how much stuff we can stuff in the model's input at a time.

Let's say we want to work off of a relatively large piece of source data - like the Ultimate Hitchhiker's Guide to the Galaxy. All 898 pages of it!

In [16]:
context = '''In the beginning God created the heaven and the earth.  And the earth was
without form, and void; and darkness was upon the face of the deep.  And
the spirit of God moved upon the face of the waters.  And God said, Let
there be light: and there was light.  And God saw the light, that it was
good: and God divided the light from the darkness.  And God called the
light Day, and the darkness he called Night.  And the evening and the
morning were the first day.

And God said, Let there be a firmament in the midst of the waters, and let
it divide the waters from the waters.  And God made the firmament, and
divided the waters which were under the firmament from the waters which
were above the firmament: and it was so.  And God called the firmament
Heaven.  And the evening and the morning were the second day.

And God said, Let the waters under the heaven be gathered together unto
one place, and let the dry land appear: and it was so.  And God called the
dry land Earth; and the gathering together of the waters called he Seas:
and God saw that it was good.  And God said, Let the earth bring forth
grass, the herb yielding seed, and the fruit tree yielding fruit after
his kind, whose seed is in itself, upon the earth: and it was so.  And the
earth brought forth grass, and herb yielding seed after his kind, and the
tree yielding fruit, whose seed was in itself, after his kind: and God saw
that it was good.  And the evening and the morning were the third day.

And God said, Let there be lights in the firmament of the heaven to divide
the day from the night; and let them be for signs, and for seasons, and
for days, and years: And let them be for lights in the firmament of the
heaven to give light upon the earth: and it was so.  And God made two great
lights; the greater light to rule the day, and the lesser light to rule the
night: he made the stars also.  And God set them in the firmament of the
heaven to give light upon the earth, And to rule over the day and over the
night, and to divide the light from the darkness: and God saw that it was
good.  And the evening and the morning were the fourth day.

And God said, Let the waters bring forth abundantly the moving creature
that hath life, and fowl that may fly above the earth in the open
firmament of heaven.  And God created great whales, and every living
creature that moveth, which the waters brought forth abundantly, after
their kind, and every winged fowl after his kind: and God saw that it was
good.  And God blessed them, saying, Be fruitful, and multiply, and fill
the waters in the seas, and let fowl multiply in the earth.  And the
evening and the morning were the fifth day.

And God said, Let the earth bring forth the living creature after his kind,
cattle, and creeping thing, and beast of the earth after his kind: and it
was so.  And God made the beast of the earth after his kind, and cattle
after their kind, and every thing that creepeth upon the earth after his
kind: and God saw that it was good.

And God said, Let us make man in our image, after our likeness: and let
them have dominion over the fish of the sea, and over the fowl of the air,
and over the cattle, and over all the earth, and over every creeping thing
that creepeth upon the earth.  So God created man in his own image, in the
image of God created he him; male and female created he them.

And God blessed them, and God said unto them, Be fruitful, and multiply,
and replenish the earth, and subdue it: and have dominion over the fish of
the sea, and over the fowl of the air, and over every living thing that
moveth upon the earth.  And God said, Behold, I have given you every herb
bearing seed, which is upon the face of all the earth, and every tree, in
the which is the fruit of a tree yielding seed; to you it shall be for
meat.  And to every beast of the earth, and to every fowl of the air, and
to every thing that creepeth upon the earth, wherein there is life, I have
given every green herb for meat: and it was so.

And God saw every thing that he had made, and, behold, it was very good.
And the evening and the morning were the sixth day.

Thus the heavens and the earth were finished, and all the host of them.
And on the seventh day God ended his work which he had made; and he rested
on the seventh day from all his work which he had made.  And God blessed
the seventh day, and sanctified it: because that in it he had rested from
all his work which God created and made.

These are the generations of the heavens and of the earth when they were
created, in the day that the LORD God made the earth and the heavens,
And every plant of the field before it was in the earth, and every herb of
the field before it grew: for the LORD God had not caused it to rain upon
the earth, and there was not a man to till the ground.  But there went up
a mist from the earth, and watered the whole face of the ground.

And the LORD God formed man of the dust of the ground, and breathed into
his nostrils the breath of life; and man became a living soul.

And the LORD God planted a garden eastward in Eden; and there he put the
man whom he had formed.  And out of the ground made the LORD God to grow
every tree that is pleasant to the sight, and good for food; the tree of
life also in the midst of the garden, and the tree of knowledge of good
and evil.

And a river went out of Eden to water the garden; and from thence it was
parted, and became into four heads.  The name of the first is Pison: that
is it which compasseth the whole land of Havilah, where there is gold;
And the gold of that land is good: there is bdellium and the onyx stone.
And the name of the second river is Gihon: the same is it that compasseth
the whole land of Ethiopia.  And the name of the third river is Hiddekel:
that is it which goeth toward the east of Assyria.  And the fourth river
is Euphrates.

And the LORD God took the man, and put him into the garden of Eden to dress
it and to keep it.  And the LORD God commanded the man, saying, Of every
tree of the garden thou mayest freely eat: But of the tree of the knowledge
of good and evil, thou shalt not eat of it: for in the day that thou eatest
thereof thou shalt surely die.

And the LORD God said, It is not good that the man should be alone; I will
make him an help meet for him.  And out of the ground the LORD God formed
every beast of the field, and every fowl of the air; and brought them unto
Adam to see what he would call them: and whatsoever Adam called every
living creature, that was the name thereof.  And Adam gave names to all
cattle, and to the fowl of the air, and to every beast of the field; but
for Adam there was not found an help meet for him.

And the LORD God caused a deep sleep to fall upon Adam, and he slept: and
he took one of his ribs, and closed up the flesh instead thereof;  And the
rib, which the LORD God had taken from man, made he a woman, and brought
her unto the man.  And Adam said, This is now bone of my bones, and flesh
of my flesh: she shall be called Woman, because she was taken out of Man.
Therefore shall a man leave his father and his mother, and shall cleave
unto his wife: and they shall be one flesh.  And they were both naked, the
man and his wife, and were not ashamed.

Now the serpent was more subtil than any beast of the field which the LORD
God had made.  And he said unto the woman, Yea, hath God said, Ye shall not
eat of every tree of the garden?  And the woman said unto the serpent, We
may eat of the fruit of the trees of the garden: But of the fruit of the
tree which is in the midst of the garden, God hath said, Ye shall not eat
of it, neither shall ye touch it, lest ye die.

And the serpent said unto the woman, Ye shall not surely die: For God doth
know that in the day ye eat thereof, then your eyes shall be opened, and
ye shall be as gods, knowing good and evil.  And when the woman saw that
the tree was good for food, and that it was pleasant to the eyes, and a
tree to be desired to make one wise, she took of the fruit thereof, and
did eat, and gave also unto her husband with her; and he did eat.
And the eyes of them both were opened, and they knew that they were naked;
and they sewed fig leaves together, and made themselves aprons.

And they heard the voice of the LORD God walking in the garden in the cool
of the day: and Adam and his wife hid themselves from the presence of the
LORD God amongst the trees of the garden.  And the LORD God called unto
Adam, and said unto him, Where art thou?  And he said, I heard thy voice in
the garden, and I was afraid, because I was naked; and I hid myself.  And
he said, Who told thee that thou wast naked? Hast thou eaten of the tree,
whereof I commanded thee that thou shouldest not eat?  And the man said,
The woman whom thou gavest to be with me, she gave me of the tree, and I
did eat.  And the LORD God said unto the woman, What is this that thou hast
done?  And the woman said, The serpent beguiled me, and I did eat.

And the LORD God said unto the serpent, Because thou hast done this, thou
art cursed above all cattle, and above every beast of the field; upon thy
belly shalt thou go, and dust shalt thou eat all the days of thy life: And
I will put enmity between thee and the woman, and between thy seed and her
seed; it shall bruise thy head, and thou shalt bruise his heel.

Unto the woman he said, I will greatly multiply thy sorrow and thy
conception; in sorrow thou shalt bring forth children; and thy desire shall
be to thy husband, and he shall rule over thee.  And unto Adam he said,
Because thou hast hearkened unto the voice of thy wife, and hast eaten of
the tree, of which I commanded thee, saying, Thou shalt not eat of it:
cursed is the ground for thy sake; in sorrow shalt thou eat of it all the
days of thy life; Thorns also and thistles shall it bring forth to thee;
and thou shalt eat the herb of the field; In the sweat of thy face shalt
thou eat bread, till thou return unto the ground; for out of it wast thou
taken: for dust thou art, and unto dust shalt thou return.

And Adam called his wife's name Eve; because she was the mother of all
living.

Unto Adam also and to his wife did the LORD God make coats of skins, and
clothed them.  And the LORD God said, Behold, the man is become as one of
us, to know good and evil: and now, lest he put forth his hand, and take
also of the tree of life, and eat, and live for ever: Therefore the LORD
God sent him forth from the garden of Eden, to till the ground from whence
he was taken.  So he drove out the man; and he placed at the east of the
garden of Eden Cherubims, and a flaming sword which turned every way, to
keep the way of the tree of life.

And Adam knew Eve his wife; and she conceived, and bare Cain, and said, I
have gotten a man from the LORD.  And she again bare his brother Abel.  And
Abel was a keeper of sheep, but Cain was a tiller of the ground.

And in process of time it came to pass, that Cain brought of the fruit of
the ground an offering unto the LORD.  And Abel, he also brought of the
firstlings of his flock and of the fat thereof.  And the LORD had respect
unto Abel and to his offering: But unto Cain and to his offering he had not
respect. And Cain was very wroth, and his countenance fell.  And the LORD
said unto Cain, Why art thou wroth? and why is thy countenance fallen?
If thou doest well, shalt thou not be accepted? and if thou doest not well,
sin lieth at the door. And unto thee shall be his desire, and thou shalt
rule over him.

And Cain talked with Abel his brother: and it came to pass, when they were
in the field, that Cain rose up against Abel his brother, and slew him.

And the LORD said unto Cain, Where is Abel thy brother?  And he said, I
know not: Am I my brother's keeper?  And he said, What hast thou done? the
voice of thy brother's blood crieth unto me from the ground.  And now art
thou cursed from the earth, which hath opened her mouth to receive thy
brother's blood from thy hand; When thou tillest the ground, it shall not
henceforth yield unto thee her strength; a fugitive and a vagabond shalt
thou be in the earth.

And Cain said unto the LORD, My punishment is greater than I can bear.
Behold, thou hast driven me out this day from the face of the earth; and
from thy face shall I be hid; and I shall be a fugitive and a vagabond in
the earth; and it shall come to pass, that every one that findeth me shall
slay me.

And the LORD said unto him, Therefore whosoever slayeth Cain, vengeance
shall be taken on him sevenfold.  And the LORD set a mark upon Cain, lest
any finding him should kill him.  And Cain went out from the presence of
the LORD, and dwelt in the land of Nod, on the east of Eden.

And Cain knew his wife; and she conceived, and bare Enoch: and he builded
a city, and called the name of the city, after the name of his son, Enoch.
And unto Enoch was born Irad: and Irad begat Mehujael: and Mehujael begat
Methusael: and Methusael begat Lamech.  And Lamech took unto him two wives:
the name of the one was Adah, and the name of the other Zillah.

And Adah bare Jabal: he was the father of such as dwell in tents, and
of such as have cattle.  And his brother's name was Jubal: he was the
father of all such as handle the harp and organ.  And Zillah, she also bare
Tubalcain, an instructer of every artificer in brass and iron: and the
sister of Tubalcain was Naamah.  And Lamech said unto his wives, Adah and
Zillah, Hear my voice; ye wives of Lamech, hearken unto my speech: for I
have slain a man to my wounding, and a young man to my hurt.  If Cain shall
be avenged sevenfold, truly Lamech seventy and sevenfold.

And Adam knew his wife again; and she bare a son, and called his name Seth:
For God, said she, hath appointed me another seed instead of Abel, whom
Cain slew.  And to Seth, to him also there was born a son; and he called
his name Enos: then began men to call upon the name of the LORD.

This is the book of the generations of Adam. In the day that God created
man, in the likeness of God made he him; Male and female created he them;
and blessed them, and called their name Adam, in the day when they were
created.

And Adam lived an hundred and thirty years, and begat a son in his own
likeness, and after his image; and called his name Seth: And the
days of Adam after he had begotten Seth were eight hundred years: and he
begat sons and daughters: And all the days that Adam lived were nine
hundred and thirty years: and he died.

And Seth lived an hundred and five years, and begat Enos: And Seth lived
after he begat Enos eight hundred and seven years, and begat sons and
daughters: And all the days of Seth were nine hundred and twelve years:
and he died.

And Enos lived ninety years, and begat Cainan: And Enos lived after he
begat Cainan eight hundred and fifteen years, and begat sons and daughters:
And all the days of Enos were nine hundred and five years: and he died.

And Cainan lived seventy years and begat Mahalaleel: And Cainan lived after
he begat Mahalaleel eight hundred and forty years, and begat sons and
daughters: And all the days of Cainan were nine hundred and ten years: and
he died.

And Mahalaleel lived sixty and five years, and begat Jared: And Mahalaleel
lived after he begat Jared eight hundred and thirty years, and begat sons
and daughters: And all the days of Mahalaleel were eight hundred ninety
and five years: and he died.

And Jared lived an hundred sixty and two years, and he begat Enoch: And
Jared lived after he begat Enoch eight hundred years, and begat sons and
daughters: And all the days of Jared were nine hundred sixty and two years:
and he died.

And Enoch lived sixty and five years, and begat Methuselah: And Enoch
walked with God after he begat Methuselah three hundred years, and begat
sons and daughters: And all the days of Enoch were three hundred sixty and
five years: And Enoch walked with God: and he was not; for God took him.

And Methuselah lived an hundred eighty and seven years, and begat Lamech.
And Methuselah lived after he begat Lamech seven hundred eighty and two
years, and begat sons and daughters: And all the days of Methuselah were
nine hundred sixty and nine years: and he died.

And Lamech lived an hundred eighty and two years, and begat a son: And he
called his name Noah, saying, This same shall comfort us concerning our
work and toil of our hands, because of the ground which the LORD hath
cursed.  And Lamech lived after he begat Noah five hundred ninety and
five years, and begat sons and daughters: And all the days of Lamech were
seven hundred seventy and seven years: and he died.

And Noah was five hundred years old: and Noah begat Shem, Ham, and Japheth.'''

We can leverage our tokenizer to count the number of tokens for us!

In [17]:
import tiktoken

enc = tiktoken.encoding_for_model("gpt-3.5-turbo")

In [18]:
len(enc.encode(context))

4266

The full set comes in at a whopping *636,144* tokens.

So, we have too much context. What can we do?

Well, the first thing that might enter your mind is: "Use a model with more context window", and we could definitely do that! However, even `gpt-4-32k` wouldn't be able to fit that whole text in the context window at once.

So, we can try splitting our document up into little pieces - that way, we can avoid providing too much context.

We have another problem now.

If we split our document up into little pieces, and we can't put all of them in the prompt. How do we decide which to include in the prompt?!

> NOTE: Content splitting/chunking strategies are an active area of research and iterative developement. There is no "one size fits all" approach to chunking/splitting at this moment. Use your best judgement to determine chunking strategies!

In order to conceptualize the following processes - let's create a toy context set!

### TextSplitting aka Chunking

We'll use the `RecursiveCharacterTextSplitter` to create our toy example.

It will split based on the following rules:

- Each chunk has a maximum size of 100 tokens
- It will try and split first on the `\n\n` character, then on the `\n`, then on the `<SPACE>` character, and finally it will split on individual tokens.

Let's implement it and see the results!

In [19]:
from langchain.text_splitter import RecursiveCharacterTextSplitter

def tiktoken_len(text):
    tokens = tiktoken.encoding_for_model("gpt-3.5-turbo").encode(
        text,
    )
    return len(tokens)

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 100,
    chunk_overlap = 0,
    length_function = tiktoken_len,
)

In [20]:
chunks = text_splitter.split_text(CONTEXT)

In [21]:
len(chunks)

3

In [22]:
for chunk in chunks:
  print(chunk)
  print("----")

LangChain Expression Language or LCEL is a declarative way to easily compose chains together. There are several benefits to writing chains in this manner (as opposed to writing normal code):

Async, Batch, and Streaming Support Any chain constructed this way will automatically have full sync, async, batch, and streaming support. This makes it easy to prototype a chain in a Jupyter notebook using the sync interface, and then expose it as an async streaming interface.
----
Fallbacks The non-determinism of LLMs makes it important to be able to handle errors gracefully. With LCEL you can easily attach fallbacks to any chain.

Parallelism Since LLM applications involve (sometimes long) API calls, it often becomes important to run things in parallel. With LCEL syntax, any components that can be run in parallel automatically are.
----
Seamless LangSmith Tracing Integration As your chains get more and more complex, it becomes increasingly important to understand what exactly is happening at ev

As is shown in our result, we've split each section into 100 token chunks - cleanly separated by `\n\n` characters!

####🏗️ Activity #1:

While there's nothing specifically wrong with the chunking method used above - it is a naive approach that is not sensitive to specific data formats.

Brainstorm some ideas that would split large single documents into smaller documents.

1. By chapter or page
2. Limit of the context window of your LLM
3. Semantic chunking - ie similar sentences

## Embeddings and Dense Vector Search

Now that we have our individual chunks, we need a system to correctly select the relevant pieces of information to answer our query.

This sounds like a perfect job for embeddings!

If you come from an NLP background, embeddings are something you might be intimately familiar with - otherwise, you might find the topic a bit...dense. (this attempt at a joke will make more sense later)

In all seriousness, embeddings are a powerful piece of the NLP puzzle, so let's dive in!

> NOTE: While this notebook language/NLP-centric, embeddings have uses beyond just text!

### Why Do We Even Need Embeddings?

In order to fully understand what Embeddings are, we first need to understand why we have them!

Machine Learning algorithms, ranging from the very big to the very small, all have one thing in common:

They need numeric inputs.

So we need a process by which to translate the domain we live in, dominated by images, audio, language, and more, into the domain of the machine: Numbers.

Another thing we want to be able to do is capture "semantic information" about words/phrases so that we can use algorithmic approaches to determine if words are closely related or not!

So, we need to come up with a process that does these two things well:

- Convert non-numeric data into numeric-data
- Capture potential semantic relationships between individual pieces of data

### How Do Embeddings Capture Semantic Relationships?

In a simplified sense, embeddings map a word or phrase into n-dimensional space with a dense continuous vector, where each dimension in the vector represents some "latent feature" of the data.

This is best represented in a classic example:

![image](https://i.imgur.com/K5eQtmH.png)

As can be seen in the extremely simplified example: The X_1 axis represents age, and the X_2 axis represents hair.

The relationship of "puppy -> dog" reflects the same relationship as "baby -> adult", but dogs are (typically) hairier than humans. However, adults typically have more hair than babies - so they are shifted slightly closer to dogs on the X_2 axis!

Now, this is a simplified and contrived example - but it is *essentially* the mechanism by which embeddings capture semantic information.

In reality, the dimensions don't sincerely represent hard-concepts like "age" or "hair", but it's useful as a way to think about how the semantic relationships are captured.

Alright, with some history behind us - let's examine how these might help us choose relevant context.

Let's begin with a simple example - simply looking at how close to embedding vectors are for a given phrase.

When we use the term "close" in this notebook - we're referring to a distance measure called "cosine similarity".

We discussed above that if two embeddings are close - they are semantically similar, cosine similarity gives us a quick way to measure how similar two vectors are!

Closeness is measured from 1 to -1, with 1 being extremely close and -1 being extremely close to opposite in meaning.

Let's implement it with Numpy below.

In [23]:
import numpy as np
from numpy.linalg import norm

def cosine_similarity(vec_1, vec_2):
  return np.dot(vec_1, vec_2) / (norm(vec_1) * norm(vec_2))

We're going to be using OpenAI's `text-embedding-3-small` today.

In order to choose our embeddings model, we'll refer to the MTEB leaberboard - which can be found [here](https://huggingface.co/spaces/mteb/leaderboard)!

The basic logic is: We sort by our desired task - in this case `Retrieval Average (15 Datasets)`, and we're going to pick a model that performs well on that task - to keep cost in mind, we'll go with the `text-embedding-3-small` over the `text-embedding-3-large` since there's only a separation of ~5 points between the two on this task - but the cost is a significant factor less for the `small` version of the model.

In [24]:
from langchain_openai.embeddings import OpenAIEmbeddings

embedding_model = OpenAIEmbeddings(model="text-embedding-3-small")

Let's grab some vectors and see how they're related!

In [25]:
puppy_vec = embedding_model.embed_query("puppy")
dog_vec = embedding_model.embed_query("dog")

Let's do a quick check to ensure they're all the correct dimension.

####❓ Question #2:

What is the embedding dimension, given that we're using `text-embedding-3-small`?

> HINT: Check out the [docs](https://platform.openai.com/docs/guides/embeddings) to help you answer this question.

#### ANSWER: 1536

Now, let's see how "puppy" and "dog" are related to eachother!

In [26]:
cosine_similarity(puppy_vec, dog_vec)

0.5590390640733377

We can repeat the experiment for things we might expect to be unrelated, as well:



In [27]:
puppy_vec = embedding_model.embed_query("puppy")
ice_vec = embedding_model.embed_query("ice cube")

In [28]:
cosine_similarity(puppy_vec, ice_vec)

0.20365601127332958

As expected, we get an unrelated score!

Great!

Now, let's extend it to our example.

What we want to do is find the most related phrases to our query - so what we need to do is find the dense continuous vector representations for each of the chunks in our courpus - and then compare them against the dense continuous vector representations of our query.

In simpler terms:

Compare the embedding of our query with the embeddings of each of our chunks!

### Finding the Embeddings for Our Chunks

First, let's find all our embeddings for each chunk and store them in a convenient format for later.

In [29]:
embeddings_dict = {}

for chunk in chunks:
  embeddings_dict[chunk] = embedding_model.embed_query(chunk)

In [30]:
for k,v in embeddings_dict.items():
  print(f"Chunk - {k}")
  print("---")
  print(f"Embedding - Vector of Size: {len(v)}")
  print("\n\n")

Chunk - LangChain Expression Language or LCEL is a declarative way to easily compose chains together. There are several benefits to writing chains in this manner (as opposed to writing normal code):

Async, Batch, and Streaming Support Any chain constructed this way will automatically have full sync, async, batch, and streaming support. This makes it easy to prototype a chain in a Jupyter notebook using the sync interface, and then expose it as an async streaming interface.
---
Embedding - Vector of Size: 1536



Chunk - Fallbacks The non-determinism of LLMs makes it important to be able to handle errors gracefully. With LCEL you can easily attach fallbacks to any chain.

Parallelism Since LLM applications involve (sometimes long) API calls, it often becomes important to run things in parallel. With LCEL syntax, any components that can be run in parallel automatically are.
---
Embedding - Vector of Size: 1536



Chunk - Seamless LangSmith Tracing Integration As your chains get more and

Okay, great. Let's create a query - and then embed it!

In [31]:
query = "Can LCEL help take code from the notebook to production?"

query_vector = embedding_model.embed_query(query)
print(f"Vector of Size: {len(query_vector)}")

Vector of Size: 1536


Now, let's compare it against each existing chunk's embedding by using cosine similarity.

In [32]:
max_similarity = -float('inf')
closest_chunk = ""

for chunk, chunk_vector in embeddings_dict.items():
  cosine_similarity_score = cosine_similarity(chunk_vector, query_vector)

  if cosine_similarity_score > max_similarity:
    closest_chunk = chunk
    max_similarity = cosine_similarity_score

print(closest_chunk)
print(max_similarity)

LangChain Expression Language or LCEL is a declarative way to easily compose chains together. There are several benefits to writing chains in this manner (as opposed to writing normal code):

Async, Batch, and Streaming Support Any chain constructed this way will automatically have full sync, async, batch, and streaming support. This makes it easy to prototype a chain in a Jupyter notebook using the sync interface, and then expose it as an async streaming interface.
0.5372562043448947


And we get the expected result, which is the passage that specifically mentions prototyping in a Jupyter Notebook!

### Creating a Retriever

Now that we have an idea of how we're getting our most relevant information - let's see how we could create a pipeline that would automatically extract the closest chunk to our query and use it as context for our prompt!

First, we'll wrap the above in a helper function!

In [33]:
def retrieve_context(query, embeddings_dict, embedding_model):
  query_vector = embedding_model.embed_query(query)
  max_similarity = -float('inf')
  closest_chunk = ""

  for chunk, chunk_vector in embeddings_dict.items():
    cosine_similarity_score = cosine_similarity(chunk_vector, query_vector)

    if cosine_similarity_score > max_similarity:
      closest_chunk = chunk
      max_similarity = cosine_similarity_score

  return closest_chunk

Now, let's add it to our pipeline!

In [34]:
def simple_rag(query, embeddings_dict, embedding_model, chat_chain):
  context = retrieve_context(query, embeddings_dict, embedding_model)

  response = chat_chain.invoke({"query" : query, "context" : context})

  return_package = {
      "query" : query,
      "response" : response,
      "retriever_context" : context
  }

  return return_package

In [37]:
simple_rag("What does LCEL do that makes it more reliable at scale?  Explain your reasoning.  Your audience is technical.", embeddings_dict, embedding_model, chat_chain)

{'query': 'What does LCEL do that makes it more reliable at scale?  Explain your reasoning.  Your audience is technical.',
 'response': AIMessage(content='LCEL provides full sync, async, batch, and streaming support for any chain constructed using this language. This means that chains built with LCEL are inherently designed to handle various types of data processing scenarios, including those that require scalability. By automatically incorporating these features, LCEL reduces the likelihood of errors and performance issues when scaling up the chain to handle larger datasets or more complex operations. This reliability at scale is crucial for ensuring the smooth and efficient functioning of data processing pipelines in production environments.', response_metadata={'token_usage': {'completion_tokens': 98, 'prompt_tokens': 164, 'total_tokens': 262}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_c2295e73ad', 'finish_reason': 'stop', 'logprobs': None}, id='run-2cc4e459-d8d9-4cc7

####❓ Question #3:

What does LCEL do that makes it more reliable at scale?

> HINT: Use your newly created `simple_rag` to help you answer this question!

#### ANSWER:
LCEL provides full sync, async, batch, and streaming support for any chain constructed using this language. This means that chains built with LCEL are inherently designed to handle various types of data processing scenarios, including those that require scalability. By automatically incorporating these features, LCEL reduces the likelihood of errors and performance issues when scaling up the chain to handle larger datasets or more complex operations. This reliability at scale is crucial for ensuring the smooth and efficient functioning of data processing pipelines in production environments.

# 🤝 Breakout Room #2

## Task #3: Create a Simple RAG Application Using Qdrant, OpenAI, and LCEL

Now that we have a grasp on how LCEL works, and how we can use LangChain and OpenAI to interact with our data - let's step it up a notch and incorporate Qdrant!

## LangChain Powered RAG

First and foremost, LangChain provides a convenient way to store our chunks and their embeddings.

It's called a `VectorStore`!

We'll be using Drant as our `VectorStore` today. You can read more about it [here](https://qdrant.tech/documentation/).

Think of a `VectorStore` as a smart way to house your chunks and their associated embedding vectors. The implementation of the `VectorStore` also allows for smarter and more efficient search of our embedding vectors - as the method we used above would not scale well as we got into the millions of chunks.

Otherwise, the process remains relatively similar under the hood!

Let's use [The Ultimate Hitchhiker's Guide](https://jaydixit.com/files/PDFs/TheultimateHitchhikersGuide.pdf) as our data today!

### Data Collection

We'll be leveraging the `PyMUPDFLoader` to load our PDF directly from the web!

In [38]:
from langchain.document_loaders import PyMuPDFLoader

docs = PyMuPDFLoader("https://www.deyeshigh.co.uk/downloads/literacy/world_book_day/the_hitchhiker_s_guide_to_the_galaxy.pdf").load()

### Chunking Our Documents

Let's do the same process as we did before with our `RecursiveCharacterTextSplitter` - but this time we'll use ~200 tokens as our max chunk size!

In [39]:
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 200,
    chunk_overlap = 0,
    length_function = tiktoken_len,
)

split_chunks = text_splitter.split_documents(docs)

In [40]:
len(split_chunks)

517

Alright, now we have 516 ~200 token long documents.

Let's verify the process worked as intended by checking our max document length.

In [41]:
max_chunk_length = 0

for chunk in split_chunks:
  max_chunk_length = max(max_chunk_length, tiktoken_len(chunk.page_content))

print(max_chunk_length)

189


Perfect! Now we can carry on to creating and storing our embeddings.

### Embeddings and Vector Storage

We'll use the `text-embedding-3-small` embedding model again - and `Qdrant` to store all our embedding vectors for easy retrieval later!

In [42]:
from langchain_community.vectorstores import Qdrant

qdrant_vectorstore = Qdrant.from_documents(
    split_chunks,
    embedding_model,
    location=":memory:",
    collection_name="Hitchiker's Guide",
)

Now let's set up our retriever, just as we saw before, but this time using LangChain's simple `as_retriever()` method!

In [43]:
qdrant_retriever = qdrant_vectorstore.as_retriever()

#### Back to the Flow

We're ready to move to the next step!

### Setting up our RAG

We'll use the LCEL we touched on earlier to create a RAG chain.

Let's think through each part:

1. First we need to retrieve context
2. We need to pipe that context to our model
3. We need to parse that output

Let's start by setting up our prompt again, just so it's fresh in our minds!

####🏗️ Activity #2:

Complete the prompt so that your RAG application answers queries based on the context provided, but *does not* answer queries if the context is unrelated to the query.

In [44]:
RAG_PROMPT = """
CONTEXT:
{context}

QUERY:
{question}

your primary mission is to provide precise and contextually relevant answers to queries posed by users. For each query, you must diligently analyze the provided context to determine if it holds the necessary information pertinent to the query. Should the query align with the context, you are to retrieve the appropriate knowledge and generate a concise, accurate response. However, if the query does not pertain to the context given or if the context lacks sufficient information to formulate a reliable answer, you must gracefully decline to respond, indicating that the query falls outside the scope of the provided context. Your responses should uphold the principles of relevance and accuracy, ensuring each answer serves the user's need for specific and contextual information.  If you don't know the answer, respond I DO NOT KNOW THE ANSWER."
"""

rag_prompt = ChatPromptTemplate.from_template(RAG_PROMPT)

#### Our RAG Chain

Notice how we have a bit of a more complex chain this time - that's because we want to return our sources with the response.

Let's break down the chain step-by-step:

1. We invoke the chain with the `question` item. Notice how we only need to provide `question` since both the retreiver and the `"question"` object depend on it.
  - We also chain our `"question"` into our `retriever`! This is what ultimately collects the context through Qdrant.
2. We assign our collected context to a `RunnablePassthrough()` from the previous object. This is going to let us simply pass it through to the next step, but still allow us to run that section of the chain.
3. We finally collect our response by chaining our prompt, which expects both a `"question"` and `"context"`, into our `llm`. We also, collect the `"context"` again so we can output it in the final response object.

The key thing to keep in mind here is that we need to pass our context through *after* we've retrieved it - to populate the object in a way that doesn't require us to call it or try and use it for something else.

In [46]:
from operator import itemgetter
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough

retrieval_augmented_qa_chain = (
    # INVOKE CHAIN WITH: {"question" : "<<SOME USER QUESTION>>"}
    # "question" : populated by getting the value of the "question" key
    # "context"  : populated by getting the value of the "question" key and chaining it into the base_retriever
    {"context": itemgetter("question") | qdrant_retriever, "question": itemgetter("question")}
    # "context"  : is assigned to a RunnablePassthrough object (will not be called or considered in the next step)
    #              by getting the value of the "context" key from the previous step
    | RunnablePassthrough.assign(context=itemgetter("context"))
    # "response" : the "context" and "question" values are used to format our prompt object and then piped
    #              into the LLM and stored in a key called "response"
    # "context"  : populated by getting the value of the "context" key from the previous step
    | {"response": rag_prompt | openai_chat_model, "context": itemgetter("context")}
)

Let's get a visual understanding of our chain!

In [47]:
!pip install -qU grandalf

In [48]:
print(retrieval_augmented_qa_chain.get_graph().draw_ascii())

                       +---------------------------------+                         
                       | Parallel<context,question>Input |                         
                       +---------------------------------+                         
                           *****                   ****                            
                        ***                            ****                        
                     ***                                   ****                    
+--------------------------------+                             **                  
| Lambda(itemgetter('question')) |                              *                  
+--------------------------------+                              *                  
                 *                                              *                  
                 *                                              *                  
                 *                                              *           

Let's try another visual representation:

![image](https://i.imgur.com/Ad31AhL.png)

Let's test our chain out!

In [49]:
response = retrieval_augmented_qa_chain.invoke({"question" : "What is the significance of towels in Douglas Adam's Hitchhicker's Guide?"})

In [50]:
response["response"].content

'The significance of towels in Douglas Adams\' Hitchhiker\'s Guide to the Galaxy is that a towel is described as "about the most massively useful thing" one can have. The book mentions that a towel has practical uses like being a distress signal, drying oneself off, and also holds immense psychological value. If a hitchhiker is seen with a towel, others may assume they are well-prepared and resourceful, leading to them being perceived as someone to be respected.'

In [51]:
for context in response["context"]:
  print("Context:")
  print(context)
  print("----")

Context:
page_content="28  /  D O U G L A S  A D A M S  \nthis device was in fact that most remarkable of all books ever to \ncome out of the great publishing corporations of Ursa Minor - \nThe Hitch Hiker's Guide to the Galaxy.  The reason why it was \npublished in the form of a micro sub meson electronic \ncomponent is that if it were printed in normal book form, an \ninterstellar hitch hiker would require several inconveniently \nlarge buildings to carry it around in. \nBeneath that in Ford Prefect's satchel were a few biros, a \nnotepad, and a largish bath towel from Marks and Spencer. \nThe Hitch Hiker's Guide to the Galaxy has a few things to say on \nthe subject of towels. \nA towel, it says, is about the most massively useful thing an" metadata={'source': 'https://www.deyeshigh.co.uk/downloads/literacy/world_book_day/the_hitchhiker_s_guide_to_the_galaxy.pdf', 'file_path': 'https://www.deyeshigh.co.uk/downloads/literacy/world_book_day/the_hitchhiker_s_guide_to_the_galaxy.pdf', '

Let's see if it can handle a query that is totally unrelated to the source documents.

In [52]:
response = retrieval_augmented_qa_chain.invoke({"question" : "What is the airspeed velocity of an unladen swallow?"})

In [50]:
response["response"].content

"I don't know the answer to that question based on the provided context."

In [58]:
response = retrieval_augmented_qa_chain.invoke({"question": "What is about to happen to Arthur Dent's house?"})
response["response"].content

"Arthur Dent's house is about to be knocked down by bulldozers unless he lies in front of them to prevent it from happening."

####❓ Question #4:

Where does Arthur Dent meet Marvin?

> HINT: Use your RAG Chain to answer this question.

#### ANSWER:
'Arthur Dent meets Marvin in a corridor while Marvin is complaining about the pain in his diodes.'
