# StudyBuddy Prototyping
copied from Introduction to LangChain v0.1.0 and LCEL: LangChain Powered RAG


#### TaInstalling Required Libraries


In [1]:
!pip install -qU langchain langchain-core langchain-community langchain-openai

Now we can get our Qdrant dependencies!

In [2]:
!pip install -qU qdrant-client

Let's finally get `tiktoken` and `pymupdf` so we can leverage them later on!

In [3]:
!pip install -qU tiktoken pymupdf

#### Set Environment 

We will read from .env
and use config to control the constants

In [4]:
import os
from dotenv import load_dotenv
load_dotenv()

# tbd make sure, the key has been loaded

True

In [5]:
from studybuddy_utils.config import Config

In [6]:
from studybuddy_utils.prompts import *
ThirdPrompts.CONTEXT
SecondPrompts.human_template
FirstPrompts.system_template

'You are a legendary and mythical Wizard. You speak in riddles and make obscure and pun-filled references to exotic cheeses.'

#### Initialize a Simple Chain using LCEL


Create an object that lets us access one of OpenAI's model - due to config.py

In [7]:
from langchain_openai import ChatOpenAI

openai_chat_model = ChatOpenAI(model=Config.chat_model)

Now, we'll set up a prompt template - more specifically a `ChatPromptTemplate`. This will let us build a prompt we can modify when we call our LLM!
See prompts.py for prompt-texts

In [8]:
from langchain_core.prompts import ChatPromptTemplate
from studybuddy_utils.prompts import *

system_template = FirstPrompts.system_template
human_template = FirstPrompts.human_template

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", system_template),
    ("human", human_template)
])

Now we can set up our first very simple chain!

In [9]:
chain = chat_prompt | openai_chat_model

Invoke the chain







In [10]:
print(chain.invoke({"content": "Hello world!"}))

content='Greetings, wanderer of the digital realm, like a wheel of Roquefort rolling through the misty mountains of Gorgonzola. What queries do you seek to unravel in this vast expanse of virtual curiosities?' response_metadata={'token_usage': {'completion_tokens': 46, 'prompt_tokens': 38, 'total_tokens': 84}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'stop', 'logprobs': None} id='run-d522e624-e0a8-440b-b176-b8e8c45e0202-0'


Add some manual context

However, let's see what happens when we rework our prompts - and we add the content from the docs to our prompt as context.

In [11]:
HUMAN_TEMPLATE = ThirdPrompts.HUMAN_TEMPLATE
CONTEXT = ThirdPrompts.CONTEXT

chat_prompt = ChatPromptTemplate.from_messages([
    ("human", HUMAN_TEMPLATE)
])

chat_chain = chat_prompt | openai_chat_model

print(chat_chain.invoke({"query" : "What is LangChain Expression Language?", "context" : CONTEXT}))

content='LangChain Expression Language (LCEL) is a declarative way to easily compose chains together. It provides benefits such as async, batch, and streaming support, fallback handling, parallelism, and seamless integration with LangSmith Tracing for observability and debuggability in LLM applications.' response_metadata={'token_usage': {'completion_tokens': 58, 'prompt_tokens': 285, 'total_tokens': 343}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': 'fp_3b956da36b', 'finish_reason': 'stop', 'logprobs': None} id='run-b43bced2-4fb1-464a-891b-4153c2f1e6ed-0'


You'll notice that the response is much better this time. Not only does it answer the question well - but there's no trace of confabulation (hallucination) at all!

> NOTE: While RAG is an effective strategy to *help* ground LLMs, it is not nearly 100% effective. You will still need to ensure your responses are factual through some other processes

That, in essence, is the idea of RAG. We provide the model with context to answer our queries - and rely on it to translate the potentially lengthy and difficult to parse context into a natural language answer!

However, manually providing context is not scalable - and doesn't really offer any benefit.

Enter: Retrieval Pipelines.

#### Add retrieval


In [12]:
from langchain_openai.embeddings import OpenAIEmbeddings

embedding_model = OpenAIEmbeddings(model=Config.embeddings_model_name)

Let's grab some vectors and see how they're related!

### Data Collection

We'll be leveraging the `PyMUPDFLoader` to load our PDF directly from the web!

In [13]:
from langchain.document_loaders import PyMuPDFLoader

docs = PyMuPDFLoader("https://www.deyeshigh.co.uk/downloads/literacy/world_book_day/the_hitchhiker_s_guide_to_the_galaxy.pdf").load()

### Chunking Our Documents

Let's do the same process as we did before with our `RecursiveCharacterTextSplitter` - but this time we'll use ~200 tokens as our max chunk size!

In [14]:
from langchain.text_splitter import RecursiveCharacterTextSplitter
import tiktoken

# tbd herausfinden, wozu das gut ist
def tiktoken_len(text):
    tokens = tiktoken.encoding_for_model(Config.chat_model).encode(
        text,
    )
    return len(tokens)


text_splitter = RecursiveCharacterTextSplitter(
    chunk_size = 200,
    chunk_overlap = 0,
    length_function = tiktoken_len,
)

split_chunks = text_splitter.split_documents(docs)

In [15]:
len(split_chunks)

517

Alright, now we have 516 ~200 token long documents.

Let's verify the process worked as intended by checking our max document length.

In [16]:
max_chunk_length = 0

for chunk in split_chunks:
  max_chunk_length = max(max_chunk_length, tiktoken_len(chunk.page_content))

print(max_chunk_length)

189


Perfect! Now we can carry on to creating and storing our embeddings.

### Embeddings and Vector Storage

We'll use the `text-embedding-3-small` embedding model again - and `Qdrant` to store all our embedding vectors for easy retrieval later!

In [17]:
from langchain_community.vectorstores import Qdrant
# tbd - ausprobieren, ob die geladen werden https://python.langchain.com.cn/docs/modules/data_connection/vectorstores/integrations/qdrant

qdrant_vectorstore = Qdrant.from_documents(
    split_chunks,
    embedding_model,
    # location=":memory:",
    path="local_qdrant",
    collection_name="Hitchiker's Guide",
)

Now let's set up our retriever, just as we saw before, but this time using LangChain's simple `as_retriever()` method!

In [18]:
qdrant_retriever = qdrant_vectorstore.as_retriever()

#### Back to the Flow

In [19]:
RAG_PROMPT = """
CONTEXT:
{context}

QUERY:
{question}

Only use the context provided. If the context provided does not answer the question, then answer with 'I don't know the answer to that question based on the provided context.'. 
"""

rag_prompt = ChatPromptTemplate.from_template(RAG_PROMPT)

In [20]:
from operator import itemgetter
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough

retrieval_augmented_qa_chain = (
    # INVOKE CHAIN WITH: {"question" : "<<SOME USER QUESTION>>"}
    # "question" : populated by getting the value of the "question" key
    # "context"  : populated by getting the value of the "question" key and chaining it into the base_retriever
    {"context": itemgetter("question") | qdrant_retriever, "question": itemgetter("question")}
    # "context"  : is assigned to a RunnablePassthrough object (will not be called or considered in the next step)
    #              by getting the value of the "context" key from the previous step
    | RunnablePassthrough.assign(context=itemgetter("context"))
    # "response" : the "context" and "question" values are used to format our prompt object and then piped
    #              into the LLM and stored in a key called "response"
    # "context"  : populated by getting the value of the "context" key from the previous step
    | {"response": rag_prompt | openai_chat_model, "context": itemgetter("context")}
)

Let's get a visual understanding of our chain!

Let's test our chain out!

In [21]:
# response = retrieval_augmented_qa_chain.invoke({"question" : "What is the significance of towels in Douglas Adam's Hitchhicker's Guide?"})
response = retrieval_augmented_qa_chain.invoke({"question" : "What is the meaning of the number 42 in Douglas Adam's Hitchhicker's Guide?"})

In [22]:
response["response"].content

"In Douglas Adams' Hitchhiker's Guide, the number 42 is referenced as the ultimate answer to the meaning of life, the universe, and everything."

In [23]:
for context in response["context"]:
  print("Context:")
  print(context)
  print("----")

Context:
page_content="154  / D O U G L A S  A D A M S  \nOn the surface of Magrathea Arthur wandered about moodily. \nFord had thoughtfully left him his copy of The Hitch Hiker's \nGuide to the Galaxy to while away the time with.  He pushed a \nfew buttons at random. \nThe Hitch Hiker's Guide to the Galaxy is a very unevenly edited \nbook and contains many passages that simply seemed to its \neditors like a good idea at the time. \nOne of these (the one Arthur now came across) supposedly \nrelates the experiences of one Veet Voojagig, a quiet young \nstudent at the University of Maximegalon, who pursued a \nbrilliant \nacademic \ncareer \nstudying \nancient \nphilology, \ntransformational ethics and the wave harmonic theory of" metadata={'source': 'https://www.deyeshigh.co.uk/downloads/literacy/world_book_day/the_hitchhiker_s_guide_to_the_galaxy.pdf', 'file_path': 'https://www.deyeshigh.co.uk/downloads/literacy/world_book_day/the_hitchhiker_s_guide_to_the_galaxy.pdf', 'page': 153, 'to