<a href="https://colab.research.google.com/github/shahnazumer/LCEL-DOCUMENT/blob/main/QDRANT_RAG_LCEL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### [Detailed](https://python.langchain.com/docs/use_cases/question_answering/)

# Retrieval Augmented Generation (RAG)

RAG is an AI framework for retrieving facts from an external knowledge base to ground large language models (LLMs) on the most accurate, up-to-date information and to give users insight into LLMs' generative process

In the context of this illustration, we will undertake the development of an AI chatbot, encompassing the utilization of LangChain, OpenAI, and Qdrant vector DB. The objective is to construct a chatbot endowed with the capacity to assimilate knowledge from the external environment through Retrieval Augmented Generation (RAG).

Upon completing this example, we anticipate having a fully operational chatbot and RAG pipeline capable of engaging in conversations and delivering informative responses based on an integrated knowledge base

# Prerequisites
Before we start building our chatbot, we need to install some Python libraries. Here's a brief overview of what each library does:

- **langchain**: This is a library for GenAI. We'll use it to chain together different language models and components for our chatbot.
- **openai**: This is the official OpenAI Python client. We'll use it to interact with the OpenAI API and generate responses for our chatbot.
- **Qdrant-client**: This is the official Qdrant client. We'll use it to interact with the Qdrant API / HOST and store our chatbot's knowledge base in a vector database.



You can install these libraries using pip like so:

In [None]:
!pip install langchain==0.0.345 openai qdrant-client langchain-community langchain-core tiktoken


# Connecting to Qdrant

Initially, we will establish a connection between our data and the VectorStore, with Qdrant serving a pivotal role in this process. Qdrant functions as a vector similarity search engine, facilitating efficient retrieval and analysis of information based on vector representations. This integration enhances our ability to perform advanced similarity searches and extract meaningful insights from the connected data within the VectorStore framework


I'm using the code from https://github.com/alejandro-ao/qdrant-cloud-app.git

In [2]:
# We will be relying heavily on the LangChain library to bring together the different components needed for our model;

import qdrant_client
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatOpenAI
from langchain.schema.output_parser import StrOutputParser
from langchain.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores import Qdrant
from langchain.text_splitter import CharacterTextSplitter
from qdrant_client import QdrantClient
from qdrant_client.http import models
from qdrant_client import models, QdrantClient
from langchain.schema.runnable import RunnablePassthrough

In [3]:
import os

os.environ["OPENAI_API_KEY"]


In [4]:
# create a qdrant instance

os.environ['QDRANT_HOST']
os.environ["QDRANT_API_KEY"]

client = qdrant_client.QdrantClient(
    os.getenv("QDRANT_HOST"),
    api_key=os.getenv("QDRANT_API_KEY")
)

In [5]:
# read the file

doc ="/content/foundation.txt"
data=""

with open(doc,'r') as f:
    data = f.read()

In [6]:
# split the text into chunks
#create a function to return chunks
def get_chunks(text):
    text_splitter=CharacterTextSplitter(
        separator= "\n",
        chunk_size=500,
        chunk_overlap=100, # second chunk start  character from 800, overlap is used to stop loosing chunk
        length_function=len
    )

    chunks=text_splitter.split_text(text)
    return chunks

In [7]:
# get the chunks for the data
texts=get_chunks(data)

In [None]:
# creating a new collection

vectors_config=models.VectorParams(
    # depends on model, we can google dimension. 1536 for openai
    # we are using openai embedding, for that size is 1536
    size=1536,
    distance=models.Distance.COSINE)

client.create_collection(
    collection_name="Isaac Asimov Foundation",
    vectors_config=vectors_config,
)

In [9]:
# create a vector store object using langchain

# if we want to use any other embedding, we need to change size

os.environ["OPENAI_API_KEY"]

embeddings = OpenAIEmbeddings()

vector_store = Qdrant(
    client=client,
    collection_name="Isaac Asimov Foundation",
    embeddings=embeddings,
)

In [10]:
# add chunks to vector store
vector_store.add_texts(texts)

['1472d37569264ff08b0224a1864ad416',
 '67bf3e7a87f041598442125ce6579101',
 'f1ee28a8d21b41c19edd0b47bfc4f4bb',
 '41c7d39634ea455493142f7fe4662a78',
 'bcb79a4b65eb4a6199c27c106ed219a1']

##Advanced Use Case: Retrieval Augmented Generation (RAG)

We've built a fully-fledged knowledge base. Now it's time to connect that knowledge base to our chatbot.

i have used the code from https://python.langchain.com/docs/expression_language/cookbook/retrieval

In [11]:
# create retriever & let's try passing a question through to a these vectorstore objects


retriever = vector_store.as_retriever()

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)

model = ChatOpenAI()

In [12]:
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

In [13]:
chain.invoke("Who is the creator of psychohistory in the Foundation series?")

'The creator of psychohistory in the Foundation series is Hari Seldon.'

In [16]:
template = """Answer the question based only on the following context:
{context}

Question: {question}

"""
prompt = ChatPromptTemplate.from_template(template)

chain = (
    {
        "context": itemgetter("question") | retriever,
        "question": itemgetter("question"),
    }
    | prompt
    | model
    | StrOutputParser()
)

In [17]:
chain.invoke({"question": "What role do technology and science play in the Foundation series?"})

'Based on the context provided, the role of technology and science in the Foundation series is to explore the idea of predicting and shaping the future through scientific means. The series is centered around the concept of psychohistory, a mathematical discipline that combines history, sociology, and mathematics to predict the future of large populations. Additionally, the series is celebrated for its intellectual depth, intricate world-building, and exploration of the long-term consequences of human actions on a galactic scale.'

# LCEL RAG with memory

I'll be using the code from https://python.langchain.com/docs/expression_language/cookbook/retrieval#with-memory-and-returning-source-documents

In [15]:
from operator import itemgetter

from langchain.memory import ConversationBufferMemory
from langchain_core.runnables import RunnableLambda, RunnablePassthrough

In [18]:
from langchain.prompts.prompt import PromptTemplate

_template = """Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:"""
CONDENSE_QUESTION_PROMPT = PromptTemplate.from_template(_template)

In [19]:
template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
ANSWER_PROMPT = ChatPromptTemplate.from_template(template)

In [24]:
from langchain.schema import format_document

DEFAULT_DOCUMENT_PROMPT = PromptTemplate.from_template(template="{page_content}")


def _combine_documents(
    docs, document_prompt=DEFAULT_DOCUMENT_PROMPT, document_separator="\n\n"
):
    doc_strings = [format_document(doc, document_prompt) for doc in docs]
    return document_separator.join(doc_strings)

In [21]:
memory = ConversationBufferMemory(
    return_messages=True, output_key="answer", input_key="question"
)

In [22]:
# First we add a step to load memory
# This adds a "memory" key to the input object

loaded_memory = RunnablePassthrough.assign(
    chat_history=RunnableLambda(memory.load_memory_variables) | itemgetter("history"),
)
# Now we calculate the standalone question
standalone_question = {
        "standalone_question": {
            "question": lambda x: x["question"],
            "chat_history": lambda x: x["chat_history"],
        }
        | CONDENSE_QUESTION_PROMPT
        | model
        | StrOutputParser(),
    }

# Now we retrieve the documents
retrieved_documents = {
    "docs": itemgetter("standalone_question") | retriever,
    "question": lambda x: x["standalone_question"],
}
# Now we construct the inputs for the final prompt
final_inputs = {
    "context": lambda x: _combine_documents(x["docs"]),
    "question": itemgetter("question"),
}
# And finally, we do the part that returns the answers
answer = {
    "answer": final_inputs | ANSWER_PROMPT | ChatOpenAI(),
    "docs": itemgetter("docs"),
}
# And now we put it all together!
final_chain = loaded_memory | standalone_question | retrieved_documents | answer

In [25]:
inputs = {"question": "What is the Seldon Plan and why is it significant??"}
result = final_chain.invoke(inputs)
result

{'answer': AIMessage(content='The significance of the Seldon Plan is that it aims to shorten a predicted dark age of 30,000 years to just 1,000 years by establishing the Foundation, a group of scientists and intellectuals. It is a plan devised by Seldon using psychohistory to guide the course of history and mitigate the chaos and collapse of the Galactic Empire.'),
 'docs': [Document(page_content='Seldon has developed psychohistory and foresees the imminent collapse of the Galactic Empire, which has governed \nthe galaxy for thousands of years. He predicts a dark age lasting 30,000 years but believes that by establishing \nthe Foundation, a group of scientists and intellectuals, \nhe can shorten this period of chaos to just 1,000 years.\nAs the Foundation evolves, it faces numerous challenges, including political turmoil, external threats, and the \nunpredictable nature of individuals.'),
  Document(page_content='Seldon has developed psychohistory and foresees the imminent collapse of 

In [26]:
# Note that the memory does not save automatically
# This will be improved in the future
# For now you need to save it yourself
memory.save_context(inputs, {"answer": result["answer"].content})

In [27]:
memory.load_memory_variables({})

{'history': [HumanMessage(content='What is the Seldon Plan and why is it significant??'),
  AIMessage(content='The significance of the Seldon Plan is that it aims to shorten a predicted dark age of 30,000 years to just 1,000 years by establishing the Foundation, a group of scientists and intellectuals. It is a plan devised by Seldon using psychohistory to guide the course of history and mitigate the chaos and collapse of the Galactic Empire.')]}

In [29]:
inputs = {"question": "but what did i ask earlier?"}
result = final_chain.invoke(inputs)
result

{'answer': AIMessage(content='The question you asked earlier is not provided in the given context.'),
 'docs': [Document(page_content='Seldon has developed psychohistory and foresees the imminent collapse of the Galactic Empire, which has governed \nthe galaxy for thousands of years. He predicts a dark age lasting 30,000 years but believes that by establishing \nthe Foundation, a group of scientists and intellectuals, \nhe can shorten this period of chaos to just 1,000 years.\nAs the Foundation evolves, it faces numerous challenges, including political turmoil, external threats, and the \nunpredictable nature of individuals.'),
  Document(page_content='Seldon has developed psychohistory and foresees the imminent collapse of the Galactic Empire, which has governed \nthe galaxy for thousands of years. He predicts a dark age lasting 30,000 years but believes that by establishing \nthe Foundation, a group of scientists and intellectuals, \nhe can shorten this period of chaos to just 1,000 

In [34]:
message_history.add_user_message(message=query)
message_history.add_ai_message(message=result["answer"])

NameError: name 'message_history' is not defined

In [33]:
 #and then we need to write the history manually at the end