# Explore QA RAG Chains with LCEL

## Install OpenAI, and LangChain dependencies

In [None]:
!pip install langchain==0.2.0
!pip install langchain-openai==0.1.7
!pip install langchain-community==0.2.0

## Install Chroma Vector DB and LangChain wrapper

In [None]:
!pip install langchain-chroma

## Enter Open AI API Key

In [None]:
from getpass import getpass

OPENAI_KEY = getpass('Enter Open AI API Key: ')

## Setup Environment Variables

In [None]:
import os

os.environ['OPENAI_API_KEY'] = OPENAI_KEY

### Open AI Embedding Models

LangChain enables us to access Open AI embedding models which include the newest models: a smaller and highly efficient `text-embedding-3-small` model, and a larger and more powerful `text-embedding-3-large` model.

In [None]:
from langchain_openai import OpenAIEmbeddings

# details here: https://openai.com/blog/new-embedding-models-and-api-updates
openai_embed_model = OpenAIEmbeddings(model='text-embedding-3-small')

## Vector Databases

One of the most common ways to store and search over unstructured data is to embed it and store the resulting embedding vectors, and then at query time to embed the unstructured query and retrieve the embedding vectors that are 'most similar' to the embedded query. A vector database takes care of storing embedded data and performing vector search for you.

### Chroma Vector DB

[Chroma](https://docs.trychroma.com/getting-started) is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0.

### Create a Vector DB and persist on disk

Here we initialize a connection to a Chroma vector DB client, and also we want to save to disk, so we simply initialize the Chroma client and pass the directory where we want the data to be saved to.

In [None]:
docs = [
 'Quantum mechanics describes the behavior of very small particles.',
 'Photosynthesis is the process by which green plants make food using sunlight.',
 'Artificial Intelligence aims to create machines that can think and learn.',
 'The pyramids of Egypt are historical monuments that have stood for thousands of years.',
 'New Delhi is the capital of India and the seat of all three branches of the Government of India.',
 'Biology is the study of living organisms and their interactions with the environment.',
 'Music therapy can aid in the mental well-being of individuals.',
 'Mumbai is the financial capital and the most populous city of India. It is the financial, commercial, and entertainment capital of South Asia.',
 'The Milky Way is just one of billions of galaxies in the universe.',
 'Economic theories help understand the distribution of resources in society.',
 'Kolkata is the de facto cultural capital of India and a historically and culturally significant city. Calcutta served as the de facto capital of India until 1911.',
 'Yoga is an ancient practice that involves physical postures and meditation.'
]

In [None]:
from langchain_chroma import Chroma

# create vector DB of docs and embeddings - takes 1 min on Colab
chroma_db = Chroma.from_texts(texts=docs, collection_name='db_docs',
                              # need to set the distance function to cosine else it uses euclidean by default
                              # check https://docs.trychroma.com/guides#changing-the-distance-function
                              collection_metadata={"hnsw:space": "cosine"},
                              embedding=openai_embed_model)

## Setup a Vector Database Retriever

Here we use the following retrieval strategy:

- Similarity with Threshold Retrieval


### Similarity with Threshold Retrieval

We use cosine similarity here and retrieve the top 3 similar documents based on the user input query and also introduce a cutoff to not return any documents which are below a certain similarity threshold

In [None]:
similarity_threshold_retriever = chroma_db.as_retriever(search_type="similarity_score_threshold",
                                                        search_kwargs={"k": 3,
                                                                       "score_threshold": 0.3})

In [None]:
query = "what is the capital of India?"
top3_docs = similarity_threshold_retriever.invoke(query)
top3_docs

In [None]:
query = "what is the old capital of India?"
top3_docs = similarity_threshold_retriever.invoke(query)
top3_docs

## Build a QA RAG Chain

To build a RAG chain we need a prompt template which instructs the LLM to not answer questions beyond the scope of the retrieved context documents, there are various such prompts out there, we craft one ourselves below

In [None]:
from langchain_core.prompts import ChatPromptTemplate

prompt = """You are an assistant for question-answering tasks.
            Use the following pieces of retrieved context to answer the question.
            If no context is present or if you don't know the answer, just say that you don't know.
            Do not make up the answer unless it is there in the provided context.
            Keep the answer concise and to the point with regard to the question.

            Question:
            {question}

            Context:
            {context}

            Answer:
         """

prompt_template = ChatPromptTemplate.from_template(prompt)

## Load Connection to LLM

Here we create a connection to ChatGPT to use later in our chains

In [None]:
from langchain_openai import ChatOpenAI

chatgpt = ChatOpenAI(model_name='gpt-3.5-turbo', temperature=0)

## Legacy LangChain Syntax for QA RAG Chain

Here we show you how you can create your own QA RAG Chain with legacy syntax which works well but not recommended as LangChain has been migrating to LCEL chains.

In [None]:
from langchain.chains import RetrievalQA

In [None]:
qa_rag_chain = RetrievalQA.from_chain_type(llm=chatgpt,
                                           chain_type="stuff",
                                           retriever=similarity_threshold_retriever,
                                           chain_type_kwargs={"prompt": prompt_template})

In [None]:
query = "What is the capital of India?"
qa_rag_chain.invoke(query)

In [None]:
query = "Tell me about new delhi in detail"
qa_rag_chain.invoke(query)

In [None]:
query = "Tell me about the financial capital of India"
qa_rag_chain.invoke(query)

In [None]:
query = "Who was the winner of the champions league in 2020?"
qa_rag_chain.invoke(query)

## LCEL Syntax for QA RAG Chain - Recommended

Here we show you how to create the RAG chain using LangChain's recommended LCEL

In [None]:
from langchain_core.runnables import RunnablePassthrough

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

qa_rag_chain = (
    {
        "context": (similarity_threshold_retriever
                      |
                    format_docs),
        "question": RunnablePassthrough()
    }
      |
    prompt_template
      |
    chatgpt
)

In [None]:
query = "What is the capital of India?"
result = qa_rag_chain.invoke(query)
print(result.content)

In [None]:
query = "Tell me about the financial capital of India in detail"
result = qa_rag_chain.invoke(query)
print(result.content)

In [None]:
query = "What is the fastest animal?"
result = qa_rag_chain.invoke(query)
print(result.content)

In [None]:
query = "How do plants survive?"
result = qa_rag_chain.invoke(query)
print(result.content)