#Retrieval QA chain with Clarifai Vectorstore using different Chain types

A retrieval QA chain is a Question-answering system that uses a retrieval-based approach to find relevant information from a large collection of documents stored inside your clarifai vectorstore. It consists of two main components: a retriever and a language model. Also, this example walks through the scenarios where the retriever generates response based on the chain types.

By leveraging langchain's retriever and chains , we can build a simple Q/A with our vectorstore which answers user queries based on the relevant content chunks from retriever.

*    The retriever is responsible for searching the document collection and retrieving the most relevant documents based on the input query.

*   The language model is then used to generate an answer based on the retrieved documents.

* [Chain type](https://python.langchain.com/docs/modules/chains/document/), which basically decides the sequence of operation to be performed with respect to retrieved chunks.

###Setup

In [None]:
!pip install langchain
!pip install clarifai

Initialize a LLM class from clarifai models.

You can use several language models from [Clarifai](https://clarifai.com/explore/models?filterData=%5B%7B%22field%22%3A%22use_cases%22%2C%22value%22%3A%5B%22llm%22%5D%7D%5D&page=1&perPage=24) platform. Sign up and get your [PAT](https://clarifai.com/settings/security) to access it.

There's 2 ways to call a model from clarifai platform, either from the model URL or using the model's app id, user id, model id credentials.

In [1]:
# use model URL
MODEL_URL="https://clarifai.com/mistralai/completion/models/mistral-7B-OpenOrca"

                     #or

# Use model parameters, change user_id, app_id and model_id to use different models from the clarifai platform.
USER_ID = "mistralai"
APP_ID = "completion"
MODEL_ID = "mistral-7B-OpenOrca"

For the example we are using ***mistral-7B-OpenOrca*** model.

## **mistral-7B-OpenOrca**
This [mistral-7B-OpenOrca](https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca) release is a finetune of Mistral-7B base model. It claims to achieve 98% of the eval performance of Llama2-70B.



Initialize your PAT key as env variable.

In [10]:
import os
os.environ["CLARIFAI_PAT"]="YOUR_CLARIFAI_PAT"

Import Clarifai LLM class from langchain.

In [30]:
from langchain.llms import Clarifai
llm=Clarifai(model_url=MODEL_URL)

### Pre-Processing of documents

To ingest your document into the vectorstore, Pre-process it using langchain's TextSplitter and Textloader funtions to convert and load it into Clarifai vectorstore as documents of chunks.

In [12]:
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.vectorstores import Clarifai

For the context we are ingesting a handbook of *Gulf historic Dubai GP revival - 2023 rule book* and we want to chat with it to understand the rules and regulations.

In [None]:
loader = TextLoader("/content/rule_book.txt")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)

###**Why Clarifai vectorstore ?**
Clarifai Vectorstore combines the process of embedding your text inputs and storing in vectoredatabase. Which effectively reduce the need for an embedding model and focusses on uploading documents directly into clarifai vectorstore DB and perform effective vector search Q/A with your documents.

In [None]:
clarifai_vector_db = Clarifai.from_documents(
    user_id="user_id",
    app_id= "app_id",
    documents=docs,
    number_of_docs=2,
)

Let's test our Vectorstore with retrievalQA chain with different chain types.

### **Retrieval QA chain with chain_type as Stuff**

[Stuff](https://python.langchain.com/docs/modules/chains/document/stuff) is simple method where it retrieves all the relevant document chunks and stuffs it as a whole in the prompt and send it to our LLM to generate response.

In [None]:
from langchain.chains import RetrievalQA

qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=clarifai_vector_db.as_retriever())
query = "How automobile classes has been segregated with respect to engines?"

In the above scenario we have used a chain type of stuff, which effectively stuffs all the relevant chunks which was retrieved from the vector store into LLM model's context as a prompt. This in return generates a response according to the user's query.

In [None]:
print(qa.run(query))

 The automobile classes have been segregated with respect to engines based on the type of engine used in the Formula One cars from 1970 to 1985. The classes are as follows:

1. F1 Grand Prix cars 3L aspirated engine built from 1970 to 1972, equipped with a Ford-Cosworth DFV engine (Class 4.4.1.1).
2. F1 Grand Prix cars 3L aspirated engine built from 1973 to 1976, equipped with a Ford-Cosworth DFV engine (Class 4.4.1.2).
3. F1 Grand Prix cars 3L aspirated engine built from 1977 to 1980, designed not to exploit the ground effect (Class 4.5.1.1).
4. F1 Grand Prix cars 3L aspirated engine built from 1977 to 1980, designed to exploit the ground effect, equipped with a Ford-Cosworth DFV engine (Class 4.5.1.2).
5. F1 Grand Prix cars 3L aspirated engine built from 1977 to 1980, designed to exploit the ground effect, equipped with other engines (Class 4.5.1.3).
6. F1 Grand Prix cars 3L aspirated engine built from 1981 to 1985, equipped with a Ford-Cosworth DFV engine (Class 4.6.1.1).
7. F1 Gran

**Use case for Stuff :**

Stuff is more clear straightforward use case, when the context is smaller and you need precise answering to your Q/A, go with stuff.

### **Retrieval QA chain with chain_type as Map Reduce**

[Map Reduce](https://python.langchain.com/docs/modules/chains/document/map_reduce) chain first applies an LLM chain to each document individually (the Map step), treating the chain output as a new document. It then passes all the new documents to a separate combine documents chain to get a single output (the Reduce step). It can optionally first compress, or collapse, the mapped documents to make sure that they fit in the combine documents chain (which will often pass them to an LLM). This compression step is performed recursively if necessary.

In [23]:
from langchain.chains import RetrievalQA

qa = RetrievalQA.from_chain_type(llm=llm, chain_type="map_reduce", retriever=clarifai_vector_db.as_retriever())
query = "How automobile classes has been segregated with respect to engines?"

In [27]:
print(qa.run(query))

Automobile classes have been segregated with respect to engines as follows:

1. F1 Grand Prix cars 3L aspirated engine built from 1970 to 1972: 
   - Equipped with a Ford-Cosworth DFV Engine
   - Equipped with other engines

2. F1 Grand Prix cars 3L aspirated engine built from 1973 to 1976:
   - Equipped with a Ford-Cosworth DFV Engine
   - Equipped with other engines

3. F1 Grand Prix cars 3L aspirated engine built from 1977 to 1980:
   - Cars designed not to exploit the ground effect
   - Cars designed to exploit the ground effect, equipped with a Ford-Cosworth DFV engine
   - Cars designed to exploit the ground effect, equipped with other engines

4. Cars equipped with a Ford-Cosworth DFV engine
   - F1 Grand Prix cars 3L aspirated built from 1981 to 1985
   - Cars equipped with other engines

5. Class Special Invitation: This class is for any Formula One cars or any other cars considered by the Organization to be of Special Historical interest to the Organization or Promoters of an

**Use case of Map reduce:**

For the same query we got different response while using the Map_reduce chain type, since it is summarizing at each stage before passing it to final stage this chain might be good use case for large context scenarios where we have number of chunks as documents that needs to be answered based on the summary.

### **Retrieval QA chain with chain_type as Map Re-rank**

[Map re-rank](https://python.langchain.com/docs/modules/chains/document/map_rerank) documents chain runs an initial prompt on each document, that not only tries to complete a task but also gives a score for how certain it is in its answer. The highest scoring response is returned.

In [28]:
from langchain.chains import RetrievalQA

qa = RetrievalQA.from_chain_type(llm=llm, chain_type="map_rerank", retriever=clarifai_vector_db.as_retriever())
query = "How automobile classes has been segregated with respect to engines?"

In [29]:
print(qa.run(query))



There is also a Special Invitation Class for any Formula One cars or any other cars considered to be of Special Historical interest to the Organization or Promoters of any of the races, or particular benefit to the Organization.


**Use case for Map Re-rank:**

From the above response we can see the generated answer is based on the top ranked chunk summary from vectorstore, which is then passed to LLM to generate response.
It can be effectively used in long context scenarios and Top K was larger also the documents might contain several chunks with same topics.


