# RAG Fusion

**Parallel Retrieval of Information from Two Data Sources (Original Document & Addendum):**
The retrieved information is compared, and the final information is updated and provided to the user accordingly.

## Technology Stack:
- **GPT-3.5 Turbo:** Used for retrieval purposes.
- **GPT-4:** Used for decision-making purposes.
- **Pinecone:** A vector database to store the information.
- **Langchain:** A framework to manage LLM solutions.
- **OpenAI text-embedding-3-large (3072 dimensions):** Used for embedding textual data.

## Procedure:
1. **Query the Vector Database:** Containing the chunks of original documents.
2. **Query the Vector Database:** Containing the chunks of addendum documents.
3. **Inject the Retrieved Responses:** Into a tuned prompt which will make judgments based on the two inputs.
4. **Decision-Making Prompt Model:** This model decides:
   - Whether the addendum retrieved information is relevant or not.
   - If yes, then alter the changes in the primary information.
   - If no, then return the original information.

## Reasons for This Approach:
1. **Divide a Bigger Task into Smaller Ones:** Helps in managing complexity.
2. **Introduce a Hard Boundary Between Addendums and the Original Document:** To avoid ambiguous responses.
3. **Awareness of Data Derivation:** Makes the LLM more capable of building accurate responses.
4. **Division of Vector Database:** Reduces the tokens consumed and the vectors retrieved from the vector database.
5. **Less Context in Each LLM:** Improves the LLM's performance in terms of accuracy.


In [24]:
openai_key="sk-yf"
pinecone_key="babe03fbb"
# enter your question to ask
user_query="which place to avoid"

### Model for retrieval 

In [25]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-3.5-turbo",api_key=openai_key)

### Original documents chunks retrieval

In [27]:

from langchain_pinecone import PineconeVectorStore
from langchain_community.vectorstores import DocArrayInMemorySearch
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
from langchain_openai import OpenAIEmbeddings
index_name = "doc1"
#index = pc.Index(index_name)

# Set up embeddings
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large",dimensions=3072,api_key=openai_key)

vectorstore = PineconeVectorStore(pinecone_api_key=pinecone_key,index_name=index_name, embedding=embeddings)
retriever = vectorstore.as_retriever()

template = """Answer the question in detail :
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
output_parser = StrOutputParser()

setup_and_retrieval = RunnableParallel(
    {"context": retriever, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser

info=chain.invoke(user_query)
print(f"information from original doc: {info}")



information from original doc: The document outlines several places to avoid when chartering a vessel. These include Great Lakes, St. Lawrence Seaway, Orinoco River, Cuba (with exceptions), Turkish-occupied Cyprus, Israel, Eritrea, Iraq, Iran (under UN sanctions), Sri Lanka (except Colombo), Somalia, Ethiopia, North Korea, Russian Pacific ports (except Vladivostok and Vostochny), active war zones, and countries with UN/flag state/US sanctions. Additionally, calling at Israeli ports prohibits subsequent calls at Arabian ports (except Egypt, if trade allowed) unless the vessel is proven not to be on the Arab Boycott List. Charterers bear the cost of removing the vessel from the Arab Boycott List if blacklisted due to Israel visits.


### Addendum chunks retrieval 
The text can be added into this vector database through this url : 
https://addendum.streamlit.app/

THe text enter here will be converted into embedding and upserted into a vector db in pinecone

I have added the text about "which port to avoid?" : a paragraph containing these regions : (north amreican coasts, cuba , mexico , gulf of mexico)

In [28]:
index_name = "doc2"
#index = pc.Index(index_name)

# Set up embeddings
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-large",dimensions=3072,api_key=openai_key)

vectorstore1 = PineconeVectorStore(pinecone_api_key=pinecone_key,index_name=index_name, embedding=embeddings)
retriever1 = vectorstore1.as_retriever()

template = """Answer the question in detail:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
output_parser = StrOutputParser()

setup_and_retrieval = RunnableParallel(
    {"context": retriever1, "question": RunnablePassthrough()}
)
chain = setup_and_retrieval | prompt | model | output_parser

new_info=chain.invoke(user_query)
print(new_info)



According to the information provided in the documents, it is advised to avoid northern America, especially the Caribbean Sea and regions around the Gulf of Mexico. Additionally, it is recommended to avoid the ports in these areas. This caution could be due to potential risks or issues that may arise in these specific locations. It is important to heed this advice in order to ensure the safety and security of any activities or operations that may be taking place in these regions.


## The desicion maker llm 
injects the response from the above two retirever llm into a prompt which enables the llm to frame final response

In [31]:
from langchain_openai import ChatOpenAI

model_final = ChatOpenAI(model="gpt-4-turbo",api_key=openai_key)

prompt = ChatPromptTemplate.from_template("""This is the query by the user : {question}
This is the existing information about the query : {info} 
And this is a new piece of information which might or might not be relevant to the query : {new_instruction}
Carefully decide whether the new piece of information is relevant to the query or not.
If yes then update the initial information with the new piece of information and return the updated information.
If no then return the initial information.
only return the updated information.                                          """)
output_parser = StrOutputParser()

chain = prompt | model | output_parser

chain.invoke({"question": "which route to avoid?","info":info,"new_instruction":new_info})

'The new piece of information about avoiding northern America, especially the Caribbean Sea and regions around the Gulf of Mexico, is relevant to the query. \n\nUpdated information:\nThe document outlines several places to avoid when chartering a vessel. These include Great Lakes, St. Lawrence Seaway, Orinoco River, Cuba (with exceptions), Turkish-occupied Cyprus, Israel, Eritrea, Iraq, Iran (under UN sanctions), Sri Lanka (except Colombo), Somalia, Ethiopia, North Korea, Russian Pacific ports (except Vladivostok and Vostochny), active war zones, countries with UN/flag state/US sanctions, northern America (especially the Caribbean Sea and regions around the Gulf of Mexico). Additionally, calling at Israeli ports prohibits subsequent calls at Arabian ports (except Egypt, if trade allowed) unless the vessel is proven not to be on the Arab Boycott List. Charterers bear the cost of removing the vessel from the Arab Boycott List if blacklisted due to Israel visits.'

### Same code as above but consodilated

In [17]:
vectorstore1 = PineconeVectorStore(pinecone_api_key=pinecone_key,index_name="doc2", embedding=embeddings)
retriever1 = vectorstore1.as_retriever()
model_final = ChatOpenAI(model="gpt-4-turbo",api_key=openai_key)
template = """This is the query by the user : {question}
This is the existing information about the query : {info} 
And this is a new piece of information which might or might not be relevant to the query : {new_instruction}
Carefully decide whether the new piece of information is relevant to the query or not.
If yes then update the initial information with the new piece of information and return the updated information.
If no then return the initial information.
Only return the response to the user, not justification
"""
prompt = ChatPromptTemplate.from_template(template)
output_parser = StrOutputParser()

setup_and_retrieval = RunnableParallel(
    {"new_instruction": retriever1, "question": RunnablePassthrough(),"info":retriever}
)
chain = setup_and_retrieval | prompt | model_final | output_parser

info=chain.invoke(input="where route to avoid?")
print(info)



NotFoundError: Error code: 404 - {'error': {'message': 'The model `gpt-4-turbo` does not exist or you do not have access to it.', 'type': 'invalid_request_error', 'param': None, 'code': 'model_not_found'}}