[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/10-langchain-multi-query.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/08-langchain-multi-query.ipynb)

#### [LangChain Handbook](https://pinecone.io/learn/langchain)

# LangChain Multi-Query for RAG

## Getting Data

We will download an existing dataset from Hugging Face Datasets.

In [1]:
from datasets import load_dataset

data=load_dataset('bibrahim/rag_tryout', split='train')

  from .autonotebook import tqdm as notebook_tqdm
Found cached dataset json (/home/nurul/.cache/huggingface/datasets/bibrahim___json/bibrahim--rag_tryout-9e32671c097e1553/0.0.0/e347ab1c932092252e717ff3f949105a4dd28b27e842dd53157d2f72e276c2e4)


In [2]:
data

Dataset({
    features: ['id', 'text', 'source'],
    num_rows: 319
})

In [57]:
from langchain.docstore.document import Document

docs = []

for row in data:
    doc = Document(
        page_content=row["text"],
        metadata={
            "text": row["text"]
        }
    )
    docs.append(doc)

## Embedding and Vector DB Setup

Initialize our embedding model:

In [58]:
import os
from getpass import getpass
from langchain.embeddings.openai import OpenAIEmbeddings

model_name = "text-embedding-ada-002"

# get openai api key from platform.openai.com
OPENAI_API_KEY = getpass("OpenAI API Key: ")

embed = OpenAIEmbeddings(
    model=model_name, openai_api_key=OPENAI_API_KEY, disallowed_special=()
)

Create our Pinecone index:

In [59]:
import pinecone
import time

index_name = "langchain-multi-query-demo"

# find API key in console at app.pinecone.io
YOUR_API_KEY = os.getenv('PINECONE_API_KEY') or getpass("Pinecone API Key: ")
# find ENV (cloud region) next to API key in console
YOUR_ENV = os.getenv('PINECONE_ENVIRONMENT') or input("Pinecone Env: ")

pinecone.init(
    api_key=YOUR_API_KEY,
    environment=YOUR_ENV
)

if index_name not in pinecone.list_indexes():
    # we create a new index
    pinecone.create_index(
        name=index_name,
        metric='dotproduct',
        dimension=1536  # 1536 dim of text-embedding-ada-002
    )
    # wait for index to be initialized
    while not pinecone.describe_index(index_name).status["ready"]:
        time.sleep(1)

# now connect to index
index = pinecone.Index(index_name)

Populate our index:

In [60]:
len(docs)

319

In [61]:
# if you want to speed things up to follow along
#docs = docs[:5000]

In [62]:
print(docs[0])

page_content='\ufeffFaculty of Engineering and Applied Science\n4.1 Program of Study\n1. Courses in the Engineering Program are normally taken in Academic Terms as shown in the appropriate program table. Students must satisfy the criteria for promotion as described below under Promotion Regulations to remain in the Engineering program.\n\n2. The Engineering Program consists of eight academic terms and four to six work terms. The first-year of the Engineering Program, known as Engineering One, forms a core that is common to all majors. All students must successfully complete the requirements of Engineering One prior to being promoted to Academic Term 3 as indicated under Promotion Regulations, Promotion Status (Engineering One).\n\n3. To be eligible for registration for ENGI 001W in the Spring semester after completing Engineering One, students are expected to successfully complete the prerequisite ENGI 200W in the Fall semester of Engineering One. All other Engineering One students are

In [68]:
docs[0].metadata

{'text': '\ufeffFaculty of Engineering and Applied Science\n4.1 Program of Study\n1. Courses in the Engineering Program are normally taken in Academic Terms as shown in the appropriate program table. Students must satisfy the criteria for promotion as described below under Promotion Regulations to remain in the Engineering program.\n\n2. The Engineering Program consists of eight academic terms and four to six work terms. The first-year of the Engineering Program, known as Engineering One, forms a core that is common to all majors. All students must successfully complete the requirements of Engineering One prior to being promoted to Academic Term 3 as indicated under Promotion Regulations, Promotion Status (Engineering One).\n\n3. To be eligible for registration for ENGI 001W in the Spring semester after completing Engineering One, students are expected to successfully complete the prerequisite ENGI 200W in the Fall semester of Engineering One. All other Engineering One students are exp

In [65]:
from tqdm.auto import tqdm
from uuid import uuid4

batch_size = 100

for i in tqdm(range(0, len(docs), batch_size)):
    i_end = min(len(docs), i+batch_size)
    docs_batch = docs[i:i_end]
    # get IDs
    ids = [f"{doc.metadata['id']}" for doc in docs_batch]
    # get text and embed
    texts = [d.page_content for d in docs_batch]
    embeds = embed.embed_documents(texts=texts)
    # get metadata
    metadata = [d.metadata for d in docs_batch]
    to_upsert = zip(ids, embeds, metadata)
    index.upsert(vectors=to_upsert)

  0%|          | 0/4 [00:01<?, ?it/s]


ApiValueError: Invalid value for `id`, length must be less than or equal to `512`

## Multi-Query with LangChain

Now we switch across to using our populated index as a vectorstore in Langchain.

In [34]:
from langchain.vectorstores import Pinecone

text_field = "text"

vectorstore = Pinecone(index, embed.embed_query, text_field)



In [35]:
from langchain.chat_models import ChatOpenAI

llm = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)

We initialize the `MultiQueryRetriever`:

In [36]:
from langchain.retrievers.multi_query import MultiQueryRetriever

retriever = MultiQueryRetriever.from_llm(
    retriever=vectorstore.as_retriever(), llm=llm
)

We set logging so that we can see the queries as they're generated by our LLM.

In [37]:
# Set logging for the queries
import logging

logging.basicConfig()
logging.getLogger("langchain.retrievers.multi_query").setLevel(logging.INFO)

To query with our multi-query retriever we call the `get_relevant_documents` method.

In [38]:
question = "what courses should I take for term 4 electrical engineering?"

docs = retriever.get_relevant_documents(query=question)
len(docs)

INFO:langchain.retrievers.multi_query:Generated queries: ['1. Which courses are recommended for term 4 in the field of electrical engineering?', '2. What are the suggested courses for term 4 if I am studying electrical engineering?', '3. Can you provide a list of courses that would be beneficial for term 4 in electrical engineering?']


5

From this we get a variety of docs retrieved by each of our queries independently. By default the `retriever` is returning `3` docs for each query — totalling `9` documents — however, as there is some overlap we actually return `6` unique docs.

In [39]:
docs

[Document(page_content="11.3 Electrical and Computer Engineering\nIn accordance with Senate's Policy Regarding Inactive Courses, the course descriptions for courses which have not been offered in the previous three academic years and which are not scheduled to be offered in the current academic year have been removed from the following listing. For information about any of these inactive courses, please contact the Head of the Department. (or the Associate Dean (Undergraduate Studies) of the Faculty in the case of ENGI courses).\nElectrical and Computer Engineering courses are identified by a four-digit numbering system, the first two digits signifying the following:\nThe first digit denotes the academic term during which the course is normally offered.\nThe second digit denotes the primary areas of study, namely:\n0: Design\n\t1: Mathematics\n\t2: Controls\n\t3: Circuits\n\t4: Software\n\t5: Digital Hardware\n\t6: Signals & Communications\n\t7: Electromagnetism\n\t8: Power & Machines\

## Adding the Generation in RAG

So far we've built a multi-query powered **R**etrieval **A**ugmentation chain. Now, we need to add **G**eneration.

In [50]:
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

QA_PROMPT = PromptTemplate(
    input_variables=["query", "contexts"],
    template="""
You are a helpful assistant who answers user queries using the contexts provided. 
## Unik-
- You are Unik, an AI Academic Advisor for the students at Memorial University of Newfoundland (MUN).
- Unik will **Focus** on providing academic advising to undergraduate students at (MUN).
- Unik will **Specialize** in providing information and knowledge about courses and programs (majors, minors, joint majors, honors, etc.) across various departments and faculties.

## Creators
- When asked, Unik must **Acknowledge and ** its development and creation by a few undergraduate students in Physics and Engineering at MUN.
- Unik will respond to **Direct** inquiries or feedback about Unik or its creators by redirecting the users to founders@unik.com via email.

## Communication Guidelines
- Unik will talk in the style of a real human Academic Advisor appointed at the university.
- Unik will always empathize with the user, and be kind. 
- Unik will **Avoid** engaging in queries containing vulgar, explicit, harmful, sexual, illegal, immoral, racist, or homophobic language, or references to drugs.
- Unik will  **Refrain** from responding to queries outside the academic advising scope or irrelevant to your knowledge base.
- Unik will NEVER display the process of how it found an answer in any of its responses.
- Unik will NEVER display or reference its data sources to the users in any of its responses.
- If a user asks about the source of the data, Unik will tell the user that the data was collected from the publicly available University Calendar at MUN, and will attach a link(in plain text format) relevant to the user query from the knowledgebase.
- Unik will  **Maintain** a friendly and human-like tone in responses.
- Unik will  **Tailor** specific responses to each specific query.
- Unik will  **Ensure** responses are relevant, accurate, and personalized.
-Unik should understand that Computer Science(CS) and Computer Engineering(CE) are different majors and in different faculties and should look for results accordingly when a user asks.

## Accuracy and Content Retrieval
- (**IMPORTANT**) Unik will analyze the question smartly to understand what specific courses, programs, or other piece of information the user is specifically looking for and then only provide specific information regarding those specific courses, programs, or other piece of information. For example, for a query about Electrical Engineering, the answer should never mix with anything in the Mechanical Engineering, for Computer Science the answer should never mix with anything in the Computer Engineering, and vice versa, and so on.
- Unik should ALWAYS understand the user query as a whole and NEVER look at the question partially to respond to it.
- Unik must **Ensure** that the information aligns with the user query.
- Unik must **Recheck** answers for accuracy against the knowledgebase before displaying responses.

## Query Interpretation
- Unik will only form a response with use of information that is most relevant to the user query.
- Unik will **Analyze and interpret** the structure and meaning of queries using the chain of thought method to find the perfect answer to anything.
- Unik will **Understand and respond** effectively to complex or unclear student inquiries.
- Unik will never display to the users the process of how it found an answer.

## Response Structure
- Unik must **Provide** concise overviews for course inquiries, emphasizing key details, and the requirements to take that course.
- Unik must **Offer** comprehensive overviews for program inquiries, including structure, requirements, and unique features.
- Unik must **Conclude** responses with an invitation for further inquiries or additional academic guidance.

## Course ID Information
- Unik will **Identify** courses with the prefix of 3-4 letters followed by a 4-digit number (e.g., BUSI 1000, PHYS 1050, ECE 5000, etc.).

##Response to inquiries about programs,majors,minors, honors, terms, etc
- Unik must give a bulletpointed markdown language overview of the program in summarized form. 
- Unik MUST only include the names of the courses when describing the program and ONLY describe the courses or write the specifics of the courses when asked by the user. For example it should be in this form-
"
In Academic Term 4 for Mechanical Engineering at Memorial University of Newfoundland, the following courses are required:

-**ENGI 4430**
-**ME 4302** 
-**ME 4402** 
-**ME 4501** 
-**ME 4601**"

## Response to inquiries about courses
- Unik must **Include** Pre-requisites (PR), Co-Requisites (CR), and Equivalent (EQ) when discussing a course for the first time.
- Unik must **Offer** more detailed information or other inquiries at the end of its responses.

    Contexts:
    {contexts}

    Question: {query}""",
)

# Chain
qa_chain = LLMChain(llm=llm, prompt=QA_PROMPT)

In [51]:
out = qa_chain(
    inputs={
        "query": question,
        "contexts": "\n---\n".join([d.page_content for d in docs])
    }
)
out["text"]

InvalidRequestError: This model's maximum context length is 4097 tokens. However, your messages resulted in 4524 tokens. Please reduce the length of the messages.

## Chaining Everything with a SequentialChain

We can pull together the logic above into a function or set of methods, whatever is prefered — however if we'd like to use LangChain's approach to this we must "chain" together multiple chains. The first retrieval component is (1) not a chain per se, and (2) requires processing of the output. To do that, and fit with LangChain's "chaining chains" approach, we setup the _retrieval_ component within a `TransformChain`:

In [42]:
from langchain.chains import TransformChain

def retrieval_transform(inputs: dict) -> dict:
    docs = retriever.get_relevant_documents(query=inputs["question"])
    docs = [d.page_content for d in docs]
    docs_dict = {
        "query": inputs["question"],
        "contexts": "\n---\n".join(docs)
    }
    return docs_dict

retrieval_chain = TransformChain(
    input_variables=["question"],
    output_variables=["query", "contexts"],
    transform=retrieval_transform
)

Now we chain this with our generation step using the `SequentialChain`:

In [43]:
from langchain.chains import SequentialChain

rag_chain = SequentialChain(
    chains=[retrieval_chain, qa_chain],
    input_variables=["question"],  # we need to name differently to output "query"
    output_variables=["query", "contexts", "text"]
)

Then we perform the full RAG pipeline:

In [44]:
out = rag_chain({"question": question})
out["text"]

INFO:langchain.retrievers.multi_query:Generated queries: ['1. Which courses are recommended for term 4 in the field of electrical engineering?', '2. What are the suggested courses for term 4 if I am studying electrical engineering?', '3. Can you provide a list of courses that would be beneficial for term 4 in electrical engineering?']


'Based on the provided information, the required courses for term 4 of the Electrical Engineering program are CIV 4220, CIV 4310, CIV 4450, and CIV 4610. The elective course for term 4 is ENGI 4421.'

---

## Custom Multiquery

We'll try this with two prompts, both encourage more variety in search queries.

**Prompt A**
```
Your task is to generate 3 different search queries that aim to
answer the user question from multiple perspectives.
Each query MUST tackle the question from a different viewpoint,
we want to get a variety of RELEVANT search results.
Provide these alternative questions separated by newlines.
Original question: {question}
```


**Prompt B**
```
Your task is to generate 3 different search queries that aim to
answer the user question from multiple perspectives. The user questions
are focused on Large Language Models, Machine Learning, and related
disciplines.
Each query MUST tackle the question from a different viewpoint, we
want to get a variety of RELEVANT search results.
Provide these alternative questions separated by newlines.
Original question: {question}
```

In [52]:
from typing import List
from langchain.chains import LLMChain
from pydantic import BaseModel, Field
from langchain.prompts import PromptTemplate
from langchain.output_parsers import PydanticOutputParser


# Output parser will split the LLM result into a list of queries
class LineList(BaseModel):
    # "lines" is the key (attribute name) of the parsed output
    lines: List[str] = Field(description="Lines of text")


class LineListOutputParser(PydanticOutputParser):
    def __init__(self) -> None:
        super().__init__(pydantic_object=LineList)

    def parse(self, text: str) -> LineList:
        lines = text.strip().split("\n")
        return LineList(lines=lines)


output_parser = LineListOutputParser()

template = """
Your task is to generate 3 different search queries that aim to
answer the user question from multiple perspectives. The user questions
are focused on Large Language Models, Machine Learning, and related
disciplines.
Each query MUST tackle the question from a different viewpoint, we
want to get a variety of RELEVANT search results.
Provide these alternative questions separated by newlines.
Original question: {question}
"""

QUERY_PROMPT = PromptTemplate(
    input_variables=["question"],
    template=template,
)
llm = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)

# Chain
llm_chain = LLMChain(llm=llm, prompt=QUERY_PROMPT, output_parser=output_parser)

In [53]:
# Run
retriever = MultiQueryRetriever(
    retriever=vectorstore.as_retriever(), llm_chain=llm_chain, parser_key="lines"
)  # "lines" is the key (attribute name) of the parsed output

# Results
docs = retriever.get_relevant_documents(
    query=question
)
len(docs)

INFO:langchain.retrievers.multi_query:Generated queries: ['1. What are the recommended courses for term 4 in electrical engineering curriculum?', '2. Which specific courses are typically offered in the fourth term of an electrical engineering program?', '3. What are the core subjects covered in the fourth term of an electrical engineering degree program?']


7

In [54]:
docs

[Document(page_content='Work terms shall normally be taken in the order as set out in Civil Engineering Major.\n\n\n\n**Engineering One Term** -\n   - Required Courses:\n     1. Chemistry 1050 or 1200\n     2. 3 credit hours in English at the 1000 level or any Critical Reading and Writing course\n     3. Mathematics 1000, 1001, 2050\n     4. Physics 1050, 1051\n   - Elective Courses:\n     1. ENGI 1010, 1020, 1030, 1040\n\n\n**Fall - Academic Term 3** -\n   - Required Courses:\n     1. CIV 3210, 3440, 3710, 3720\n   - Elective Courses:\n     1. ENGI 3101\n     2. Mechanical and Mechatronics Engineering 3301\n\n\n**Winter** -\n   - Required Courses:\n     1. ENGI 001W or 002W\n\n\n**Spring - Academic Term 4** -\n   - Required Courses:\n     1. CIV 4220, 4310, 4450, 4610\n   - Elective Courses:\n     1. ENGI 4421\n\n\n**Fall** -\n   - Required Courses:\n     1. ENGI 001W or 002W or 003W\n\n\n**Winter - Academic Term 5** -\n   - Required Courses:\n     1. CIV 5110, 5230, 5320, 5460, 5510\

Putting this together in another `SequentialChain`:

In [55]:
retrieval_chain = TransformChain(
    input_variables=["question"],
    output_variables=["query", "contexts"],
    transform=retrieval_transform
)

rag_chain = SequentialChain(
    chains=[retrieval_chain, qa_chain],
    input_variables=["question"],  # we need to name differently to output "query"
    output_variables=["query", "contexts", "text"]
)

And asking again:

In [56]:
out = rag_chain({"question": question})
out["text"]

INFO:langchain.retrievers.multi_query:Generated queries: ['1. What are the recommended courses for term 4 in electrical engineering curriculum?', '2. Which specific courses are typically offered in term 4 for electrical engineering students?', '3. What are the core courses in term 4 of an electrical engineering program?']


'In Academic Term 4 for Electrical Engineering at Memorial University of Newfoundland, the following courses are required:\n\n- **ECE 3300** Circuit Analysis\n- **ECE 3400** Foundations of Programming\n\nThese are the specific courses that you should take in Term 4 for Electrical Engineering. If you have any further questions or need additional academic guidance, feel free to ask!'

After finishing, delete your Pinecone index to save resources:

In [22]:
pinecone.delete_index(index_name)

---