MultiQuery Retriever for the "comparing population" task

In [None]:
from qdrant_client import QdrantClient
from langchain.vectorstores import Qdrant
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chat_models import ChatOpenAI
from langchain.retrievers.multi_query import MultiQueryRetriever
from dotenv import load_dotenv
load_dotenv()


In [3]:
client = QdrantClient(path="./qdrant_storage/")
embeddings = OpenAIEmbeddings()
collection_name = "toronto_sf"
qdrant=Qdrant(client=client, collection_name=collection_name, embeddings=embeddings)

In [4]:
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
retriever_from_llm = MultiQueryRetriever.from_llm(retriever=qdrant.as_retriever(), llm=llm)

In [5]:
question = "compare the population of Toronto and San Francisco"

In [7]:
def pretty_print_docs(docs):
    print(f"\n{'-' * 100}\n".join([f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]))
relevant_docs = retriever_from_llm.get_relevant_documents(query=question, k=3)
len(relevant_docs)
pretty_print_docs(relevant_docs)

Document 1:

The city's foreign-born persons made up 47 per cent of the population, compared to 49.9 per cent in 2006. According to the United Nations Development Programme, Toronto has the second-highest percentage of constant foreign-born population among world cities, after Miami, Florida. While Miami's foreign-born population has traditionally consisted primarily of Cubans and other Latin Americans, no single nationality or culture dominates Toronto's immigrant population, placing it among the most diverse cities in the world. In 2010, it was estimated over 100,000 immigrants arrive in the Greater Toronto Area each year.
----------------------------------------------------------------------------------------------------
Document 2:

Toronto is the most populous city in Canada and the capital city of the Canadian province of Ontario. With a recorded population of 2,794,356 in 2021, it is the fourth-most populous city in North America. The city is the anchor of the Golden Horseshoe, 

Ok, it looks like it does break down the original query, the documents include both the population of San Francisco and Toronto

Alright, now we use the multiquery retriever to solve the "comparing population" problem. As the result shown below. It takes 10 seconds to get to the answer. And the cost is: $0.003905. Comparing to the decompose solution:  $0.001197 and 15 seconds, so multiquery is 3x expensive but 30% faster. But we can improve the time of decompose by running multi-thread or async.

In [13]:
from langchain.callbacks import get_openai_callback
from langchain.prompts import PromptTemplate
from operator import itemgetter
from langchain.schema import StrOutputParser

template = """
    Given the following context:
    {context}
    Please answer the following question:
    {question}
    The answer is:
"""
prompt_template = PromptTemplate.from_template(template=template)

chain = {"context": itemgetter("question") | retriever_from_llm, "question": itemgetter("question")} | prompt_template | llm | StrOutputParser()

In [14]:
with get_openai_callback() as callback:
    response = chain.invoke({"question": question})
    print(response)
    print(f"\nHere is the cost breakdown for this call:\n{callback}")

The population of Toronto is 2,794,356 in 2021, while the population of San Francisco is 873,965 according to the 2020 United States census. Therefore, Toronto has a significantly larger population than San Francisco.

Here is the cost breakdown for this call:
Tokens Used: 1270
	Prompt Tokens: 1175
	Completion Tokens: 95
Successful Requests: 2
Total Cost (USD): $0.003905
