# **Building a Self-Querying RAG Application using Qdrant**

**Introduction**

Creating a foundation for a Retrieval-Augmented Generation (RAG) pipeline is often straightforward. However, creating a RAG application with structured data with a self querying mechanism can be very useful too.

❗This notebook requires **OpenAI Key**

### **1. Import relevant packages**

In [2]:
import os
import json
import warnings
import openai
import pandas as pd
import qdrant_client
from tqdm import tqdm
from getpass import getpass
from dotenv import load_dotenv
from datasets import load_dataset
from typing import Optional, List, Tuple
from qdrant_client import QdrantClient, models
from qdrant_client.http.models import PointStruct
from langchain_core.language_models import BaseChatModel
warnings.filterwarnings('ignore')

In [3]:
load_dotenv()

True

### **2. Setup your openai key**

In [4]:
if not (openai_api_key := os.environ.get("OPENAI_API_KEY")):
    openai_api_key = getpass("🔑 Enter your OpenAI API key: ")
openai.api_key = openai_api_key
os.environ["OPENAI_API_KEY"] = openai_api_key

### **3.  Retrieve the documents / dataset to be used**

In [5]:

dataset = load_dataset("GroNLP/ik-nlp-22_winemag", split="test")


Downloading readme: 100%|███████████████████████████████████████████████████████████████████████| 33.0/33.0 [00:00<00:00, 226B/s]
Downloading data: 100%|█████████████████████████████████████████████████████████████████████| 20.5M/20.5M [00:02<00:00, 7.56MB/s]
Downloading data: 100%|█████████████████████████████████████████████████████████████████████| 1.45M/1.45M [00:00<00:00, 2.85MB/s]
Downloading data: 100%|█████████████████████████████████████████████████████████████████████| 1.46M/1.46M [00:00<00:00, 2.76MB/s]
Generating train split: 100%|███████████████████████████████████████████████████| 70458/70458 [00:00<00:00, 280656.73 examples/s]
Generating validation split: 100%|████████████████████████████████████████████████| 5000/5000 [00:00<00:00, 332918.26 examples/s]
Generating test split: 100%|██████████████████████████████████████████████████████| 5000/5000 [00:00<00:00, 304062.87 examples/s]


In [6]:
len(dataset)

5000

In [7]:
df = dataset.to_pandas()
df

Unnamed: 0,index,country,description,points,price,province,variety
0,11602,US,"Pretty dark for a rosé, and heavy and rich, to...",83,18.0,California,Rosé
1,27260,US,"Attractive roasted, smoky aromas join ripe plu...",90,13.0,California,Merlot
2,76630,US,Vinified in stainless and left on the lees for...,90,14.0,Oregon,Pinot Gris
3,12014,US,"A beautiful Merlot, noble and classic. It does...",92,28.0,California,Merlot
4,9116,US,A palate-pleasing mix of juicy boysenberry fru...,90,35.0,Washington,Syrah
...,...,...,...,...,...,...,...
4995,10215,US,From one of California’s coolest growing regio...,86,30.0,California,Champagne Blend
4996,88814,US,"A touch of vanilla, butter and seared lemons c...",88,28.0,California,Viognier
4997,124091,Italy,This playful blend of air-dried white grapes (...,83,12.0,Veneto,White Blend
4998,1417,US,"The cool vintage was a challenge, which speaks...",93,38.0,California,Zinfandel


In [8]:
df.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 7 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   index        5000 non-null   int64  
 1   country      5000 non-null   object 
 2   description  5000 non-null   object 
 3   points       5000 non-null   int64  
 4   price        5000 non-null   float64
 5   province     5000 non-null   object 
 6   variety      5000 non-null   object 
dtypes: float64(1), int64(2), object(4)
memory usage: 273.6+ KB


### **4. Setting up our vector store - Qdrant**

Set up the qdrant client and then create a collection so that our document embeddings can be stored.

In [5]:

##Uncomment to initialise qdrant client in memory
#client = qdrant_client.QdrantClient(
#    location=":memory:",
#)

##Uncomment below to connect to Qdrant Cloud
client = qdrant_client.QdrantClient(
    os.environ.get("QDRANT_URL"),
    api_key=os.environ.get("QDRANT_API_KEY"),
)

## Uncomment below to connect to local Qdrant
#client = qdrant_client.QdrantClient("http://localhost:6333")

In [6]:
## Collection name that will be used throughtout in the notebook
COLLECTION_NAME = "wine_mag_selfquery"

In [7]:
## General Collection level operations

## Get information about existing collections
client.get_collections()

## Get information about specific collection
#collection_info = client.get_collection(COLLECTION_NAME)
#print(collection_info)

## Deleting collection , if need be
#client.delete_collection(COLLECTION_NAME)

CollectionsResponse(collections=[CollectionDescription(name='wine_mag_selfquery'), CollectionDescription(name='posts'), CollectionDescription(name='users'), CollectionDescription(name='users_with_sparse'), CollectionDescription(name='qdrant-docs-rag-langchain')])

In [15]:
#!pip install sentence_transformers

In [8]:
from sentence_transformers import SentenceTransformer
encoder = SentenceTransformer("all-MiniLM-L6-v2")

In [16]:
client.create_collection(
    collection_name=COLLECTION_NAME,
    vectors_config=models.VectorParams(
        size=encoder.get_sentence_embedding_dimension(),
        distance=models.Distance.COSINE,
    ),
)

True

### **5. Processing data into vectors**

In [18]:
# Document class to structure data
class Document:
    def __init__(self, page_content, metadata):
        self.page_content = page_content
        self.metadata = metadata

# Convert DataFrame rows into Document objects
def df_to_documents(df):
    documents = []
    for _, row in df.iterrows():
        metadata = {
            "country": row["country"],
            "points": row["points"],
            "price": row["price"],
            "variety": row["variety"],
            "province": row["province"]
        }
        document = Document(page_content=row["description"], metadata=metadata)
        documents.append(document)
    return documents

docs = df_to_documents(df)

In [19]:
vector_points = [
    models.PointStruct(
        id=idx, 
        vector=encoder.encode(doc.page_content).tolist(), 
        payload={'metadata': doc.metadata, 'page_content': doc.page_content}
    )
    for idx, doc in enumerate(docs)
]

### **6. Adding vector points into Qdrant Collection**

In [21]:
client.upload_points(
    collection_name=COLLECTION_NAME,
    points=vector_points,
)

In [9]:
## Ensuring we have expected number of document chunks
client.count(collection_name=COLLECTION_NAME)

CountResult(count=5000)

### **7. Searching for the document**

In [10]:
hits = client.search(
    collection_name=COLLECTION_NAME,
    query_vector=encoder.encode("Quinta dos Avidagos 2011").tolist(),
    limit=3,
)

for hit in hits:
    print(hit.payload['metadata'], "score:", hit.score)

{'country': 'Portugal', 'points': 88, 'price': 19.0, 'variety': 'Portuguese Red', 'province': 'Douro'} score: 0.43031907
{'country': 'Portugal', 'points': 88, 'price': 20.0, 'variety': 'Portuguese Red', 'province': 'Douro'} score: 0.3356677
{'country': 'Portugal', 'points': 88, 'price': 15.0, 'variety': 'Portuguese Red', 'province': 'Douro'} score: 0.32194147


### **8. Searching with filter**

In [24]:
# query filter
hits = client.search(
    collection_name=COLLECTION_NAME,
    query_vector=encoder.encode("Night Sky").tolist(),
    query_filter=models.Filter(
        must=[
            models.FieldCondition(key="metadata.country", match=models.MatchValue(value="US")),
            models.FieldCondition(key="metadata.price", range=models.Range(gte=15.0, lte=30.0)), 
            models.FieldCondition(key="metadata.points", range=models.Range(gte=90, lte=100))
        ]
    ),
    limit=3,
)

for hit in hits:
    print(hit.payload['metadata'], "\nprice:", hit.payload['metadata']['price'], "\npoints:", hit.payload['metadata']['points'], "\n\n")


{'country': 'US', 'points': 90, 'price': 25.0, 'variety': 'Pinot Noir', 'province': 'New York'} 
price: 25.0 
points: 90 


{'country': 'US', 'points': 91, 'price': 21.0, 'variety': 'Tempranillo', 'province': 'California'} 
price: 21.0 
points: 91 


{'country': 'US', 'points': 92, 'price': 24.0, 'variety': 'Syrah', 'province': 'California'} 
price: 24.0 
points: 92 




### **9. Self-Query mechanism using Langchain**

In [18]:
#!pip install langchain_qdrant
#!pip install lark

In [10]:
from langchain.chains.query_constructor.base import AttributeInfo
from langchain.retrievers.self_query.base import SelfQueryRetriever
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain.callbacks.tracers import ConsoleCallbackHandler
from langchain_qdrant import QdrantVectorStore
from langchain_openai import OpenAI, ChatOpenAI

handler = ConsoleCallbackHandler()
#llm = ChatOpenAI(temperature=0, model="gpt-4o")
llm = OpenAI(temperature=0)

embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vector_store = QdrantVectorStore(
    client=client,
    collection_name=COLLECTION_NAME,
    embedding=embeddings,
)

  embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")


In [42]:
!pip freeze | grep langchain

/Users/atitaarora/.zshenv:2: command too long: /Users/atitaarora/qdrant/workspace/qdrant-rag-eval/workshop-rag-eval-oxford-llm2024/oxford-rag-eval/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/appleinternal/bin:/Library/Apple/usr/bin:/Users/atitaarora/.cargo/bin:/Applications/iTerm.app/Contents/Resources/utilities=/Users/atitaarora/qdrant/workspace/qdrant-rag-eval/workshop-rag-eval-oxford-llm2024/oxford-rag-eval/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/System/Cryptexes/App/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/local/bin:/var/run/com.apple.security.cryptexd/codex.system/bootstrap/usr/bin:/var/run/com.apple.security.cryptexd/codex.system/

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


langchain==0.2.16
langchain-community==0.2.17
langchain-core==0.2.41
langchain-openai==0.1.25
langchain-qdrant==0.1.4
langchain-text-splitters==0.2.4


In [11]:
metadata_field_info = [
    AttributeInfo(
        name="country",
        description="The country that the wine is from",
        type="string",
    ),
    AttributeInfo(
        name="points",
        description="The number of points wine has been rated on a scale of 1-100",
        type="integer",
    ),
    AttributeInfo(
        name="price",
        description="The cost for a bottle of the wine",
        type="float",
    ),
    AttributeInfo(
        name="variety",
        description="The grapes used to make the wine",
        type="string",
    ),
]

document_contents = "Brief description of the wine"

# Set up a retriever to query your vector store with self-querying capabilities
retriever = SelfQueryRetriever.from_llm(
    llm, 
    vector_store, 
    document_contents, 
    metadata_field_info, 
    verbose=True
    )

### **10. Querying with SelfRetriever from Qdrant**

In [12]:
response = retriever.invoke("Which US wines are priced between 15 and 30 and have points above 90?")
response

[Document(metadata={'country': 'US', 'points': 91, 'price': 25.0, 'variety': 'Sauvignon Blanc', 'province': 'California', '_id': 1016, '_collection_name': 'wine_mag_selfquery'}, page_content='Quite light in aroma at first, this is a delicate, lithe and lovely bottling, showing lemon and yellow pear tones on the nose. That sensibility carries to the palate, where touches of yellow apple, sea salt, faint banana and a shred of bubblegum converge for a complete yet restrained white wine experience.'),
 Document(metadata={'country': 'US', 'points': 93, 'price': 30.0, 'variety': 'Cabernet Sauvignon', 'province': 'California', '_id': 3535, '_collection_name': 'wine_mag_selfquery'}, page_content='The Daou brothers have become increasingly known for their rich, unctuous and lavish high-end wines, but this bottling offers a taste of that opulence for just $30. Plump blueberry and soft, cedar-like spice scents show on the nose, while the palate bursts with black cherry, dark chocolate and caramel

In [14]:
for resp in response:
    print(resp.metadata['variety'], "\n price:", resp.metadata['price'], "points:", resp.metadata['points'], "\n\n")

Sauvignon Blanc 
 price: 25.0 points: 91 


Cabernet Sauvignon 
 price: 30.0 points: 93 


Chardonnay 
 price: 22.0 points: 91 


Barbera 
 price: 18.0 points: 90 




In [16]:
retriever.invoke("Which US wines are priced between 15 and 30 and have points above 90?", {"callbacks":[handler]})

[32;1m[1;3m[chain/start][0m [1m[retriever:SelfQueryRetriever > chain:query_constructor] Entering Chain run with input:
[0m{
  "query": "Which US wines are priced between 15 and 30 and have points above 90?"
}
[32;1m[1;3m[chain/start][0m [1m[retriever:SelfQueryRetriever > chain:query_constructor > prompt:FewShotPromptTemplate] Entering Prompt run with input:
[0m{
  "query": "Which US wines are priced between 15 and 30 and have points above 90?"
}
[36;1m[1;3m[chain/end][0m [1m[retriever:SelfQueryRetriever > chain:query_constructor > prompt:FewShotPromptTemplate] [0ms] Exiting Prompt run with output:
[0m[outputs]
[32;1m[1;3m[llm/start][0m [1m[retriever:SelfQueryRetriever > chain:query_constructor > llm:OpenAI] Entering LLM run with input:
[0m{
  "prompts": [
    "Your goal is to structure the user's query to match the request schema provided below.\n\n<< Structured Request Schema >>\nWhen responding use a markdown code snippet with a JSON object formatted in the follow

[Document(metadata={'country': 'US', 'points': 91, 'price': 25.0, 'variety': 'Sauvignon Blanc', 'province': 'California', '_id': 1016, '_collection_name': 'wine_mag_selfquery'}, page_content='Quite light in aroma at first, this is a delicate, lithe and lovely bottling, showing lemon and yellow pear tones on the nose. That sensibility carries to the palate, where touches of yellow apple, sea salt, faint banana and a shred of bubblegum converge for a complete yet restrained white wine experience.'),
 Document(metadata={'country': 'US', 'points': 93, 'price': 30.0, 'variety': 'Cabernet Sauvignon', 'province': 'California', '_id': 3535, '_collection_name': 'wine_mag_selfquery'}, page_content='The Daou brothers have become increasingly known for their rich, unctuous and lavish high-end wines, but this bottling offers a taste of that opulence for just $30. Plump blueberry and soft, cedar-like spice scents show on the nose, while the palate bursts with black cherry, dark chocolate and caramel