## Qdrant
https://qdrant.tech/documentation/quick-start/

Qdrant entities are named collection and points (= vector)

1. Connect to Qdrant via langchain, create collection
2. Add openAI embeddings to Qdrant
3. Search


In [None]:
!pip install qdrant-client

In [None]:
!pip install openai langchain tiktoken

In [11]:
import os
import qdrant_client
from langchain.vectorstores import Qdrant
from langchain.embeddings.openai import OpenAIEmbeddings
from google.colab import userdata

os.environ["QDRANT_API_KEY"] = userdata.get('qdrant_api_key')
os.environ["QDRANT_HOST"] = userdata.get('qdrant_host')

## Create Collection

In [30]:
db_client = qdrant_client.QdrantClient(
    api_key=os.environ["QDRANT_API_KEY"],
    location=os.environ["QDRANT_HOST"]
)

In [10]:
from qdrant_client.models import Distance, VectorParams

db_client.create_collection(
    collection_name="my_collection",
    vectors_config=VectorParams(size=1536, # dimensions, 1536 from openAI embedding text-embedding-3-small
                                distance=Distance.COSINE),
)

True

In [34]:
import os
from google.colab import userdata
from langchain.embeddings.openai import OpenAIEmbeddings
os.environ["OPENAI_API_KEY"] = userdata.get('openai_api_key')


## Embed and upload

In [13]:
import pandas as pd
df = pd.read_csv('sample_data/quotes.csv')
embedding_vectors = OpenAIEmbeddings(model="text-embedding-3-small").embed_documents(df['quote'].tolist())

In [54]:
from qdrant_client.models import PointStruct

db_client.upsert(
    collection_name="my_collection",
    points=[
        PointStruct(
            id=elem[3],
            vector=elem[0],
            payload={"person": elem[1], "topic": elem[2]}
        )
        for elem in  zip(embedding_vectors, df['person'], df['topic'], range(len(df)) )
    ]
)

UpdateResult(operation_id=0, status=<UpdateStatus.COMPLETED: 'completed'>)

## Search

In [71]:
query_text = 'what is life'
query_vector = OpenAIEmbeddings(model="text-embedding-3-small").embed_documents([query_text])
hits = db_client.search(
      collection_name="my_collection",
      query_vector=query_vector[0],
      limit=5  # Return 5 closest points
)

In [72]:
for h in hits:
  id = h.dict()['id']
  print(df['quote'][id])


Where there is love there is life.
Life is what happens when you're busy making other plans.
Life is 10% what happens to us and 90% how we react to it.
The purpose of our lives is to be happy.
Make your life a masterpiece; imagine no limitations on what you can be, have or do.
