### Setting up qdrant :

1. Run from terminal : docker pull qdrant/qdrant
2. Run from terminal:

> docker run -p 6333:6333 -p 6334:6334 \
> -v "$(pwd)/qdrant_storage:/qdrant/storage:z" \
> qdrant/qdrant

In the above command:
The first line binds REST API port and GRPC port to our local network \
-p 6333:6333  => refers to the RestAPI port \
-p 6334:6334  => refers to GRPC port

The second line mounts local storage called qdrant_storage to the docker image

Once this command is run, open your browser and open the following link to see your qdrant database UI:
> http://localhost:6333/dashboard#/welcome


In [13]:
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, CollectionParams, PointStruct

import openai
import pandas as pd

In [4]:
qdrant_client = QdrantClient(url="http://localhost:6333")

qdrant_client.create_collection(
    collection_name="Amazon-items-collection-00",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE), 
)
# setiing vector size = 1536 , which is same as the embedding size of the OpenAI embedding model
# setting distance metric to COSINE, which is commonly used for text embeddings, shows the similarity between vectors based on the cosine of the angle between them.
# Once these parameters are set, they can not be changed later, so it is important to choose them carefully based on the use case and the type of data being stored in the collection.

True

> After running the above, go back to the qdrant UI in your browser and go to collections - you should see the above collection that was created.

> Lets now get and store some data in this collection

In [6]:
# Read the sampled dataset with Amazon inventory metadata

df_items = pd.read_json("../data/meta_Electronics_2022_2023_with_category_ratings_100_sample_1000.jsonl", lines=True)

In [7]:
# Concatenate title and features

def preprocess_data(row):
    return f"{row['title']} {' '.join(row['features'])}"

df_items["preprocessed_data"] = df_items.apply(preprocess_data, axis=1)

In [12]:
df_items.head()

Unnamed: 0,main_category,title,average_rating,rating_number,features,description,price,images,videos,store,categories,details,parent_asin,bought_together,subtitle,author,preprocessed_data
0,Industrial & Scientific,"RAVODOI USB C Cable, [2Pack/3.3ft+6.6ft] USB T...",4.4,119,[【Fast Charging Cord】These USB C cables provid...,[],,[{'thumb': 'https://m.media-amazon.com/images/...,"[{'title': 'Type-C Charger Cable ', 'url': 'ht...",RAVODOI,"[Electronics, Computers & Accessories, Compute...","{'Brand': 'RAVODOI', 'Connector Type': 'USB Ty...",B09R4Y2HKY,,,,"RAVODOI USB C Cable, [2Pack/3.3ft+6.6ft] USB T..."
1,All Electronics,"SNESH-2 Pack USB-C Female to USB Male Adapter,...",4.5,352,[🔹(Light & compact) Easy to carry and light we...,[],4.99,[{'thumb': 'https://m.media-amazon.com/images/...,"[{'title': 'USB Male & Female Adapter', 'url':...",SNESH,"[Electronics, Computers & Accessories, Compute...",{'Package Dimensions': '3.54 x 2.4 x 0.35 inch...,B09JV5FM2S,,,,"SNESH-2 Pack USB-C Female to USB Male Adapter,..."
2,All Electronics,USB C Docking Station Dual Monitor for MacBook...,3.9,1193,[【18-in-1Docking Station】With USB C Docking St...,[],,[{'thumb': 'https://m.media-amazon.com/images/...,[],ZMUIPNG,"[Electronics, Computers & Accessories, Laptop ...","{'Product Dimensions': '3.94""L x 1.18""W x 3.94...",B09SFN9NRX,,,,USB C Docking Station Dual Monitor for MacBook...
3,Camera & Photo,[2023 Upgraded] Telescopes for Adults Astronom...,4.1,219,[🎁【2023 All New Experience】The newly upgraded ...,[],169.99,[{'thumb': 'https://m.media-amazon.com/images/...,"[{'title': 'Good picture quality', 'url': 'htt...",HUTACT,"[Electronics, Camera & Photo, Binoculars & Sco...","{'Product Dimensions': '32.5""D x 5.5""W x 9.7""H...",B09TP3SZ7C,,,,[2023 Upgraded] Telescopes for Adults Astronom...
4,AMAZON FASHION,"Laptop Bag 15.6 Inch, Laptop Briefcase Messeng...",4.5,222,"[Leather,Mesh, Imported, Multi-pockets and Lar...",[],24.95,[{'thumb': 'https://m.media-amazon.com/images/...,[],KPIQIU,"[Electronics, Computers & Accessories, Laptop ...",{'Product Dimensions': '16 x 2 x 12 inches; 1....,B0B5H7T7XZ,,,,"Laptop Bag 15.6 Inch, Laptop Briefcase Messeng..."


In [8]:
# Sample 50 items from the dataset

df_sample = df_items.sample(n=50, random_state=42)

In [9]:
# Define Embeddings function
def get_embedding(text, model="text-embedding-3-small"):
    response = openai.embeddings.create(
        input=[text],
        model=model,
    )
    return response.data[0].embedding

In [14]:
# Embed sample Data
data_to_embed = df_sample["preprocessed_data"].tolist()
pointstructs = []
for i, data in enumerate(data_to_embed):
    embedding = get_embedding(data)
    pointstructs.append(
        PointStruct(
            id=i,
            vector=embedding,
            payload={"text": data},
        )
    )

In [16]:
len(pointstructs)

50

In [17]:
# Embed actual Data
data_to_embed = df_items["preprocessed_data"].tolist()
pointstructs_list = []
for i, data in enumerate(data_to_embed):
    embedding = get_embedding(data)
    pointstructs_list.append(
        PointStruct(
            id=i,
            vector=embedding,
            payload={"text": data},
        )
    )

In [19]:
def chunked(lst, n):
    for i in range(0, len(lst), n):
        yield lst[i : i + n]

# Write embedded data to Qdrant collection
for batch in chunked(pointstructs_list, 50):
    qdrant_client.upsert(
        collection_name="Amazon-items-collection-00",
        wait=True,
        points=batch,
    )

In [20]:
# Define a function for data retrieval
def retrieve_data(query):
    query_embedding = get_embedding(query)
    results = qdrant_client.query_points(
        collection_name="Amazon-items-collection-00",
        query=query_embedding,
        limit=10,
    )
    return results

In [22]:
# Test the retrieval function
retrieve_data("wireless headphones with noise cancellation").points

[ScoredPoint(id=662, version=13, score=0.5792889, payload={'text': '2 Pack Earbuds Headphones, Wired Earphones Stereo Noise Reduction Canceling with Microphone and Volume Control '}, vector=None, shard_key=None, order_value=None),
 ScoredPoint(id=615, version=12, score=0.56868047, payload={'text': 'Wireless Earbud Bluetooth 5.1 Headphones Deep Bass Bluetooth Earbud with 4 Noise Cancelling Mics, 2022 WirelessHeadphones in Ear with IP7 Waterproof, 35Hrs Ear Buds, LED Display Bluetooth 5.3 and Auto Pairing: Adopting advanced Bluetooth 5.3 chip, wireless earbud support HSP HFP A2DP AVRCP, which offer you an unparalleled audio experience with faster transmission speed, stronger connection stability, and longer range of bluetooth. Taken from the charging case, wireless headphones will be paired with each other automatically, and just hit on bluetooth list MD016 earbud on your device to connect, avoiding complex operations. Stereo Deep Bass and HD Calls: These bluetooth headphones with 13mm g