# Searching for Vectors with Qdrant

## ↗️🔍 Introduction to Vector Search

- **Vector search** is a technique for finding similar points in a dataset by comparing vector representations.

- It focuses on semantic similarity rather than exact keyword matching.

-  Think of it as searching for a "query vector" in a sea of vectors, similar to finding a needle in a haystack.

## 🧠 Transforming Text into Vectors

- In a previous video, we learned to transform text into vectors using an embedding model.

- These vectors represent points in a multi-dimensional space.

## 📍 Similarity Search: Finding the Closest Neighbors

- **Similarity** refers to the closeness of vectors in this space, indicating similar characteristics or meanings.

### Key Similarity Metrics

1. ### 📐 **Cosine Similarity**

<img src ="https://cdn.hashnode.com/res/hashnode/image/upload/v1679636266737/91f1390a-5f10-4e95-ad66-a7a890eee644.jpeg">

[Image Source: Ankit Dash on HashNode](https://ankitdash.hashnode.dev/hierarchical-navigable-small-worlds-algorithm-hnsw)

   - Ideal when only the direction of vectors matters, not the magnitude.

   - Commonly used in text similarity, like document clustering.

   - Measures the cosine of the angle between two vectors.

2. ### 🔴 **Dot Product**

   - Considers both the direction and magnitude of vectors.

   - Useful in recommendation systems where magnitude signifies preference strength.

   - If your vectors are normalized (i.e., unit vectors), cosine similarity and dot product will yield similar results.

3. ### ፨ **Euclidean Distance**


<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1679665547797/1093c3b2-d811-4e00-afff-c6ede1a30a8d.png">

[Image Source: Ankit Dash on HashNode](https://ankitdash.hashnode.dev/hierarchical-navigable-small-worlds-algorithm-hnsw)

   - Measures the straight-line distance between vectors.

   - Suitable for image similarity where differences in feature values are crucial.

4. ### 🗽 **Manhattan Distance**

<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1679636425137/599ed271-96e9-44da-880e-13c71b4d616d.jpeg">

[Image Source: Ankit Dash on HashNode](https://ankitdash.hashnode.dev/hierarchical-navigable-small-worlds-algorithm-hnsw)

   - Calculates the sum of absolute differences between vectors.

   - Often used for comparing binary or categorical features.

## 🌌 Navigating the Vector Space: HNSW and ANN

- Efficient search through massive collections requires tools like [**Hierarchical Navigable Small World (HNSW)**](https://ankitdash.hashnode.dev/hierarchical-navigable-small-worlds-algorithm-hnsw).

- HNSW is an **Approximate Nearest Neighbor (ANN)** algorithm that helps find similar vectors efficiently.

### How HNSW Works

1. **Continuous to Discrete Transformation**

   - Utilizes the K-nearest neighbour algorithm to place each vector in the graph as a node linked to its closest 'K' neighbours.

   - The number 'K' sets the boundaries, with nodes connected to their nearest neighbours forming a graph.


2. **Layered Graph Structure**

   - Nodes with higher connections ascend to higher layers, creating a hierarchical structure.

   - Top layers have fewer nodes, acting as entry points and speeding up navigation through vector space.

3. **Knowledge Compression**

   - High-degree nodes in the upper layers help avoid local minima, enhancing search efficiency.

   - Discretisation acts as a form of data compression, which is crucial for both understanding and efficient searching.

<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1679426886847/cbfe7dce-09db-4a64-92ce-f8db2ecffa80.jpeg?auto=compress,format&format=webp">



- Enables sublinear search times by avoiding full database scans.

- Provides "good enough" results quickly by prioritizing speed over absolute precision.

In [1]:
import os
from dotenv import load_dotenv

from qdrant_client import QdrantClient

load_dotenv(".env")

q_client = QdrantClient(
    url=os.getenv('QDRANT_URL'),
    api_key=os.getenv('QDRANT_API_KEY')
)

### 🔍 Searching vectors:

- 📝 Obtain user input text

- 🔢 Transform input into vector embedding

- 🎯 Utilize Qdrant to find closest vectors

- 🤝 Retrieve list of best-matching vectors representing most similar items

In [2]:
from openai import OpenAI

openai_client = OpenAI()

def get_text_embedding(
    text: str, 
    openai_client: OpenAI= openai_client, 
    model: str = "text-embedding-3-large") -> list:
    """
    Get the vector representation of the input text using the specified OpenAI embedding model.

    Args:
        openai_client (OpenAI): An instance of the OpenAI client.
        text (str): The input text to be embedded.
        model (str, optional): The name of the OpenAI embedding model to use. Defaults to "text-embedding-3-large".

    Returns:
        list: The vector representation of the input text as a list of floats.

    Raises:
        OpenAIError: If an error occurs during the API call.
    """
    try:
        embedding = openai_client.embeddings.create(
            input=text, 
            model=model
        ).data[0].embedding
        return embedding
    except openai_client.OpenAIError as e:
        raise e


Before using the function defined below, it's good to get a sense of what gets returned when you search. Notice that in the cell below, I passed a list of keys for the payload that I want to recieve. In the function, I set `with_payload=True` so it will return all the stuff in the payload.

In [3]:
q_client.search(
    collection_name="arxiv_chunks",
    query_vector=("summary" ,get_text_embedding("machine learning in sound and diffusion")),
    with_payload=["summary", "title", "authors"],
    limit=2
)

[ScoredPoint(id='128135e3-6497-44ff-8e86-6c6ec8377f81', version=0, score=0.41315556, payload={'authors': ['Dongchao Yang', 'Jianwei Yu', 'Helin Wang', 'Wen Wang', 'Chao Weng', 'Yuexian Zou', 'Dong Yu'], 'summary': 'Generating sound effects that humans want is an important topic. However,\nthere are few studies in this area for sound generation. In this study, we\ninvestigate generating sound conditioned on a text prompt and propose a novel\ntext-to-sound generation framework that consists of a text encoder, a Vector\nQuantized Variational Autoencoder (VQ-VAE), a decoder, and a vocoder. The\nframework first uses the decoder to transfer the text features extracted from\nthe text encoder to a mel-spectrogram with the help of VQ-VAE, and then the\nvocoder is used to transform the generated mel-spectrogram into a waveform. We\nfound that the decoder significantly influences the generation performance.\nThus, we focus on designing a good decoder in this study. We begin with the\ntraditional 

If you have a whole bunch of keys in your payload, but there are only a couple that you want to exclude, you can use the `PayloadSelectorExclude`

In [4]:
from qdrant_client import QdrantClient, models

exlusioner = models.PayloadSelectorExclude(exclude=["chunk", "text_id"])

q_client.search(
    collection_name="arxiv_chunks",
    query_vector=("summary" ,get_text_embedding("machine learning in sound and diffusion")),
    with_payload=exlusioner,
    limit=2
)

[ScoredPoint(id='128135e3-6497-44ff-8e86-6c6ec8377f81', version=0, score=0.41315556, payload={'authors': ['Dongchao Yang', 'Jianwei Yu', 'Helin Wang', 'Wen Wang', 'Chao Weng', 'Yuexian Zou', 'Dong Yu'], 'source': 'http://arxiv.org/pdf/2207.09983', 'summary': 'Generating sound effects that humans want is an important topic. However,\nthere are few studies in this area for sound generation. In this study, we\ninvestigate generating sound conditioned on a text prompt and propose a novel\ntext-to-sound generation framework that consists of a text encoder, a Vector\nQuantized Variational Autoencoder (VQ-VAE), a decoder, and a vocoder. The\nframework first uses the decoder to transfer the text features extracted from\nthe text encoder to a mel-spectrogram with the help of VQ-VAE, and then the\nvocoder is used to transform the generated mel-spectrogram into a waveform. We\nfound that the decoder significantly influences the generation performance.\nThus, we focus on designing a good decoder i

You can also create more interesting and complex [filters](https://qdrant.tech/documentation/concepts/filtering/). This is useful when it's impossible to express all the features of the object in the embedding. I recommend checking out the documentation for filters [here]to get a sense of the options available to you. I'm sure we'll make use of filtering as this series progresses.

Below, I've created a filter on the author field. Basically saying that the client *should* return point where Dong Yu is one of the authoers of the paper.

There are other filtering clauses like `Must` and `Must Not`, in addition to filtering conditions like `Match`, `Match Except`, `Nested key`. These can be combined to form complex conditions. Again, I recommend checking out the document and hacking around on your own.

In [5]:
author_filter = models.Filter(
    should=[
        models.FieldCondition(
            key="authors",
            match=models.MatchValue(value="Dong Yu")
            )
            ])

q_client.search(
    collection_name="arxiv_chunks",
    query_vector=("summary", get_text_embedding("machine learning in sound and diffusion")),
    query_filter=author_filter,
    limit=5
)

[ScoredPoint(id='128135e3-6497-44ff-8e86-6c6ec8377f81', version=0, score=0.41323012, payload={'authors': ['Dongchao Yang', 'Jianwei Yu', 'Helin Wang', 'Wen Wang', 'Chao Weng', 'Yuexian Zou', 'Dong Yu'], 'chunk': 'that it can effectively alleviate the unidirectional bias and the\naccumulated prediction error problems. We adopt the idea\nfrom diffusion models, which use a forward process to corrupt\nthe original mel-spectrogram tokens in Tsteps, and then let the\nmodel learn to recover the original tokens in a reverse process.\nSpeciﬁcally, in the forward process, we deﬁne a transition\nmatrix that denotes probability of each token transfer to a\nrandom token or a pre-deﬁned MASK token. By using the\ntransition matrix, the original tokens x0\x18q(x0)transfer\ninto a stationary distribution p(xT). In the reverse process,\nwe let the network learn to recover the original tokens from\nxT\x18p(xT)conditioned on the text features. Figure 1\n(c) shows an example of non-autoregressive mel-spect

Now, define a search function. You'll see there is an argument defined `named_vector_to_search`, this will define which vectore you want to query against. Any other type of payload filtering you want to do can be passed as a `kwarg`.

In [6]:
def search(
    named_vector_to_search: str,
    input_query: str, 
    limit: int = 5, 
    client: QdrantClient = q_client, 
    collection_name: str = "arxiv_chunks", 
    **kwargs):
    """
    Perform a vector search in the Qdrant database based on the input query.

    This method takes an input query string, converts it into a vector embedding using the
    "text-embedding-3-large" model, and searches for the closest matching vectors in the
    Qdrant database. The search results are returned as a list of dictionaries containing
    the item ID, similarity score, and payload information

    Args:
        input_query (str): The input query string to search for.
        named_vector_to_search: the vector you want to search against
        limit (int, optional): The maximum number of search results to return. Default is 3.
        kwargs: Additional keyword arguments to pass to the Qdrant search method.

    Returns:
        list: A list of dictionaries representing the search results. Each dictionary contains
              the following keys:
              - "id": The ID of the matching item in the Qdrant database.
              - "similarity_score": The similarity score between the input query and the matching item.
              - metadata from the payload

    """

    input_vector = get_text_embedding(input_query)

    search_result = client.search(
        collection_name=collection_name,
        query_vector=(named_vector_to_search, input_vector),
        limit=limit,
        with_payload=True,
        **kwargs
    )

    result = []
    for item in search_result:
        similarity_score = item.score
        payload = item.payload
        data = {
            "similarity_score": similarity_score, 
            "summary": payload.get("summary"),
            "title": payload.get("title"), 
            "source": payload.get("source"),
            "authors": payload.get("authors")
            }
        result.append(data)

    return result

In [7]:
QUERY_STRING = "agents, reasoning, chain-of-thought, few-shot prompting"

search(
    named_vector_to_search= "summary", 
    input_query=QUERY_STRING
    )

[{'similarity_score': 0.58826965,
  'summary': 'The past decade has witnessed dramatic gains in natural language processing\nand an unprecedented scaling of large language models. These developments have\nbeen accelerated by the advent of few-shot techniques such as chain of thought\n(CoT) prompting. Specifically, CoT pushes the performance of large language\nmodels in a few-shot setup by augmenting the prompts with intermediate steps.\nDespite impressive results across various tasks, the reasons behind their\nsuccess have not been explored. This work uses counterfactual prompting to\ndevelop a deeper understanding of CoT-based few-shot prompting mechanisms in\nlarge language models. We first systematically identify and define the key\ncomponents of a prompt: symbols, patterns, and text. Then, we devise and\nconduct an exhaustive set of experiments across four different tasks, by\nquerying the model with counterfactual prompts where only one of these\ncomponents is altered. Our experim

You can set the threshold for similarity as well via the `score_threshold` argument.

In [8]:
search(
    named_vector_to_search= "summary", 
    input_query=QUERY_STRING,
    score_threshold=0.51
    )

[{'similarity_score': 0.58826125,
  'summary': 'The past decade has witnessed dramatic gains in natural language processing\nand an unprecedented scaling of large language models. These developments have\nbeen accelerated by the advent of few-shot techniques such as chain of thought\n(CoT) prompting. Specifically, CoT pushes the performance of large language\nmodels in a few-shot setup by augmenting the prompts with intermediate steps.\nDespite impressive results across various tasks, the reasons behind their\nsuccess have not been explored. This work uses counterfactual prompting to\ndevelop a deeper understanding of CoT-based few-shot prompting mechanisms in\nlarge language models. We first systematically identify and define the key\ncomponents of a prompt: symbols, patterns, and text. Then, we devise and\nconduct an exhaustive set of experiments across four different tasks, by\nquerying the model with counterfactual prompts where only one of these\ncomponents is altered. Our experim

# That's it for this one!

There is a lot more ground to cover, and things are only going to get more interesting from here on out. I hope you're as excited to learn about it as I am teaching it to you!