# Usage With Qdrant

This notebook demonstrates how to use FastEmbed and Qdrant to perform vector search and retrieval. Qdrant is an open-source vector similarity search engine that is used to store, organize, and query collections of high-dimensional vectors. 

We will use the Qdrant to add a collection of documents to the engine and then query the collection to retrieve the most relevant documents.

It consists of the following sections:

1. Setup: Installing necessary packages, including the Qdrant Client and FastEmbed.
2. Importing Libraries: Importing FastEmbed and other libraries
3. Data Preparation: Example data and embedding generation
4. Querying: Defining a function to search documents based on a query
5. Running Queries: Running example queries

## Setup

First, we need to install the dependencies. `fastembed` to create embeddings and perform retrieval, and `qdrant-client` to interact with the Qdrant database.

In [1]:
!pip install 'qdrant-client[fastembed]' --quiet --upgrade

[33mDEPRECATION: pytorch-lightning 1.6.5 has a non-standard dependency specifier torch>=1.8.*. pip 23.3 will enforce this behaviour change. A possible replacement is to upgrade to a newer version of pytorch-lightning or contact the author to suggest that they release a version with a conforming dependency specifiers. Discussion can be found at https://github.com/pypa/pip/issues/12063[0m[33m
[0m

Importing the necessary libraries:

In [2]:
from typing import List
import numpy as np
from fastembed.embedding import FlagEmbedding as Embedding
from qdrant_client import QdrantClient

## Data Preparation
We initialize the embedding model and generate embeddings for the documents.

### 💡 Tip: Prefer using `query_embed` for queries and `passage_embed` for documents.

In [3]:
# Example list of documents
documents: List[str] = [
    "Maharana Pratap was a Rajput warrior king from Mewar",
    "He fought against the Mughal Empire led by Akbar",
    "The Battle of Haldighati in 1576 was his most famous battle",
    "He refused to submit to Akbar and continued guerrilla warfare",
    "His capital was Chittorgarh, which he lost to the Mughals",
    "He died in 1597 at the age of 57",
    "Maharana Pratap is considered a symbol of Rajput resistance against foreign rule",
    "His legacy is celebrated in Rajasthan through festivals and monuments",
    "He had 11 wives and 17 sons, including Amar Singh I who succeeded him as ruler of Mewar",
    "His life has been depicted in various films, TV shows, and books",
]

This tutorial demonstrates how to utilize the QdrantClient to add documents to a collection and query the collection for relevant documents.

## ➕ Adding Documents

The `add` creates a collection if it does not already exist. Now, we can add the documents to the collection:

In [4]:
client = QdrantClient(":memory:")
client.add(collection_name="test_collection", documents=documents)

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


['77e1e4724dd243b08608f57d5692f6aa',
 '74841e5dc3594646bda2c6a6d2795dbd',
 '6ef39a9445604d0da84d04f760cd7cf7',
 'e659503d3b3748ef90f23c778274835b',
 'b999675068cd413f93faa0cc890c3819',
 '8e452f2935cf4e4b80d8eea68c2aad58',
 '28ed4fd4592c48c9a0519618d51bb86e',
 '59378c784c5f49109bef65fdc4061334',
 'a78c9b598f7942749156334283a6f24f',
 'f72bb24701c64fabb0182c9e757b581b']

These are the ids of the documents we just added. We don't have a use for them in this tutorial, but they can be used to update or delete documents.

## 📝 Running Queries
We'll define a function to print the top k documents based on a query, and prepare a sample query.

In [5]:
# Prepare your documents, metadata, and IDs
docs = ["Qdrant has Langchain integrations", "Qdrant also has Llama Index integrations"]
metadata = [
    {"source": "Langchain-docs"},
    {"source": "Linkedin-docs"},
]
ids = [42, 2]

# Use the new add method
client.add(
    collection_name="demo_collection",
    documents=docs,
    metadata=metadata,
    ids=ids
)

[42, 2]

In [6]:
search_result = client.query(
    collection_name="demo_collection",
    query_text=["This is a query document"]
)
print(search_result)

[QueryResponse(id='42', embedding=None, metadata={'document': 'Qdrant has Langchain integrations', 'source': 'Langchain-docs'}, document='Qdrant has Langchain integrations', score=0.8496814051311954), QueryResponse(id='2', embedding=None, metadata={'document': 'Qdrant also has Llama Index integrations', 'source': 'Linkedin-docs'}, document='Qdrant also has Llama Index integrations', score=0.8478494193031256)]


## 🎬 Conclusion

This tutorial demonstrates the basics of working with the QdrantClient to add and query documents. By following this guide, you can easily integrate Qdrant into your projects for vector similarity search and retrieval.

Remember to properly handle the closing of the client connection and further customization of the query parameters according to your specific needs.

The official Qdrant Python client documentation can be found [here](https://github.com/qdrant/qdrant-client) for more details on customization and advanced features.