# WTF is a Vector Database?

You're probably familiar with traditional databases, like relational databases or NoSQL databases. 

They store data in tables, with each row representing a record and each column representing a particular type of data, like name, age, or address. Searching and querying these databases is straightforward. You can use SQL to retrieve records based on specific criteria. For example:

```sql
-- Find all users with the name 'Harpreet'
SELECT * FROM users WHERE name = 'Harpreet';

-- Get all products with a price greater than $100
SELECT * FROM products WHERE price > 100;
```

But what if you need to store and query data that can't be easily represented as rows and columns? 

What if your data is more complex, like images, audio files, or abstract concepts like user preferences or semantic meanings? Imagine trying to find all images that look like a specific pair of shoes in a traditional database. You'd have to manually tag each image with relevant keywords and then search for those tags. Unstructured data like images, sounds, complex text documents, or even molecular data can't easily be parsed into discrete rows and columns.

But what if there was a way to represent complex, unstructured data in a format that captures its inherent relationships and allows for efficient similarity-based searching? This is where vectors come to the rescue.

<img src="https://y.yarn.co/c38f4af9-d116-43b1-a7d2-fcfe7e13c0d0_text.gif" style="display: block; margin: auto;">

If the term sounds mathematical, that's because it is. A vector is a mathematical object with both magnitude (length) and direction. In our context, a **vector** is not just a mere sequence of numbers but a representation of data in a high-dimensional space. Each element in a vector represents a different direction or position in that space. 

For example:

 - Text documents can be represented as vectors where each dimension corresponds to a word, and the value indicates the importance of that word in the document

 - Images can be converted into vectors where each pixel is a dimension

 - Audio clips can be transformed into vectors based on various audio features

This becomes especially handy when you want to numerically represent and compare complex, unstructured data. Vectors representing similar objects or concepts will be close to each other in multidimensional. 

Imagine you're building a recommendation system for a music streaming service. You could represent each song as a vector, where the dimensions of the vector correspond to different features like genre, tempo, mood, and lyrics. 

<img src="https://cdn.vectorstock.com/i/1000v/93/91/audio-waveform-signals-wave-song-equalizer-vector-21029391.avif" >

When you represent a song as a vector, each dimension captures some aspect or feature of the song. 

However, the individual values in the vector don't have an explicit, human-interpretable meaning on their own. For example, suppose a song vector has values [0.2, -0.5, 0.8, 0.1]. You can't point to the 0.8 and definitively say, "This means the song is very danceable." Instead, that 0.8 value contributes to the song's "danceability" in combination with all the other feature values and other song vectors. 

This expressivity that a vector provides allows you to capture the complex relationships between songs and users in a way that's not possible with traditional database rows.

### So, why can't you just use a regular database to store and query these vectors? 

Traditional databases are designed to work with discrete, categorical data, not continuous, numerical data like vectors. They're great for storing and querying structured data. Still, they're not optimized for searching, filtering, or ranking data based on complex, high-dimensional relationships. 

For example, suppose you wanted to find all songs in your music database that have a similar vibe to ["Particles" by Lucy in Disguise](https://youtu.be/Cd7FMSwBm9k) (just a random song that I happen to be listening to while writing this). With a traditional database, you'd have to search through discrete fields like genre, artist, etc. However, songs with similar vibes may span multiple genres and artists. Not to mention that you're making the assumption that you can get to the vibe of a particular track based on discrete attributes of a song.

Instead, represent each song as a high-dimensional vector capturing attributes like tempo, mood, lyrics, etc. You can find the most similar songs by looking for the nearest vectors in that space. However, searching through billions of high-dimensional vectors is computationally expensive and not something traditional databases are designed for.

Traditional databases aren't built to handle these queries efficiently because they're designed to search through discrete, well-defined fields rather than multidimensional ones. But with vectors, you can represent the audio profile of each song as a vector in a multidimensional space. Then, to find similar songs, you just find the vectors closest to your target vector. 

Not only that, but you might need to search through billions of high-dimensional vectors to find songs with a similar vibe. That's computationally expensive, and traditional databases simply aren't designed for this task.

This is where vector databases come in—they're databases designed specifically to store, search, and efficiently query this type of data. 

They're optimized for high-performance similarity searches, clustering, and other critical operations in applications such as recommendation systems, computer vision, and natural language processing.

### So, what is a vector database, exactly? 

A vector database is a database that's specifically designed to store, manage, query, and perform operations on large collections of vectors. 

They use specialized indexing and search algorithms, like approximate nearest neighbor search (ANNS), for fast and accurate searching, filtering, and ranking of vectors based on their similarity, distance, or other relationships. 

Unlike traditional databases that match exact queries, vector databases help you search through data in a way that mimics human-like perception, finding items that are conceptually "similar," even if not identical. This is particularly important for applications like recommendation systems, semantic search, and other machine learning use cases.

Some key advantages of vector databases include:

 - Optimized for storing and querying high-dimensional vector data

 - Support fast approximate similarity search

 - Enable searching and recommendations based on semantic meaning and relevance

 - Scale to massive datasets

 - Integrate well with ML/AI workflows and libraries

While traditional databases are great for storing and querying structured data, vector databases are purpose-built for the unique challenges of managing, searching and analyzing vast amounts of unstructured, high-dimensional data. As the volume of unstructured data continues to explode and AI/ML becomes increasingly critical, vector databases will play a key role in powering this and the next generation of context-aware applications. 

## The Role of Vector Databases in RAG

In a typical RAG workflow, the external data used to augment the LLM's knowledge is first converted into vector embeddings. 

These numerical representations capture the data's semantic meaning and context, allowing similar items to be grouped closer together in vector space. The vector embeddings are then stored in a vector database optimized for performing fast and accurate similarity searches. 

When a user submits a query, the RAG system follows these steps:

1. The query is converted into a vector embedding.

2. The vector database is searched to find the data points most similar to the query vector.

3. The retrieved data is used to augment the original query prompt.

4. The augmented prompt is fed to the LLM to generate a response.

In traditional database systems, the search mechanism is designed to match exact queries to entries in the database. The system might need to retrieve conceptually similar information that does not precisely match verbatim. For instance, a query about "tips for writing a novel" might benefit from retrieving "creative writing techniques" information, even if the exact phrase wasn't used in the query. This level of semantic understanding and contextual retrieval is beyond the capabilities of traditional databases.

This is where vector databases come in.

Vector databases address these challenges by using vectors to represent and understand the semantic similarities between different pieces of data. Using a vector database, the RAG system sifts through massive amounts of data to find the most relevant context for each query. 

We'll discuss how to do this in-depth as we progress. But for now, I want to discuss choosing a vector database.

# How to Choose the Right Vector Database

With the growing popularity of Generative AT, the market for vector databases has exploded. 

There are now numerous options, each with its own strengths and weaknesses. So, how do you pick the right one for your project?

<img src="https://media.giphy.com/media/3o7btPCcdNniyf0ArS/giphy.gif" style="display: block; margin: auto;">

## Performance and Scalability

One primary reason to use a vector database is to achieve fast and efficient similarity search over large datasets. Therefore, performance is a critical consideration. 

Look for databases that offer:

- Fast indexing and querying speeds

- Ability to handle high-dimensional vectors (e.g., 512 dimensions or more)

- Scalability to billions of vectors

- Distributed architecture for handling massive datasets

Benchmark the databases using datasets and query patterns similar to your production use case for a realistic performance assessment.

## Ease of Use and Integration

Consider how easy it is to get started with the database and integrate it into your existing workflow. Some factors to look for:

- Well-documented APIs and client libraries in your preferred programming language

- Compatibility with popular ML frameworks (e.g., PyTorch, TensorFlow)

- Managed cloud offerings or easy-to-deploy containers for quick setup

- Availability of GUI tools for data exploration and visualization

The easier it is to use and integrate the database, the faster you can iterate on your RAG application.

## Data and Query Flexibility

Different vector databases offer different levels of flexibility regarding the data they can store and the queries they support. Consider your requirements:

- Do you need to store and query additional metadata alongside the vectors?

- Do you require support for advanced query types like k-NN, range searches, or filtering by metadata?

- Are you working with dense, sparse, or a mix of both?

Choose a database that aligns with your data and querying needs to avoid costly workarounds or limitations down the line.

## Community and Ecosystem

The strength of the community and ecosystem around a vector database can significantly impact its long-term viability and the support you receive. Look for:

- Active development and regular releases

- Responsive community forums or chat channels for getting help

- Extensive documentation, tutorials, and examples

- Integrations with popular tools and frameworks in your domain

- Commercial support options if you require enterprise-grade assistance

A thriving community and ecosystem can make a big difference in the success of your project.

## Cost and Licensing

Finally, consider the total cost of ownership and licensing model of the vector database:

- Is it open-source or proprietary?

- What are the costs for development, production, and scaling?

- Are there any usage or feature limitations in the free/community edition?

- How does the pricing compare to other options with similar capabilities?

Carefully evaluate the costs and licensing to ensure they align with your budget and long-term requirements.

# Why I Chose Qdrant

After carefully evaluating the vector database landscape and considering the unique requirements of my RAG project, I ultimately decided to use Qdrant as my vector database of choice. 

<img src="https://qdrant.tech/images/logo_with_text.svg" alt="Qdrant Logo" style="display: block; margin: auto; width: 400px">

Several key factors led me to this decision, but ultimately, it came down to Qdrant's impressive performance, flexibility, and developer-friendly ecosystem.

## Unmatched Speed and Scalability

One of the standout features of Qdrant is its lightning-fast performance. 

Qdrant outperforms other vector databases in terms of both indexing and querying speeds. This is thanks to its highly optimized indexing structures and search algorithms, designed to efficiently handle high-dimensional vectors.

But it's not just about raw speed – Qdrant also scales incredibly well. 

It can handle billion-scale vector collections without sweat, making it perfect for large-scale RAG applications. Distributing the database across multiple nodes allows for horizontal scalability, ensuring that performance remains top-notch even as the dataset grows.

## Flexible Data Model and Querying

Another area where Qdrant shines is its flexible data model. 

In addition to storing vectors, Qdrant allows you to attach arbitrary metadata to each vector. This is incredibly handy for RAG, enabling the storage and querying of additional context alongside the vector embeddings.

Qdrant's querying capabilities are also top-notch. 

It supports various query types, including k-NN similarity search, range searches, and metadata filtering. This flexibility allows for crafting particular retrieval queries beyond simple similarity matching.

## Easy Integration and Deployment

From a developer perspective, Qdrant is delightful to work with. 

It has well-documented APIs and client libraries in popular languages like Python, making integration into existing parts of the GenAI stack a breeze. The compatibility with frameworks like PyTorch and TensorFlow is a huge plus, allowing for seamless integration with the rest of the RAG pipeline.

Deployment is also straightforward, with Qdrant offering a managed cloud service and easy-to-use Docker containers for self-hosting. 

This flexibility allows for getting up and running quickly without requiring extensive infrastructure setup. 

<img src="https://raw.githubusercontent.com/ramonpzg/mlops-sydney-2023/main/images/qdrant_overview_high_level.png">


## Vibrant Community and Ecosystem

Qdrant has built up a vibrant community and ecosystem. 

The development team is active and responsive, with regular releases and improvements based on user feedback. The documentation is comprehensive and well-structured, with plenty of examples and tutorials to help you get started. The [Discord community](https://discord.gg/qdrant) is a great place to get help and share knowledge with other Qdrant users.

## Attractive Cost Structure

Finally, Qdrant's cost structure was a significant factor in my decision. 

As an open-source project, Qdrant is free to use and deploy self-hosted. It also has a generous free tier!

Even for larger-scale deployments, Qdrant's managed cloud service offers competitive pricing compared to other vector database providers. The transparent and predictable pricing model makes it easy to estimate and control costs as the application scales.

The developer-friendly ecosystem and active community sealed the deal, giving me confidence in Qdrant's long-term viability and support. 

I'm excited to use this Vector Database throughout this blog series and in my book and to show you how to build an RAG application on top of Qdrant, leveraging its powerful capabilities to create truly intelligent and context-aware experiences.

In the next post in this series, I'll teach you how to setup your enviornment and get started with Qdrant so we can use it in our RAG workflows.