-
Notifications
You must be signed in to change notification settings - Fork 262
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Milvus integration for vector create and search (#1269)
Integrated Milvus vector store into EvaDB. Added a `MilvusVectorStore` class and Milvus type for query parsing and execution. Below are environment values for the use of the Milvus index: * `MILVUS_URI` is the URI of the Milvus instance (which would be http://localhost:19530 when running locally). **This value is required** * `MILVUS_USER` is the name of the user for the Milvus instance. * `MILVUS_PASSWORD` is the password of the user for the Milvus instance. * `MILVUS_DB_NAME` is the name of the database to be used. This will default to the `default` database if not provided. * `MILVUS_TOKEN` is the authorization token for the Milvus instance. --------- Co-authored-by: Andy Xu <xzdandy@gmail.com>
- Loading branch information
1 parent
af696d6
commit 71b9aca
Showing
15 changed files
with
304 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
Milvus | ||
========== | ||
|
||
Milvus is an open-source, distributed vector database designed for similarity search and analytics on large-scale vector data. | ||
The connection to Milvus is based on the `pymilvus <https://pymilvus.readthedocs.io/en/latest>`_ library. | ||
|
||
Dependency | ||
---------- | ||
|
||
* pymilvus | ||
|
||
Parameters | ||
---------- | ||
|
||
To use Milvus you must have a URI to a running Milvus instance. Here are the `instructions to spin up a local instance <https://milvus.io/docs/install_standalone-docker.md>`_. | ||
If you are running it locally, the Milvus instance should be running on ``http://localhost:19530``. Please be sure that the Milvus version is >= 2.3.0. Below are values that the Milvus integration uses: | ||
|
||
* `MILVUS_URI` is the URI of the Milvus instance (which would be ``http://localhost:19530`` when running locally). **This value is required** | ||
* `MILVUS_USER` is the name of the user for the Milvus instance. | ||
* `MILVUS_PASSWORD` is the password of the user for the Milvus instance. | ||
* `MILVUS_DB_NAME` is the name of the database to be used. This will default to the `default` database if not provided. | ||
* `MILVUS_TOKEN` is the authorization token for the Milvus instance. | ||
|
||
The above values can either be set via the ``SET`` statement, or in the os environment fields "MILVUS_URI", "MILVUS_USER", "MILVUS_PASSWORD", "MILVUS_DB_NAME", and "MILVUS_TOKEN" | ||
|
||
|
||
.. code-block:: sql | ||
SET MILVUS_URI = 'http://localhost:19530'; | ||
Create Index | ||
----------------- | ||
|
||
.. code-block:: sql | ||
CREATE INDEX index_name ON table_name (data) USING MILVUS; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,146 @@ | ||
# coding=utf-8 | ||
# Copyright 2018-2023 EvaDB | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
import os | ||
from typing import List | ||
|
||
from evadb.third_party.vector_stores.types import ( | ||
FeaturePayload, | ||
VectorIndexQuery, | ||
VectorIndexQueryResult, | ||
VectorStore, | ||
) | ||
from evadb.utils.generic_utils import try_to_import_milvus_client | ||
|
||
allowed_params = [ | ||
"MILVUS_URI", | ||
"MILVUS_USER", | ||
"MILVUS_PASSWORD", | ||
"MILVUS_DB_NAME", | ||
"MILVUS_TOKEN", | ||
] | ||
required_params = [] | ||
_milvus_client_instance = None | ||
|
||
|
||
def get_milvus_client( | ||
milvus_uri: str, | ||
milvus_user: str, | ||
milvus_password: str, | ||
milvus_db_name: str, | ||
milvus_token: str, | ||
): | ||
global _milvus_client_instance | ||
if _milvus_client_instance is None: | ||
try_to_import_milvus_client() | ||
import pymilvus | ||
|
||
_milvus_client_instance = pymilvus.MilvusClient( | ||
uri=milvus_uri, | ||
user=milvus_user, | ||
password=milvus_password, | ||
db_name=milvus_db_name, | ||
token=milvus_token, | ||
) | ||
|
||
return _milvus_client_instance | ||
|
||
|
||
class MilvusVectorStore(VectorStore): | ||
def __init__(self, index_name: str, **kwargs) -> None: | ||
# Milvus URI is the only required | ||
self._milvus_uri = kwargs.get("MILVUS_URI") | ||
|
||
if not self._milvus_uri: | ||
self._milvus_uri = os.environ.get("MILVUS_URI") | ||
|
||
assert ( | ||
self._milvus_uri | ||
), "Please set your Milvus URI in evadb.yml file (third_party, MILVUS_URI) or environment variable (MILVUS_URI)." | ||
|
||
# Check other Milvus variables for additional customization | ||
self._milvus_user = kwargs.get("MILVUS_USER") | ||
|
||
if not self._milvus_user: | ||
self._milvus_user = os.environ.get("MILVUS_USER", "") | ||
|
||
self._milvus_password = kwargs.get("MILVUS_PASSWORD") | ||
|
||
if not self._milvus_password: | ||
self._milvus_password = os.environ.get("MILVUS_PASSWORD", "") | ||
|
||
self._milvus_db_name = kwargs.get("MILVUS_DB_NAME") | ||
|
||
if not self._milvus_db_name: | ||
self._milvus_db_name = os.environ.get("MILVUS_DB_NAME", "") | ||
|
||
self._milvus_token = kwargs.get("MILVUS_TOKEN") | ||
|
||
if not self._milvus_token: | ||
self._milvus_token = os.environ.get("MILVUS_TOKEN", "") | ||
|
||
self._client = get_milvus_client( | ||
milvus_uri=self._milvus_uri, | ||
milvus_user=self._milvus_user, | ||
milvus_password=self._milvus_password, | ||
milvus_db_name=self._milvus_db_name, | ||
milvus_token=self._milvus_token, | ||
) | ||
self._collection_name = index_name | ||
|
||
def create(self, vector_dim: int): | ||
if self._collection_name in self._client.list_collections(): | ||
self._client.drop_collection(self._collection_name) | ||
self._client.create_collection( | ||
collection_name=self._collection_name, | ||
dimension=vector_dim, | ||
metric_type="COSINE", | ||
) | ||
|
||
def add(self, payload: List[FeaturePayload]): | ||
milvus_data = [ | ||
{ | ||
"id": feature_payload.id, | ||
"vector": feature_payload.embedding.reshape(-1).tolist(), | ||
} | ||
for feature_payload in payload | ||
] | ||
ids = [feature_payload.id for feature_payload in payload] | ||
|
||
# Milvus Client does not have upsert operation, perform delete + insert to emulate it | ||
self._client.delete(collection_name=self._collection_name, pks=ids) | ||
|
||
self._client.insert(collection_name=self._collection_name, data=milvus_data) | ||
|
||
def persist(self): | ||
self._client.flush(self._collection_name) | ||
|
||
def delete(self) -> None: | ||
self._client.drop_collection( | ||
collection_name=self._collection_name, | ||
) | ||
|
||
def query(self, query: VectorIndexQuery) -> VectorIndexQueryResult: | ||
response = self._client.search( | ||
collection_name=self._collection_name, | ||
data=[query.embedding.reshape(-1).tolist()], | ||
limit=query.top_k, | ||
)[0] | ||
|
||
distances, ids = [], [] | ||
for result in response: | ||
distances.append(result["distance"]) | ||
ids.append(result["id"]) | ||
|
||
return VectorIndexQueryResult(distances, ids) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.