Skip to content

Commit

Permalink
feat: qdrant document index (#1321)
Browse files Browse the repository at this point in the history
* Initial implementation of Qdrant document index

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Initial implementation of Qdrant document index

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Update poetry.lock

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Initial implementation of _filter and _text_search, also with batched versions

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Return separate scores from batched text search requests

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Return separate scores from batched find requests

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Add empty test_query_builder.py for Qdrant

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Upgrade Qdrant to 1.1.1

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Implement QueryBuilder

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Add tensorflow tests

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Supported optional vectors

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Fix mypy and formatting

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Update docarray/index/backends/qdrant.py

Co-authored-by: Johannes Messner <44071807+JohannesMessner@users.noreply.github.com>
Signed-off-by: Kacper Łukawski <kacperlukawski@users.noreply.github.com>

* Remove the test with custom type (np.array)

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Update Qdrant to 1.1.4

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Refactor tests

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* WIP: Raw query execution

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Add raw Qdrant query support in .execute_query

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Switch to local mode in Qdrant tests

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Code formatting with black

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

* Update poetry.lock

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>

---------

Signed-off-by: Kacper Łukawski <lukawski.kacper@gmail.com>
Signed-off-by: Kacper Łukawski <kacperlukawski@users.noreply.github.com>
Co-authored-by: Johannes Messner <44071807+JohannesMessner@users.noreply.github.com>
  • Loading branch information
kacperlukawski and JohannesMessner committed Apr 14, 2023
1 parent 42523c7 commit 2ea0acd
Show file tree
Hide file tree
Showing 30 changed files with 2,913 additions and 887 deletions.
2 changes: 0 additions & 2 deletions docarray/array/doc_vec/doc_vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,6 @@ def __init__(
tensor_columns[field_name] = TensorFlowTensor(stacked)

elif issubclass(field_type, AbstractTensor):

tensor = getattr(docs[0], field_name)
column_shape = (
(len(docs), *tensor.shape)
Expand Down Expand Up @@ -378,7 +377,6 @@ def _set_data_column(
self._storage.tensor_columns[field] = values

elif field in self._storage.doc_columns.keys():

values_ = parse_obj_as(
DocVec.__class_getitem__(self._storage.doc_columns[field].doc_type),
values,
Expand Down
2 changes: 0 additions & 2 deletions docarray/base_doc/mixins/io.py
Original file line number Diff line number Diff line change
Expand Up @@ -219,7 +219,6 @@ def from_protobuf(cls: Type[T], pb_msg: 'DocProto') -> T:
fields: Dict[str, Any] = {}

for field_name in pb_msg.data:

if field_name not in cls.__fields__.keys():
continue # optimization we don't even load the data if the key does not
# match any field in the cls or in the mapping
Expand Down Expand Up @@ -265,7 +264,6 @@ def _get_content_from_node_proto(
elif content_key is None:
return_field = None
elif docarray_type is None:

arg_to_container: Dict[str, Callable] = {
'list': list,
'set': set,
Expand Down
1 change: 0 additions & 1 deletion docarray/computation/torch_backend.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,6 @@ def shape(cls, tensor: 'torch.Tensor') -> Tuple[int, ...]:

@classmethod
def reshape(cls, tensor: 'torch.Tensor', shape: Tuple[int, ...]) -> 'torch.Tensor':

"""
Gives a new shape to tensor without changing its data.
Expand Down
5 changes: 4 additions & 1 deletion docarray/index/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
from docarray.index.backends.elastic import ElasticDocIndex # noqa: F401
from docarray.index.backends.elasticv7 import ElasticV7DocIndex # noqa: F401
from docarray.index.backends.hnswlib import HnswDocumentIndex # noqa: F401
from docarray.index.backends.qdrant import QdrantDocumentIndex # noqa: F401

__all__ = []

Expand All @@ -25,7 +26,9 @@ def __getattr__(name: str):
elif name == 'ElasticV7DocIndex':
import_library('elasticsearch', raise_error=True)
import docarray.index.backends.elasticv7 as lib

elif name == 'QdrantDocumentIndex':
import_library('qdrant_client', raise_error=True)
import docarray.index.backends.qdrant as lib
else:
raise ImportError(
f'cannot import name \'{name}\' from \'{_get_path_from_docarray_root_level(__file__)}\''
Expand Down

0 comments on commit 2ea0acd

Please sign in to comment.