Skip to content

💫 Release v0.21.0

Compare
Choose a tag to compare
@github-actions github-actions released this 17 Jan 09:11
ca2973f

Release Note (0.21.0)

Release time: 2023-01-17 09:10:50

This release contains 3 new features, 7 bug fixes and 5 documentation improvements.

🆕 Features

OpenSearch Document Store (#853)

This version of DocArray adds a new Document Store: OpenSearch!

You can use the OpenSearch Document Store to index your Documents and perform ANN search on them:

from docarray import Document, DocumentArray
import numpy as np

# Connect to OpenSearch instance
n_dim = 3

da = DocumentArray(
    storage='opensearch',
    config={'n_dim': n_dim},
)

# Index Documents
with da:
    da.extend(
        [
            Document(id=f'r{i}', embedding=i * np.ones(n_dim))
            for i in range(10)
        ]
    )

# Perform ANN search
np_query = np.ones(n_dim) * 8
results = da.find(np_query, limit=10)

Additionally, the OpenSearch Document Store can perform filter queries, search by text, and search by tags.

Learn more about its usage in the official documentation.

Add color to point cloud display (#961)

You can now include color information in your point cloud data, which can be visualized using display_point_cloud_tensor():

coords = np.load('a_red_motorbike/coords.npy')
colors = np.load('a_red_motorbike/coord_colors.npy')

doc = Document(
    tensor=coords,
    chunks=DocumentArray([Document(tensor=colors, name='point_cloud_colors')])
)
doc.display()

image

Add language attribute to Redis Document Store (#953)

The Redis Document Store now supports text search in various supported languages. To set a desired language, change the language parameter in the Redis configuration:

da = DocumentArray(
    storage='redis',
    config={
        'n_dim': 128,
        'index_text': True,
        'language': 'chinese',
    },
)

🐞 Bug Fixes

Replace newline with whitespace to fix display in plot embeddings (#963)

Whenever the string "\n" was contained in any Document field, doc.plot() would result in a rendering error. This fixes those errors be rendering "\n" as whitespace.

Fix unwanted coercion in to_pydantic_model (#949)

This bug caused all strings of the form 'Infinity' to be coerced to the string 'inf' when calling to_pydantic_model() or to_dict(). This is fixed now, leaving such strings unchanged.

Calculate relevant docs on index instead of queries (#950)

In the embed_and_evaluate() method, the number of relevant Documents per label used to be calculated based on the Document in self. This is not generally correct, so after this fix the quantity is calculated based on the Documents in the index data.

Remove offset index create on list like false (#936)

When a Document Store has list-like behavior disabled, it no longer creates an offset to id mapping, which improves performance.

Add support for remote audio files (#933)

Loading audio files from a remote URL would cause FileNotFoundError, which is now fixed.

Query operator $exists does not work correctly with tags (#911) (#923)

Before this fix, $exists would treat false-y values such as 0 or [] as non existent. This is now fixed.

Document from dataclass with singleton list (#1018)

When casting from a dataclass to Document, singleton lists were treated like an individual element, even if the corresponding field was annotated with List[...]. Now this case is considered, and accessing such a field will yield a DocumentArray, even for singleton inputs.

📗 Documentation Improvements

  • Link to Discord (#1010)
  • Have less versions to avoid deployment timeout (#977)
  • Fix data management section not appearing in Documentation (#967)
  • Link to OpenSearch docs in sidebar (#960)
  • Multimodal to datatypes (#934)

🤟 Contributors

We would like to thank all contributors to this release: