You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If an object is updated, but the vector is not altered, the object should be updated in place without the need to modify the vector index.
Background
With the current logic, every update creates a unique object and marks the previous doc id as deleted. This is because objects are immutable inside Weaviate, and there is no true update option. Every update is an insert+delete. However, this can be very costly when metadata is updated frequently, but the vector is not. In this case, changing an object in place (keeping the same doc id) without altering the vector would be preferred.
Reproducing
Here is a minimal example to show that updating the same object leads to a lot of HNSW updates:
Create one object
Update non-vector properties of object continously
curl -s localhost:2112/metrics | grep vector_index_tombstones
# HELP vector_index_tombstones Number of active vector index tombstones
# TYPE vector_index_tombstones gauge
vector_index_tombstones{class_name="Test",shard_name="c3ieorjafMqI"} 10000
Tech Notes
There is one other process where we can already do in-place updates without altering the doc id: The References Batch API. I believe we can reuse that logic.
The logic could be as follows:
identify the old inverted index entries, so we know what needs to be cleaned up
identify the new inverted index entries, so we know what needs to be added
Perform additions and deletions on the same doc id
Acceptance Criteria
When an update involves a vector change, the existing logic is used: The doc ID is retired and a new doc ID is created
When the update does not involve a vector change, the doc ID is kept and the inverted index is altered (according to the logic outlined above or sth similar)
All filters still work correctly, i.e. they match for the updated values and don't match for previous values.
Objects are considered identical if all user-settable properties are identical. This means they can be identical even with different update/create timestamps.
The text was updated successfully, but these errors were encountered:
tl;dr
If an object is updated, but the vector is not altered, the object should be updated in place without the need to modify the vector index.
Background
With the current logic, every update creates a unique object and marks the previous doc id as deleted. This is because objects are immutable inside Weaviate, and there is no true update option. Every update is an insert+delete. However, this can be very costly when metadata is updated frequently, but the vector is not. In this case, changing an object in place (keeping the same doc id) without altering the vector would be preferred.
Reproducing
Here is a minimal example to show that updating the same object leads to a lot of HNSW updates:
Check tombstones:
Tech Notes
There is one other process where we can already do in-place updates without altering the doc id: The References Batch API. I believe we can reuse that logic.
The logic could be as follows:
Acceptance Criteria
The text was updated successfully, but these errors were encountered: