Releases: weaviate/weaviate
v1.23.6 - Support for OpenAI's V3 embedding models, gRPC nested objects missing value and shard and replica selection Fixes
Breaking Changes
none
New Features
none
Fixes
- Refine shard and replica selection to distribute replicas across more nodes by @redouan-rhazouani in #4077
- Fix nested obejcts with missing values by @dirkkul in #4072
- Run CI (incl docker-push) after new commits have been merged into stable by @dirkkul in #4081
- Add support for OpenAI's new V3 embedding models by @antas-marcin in #4085
Full Changelog: v1.23.5...v1.23.6
v1.23.5 - AWS module gRPC headers, OpenAI error response, Hybrid vector search and Keyword search with special characters Fixes
Breaking Changes
none
New Features
none
Fixes
- Fix: handle hybrid vector search without BM25 by @parkerduckworth in #4049
- execute near vector search for hybrid with vector only by @tsmith023 in #4048
- Fix regex error in keyword search when passing special characters in query by @antas-marcin in #4061
- pull key from grpc ctx if not found otherwise by @tsmith023 in #4066
- Fix parsing of OpenAI's error response by @antas-marcin in #4068
- chore: rewrite flaky test by @asdine in #4074
Full Changelog: v1.23.4...v1.23.5
v1.22.11 - Regex error in keyword search when passing special characters Fix
Breaking Changes
none
New Features
none
Fixes
- Fix regex error in keyword search when passing special characters in query by @antas-marcin in #4061
Full Changelog: v1.22.10...v1.22.11
v1.23.4 - gRPC API Fixes, Added support for SageMaker in text2ve-aws module, Sharded locks lock contention Fix
Breaking Changes
none
New Features
none
Fixes
- Re-enable Sagemaker service in AWS modules by @antas-marcin in #3922
- Fix to class autodetection for MT by @dirkkul in #4018
- adapters/grpc: remove redundant nil check in
extractPrimitiveProperties
by @Juneezee in #4025 - map correct dtype when filtering references by count by @tsmith023 in #4024
- add NearX messages for multi2vec-bind gRPC searches by @tsmith023 in #4008
- [GRPC] Batch empty list type by @dirkkul in #4031
- Class and Property name max length validation by @aliszka in #4036
- Fix sharded locks lock contention by @asdine in #4034
- replace implicit nil behaviour with explicit grpc message by @tsmith023 in #4041
Full Changelog: v1.23.3...v1.23.4
v1.22.10 - Sharded locks lock contention Fix
Breaking Changes
none
New Features
none
Fixes
- [1.22] Update license header copyright date by @parkerduckworth in #3981
- Fix sharded locks lock contention by @asdine in #4034
Full Changelog: v1.22.9...v1.22.10
v1.23.3 - gRPC API improvements
Breaking Changes
none
New Features
none
Fixes
- Add support for returning phone number properties in gRPC queries by @tsmith023 in #3961
- async: fix double close of chunk channel by @asdine in #3998
- Unify text2vec module components by @antas-marcin in #3990
- Fix uuid casing for references added with object, single refs and bat… by @dirkkul in #4001
- Unify text2vec-contextionary module by @antas-marcin in #4004
- [GRPC] Add structured references for filters + GRPC batch delete endpoint by @dirkkul in #3994
- Allow discerning between
nil
and[]
in ref props by @tsmith023 in #4006
Full Changelog: v1.23.2...v1.23.3
v1.23.2 - gRPC API generative search Fix, support for baseURL in Azure OpenAI endpoints in text2vec-openai module
Breaking Changes
none
New Features
none
Fixes
- [1.22] Update license header copyright date by @parkerduckworth in #3981
- [1.23] Update license header copyright date by @parkerduckworth in #3982
- hotfix regressions with GQL resp mapping in gRPC API by @tsmith023 in #3984
- Fix #3588 let the baseURL for Azure OpenAI be configurable. by @jlewi in #3966
New Contributors
Full Changelog: v1.23.1...v1.23.2
v1.23.1 - gRPC API enhancements, PQ stability Fixes, Cycle Manager Improvements
Breaking Changes
none
New Features
none
Fixes
- Rename id_bytes field to id_as_bytes by @antas-marcin in #3932
- Adjust go dependencies in e2e tests with go client by @antas-marcin in #3933
- Add reranking functionalities to the gRPC API by @tsmith023 in #3952
- Add support for returning blob properties in gRPC queries by @tsmith023 in #3960
- [v1.22.x] Fix issue with deleted node before PQ fitting started by @etiennedi in #3955
- [v1.22.x] Handle vector cache miss during PQ fitting by @etiennedi in #3958
- [v1.22.x] Fix broken PQ empty checks by @etiennedi in #3954
- [v1.23.x] Fix issue with deleted node before PQ fitting started by @etiennedi in #3956
- [v1.23.x] Handle vector cache miss during PQ fitting by @etiennedi in #3959
- cyclemanager: abort routine if running while attempting to unregister by @asdine in #3964
- ensure tenant is passed to cref validation logic by @tsmith023 in #3965
- Grpc/support groupby with gen and rerank by @tsmith023 in #3975
Full Changelog: v1.23.0...v1.23.1
v1.22.9 - PQ stability Fixes, Cycle Manager Improvements
Breaking Changes
none
New Features
none
Fixes
- [v1.22.x] Fix issue with deleted node before PQ fitting started by @etiennedi in #3955
- [v1.22.x] Handle vector cache miss during PQ fitting by @etiennedi in #3958
- [v1.22.x] Fix broken PQ empty checks by @etiennedi in #3954
- cyclemanager: abort routine if running while attempting to unregister by @asdine in #3964
Full Changelog: v1.22.8...v1.22.9
v1.23.0 - Binary quantization support, Startup time improvements, New Generative Anyscale module, gRPC API performance improvements
Breaking Changes
Nodes Status Response Verbosity
Getting the status for all nodes in a cluster can be a very expensive query when each node contains a large number of shards, as the metadata for each shard is included in the response. A new output verbosity option sets the default verbosity level to minimal
, omitting individual shard metadata. The new verbose
verbosity output level includes shard metadata, so using this will return a response body identical to the previous nodes status response before this release.
- Add output verbosity option to Nodes API by @parkerduckworth in #3864
New Features
Binary Quantization / Brute Force Index
Exciting news! We've added a brute force search feature that efficiently runs straight from disk. You can choose between using original vectors or binary compression for faster processing and less disk read. Currently, compression is best for specific data types, but stay tuned! We're working towards a cool update where, with compression enabled, we'll mix disk and memory operations for even better performance. This awesome enhancement is on the horizon, so keep an eye out! 🚀✨
Startup Time / MTTR Improvements
Previously our mean time to recovery (MTTR) / node startup time was significantly impacted by nodes which contain a large number of shards or tenants. This is because the database had to synchronously load each shard from disk before startup was complete and the requests were ready to be served. Well, say goodbye to those days!
We've introduced an amazing new feature: a lazy-loaded shard abstraction layer. This game-changer drastically speeds up node startup and recovery times. How? By loading shards in the background without blocking startup. This means your nodes are up and running almost instantly - talk about efficiency!
And here's the best part: if a request hits for a shard that's not yet loaded, Weaviate smartly fetches it on the spot, serving the request right away. The rest of the shards continue to load in the background seamlessly. This is a huge leap forward in performance and responsiveness. So, gear up to experience a smoother, faster, and more efficient DB.
- Lazy load shards to improve startup time, and conserve resources by @donomii in #3783
- Lazy load shards (part 2) by @aliszka in #3830
- Lazy load shards (part 3) by @aliszka in #3859
- Add metrics for shard lazy loading/unloading by @donomii in #3893
Auto-Compression
The introduction of Product Quantization (PQ) was a huge step forward in efficient vector operations. And now, we're pushing the envelope even further with auto-compression.
Here's the deal: when your in-memory vector index hits a certain threshold, PQ kicks in automatically, compressing the index. This means smarter, smoother, and super-efficient handling of your data without lifting a finger.
Modules
- Generative Anyscale Module by @CShorten in #3845
- Adding Mixtral-8x7B-Instruct-v0.1 by @CShorten in #3913
- Add support for Google's Gemini model by @antas-marcin in #3891
- Adjust generateContent API call for Gemini model by @antas-marcin in #3911
- Add support for new 002 and 003 Gecko models by @antas-marcin in #3905
Resource Guardrails
Performance Optimizations
- Removes tombstones when merging with root segment by @aliszka in #3666
- Add option to force compactions every cycle, where it is advantageous by @donomii in #3675
- Improve Cursor Performance by reusing memory by @etiennedi in #3660
- Improvement of stop condition within SearchByVectorDistance by @aliszka in #3742
- Improved filtered flat search stop condition by @aliszka in #3753
- Optional bloom filter and count net additions calculations by @aliszka in #3756
- Implement pread for replace strategy cursor by @parkerduckworth in #3727
- Setting optimal segments on default, based on the dimensions by @abdelr in #3790
- Send uuid as a byte by @dirkkul in #3894
gRPC Improvements
- Add config and support for gRPC TLS credentials by @mikewyer in #3794
- Add filter for metadata to GRPC by @dirkkul in #3861
- Introduce custom
pb.Properties
message to contain type-aware properties within search result by @tsmith023 in #3820 - Add support for geo coordinates in GRCP by @dirkkul in #3883
- Introduce
ReturnAllNonrefProperties
bool toPropertiesRequest
by @tsmith023 in #3899
Nodes API
- Add output verbosity option to Nodes API by @parkerduckworth in #3864
- Add compressed to NodeShardStatus by @trengrj in #3806
Internal System Restructuring
- Restructure internal file structure to align with the class structure by @parkerduckworth in #3719
- Migration of brute force index's buckets to main store by @aliszka in #3740
- Unification of compressed vectors bucket names by @aliszka in #3865
- Merge PQ bucket into shard store, migrate PQ files from flat file structure by @parkerduckworth in #3726
Other
- Restores cleanup tombstones on compaction tests by @aliszka in #3710
- Removing hnsw config dependency by @abdelr in #3738
- Validating user config by @abdelr in #3754
- Defining constants once by @abdelr in #3755
- Add Github issue template by @dudanogueira in #3766
- Change Google Cloud Storage tests to use prebuilt docker image by @antas-marcin in #3854
- Improve Weaviate's e2e test pipelines by @antas-marcin in #3871
- Generic priority queue by @parkerduckworth in #3877
- Refactor all
lsmkv
error equality checks to useerrors.Is
by @parkerduckworth in #3900 - Exclude vendor and temporary directories in the swagger script by @redouan-rhazouani in #3904
Fixes
- Flat index refactor/fixes by @aliszka in #3724
- TypeAssertVectorIndex fix to work with hnsw and flat index by @aliszka in #3736
- Fix clusterapi comms when gRPC searching required UUID only by @tsmith023 in #3747
- Improvements of HNSW locks by @aliszka in #3784
- Flat index fix: rescoreLimit by @dirkkul in #3884
- Fixes for GRPC filters by @dirkkul in #3888
- Secure access to consistency level during batch deletes by @redouan-rhazouani in #3869
- Fix filtering on array types with GRPC by @dirkkul in #3892
- Enhance backup procedure for replicated classes by @redouan-rhazouani in #3889
- Fix batch deletions when replication is enabled by @redouan-rhazouani in #3919
New Contributors
- @dudanogueira made their first contribution in #3766
- @mikewyer made their first contribution in #3794
Full Changelog: v1.22.6...v1.23.0