Releases: weaviate/weaviate
v1.25.12 - RAFT fixes, Bm25 speedups and experimental repair endpoint
Breaking Changes
none
New Features
Fixes
- Fix tombstone cleanup if GOMAXPROCS is set to 1 by @andrewisplinghoff in #5593
- RAFT: don't update lastAppliedIndexToDB on non restore in reloadDBFromSchema by @moogacs in #5594
- Speed up bm25 by concurrent LSM lookups by @dirkkul in #5567
- Fix go.mod dependencies by @antas-marcin in #5400
- chore improve err handling when fetching shard node name by @jeroiraz in #5415
- chore: adjust pull_requests.yaml to allow quick docker push on prefix 'build' branches by @moogacs in #5016
- add more parameters to configure sentry by @reyreaud-l in #5424
- fix ci push docker typo by @reyreaud-l in #5430
- debug: add reindex endpoint by @asdine in #5305
- fix: add missing return after http.Error() by @moogacs in #5452
- Add fgprof for better profiling by @dirkkul in #5453
- chore: log ctx's with timeout DEBUG LEVEL by @moogacs in #5445
- add sentry middlewares by @reyreaud-l in #5464
- Fix bug when all objects fail to parse in
rpc.BatchObjects
by @tsmith023 in #5467 - add more trace filter to sentry sampler by @reyreaud-l in #5473
- add cluster id and cluster owner info to sentry by @reyreaud-l in #5475
- swap sentry sub-config to be enabled by default by @reyreaud-l in #5476
- refact: Distance methods by @moogacs in #5474
- Keeping graph, cache and store in sync during tombstone cleanup cycles by @abdelr in #5463
- refact:distance: handle deleted nodes from caller only by @moogacs in #5483
- Change transformers image to baai-bge-small-en-v1.5-onnx-1.9.4 by @antas-marcin in #5530
- Adjust pipelines for larger github runners by @antas-marcin in #5539
- Fallback to Different Hosts Quickly When Coordinator Pull Op Errors by @nathanwilk7 in #5472
- Handle dirty reads/writes of deleted objects in replicated concurrent patching by @tsmith023 in #5516
- fix debug reindexing with non-target vectors by @asdine in #5502
Full Changelog: v1.25.11...v1.25.12
v1.24.22 - Replication with PATCH requests Fix, Vector index tombstone cleanup process Fix, Sentry & Profiling Improvements
Breaking Changes
none
New Features
none
Fixes
- Fix go.mod dependencies by @antas-marcin in #5400
- chore improve err handling when fetching shard node name by @jeroiraz in #5415
- chore: adjust pull_requests.yaml to allow quick docker push on prefix 'build' branches by @moogacs in #5016
- add more parameters to configure sentry by @reyreaud-l in #5424
- fix ci push docker typo by @reyreaud-l in #5430
- debug: add reindex endpoint by @asdine in #5305
- fix: add missing return after http.Error() by @moogacs in #5452
- Add fgprof for better profiling by @dirkkul in #5453
- chore: log ctx's with timeout DEBUG LEVEL by @moogacs in #5445
- add sentry middlewares by @reyreaud-l in #5464
- Fix bug when all objects fail to parse in
rpc.BatchObjects
by @tsmith023 in #5467 - add more trace filter to sentry sampler by @reyreaud-l in #5473
- add cluster id and cluster owner info to sentry by @reyreaud-l in #5475
- swap sentry sub-config to be enabled by default by @reyreaud-l in #5476
- refact: Distance methods by @moogacs in #5474
- Keeping graph, cache and store in sync during tombstone cleanup cycles by @abdelr in #5463
- refact:distance: handle deleted nodes from caller only by @moogacs in #5483
- Change transformers image to baai-bge-small-en-v1.5-onnx-1.9.4 by @antas-marcin in #5530
- Adjust pipelines for larger github runners by @antas-marcin in #5539
- Fallback to Different Hosts Quickly When Coordinator Pull Op Errors by @nathanwilk7 in #5472
- Handle dirty reads/writes of deleted objects in replicated concurrent patching by @tsmith023 in #5516
- fix debug reindexing with non-target vectors by @asdine in #5502
Full Changelog: v1.24.21...v1.24.22
v1.25.11 - Flat index improvements, Shard initialization and Raft Fixes
Breaking Changes
none
New Features
none
Fixes
- Improve LSM store initialization of shard by @etiennedi in #5471
- Make loading of Flat-Index BQ-cache more SSD-friendly by @etiennedi in #5468
- fix: ensure all instances are lazy instances if lazyness is not disabled by @jeroiraz in #5499
- Parallel rescore in bq flat by @dirkkul in #5477
- Optimize vector cache initialization by @dirkkul in #5509
- Annotate vector index error for sentry by @etiennedi in #5513
- add raft snapshot restore unit test by @reyreaud-l in #5514
- raft: refact bits for class info by @moogacs in #5515
- fix flat index del obj by @jeroiraz in #5519
- refact: RAFT remove not used bit in schema updateTenants by @moogacs in #5504
- refact:raft: log cache part of the store by @moogacs in #5525
- refact: raft: last applied directly from store by @moogacs in #5526
- fix: lazy shard initialization taking into account implicit shard loading by @jeroiraz in #5522
- Change transformers image to baai-bge-small-en-v1.5-onnx-1.9.4 by @antas-marcin in #5530
Full Changelog: v1.25.10...v1.25.11
v1.25.10 - BM25 performance improvements, Vector index tombostone cleanup Fix, Sentry monitoring improvements
Breaking Changes
none
New Features
none
Fixes
- Fix bug when all objects fail to parse in
rpc.BatchObjects
by @tsmith023 in #5467 - add more trace filter to sentry sampler by @reyreaud-l in #5473
- add cluster id and cluster owner info to sentry by @reyreaud-l in #5475
- swap sentry sub-config to be enabled by default by @reyreaud-l in #5476
- refact: Distance methods by @moogacs in #5474
- Keeping graph, cache and store in sync during tombstone cleanup cycles by @abdelr in #5463
- refact:distance: handle deleted nodes from caller only by @moogacs in #5483
- Conccurently read objects in BM25 by @dirkkul in #5481
Full Changelog: v1.25.9...v1.25.10
v1.25.9 - Hybrid search and Object store retrevial performance Fix, Updates with empty arrays and lists Fix, Sentry configuration improvements
Breaking Changes
none
New Features
none
Fixes
- Fix go.mod dependencies by @antas-marcin in #5400
- chore: further divide acceptance with go client e2e tests by @antas-marcin in #5405
- chore improve err handling when fetching shard node name by @jeroiraz in #5415
- chore: adjust pull_requests.yaml to allow quick docker push on prefix 'build' branches by @moogacs in #5016
- Fix batch updates of empty arrays by @dirkkul in #5418
- fix: py test conftest by @moogacs in #5425
- RAFT: retry leader not found and refact the retry policy by @moogacs in #5412
- improve update tenant in raft schema by @reyreaud-l in #5422
- add more parameters to configure sentry by @reyreaud-l in #5424
- fix ci push docker typo by @reyreaud-l in #5430
- debug: add reindex endpoint by @asdine in #5305
- Parallel vector and keyword for hybrid search by @amourao in #5436
- fix: add missing return after http.Error() by @moogacs in #5452
- Add fgprof for better profiling by @dirkkul in #5453
- chore: log ctx's with timeout DEBUG LEVEL by @moogacs in #5445
- Fix updating object with empty list by @dirkkul in #5462
- add sentry middlewares to servers and internal clients by @reyreaud-l in #5431
- add sentry middlewares by @reyreaud-l in #5464
- Parallelize object store retrieval by @etiennedi in #5443
Full Changelog: v1.25.8...v1.25.9
v1.26.1 - Hybrid search performance Fix, Tenants create API Fix, New JinaAI Reranker module
Breaking Changes
none
New Features
- Add support for JinaAI reranker API (#5421) by @antas-marcin in #5440
Fixes
- Parallel vector and keyword for hybrid search by @amourao in #5436
- allow over 100 on tenants creation by @moogacs in #5442
Full Changelog: v1.26.0...v1.26.1
v1.26.0 - Tenant Offloading, Multi-Target Vector Search, Scalar Quantization, Async Replication, Improved Range Queries
Breaking Changes
Tenant activity status update requests now limited to 100 tenants. This is mitigated by the official client libraries which batch requests containing greater than 100 tenants in the background.
New Features
Tenant Offloading
Designed to optimize storage costs and improve resource utilization by separating compute and storage for inactive tenants. This new feature allows inactive tenant data to be offloaded to object storage, thus reducing expenses by not paying for compute resources when the data is not actively used. This initial iteration is S3 compatible, and official support for other object storage solutions is coming soon.
- Allow tenant FROZEN status REST & gRPC by @moogacs in #5072
- S3 offload tenants module by @moogacs in #5074
- Offload shards to S3 upload/download paths by @moogacs in #5119
- Add offloading tenant states to gRPC API by @tsmith023 in #5224
- Add offload upload/download metric by @moogacs in #5273
- Remove tenant unused UNFROZEN status by @moogacs in #5296
- Add read-only tenant statuses to OpenAPI spec by @tsmith023 in #5289
- Delete frozen tenants in cloud on tenant or class deletion by @moogacs in #5288
- Pass error correctly in index.dropCloudShards by @moogacs in #5303
- Rename S3_ENDPOINT_URL flag to OFFLOAD_S3_ENDPOINT by @moogacs in #5304
- Allow new tenant names as input for Add and Update by @dirkkul in #5309
- Increase offload module timeout default to 120s by @moogacs in #5357
- Prevent tenant freezing when offload module not available by @parkerduckworth in #5354
- Offload tenants stuck on unfreezing by @moogacs in #5361
- Offload auto tenant de/activate by @moogacs in #5366
- Allow concurrent tenants update by @moogacs in #5367
- Update tenants offload tests timeout by @moogacs in #5371
- Offloading unfreeze concurrency loop variable capture by @moogacs in #5376
- Offload module: log level error only for default by @moogacs in #5377
- Improve performance update tenant upload/download process all at once by @moogacs in #5383
- Refactor update tenants validation by @moogacs in #5390
- Make offload bucket auto creation configurable by @moogacs in #5392
- Acceptance test for backup & offload by @moogacs in #5395
- Adjust the WAL recovery log to include on offloaded by @moogacs in #5409
- Delete class will check for frozen tenant before cloud deletion by @moogacs in #5408
- Abort on any shard error during offloading by @moogacs in #5416
Multi-Target Vector Search
Significantly enhances search capabilities by allowing users to perform searches across multiple vectors simultaneously. This feature improves search efficiency and relevance by querying multiple vectors in a single operation, enabling more comprehensive and accurate data retrieval.
- Named vector multi search by @dirkkul in #5099
- Refactor dto.Get struct by @dirkkul in #5173
- Refactor multi target search by @dirkkul in #5190
- Add GQL support for multi target search by @dirkkul in #5172
- More multi target search fixes by @dirkkul in #5192
- Add separate vector per target for multi target search by @dirkkul in #5216
- Fix targets for near vector GQL subsearch by @dirkkul in #5329
- Disable certainty for multi vector search by @dirkkul in #5344
- More multi target tests and fixes by @dirkkul in #5346
- Multi target/fix near object with different length vectors by @tsmith023 in #5389
- Fix multi target search for BQ without cache by @dirkkul in #5391
Scalar Quantization
Optimizes storage and enhances search performance. Scalar Quantization compresses vector data by mapping the vector's floating point values to integers, significantly reducing storage size while maintaining accuracy. This compression allows for faster and more efficient searches by decreasing the amount of data processed.
- Add SIMD l2 distance calculation by @asdine in #5095
- Dot byte implementation by @asdine in #5097
- Scalar Quantization by @abdelr in #5125
- Fix compressor logger for scalar quantization by @trengrj in #5307
- Add vector generator function for BQ and SQ by @robbespo00 in #5399
Async Replication
Ensures data consistency and integrity across replicas by asynchronously replicating changes. The Merkle tree-based implementation allows for efficient comparison and synchronization of data, identifying differences quickly and reducing replication lag. This asynchronous approach minimizes the impact on performance while ensuring that all replicas are up-to-date, providing a robust and reliable HA solution
- Asynchronous Replication by @jeroiraz in #4245
- Do not omit async replication attribute when disabled by @jeroiraz in #5175
- Reduce hashtree sizing by @jeroiraz in #5176
- Fix async replication leaks by @jeroiraz in #5222
- Ensure shardstate is up to date by @jeroiraz in #5233
- Consider the case when the hashtree has been explicitly set to nil by @jeroiraz in #5275
- Support deleted obj during async replication by @jeroiraz in #5314
- Fix enabling async replication on existing class, and multi-tenancy validation by @parkerduckworth in #5343
- Use compact hashtree even when multi-tenancy is not enabled by @jeroiraz in #5333
- Set a limit of objects being propagated per hashbeat iteration by @jeroiraz in #5368
- Improve err handling when fetching shard node name by @jeroiraz in #5415
Improved Range Queries
Introduces a new range filter index type, drastically improving the performance of large scale queries on numeric ranges.
- Range roaring set index by @aliszka in #5128
- Bucket range reader: improved panic handling by @aliszka in #5236
- Range roaring set index tests by @aliszka in #5280
- Range roaring set index - rename indexRangeble to indexRangeFilters by @aliszka in #5299
- Migrate existing class properties so IndexRangeFilters won't be nil by @parkerduckworth in #5347
Other
- Added an environment variable for disabling the go profiler setup by @bennycortese in #4290
- Remove duplicate contextionary url in dev setup by @aminst in #4806
- Deprecate old schema impl. before RAFT by @moogacs in #4943
- Use SafeErrorCompounder in migrator by @moogacs in #5388
- Add reindex endpoint by @asdine in #5305
Module Improvements
- Add API based modules by default by @databyjp in #5002
- Add dynamic generative modules syntax (module system and GraphQL support) by @antas-marcin in #5238
- Fix gRPC handling of dynamic generate result by @antas-marcin in #5261
- Module Generative Anthropic by @cdpierse in #5210
- Add Generative Anthropic e2e tests by @antas-marcin in #5290
- Add multi2vec-palm to the list of API-based modules to enable by @databyjp in #5313
Performance Improvements
v1.26.0-rc.1 - Tenant Offloading, Scalar Quantization, Async Replication, Improved Range Queries, Multi-tenancy Compression Support
This is a release candidate for the upcoming v1.26.0 release
A release candidate (RC) means the release is considered feature complete and has finished beta-testing. Any issues discovered during the RC phase will lead to new rc releases. The final rc release becomes the stable release. We're happy for your feedback about this pre-release.
This pre-release contains:
- Tenant Offloading
- Scalar Quantization
- Async Replication
- Improved Range Queries
- Async Indexing Observability
v1.25.8 - Tombstone panic prevention Fix, HNSW PQ Fixes, GraphQL with primitive type named classes Fix, Schema V2 loading in metadata only mode Fix, Sentry logging integration
Breaking Changes
none
New Features
none
Fixes
- fix: gql dependancy on RAFT schema by @moogacs in #5285
- Add new flag & logic to force search to query all replicas of a shard if possible by @reyreaud-l in #5295
- [PQ] Skip empty key on preload of compressed vectors by @trengrj in #5319
- fix search deduplication using wrong index by @reyreaud-l in #5322
- [v1.24] Improved Error handling during shard start up by @etiennedi in #5320
- add unit test for search dedup by @reyreaud-l in #5323
- Add more tests for search_dedup by @etiennedi in #5327
- Fixing the ReadThisRound var for PQ by @abdelr in #5283
- rescoring when flat searching by @abdelr in #5326
- Add debug log when starting to load shard by @etiennedi in #5334
- Forbid loading the local DB if the node is a metadata only node by @reyreaud-l in #5359
- Better error for missing vectorizer by @dirkkul in #5348
- disallow reloading the DB if metadata node only by @reyreaud-l in #5360
- Bugfix/tombstone panic prevention v2 by @jbendotnet in #5353
- add warning log for full replicas search by @reyreaud-l in #5336
- Fix GraphQL schema error when class name is a GraphQL primitive type name by @antas-marcin in #5370
- Enable recover Roaring as list WAL from 1.26 by @amourao in #5379
- Integrate sentry (Opt-in, disabled by default) by @etiennedi in #5221
- Report common unexpected errors (vector search failure, shard init) to Sentry by @etiennedi in #5386
Full Changelog: v1.25.7...v1.25.8
v1.24.21 - Tombstone panic prevention Fix, GraphQL with primitive type named classes Fix, Sentry logging integration
Breaking Changes
none
New Features
none
Fixes
- Add debug log when starting to load shard by @etiennedi in #5334
- Bugfix/tombstone panic prevention v2 by @jbendotnet in #5353
- add warning log for full replicas search by @reyreaud-l in #5336
- Fix GraphQL schema error when class name is a GraphQL primitive type name by @antas-marcin in #5370
- Integrate sentry (Opt-in, disabled by default) by @etiennedi in #5221
- Report common unexpected errors (vector search failure, shard init) to Sentry by @etiennedi in #5386
Full Changelog: v1.24.20...v1.24.21