Releases: AnswerDotAI/RAGatouille
Releases · AnswerDotAI/RAGatouille
0.0.8post1
Minor fix: Corrects from time import time
import introduced in indexing overhaul and causing crashing issues as time
was then used improperly.
0.0.8
0.0.8 is finally here!
Major changes:
- Indexing overhaul contributed by @jlscheerer #158
- Relaxed dependencies to ensure lower install load #173
- Indexing for under 100k documents will by default no longer use Faiss, performing K-Means in pure PyTorch instead. This is a bit of an experimental change, but benchmark results are encouraging and result in greatly increased compatibility. #173
- CRUD improvements by @anirudhdharmarajan. Feature is still experimental/not fully supported, but rapidly improving!
Fixes:
- Many small bug fixes, mainly around typing
- Training triplets improvement (already present in 0.0.7 post versions) by @JoshuaPurtell
0.0.7post3
0.0.7post2
Fixes & tweaks to the previous release:
- Automatically adjust batch size on longer contexts (32 for 512 tokens, 16 for 1024, 8 for 2048, decreasing like this until a minimum of 1)
- Apply dynamic max context length to reranking
0.0.7post1
Release focusing on length adjustments. Much more dynamism and on-the-fly adaptation, both for query length and maximum document length!
- Remove hardcoded maximum length: it is now inferred from your base model's maximum position encodings. This enables support for longer-context ColBERT, such as Jina ColBERT
- Upstream changes to
colbert-ai
to allow any base model to be used, rather than pre-defined ones. - Query length now adjusts dynamically, from 32 (hardcoded minimum) to your model's maximum context window for longer queries.
0.0.6c2
(notes encompassing changes in the last few PyPi releases that were undocumented until now)
Changes:
- Query only a subset documents based on doc ids by @PrimoUomo89 #94
- Return chunk ids in results thanks to @PrimoUomo89 #125
- Lower kmeans iterations when not necessary to run more #129
- Properly license the library as Apache-2 on PyPi
Fixes:
- Dynamically increase search hyper parameters for large k values and lower doc counts, reducing the number of situations where the total number of documents return is substantially below
k
#131 - Fix enabling the use of Training data processing with hard negatives turned off by @corrius #117
- Proper handling of different input types when pre-processing training triplets by @GautamR-Samagra #115
0.0.6b5
0.0.6b2
0.0.6b0
0.0.6a1
Fixes & minor improvements:
- Better verbosity control, especially for high-CRUD encode() scenarios
- Fixed max document length being often set too low when using in-memory encode() or rerank()
- Allow forced overwrite of indexes (#63)
- Fix wrong argument being passed to negative miner (should not have had any impact in practice)