select model for inserting slices according the model id #52
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces several changes to improve the handling of embedding models and their corresponding Qdrant collections in the
qdrant_handlermodule and associated test files. The most significant updates include transitioning from language-specific collection names to embedding model-specific naming, refactoring theclassify_documents_per_collectionfunction for clarity and efficiency, and updating the test cases to align with these changes.Updates to embedding model handling:
classify_documents_per_collectionfunction inwelearn_datastack/modules/qdrant_handler.pyto use a mapping (model_name_collection_name) for resolving collections based on embedding model names, replacing the previous language-based logic. This simplifies the logic and reduces redundancy.Test case updates for embedding model-specific collections:
tests/qdrant_syncronizer/test_qdrant_handler.pyandtests/qdrant_syncronizer/test_qdrant_syncronizer.pyto follow the new embedding model-specific naming convention (e.g.,collection_welearn_en_embmodel→collection_welearn_en_english-embmodel). [1] [2]test_qdrant_handler.pyto include the embedding model name as a parameter forFakeSliceinstances, ensuring proper alignment with the new naming logic. [1] [2] [3]EmbeddingModelinstance intest_qdrant_syncronizer.pyto simulate embedding model metadata and updatedDocumentSliceinstances to reference this model. [1] [2]Additional improvements:
defaultdictimport inwelearn_datastack/modules/qdrant_handler.pyto simplify the initialization of theretdictionary inclassify_documents_per_collection.MockandEmbeddingModelimports in test files to support the new embedding model logic. [1] [2]