Skip to content

Conversation

@JSv4
Copy link
Collaborator

@JSv4 JSv4 commented Mar 15, 2025

pgvector doesn't make it easy to store variable dimension vectorsand we will want to mix and match vectors of same dimensions but different embeddings. This change leverages pipeline and restructured data model for embeddings to facilitate this use case where we want mixed embeddings env. Initial plan is to set at Corpus level but in theory could have mixed vectors within corpus too.

JSv4 added 30 commits March 11, 2025 00:39
… kludgy. If we want to separately search description via embeddings, make it a note.
…eation functions. Switched embeddings generation for doc to a corpus action when doc is added as the embeddings are now based on corpus and not same dimension in all cases.
… version AND is going to be deprecated anyway.
@codecov
Copy link

codecov bot commented Mar 23, 2025

Codecov Report

Attention: Patch coverage is 93.33710% with 118 lines in your changes missing coverage. Please review.

Project coverage is 89.63%. Comparing base (9a97509) to head (c24dfc5).
Report is 55 commits behind head on main.

Files with missing lines Patch % Lines
opencontractserver/utils/embeddings.py 83.11% 13 Missing ⚠️
opencontractserver/shared/Managers.py 80.35% 11 Missing ⚠️
opencontractserver/annotations/admin.py 88.75% 9 Missing ⚠️
opencontractserver/tasks/embeddings_task.py 82.35% 9 Missing ⚠️
opencontractserver/pipeline/utils.py 84.31% 8 Missing ⚠️
opencontractserver/shared/mixins.py 78.37% 8 Missing ⚠️
opencontractserver/utils/files.py 42.85% 8 Missing ⚠️
opencontractserver/tests/test_embeddings_task.py 97.34% 7 Missing ⚠️
opencontractserver/llms/vector_stores.py 92.50% 6 Missing ⚠️
opencontractserver/annotations/models.py 90.69% 4 Missing ⚠️
... and 15 more
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##             main     #339     +/-   ##
=========================================
  Coverage   89.63%   89.63%             
=========================================
  Files         261      278     +17     
  Lines       10515    12021   +1506     
=========================================
+ Hits         9425    10775   +1350     
- Misses       1090     1246    +156     
Files with missing lines Coverage Δ
..._annotation_embeddings_note_embeddings_and_more.py 100.00% <100.00%> (ø)
opencontractserver/corpuses/apps.py 81.81% <100.00%> (-5.69%) ⬇️
..._corpusaction_options_corpus_preferred_embedder.py 100.00% <100.00%> (ø)
...migrations/0017_alter_corpus_preferred_embedder.py 100.00% <100.00%> (ø)
opencontractserver/corpuses/models.py 92.72% <100.00%> (+0.49%) ⬆️
opencontractserver/corpuses/signals.py 100.00% <ø> (ø)
opencontractserver/documents/apps.py 88.23% <100.00%> (+0.73%) ⬆️
...ions/0015_remove_document_description_embedding.py 100.00% <100.00%> (ø)
opencontractserver/documents/models.py 98.75% <100.00%> (+0.06%) ⬆️
opencontractserver/extracts/models.py 100.00% <ø> (ø)
... and 38 more

... and 2 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9a97509...c24dfc5. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

JSv4 added 22 commits March 23, 2025 16:55
@JSv4 JSv4 merged commit 6e9f117 into main Apr 2, 2025
5 of 6 checks passed
@JSv4 JSv4 deleted the JSv4/pluggable-vectors branch April 2, 2025 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants