-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TST] Property Test Generation Fixes #2383
base: main
Are you sure you want to change the base?
Conversation
…nt in favor of point test
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
) Closes #2377 #2379 ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - Making dimension and version lookup optional in the Collection model creation in fastapi client ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes N/A
## Description of changes Fix a typo in comment section in chromadb/db/system.py ``` """ Create a new collection any (-> and) associated resources in the SysDB. """ ``` ## Test plan Do not need test ## Documentation Change Not public facing API documentation change.
Merging into stacked branch
…nt in favor of point test
metadatas = [] | ||
for i in range(len(ids)): | ||
metadatas.append(generated_metadatas[i % len(generated_metadatas)]) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks good to me.
@@ -75,6 +118,8 @@ def test_add( | |||
) | |||
|
|||
|
|||
# Hypothesis struggles to generate large record sets so we explicitly create | |||
# a large record set | |||
def create_large_recordset( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel like we could add some more randomization in here. For example, all embeddings are the same - this is guaranteed to produce a bad HNSW graph. Unrelated to the focus of this PR However.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, I tried to replace this with hypothesis but still need to do some munging, I cut myself a task
Description of changes
Summarize the changes made by this PR.
The primary intent of this PR is to remove the is_metadata_valid invariant which was a workaround for our metadata strategy generating faulty metadata and then us special casing all uses of the record set strategy to handle invalid generations. This PR patches the metadata generation to not generate invalid metadata.
Adds modes in test_add to add a medium sized record set. This was initially timing out in hypothesis's generation. Hypothesis bounds the buffer size of the bytes it uses to do random generation, so generating larger metadata was resulting in examples being marked at OVERRUN by conjecture (gleaned from issues like Tests fail with StopTest (OVERRUN) when generating a random integer (strategies.randoms) HypothesisWorks/hypothesis#3999 + reading hypothesis code + stepping through it). This PR adds the ability to generate N fixed metadata entries and uniformly distribute them over the record set, reducing the overall entropy.
Fixes a bug that test_embeddings was not handling None as a possible metadata state, since this state was never generated. Added an explicit test for this.
Fixes a bug in the reference filtering implementation in test_filtering that did not handle None metadata since that state was never generated.
This PR is forced to touch types related to metadata, which are incorrect and cause typing errors. I ignored the errors to minimize the surface area of this change and defer those changes to the pass mentioned in #2292.
Test plan
How are these changes tested?
These changes are covered by existing tests, and
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
No external changes required.