[TST] Property Test Generation Fixes #2383

HammadB · 2024-06-19T19:46:39Z

Description of changes

Summarize the changes made by this PR.
The primary intent of this PR is to remove the is_metadata_valid invariant which was a workaround for our metadata strategy generating faulty metadata and then us special casing all uses of the record set strategy to handle invalid generations. This PR patches the metadata generation to not generate invalid metadata.

Adds modes in test_add to add a medium sized record set. This was initially timing out in hypothesis's generation. Hypothesis bounds the buffer size of the bytes it uses to do random generation, so generating larger metadata was resulting in examples being marked at OVERRUN by conjecture (gleaned from issues like Tests fail with StopTest (OVERRUN) when generating a random integer (strategies.randoms) HypothesisWorks/hypothesis#3999 + reading hypothesis code + stepping through it). This PR adds the ability to generate N fixed metadata entries and uniformly distribute them over the record set, reducing the overall entropy.
Fixes a bug that test_embeddings was not handling None as a possible metadata state, since this state was never generated. Added an explicit test for this.
Fixes a bug in the reference filtering implementation in test_filtering that did not handle None metadata since that state was never generated.

This PR is forced to touch types related to metadata, which are incorrect and cause typing errors. I ignored the errors to minimize the surface area of this change and defer those changes to the pass mentioned in #2292.

Test plan

How are these changes tested?
These changes are covered by existing tests, and

Tests pass locally with pytest for python, yarn test for js, cargo test for rust

Documentation Changes

No external changes required.

…nt in favor of point test

github-actions · 2024-06-19T19:46:50Z

HammadB · 2024-06-19T19:46:56Z

[TST] Fix test filtering and add recordset test tiers #2385
[TST] Property Test Generation Fixes #2383 👈
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @HammadB and the rest of your teammates on Graphite

) Closes #2377 #2379 ## Description of changes *Summarize the changes made by this PR.* - Improvements & Bug fixes - Making dimension and version lookup optional in the Collection model creation in fastapi client ## Test plan *How are these changes tested?* - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes N/A

## Description of changes Fix a typo in comment section in chromadb/db/system.py ``` """ Create a new collection any (-> and) associated resources in the SysDB. """ ``` ## Test plan Do not need test ## Documentation Change Not public facing API documentation change.

Merging into stacked branch

…nt in favor of point test

chromadb/test/property/strategies.py

codetheweb · 2024-06-20T21:00:46Z

chromadb/test/property/strategies.py

+    metadatas = []
+    for i in range(len(ids)):
+        metadatas.append(generated_metadatas[i % len(generated_metadatas)])
+


atroyn

All looks good to me.

atroyn · 2024-06-20T21:53:59Z

chromadb/test/property/test_add.py

@@ -75,6 +118,8 @@ def test_add(
    )


+# Hypothesis struggles to generate large record sets so we explicitly create
+# a large record set
 def create_large_recordset(


Feel like we could add some more randomization in here. For example, all embeddings are the same - this is guaranteed to produce a bad HNSW graph. Unrelated to the focus of this PR However.

Agreed, I tried to replace this with hypothesis but still need to do some munging, I cut myself a task

[TST] Make metadata strategy return valid metadata and remove invaria…

d0a7570

…nt in favor of point test

HammadB and others added 5 commits June 19, 2024 12:57

Make type chane less intrusive

1f36f2f

Remove redundant type

cff6b44

Make test_embeddings handle exisiting None

4b3081c

import uuid

ab93b62

HammadB mentioned this pull request Jun 20, 2024

[TST] Fix test filtering and add recordset test tiers #2385

Merged

1 task

imaffe and others added 7 commits June 20, 2024 08:34

[TST] Fix test filtering and add recordset test tiers (#2385)

dafd1ff

Merging into stacked branch

[TST] Make metadata strategy return valid metadata and remove invaria…

d5c53e1

…nt in favor of point test

Make type chane less intrusive

873e80a

Remove redundant type

5ec1d10

Make test_embeddings handle exisiting None

ece9608

import uuid

eb278d2

HammadB changed the title ~~[TST] Make metadata strategy return valid metadata and remove invariant in favor of point test~~ [TST] Property Test Generation Fixes Jun 20, 2024

HammadB commented Jun 20, 2024

View reviewed changes

chromadb/test/property/strategies.py Outdated Show resolved Hide resolved

HammadB commented Jun 20, 2024

View reviewed changes

chromadb/test/property/strategies.py Outdated Show resolved Hide resolved

HammadB commented Jun 20, 2024

View reviewed changes

chromadb/test/property/strategies.py Show resolved Hide resolved

HammadB added 2 commits June 20, 2024 13:43

merge

548c3ab

Cleanup

81e63fe

HammadB marked this pull request as ready for review June 20, 2024 20:49

Empty commit

a32e8aa

codetheweb reviewed Jun 20, 2024

View reviewed changes

HammadB added 2 commits June 20, 2024 14:18

healthcheck settings

5e93cde

healthcheck enable all

486563b

atroyn approved these changes Jun 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TST] Property Test Generation Fixes #2383

[TST] Property Test Generation Fixes #2383

HammadB commented Jun 19, 2024 •

edited

github-actions bot commented Jun 19, 2024

HammadB commented Jun 19, 2024 •

edited

codetheweb Jun 20, 2024

atroyn left a comment

atroyn Jun 20, 2024 •

edited

HammadB Jun 21, 2024

[TST] Property Test Generation Fixes #2383

Are you sure you want to change the base?

[TST] Property Test Generation Fixes #2383

Conversation

HammadB commented Jun 19, 2024 • edited

Description of changes

Test plan

Documentation Changes

github-actions bot commented Jun 19, 2024

Reviewer Checklist

Testing, Bugs, Errors, Logs, Documentation

System Compatibility

Quality

HammadB commented Jun 19, 2024 • edited

codetheweb Jun 20, 2024

Choose a reason for hiding this comment

atroyn left a comment

Choose a reason for hiding this comment

atroyn Jun 20, 2024 • edited

Choose a reason for hiding this comment

HammadB Jun 21, 2024

Choose a reason for hiding this comment

HammadB commented Jun 19, 2024 •

edited

HammadB commented Jun 19, 2024 •

edited

atroyn Jun 20, 2024 •

edited