-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Update][Ease of Use] Fix Types #2292
Labels
Milestone
Comments
HammadB
added a commit
that referenced
this issue
Jun 26, 2024
## Description of changes *Summarize the changes made by this PR.* The primary intent of this PR is to remove the is_metadata_valid invariant which was a workaround for our metadata strategy generating faulty metadata and then us special casing all uses of the record set strategy to handle invalid generations. This PR patches the metadata generation to not generate invalid metadata. - Adds modes in test_add to add a medium sized record set. This was initially timing out in hypothesis's generation. Hypothesis bounds the buffer size of the bytes it uses to do random generation, so generating larger metadata was resulting in examples being marked at OVERRUN by conjecture (gleaned from issues like HypothesisWorks/hypothesis#3999 + reading hypothesis code + stepping through it). This PR adds the ability to generate N fixed metadata entries and uniformly distribute them over the record set, reducing the overall entropy. - Fixes a bug that test_embeddings was not handling None as a possible metadata state, since this state was never generated. Added an explicit test for this. - Fixes a bug in the reference filtering implementation in test_filtering that did not handle None metadata since that state was never generated. This PR is forced to touch types related to metadata, which are incorrect and cause typing errors. I ignored the errors to minimize the surface area of this change and defer those changes to the pass mentioned in #2292. ## Test plan *How are these changes tested?* These changes are covered by existing tests, and - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes No external changes required.
Anush008
pushed a commit
to Anush008/chroma
that referenced
this issue
Jun 27, 2024
## Description of changes *Summarize the changes made by this PR.* The primary intent of this PR is to remove the is_metadata_valid invariant which was a workaround for our metadata strategy generating faulty metadata and then us special casing all uses of the record set strategy to handle invalid generations. This PR patches the metadata generation to not generate invalid metadata. - Adds modes in test_add to add a medium sized record set. This was initially timing out in hypothesis's generation. Hypothesis bounds the buffer size of the bytes it uses to do random generation, so generating larger metadata was resulting in examples being marked at OVERRUN by conjecture (gleaned from issues like HypothesisWorks/hypothesis#3999 + reading hypothesis code + stepping through it). This PR adds the ability to generate N fixed metadata entries and uniformly distribute them over the record set, reducing the overall entropy. - Fixes a bug that test_embeddings was not handling None as a possible metadata state, since this state was never generated. Added an explicit test for this. - Fixes a bug in the reference filtering implementation in test_filtering that did not handle None metadata since that state was never generated. This PR is forced to touch types related to metadata, which are incorrect and cause typing errors. I ignored the errors to minimize the surface area of this change and defer those changes to the pass mentioned in chroma-core#2292. ## Test plan *How are these changes tested?* These changes are covered by existing tests, and - [x] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes No external changes required.
We also alias |
This was referenced Aug 27, 2024
Closed
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Fix Types
Types are currently broken in many places throughout the project. Some of these, like
CollectionMetadata
have been broken for over a year. A good type system improves code correctness and hence development velocity.We must have 𝔬𝔯𝔡𝔢𝔯.
API Design
N/A
[Complexity] Subtask
The text was updated successfully, but these errors were encountered: