-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[TST] More benchmark queries for regex #4910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
Expand Regex Benchmark Coverage and Update Dependencies This PR significantly expands the regex and full-text search benchmark queries, using the bigcode/the-stack-dedup Rust dataset for more comprehensive and realistic benchmarks. It also updates a set of core dependencies (notably Arrow and Parquet to 55.1, with lockfile and cargo file adjustments), and adapts affected k8s WAL integration tests to new fragment sizes and manifest expectations following Arrow/Parquet upgrades. Additionally, a new dataset runner for the Rust dataset is implemented for more robust benchmarking. Key Changes• Substantially broadened regex and literal search benchmark patterns for more realistic evaluation, especially in rust/index/benches/literal.rs and rust/worker/benches/regex.rs Affected Areas• Benchmarks: regex.rs, literal.rs This summary was automatically generated by @propel-code-bot |
4d57d3b
to
99c96f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline to babysit staging just to be safe that arrow version increment does not break old data
Merge activity
|
## Description of changes _Summarize the changes made by this PR._ - Improvements & Bug fixes - This PR adds more regex patterns in the benchmark. The benchmark also serve as an integration for regex as it compares the result with bruteforce evaluation. - Updates a few dependencies. Verified that there should be no breaking change - Updates some wal3 test because fragment size changed after dependency. The existing fragment should be compatible and manifest should still be valid - New functionality - N/A ## Test plan _How are these changes tested?_ - [ ] Tests pass locally with `pytest` for python, `yarn test` for js, `cargo test` for rust ## Documentation Changes _Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the [docs section](https://github.com/chroma-core/chroma/tree/main/docs/docs.trychroma.com)?_
Description of changes
Summarize the changes made by this PR.
Test plan
How are these changes tested?
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?