-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jegao/label hot fix with main2 #430
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Transferring Varun's chagges from external fork with squash merge * generating multiple gt's for each filter label + search with multiple filter labels (code cleanup) * supporting no-filter + one filter label + filter label file (multiple filters) while computing GT * generating multiple gt's + refactoring code for readability & cleanliness * adding more tests for filtered search * updating pr-test to test filtered cases * lowering recall requirement for disk index * transferred functions to filter_utils * adding more test for build and search without universal label * adding one_per_point distribution to generate_synthetic_labels + cleaning up artifacts after compute gt+ removing minor errors * refactoring search_disk_index to use a query filter vector --------- Co-authored-by: patelyash <patelyash@microsoft.com> Co-authored-by: Varun Sivashankar <t-varunsi@microsoft.com>
- add code for two variants of filtered index, readme and CI tests - add utils for synthetic label generation and CI tests. * Add co-authors Co-authored-by: ravishankar <rakri@microsoft.com> Co-authored-by: Varun Sivashankar <t-varunsi@microsoft.com> --------- Co-authored-by: ravishankar <rakri@microsoft.com> Co-authored-by: David Kaczynski <dkaczynski@microsoft.com> Co-authored-by: Siddharth Gollapudi <t-gollapudis@microsoft.com> Co-authored-by: Neelam Mahapatro <nmahapatro@microsoft.com> Co-authored-by: Harsha Vardhan Simhadri <harshasi@microsoft.com> Co-authored-by: Harsha Vardhan Simhadri <harsha-simhadri@users.noreply.github.com> Co-authored-by: REDMOND\patelyash <patelyash@microsoft.com> Co-authored-by: Varun Sivashankar <t-varunsi@microsoft.com>
* Rather than sift through all the *.cpp and *.h in the root directory, we're looking for only the sources in our main repository for formatting. Git submodules are excluded * Removing the --Werror flag only until we actually format all of the code in a future commit * We're choosing to base our style on the Microsoft style guide and not make any changes * Running format action on source code. Settling on Google styling. Settled on '.clang-format' instead of '_clang-format'. Fixed instructions such that only clang-format 12 is installed (13 changes SortIncludes options from true/false to a trinary set of options, none of which include the word 'false') * Enabling error on malformatted file * Revert "Enabling error on malformatted file" This reverts commit fa33e82. * Revert "Running format action on source code. Settling on Google styling. Settled on '.clang-format' instead of '_clang-format'. Fixed instructions such that only clang-format 12 is installed (13 changes SortIncludes options from true/false to a trinary set of options, none of which include the word 'false')" This reverts commit e0281be. * Trying again; formatting rules based on Google rules, disables sorting includes as that breaks us, and enabling check on build. * Somehow this was missed in the mass format. Formatting include/distance.h. * Manually fixing the formatting because clang-format wouldn't, but WOULD flag it as invalid
Fix typo in SSD index readme
Remove warnings affecting internal build pipelines --------- Co-authored-by: Yiyong Lin <yiyolin@microsoft.com>
* Add support for multiple frozen points * Add the missing parameters to the constructor.
* Added filtered disk index readme
* Transferring Varun's chagges from external fork with squash merge * generating multiple gt's for each filter label + search with multiple filter labels (code cleanup) * supporting no-filter + one filter label + filter label file (multiple filters) while computing GT * generating multiple gt's + refactoring code for readability & cleanliness * adding more tests for filtered search * updating pr-test to test filtered cases * lowering recall requirement for disk index * transferred functions to filter_utils * adding more test for build and search without universal label * adding one_per_point distribution to generate_synthetic_labels + cleaning up artifacts after compute gt+ removing minor errors * refactoring search_disk_index to use a query filter vector --------- Co-authored-by: patelyash <patelyash@microsoft.com> Co-authored-by: Varun Sivashankar <t-varunsi@microsoft.com>
remove _u, _s typedefs * converting uint64's to size_t where they represent array offsets --------- Co-authored-by: harsha vardhan simhadri <harsha.v.simhadri@gmail.com>
add codebook passing and pq/opq dim overwrite.
* some bug fix when enable the EXEC_EnV_OLS * avoid unit test failure * unit test testing * changed based on gopal's suggestion * update load_impl(AlignedFileReader &reader) * change the load_impl to be identical to objectstore * remvoe blank
* Output distance file * fix --------- Co-authored-by: Shengjie Qian <shenqian@microsoft.com>
* Add WIN macro for non-win funtion * fix vc16 compile issue * fix compile issue * fix compile issue * fix compile issue * clean up code
* small bug fix * test ubuntu fail * formatting * re-triggering unitest
* Refactor of diskannpy module code. * 0.5.0.rc1 for python and enabling the build-python portion of the pr-test process. * clang-format changes * In theory this should speed up the python build drastically by only building the wheel for the python version and OS we're attempting to fan out to in our CICD job tree * Missed a dollar sign * Copy/pasting left a CICD step name that implied we were running a code formatting check when instead we were building a wheel. This is now fixed. * In theory, readying the release action too. We won't know if it works until it merges and we cut a release, but at least the paths have been fixed * Designated initializers just happened to work on linux but shouldn't have as they weren't added until cpp20 * Formatting
* small bug fix * test ubuntu fail * formatting * re-triggering unitest * cause error, remove two character params * cause error, remove two character params * unit test fix * clean up code * add more accurate error handelling * fix filter build * re-trigger test * try lower recall number * test witl more value * revert back to test unit test
Github actions fix: composite action `python-wheel` publishes wheels to the `wheels` artifact. `python-release` workflow then looks for it in the `dist` artifact, which does not exist. This is a CICD change only.
* Fixed inputs type-o * Action 'checkout@v2' is deprecated
Trying a new release of the python lib to see if there was a packaging error in the publication of rc1.
* Fixed param name in comments * Hide rust/target
* Removed the logger and verified that the logging capability is the root cause of our consistent segfault errors in python. Perhaps it also will fix any issues in our label test too? I'd like to push it to GH and see. * Formatting fixes * Revert "Formatting fixes" This reverts commit 9042595. * Revert "Removed the logger and verified that the logging capability is the root cause of our consistent segfault errors in python. Perhaps it also will fix any issues in our label test too? I'd like to push it to GH and see." This reverts commit 7561009. * The custom logging implementation is causing segfaults in python. We're not sure exactly where, but this is the easiest and quickest way to getting a working python release. * All the integration tests are failing, and there's a chance the virtual dtor on AbstractDataStore might be the culprit, though I am not sure why. I'm hoping it is so it won't fall on the logging changes. * Formatting. Again.
* Added utilities to standardize help across cli tools. #370 * Made three option groupings (required/optional/print) * Moved common parameter descriptions to a common file. #370 * Updated usage statement for search_disk_app #370 * Updated range_search_disk_index to use the new required/optional format. #370 * Updated test apps to use the new help format. #370 * Fixed format issue. #370 * Updated help format for the 'build' apps. #370 * Fixed code formatting. #370 * Added src/*.hpp to the clang format. #370 * Moved header into the headers directory. #370 * Added missing configs. #370 * Removed superflous paths from include. #370 * Added #pragma once. #370 * Type-o fixes. #370 * Fixed capitolization of constant. #370 * Make fail_if_recall description more accurate. #370 * Changed to using set notation. #370 * Better explanations for some options. #370 * Added short explanation of file format. #370 --------- Co-authored-by: Jon McLean <none@example.com> Co-authored-by: Jonathan McLean <Jonathan.McLean@microsoft.com>
* Identified the appropriate build flags to get a working python build that doesn't rely on -march=native or -mtune=native. We've run benchmarks on multiple computers that indicate the only important flag other than -mavx2 -msse2 -mfma is -funroll-loops. Optimization levels such as -O1, -O2, or -O3 actually makes for less performant code. -Ofast is unavailble for use in Python, as it causes problems with floating point math in Python * 1.22 was left in a comment despite 1.25 being the value specified * Python 3.8 is not supported by numpy 1.25, so we're removing it.
* Work-in-progress commit adding JSON output for timings. in-mem-static is complete * Added timings to dynamic and total-time to static
Using the correct README for our publication to pypi.
* small bug fix * test ubuntu fail * formatting * re-triggering unitest * add small fix for in_mem_data_store when EXEC_ENV_OLS is enabed
* fix: use the passed in io_limit * fix to be clang-formatted
* While simply creating a unit test to repro Issue #400, I found a number of bugs that I needed to address just to get it to work the way I had intended. This does not yet have what I would consider a comprehensive suite of test coverage for the DynamicMemoryIndex, but we at least do save it with the metadata file, we can load it correctly, and saving *always* consolidate_deletes() prior to save if any item has been marked for deletion prior to save. * We actually cannot save without compacting before save anyway. Removing the parameter from save() and hardcoding it to True until we can actually support it. * Addressing some PR comments and readying a 0.5.0.rc5 release
…ueue<SSDThreadData*> type, otherwise the default null_T is uninitialized, could point to arbitraty memory (#408)
* Some early staging for README updates and pyproject updates for a 0.6.0 release for diskannpy. * Trying to fix the CI badge to point toward main's latest build * Updating documentation for pdoc generation * Documentation updates. Tightened up the API to drop list support (there were entirely too many cases where it wouldn't work, and it's easier to just tell people to convert it themselves) * Some module reorganization to make pdoc actually display the docstrings for variables re-exported at the top level * A copy paste happened that shouldn't have. * Updating the apps to use the new 0.6.0 api * Addressing PR feedback * Some of the documentation changes didn't get made in both from_file or the constructor
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Reference Issues/PRs
What does this implement/fix? Briefly explain your changes.
Any other comments?