Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize vector renormalisation #3928

Merged
merged 4 commits into from
Mar 27, 2024
Merged

Stabilize vector renormalisation #3928

merged 4 commits into from
Mar 27, 2024

Conversation

timvisee
Copy link
Member

@timvisee timvisee commented Mar 27, 2024

Preprocessing or normalizing vectors should be stable. Meaning that preprocessing or normalizing a second time should give the exact same values.

We need stability because transferring points to another shards with stream records transfer (and SyncPoints) normalizes vectors again. Their values must remain exactly the same.

This PR makes renormalization stable by using 1e-6 as normalization threshold. It adds a test to assert it works as expected for our use case.

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you formatted your code locally using cargo +nightly fmt --all command prior to submission?
  3. Have you checked your code using cargo clippy --all --all-features command?

@timvisee timvisee changed the title Add stable renormalization test Stabilize vector renormalisation Mar 27, 2024
@timvisee timvisee marked this pull request as ready for review March 27, 2024 15:56
@timvisee timvisee requested a review from generall March 27, 2024 15:56
@timvisee timvisee merged commit 6cfc97c into dev Mar 27, 2024
17 checks passed
@timvisee timvisee deleted the test-stable-normalize branch March 27, 2024 16:08
timvisee added a commit that referenced this pull request Apr 22, 2024
* Add normalization stability test

* Move stable preprocess test closer to implementation

* Make renormalizing stable, keep old values if within certain threshold

* Test with 1500 dimensions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants