-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature arbitrary size encodings #366
Feature arbitrary size encodings #366
Conversation
Encodings of any number of bytes are now supported. Closes #163
Codecov Report
@@ Coverage Diff @@
## master #366 +/- ##
=======================================
Coverage 94.57% 94.57%
=======================================
Files 16 16
Lines 792 792
=======================================
Hits 749 749
Misses 43 43 |
Bumps [hypothesis](https://github.com/HypothesisWorks/hypothesis) from 5.41.0 to 5.41.1. - [Release notes](https://github.com/HypothesisWorks/hypothesis/releases) - [Commits](HypothesisWorks/hypothesis@hypothesis-python-5.41.0...hypothesis-python-5.41.1) Signed-off-by: dependabot-preview[bot] <support@dependabot.com>
Encodings of any number of bytes are now supported. Closes #163
@wilko77 did you find a way to make the coverage work for external PRs? As far as I can tell this is passing all the tests on each system and is ready for review. |
I started looking into the pipeline (#365). I made the coverage optional, that works. However, we then have another step where we upload the artifacts to a feed. And this still fails. It complains that the Anonlink build service hasn't got the permissions to publish there. |
My guess is that "Publish package to test feed" stage could be only enabled for the main branches - https://docs.microsoft.com/en-us/azure/devops/pipelines/process/conditions?view=azure-devops&tabs=yaml
Yeah I think if you could get the release pipeline to pick up all the artifacts directly you wouldn't need the test feed. I made it so we could test linked feature branches across the multiple projects. Primarily so anonlink-entity-service could test a not-yet released version of the anonlink library by pulling from the test feed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice.
A few minor things. See comments.
Thank you.
If you merge master, then you'll get the updated CI definition, and with it the satisfaction of seeing the CI pipeline succeed.
anonlink/similarities/_dice.pyx
Outdated
# To permit arbitrary size input_data, we may need to pad with zeros to align to a word boundary? | ||
# Eg say our input is 32 bits and our array size is 8, our output_size is already correct at 4 | ||
# but we may want to make the input 64 bits? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand this. I though you implemented arbitrary sized inputs? What is this issue here exactly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was me recording my thinking early on. As you know, before this PR the CPP code assumed inputs would be a multiple of WORD_SIZE
(8 bytes) in length, at least for supporting popcount
we could have padded the input with zeros up to the nearest word size boundary and avoided modifying the CPP code.
Ultimately I didn't take that approach anywhere so I'll remove the comment.
…gs' into feature-arbitrary-size-encodings # Conflicts: # anonlink/similarities/_dice.pyx # requirements.txt
Builds on #363 as suggested by @nbgl in #163
Implements in C a path to popcount and dice one to many when the encoding size is not word aligned. This means you can finally work on encodings of just 1 byte or 13 bytes (if you are so inclined).
I haven't done any benchmarking although I've expanded and added several tests.