Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: basic hash bin delegation repo example + test #1700

Merged
merged 4 commits into from
Dec 10, 2021

Conversation

lukpueh
Copy link
Member

@lukpueh lukpueh commented Dec 1, 2021

Fixes #1673 (together with #1685)

Description of the changes being introduced by the pull request:

As 'repository_tool' and 'repository_lib' are being deprecated, hash bin delegation interfaces are no longer available in this implementation. The example code in this file demonstrates how to easily implement those interfaces, and how to use them together with the TUF metadata API, to perform hash bin delegation.

Note, the hash bin delegation logic in this example is largely copied from 'repository_{lib, tool}', but modernized and simplified for this purpose.

Notes to reviewers

  • I sneaked in a tiny unrelated docfix (954c159). If someone is appalled by this breach of pull request protocol, I'm happy to move it to a separate one. :P
  • I would appreciate simplification-suggestions regarding my probably sometimes long-winded phrasing in the explanatory texts/comments of hashed_bin_delegation.py. (@jhdalek55 to the rescue!)
  • Besides purely technical linguistic help, I'd appreciate an assessment whether the example needs more/less/other explanations about TUF/Metadata API/Hash bin delegation, etc..

Thanks!!

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@coveralls
Copy link

coveralls commented Dec 1, 2021

Pull Request Test Coverage Report for Build 1562712885

  • 59 of 60 (98.33%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.005%) to 97.577%

Changes Missing Coverage Covered Lines Changed/Added Lines %
examples/repo_example/hashed_bin_delegation.py 59 60 98.33%
Totals Coverage Status
Change from base Build 1554266096: -0.005%
Covered Lines: 4091
Relevant Lines: 4176

💛 - Coveralls

lukpueh added a commit to lukpueh/tuf that referenced this pull request Dec 2, 2021
Add 1.0.0 announcment document and point to it in main README.

TODO:
- Commit message
- PR (blocks on theupdateframework#1693, theupdateframework#1675, maybe theupdateframework#1700)
@lukpueh lukpueh mentioned this pull request Dec 3, 2021
3 tasks
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
As 'repository_tool' and 'repository_lib' are being deprecated,
hash bin delegation interfaces are no longer available in this
implementation. The example code in this file demonstrates how to
easily implement those interfaces, and how to use them together
with the TUF metadata API, to perform hash bin delegation.

Note, the hash bin delegation logic in this example is largely
copied from repository_{lib, tool}, and modernized and simplified
for this purpose.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I don't have any specific comments on this.

In the future this definitely looks like something that would make sense to package with the repository API somehow:

  • manage a "bin container" Targets automatically so repository API user can just "add a new target to bin container" (and in reality target is added to the correct bin)
  • figure out how to handle NUMBER_OF_BINS changes (at least provide advice on when and how to handle growth in targets numbers) and other maintenance things.

... but like a lot of repository functionality it looks like something I'd really rather see experimented in an implementation before we add it to any library code

@lukpueh
Copy link
Member Author

lukpueh commented Dec 7, 2021

figure out how to handle NUMBER_OF_BINS changes

I had the same thought, while revising (and trying to re-understand) the binning logic. In reality a user does not care about the exact NUMBER_OF_BINS, but rather about an "appropriate" distribution of target files over bins. So it probably makes more sense to provide an interface, where the user specifies the expected number of target files (or corresponding classes of target numbers).

# The available digits in the hexadecimal representation of the number of bins
# (minus one, counting starts at zero) determines the length of any hash prefix,
# i.e. how many left digits need to be considered to assign the hash to a bin.
PREFIX_LEN = len(f"{NUMBER_OF_BINS - 1:x}") # 2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a simple word or comment on what 1:x does.
I had to find it online and because I hadn't used it, it wasn't trivial to me.
In that context, it could be useful to explain somewhere why we use hexadecimal strings.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment, @MVrachev! I just pushed a commit trying to make this clearer. Let me know what you think.

Copy link
Collaborator

@MVrachev MVrachev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one small suggestion.
Other than that as a reader who has never read anything specific about hash bin delegations
it seems understandable.
LGTM!

# by each individual bin (BIN_SIZE):
#
# The prefix length is the number of digits in the hexadecimal representation
# (see ':x') of the number of bins minus one (counting starts at zero), i.e. ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# (see ':x') of the number of bins minus one (counting starts at zero), i.e. ...
# (see "x" integer representation from https://docs.python.org/3/library/string.html#module-string)
of the number of bins minus one (counting starts at zero), i.e. ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Martin! I included a slightly modified version of this suggestion. Merging when CI/CD returns...

Tries to clarify the introductory text in the hash bin delegation
example.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@lukpueh lukpueh merged commit 8209189 into theupdateframework:develop Dec 10, 2021
lukpueh added a commit to lukpueh/tuf that referenced this pull request Dec 10, 2021
Following parallel merges of theupdateframework#1700 (added new test method),
and theupdateframework#1710 (started running mypy on tests), ci/cd fails in the
develop branch. This is fixed in this patch.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docs: Add repository tutorial based on metadata API
4 participants