Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate metadata scalability #39

Closed
trishankkarthik opened this issue Mar 15, 2013 · 5 comments

Comments

@trishankkarthik
Copy link
Member

commented Mar 15, 2013

How does the implementation plan to handle metadata for a software update repository with a large number of targets and target delegations? Presently, it looks like the metadata will be quite large if left uncompressed for a sufficiently large number of targets and target delegations.

A few solutions:

  1. Compress metadata with standard (e.g. GZIP) techniques.
  2. Investigate metadata difference schemes.

@ghost ghost assigned trishankkarthik Mar 15, 2013

@trishankkarthik

This comment has been minimized.

Copy link
Member Author

commented Mar 19, 2013

#44 will give us some data about this issue.

@trishankkarthik

This comment has been minimized.

Copy link
Member Author

commented Apr 1, 2013

Things to do efficiently: downloading only the subset of target metadata relevant to the target file in question; downloading as much as possible in as few HTTP requests as possible.

@trishankkarthik

This comment has been minimized.

Copy link
Member Author

commented Apr 1, 2013

See #57 for a method to reduce metadata size in the common case where a delegated role is trusted with wildcard target paths.

@trishankkarthik

This comment has been minimized.

Copy link
Member Author

commented Apr 2, 2013

Maybe consider binary data exchange formats, such as Protocol Buffers or Cap'n Proto.

vladimir-v-diaz added a commit that referenced this issue Jul 29, 2013
Continue design changes to address issues #57, #39, #48
A directory listed under the "paths" field of a parent metadata delegation is understood to mean all
subdirectories and files the delegated role is trusted to update.  The delegated role has the option
of specifying multiple, arbitrary, and explicit file paths & directories.  The previous implementation
allowed explicit file paths in the "paths" field of the parent role metadata.  This commit modified
this behaviour to allow directories (replicating wildcards) to minimize the size of parent metadata.
@trishankkarthik trishankkarthik referenced this issue Jul 31, 2013
@trishankkarthik

This comment has been minimized.

Copy link
Member Author

commented Aug 5, 2013

The tentatively-named "lazy bin walk" scheme to address metadata scalability is discussed in our design document for PyPI+TUF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant
You can’t perform that action at this time.