Investigate metadata scalability #39

Closed
trishankkarthik opened this Issue Mar 15, 2013 · 5 comments

Projects

None yet

1 participant

@trishankkarthik
The Update Framework (TUF) member

How does the implementation plan to handle metadata for a software update repository with a large number of targets and target delegations? Presently, it looks like the metadata will be quite large if left uncompressed for a sufficiently large number of targets and target delegations.

A few solutions:
1. Compress metadata with standard (e.g. GZIP) techniques.
2. Investigate metadata difference schemes.

@trishankkarthik
The Update Framework (TUF) member

#44 will give us some data about this issue.

@trishankkarthik
The Update Framework (TUF) member

Things to do efficiently: downloading only the subset of target metadata relevant to the target file in question; downloading as much as possible in as few HTTP requests as possible.

@trishankkarthik
The Update Framework (TUF) member

See #57 for a method to reduce metadata size in the common case where a delegated role is trusted with wildcard target paths.

@trishankkarthik
The Update Framework (TUF) member

Maybe consider binary data exchange formats, such as Protocol Buffers or Cap'n Proto.

@vladimir-v-diaz vladimir-v-diaz added a commit that referenced this issue Jul 29, 2013
@vladimir-v-diaz vladimir-v-diaz Continue design changes to address issues #57, #39, #48
A directory listed under the "paths" field of a parent metadata delegation is understood to mean all
subdirectories and files the delegated role is trusted to update.  The delegated role has the option
of specifying multiple, arbitrary, and explicit file paths & directories.  The previous implementation
allowed explicit file paths in the "paths" field of the parent role metadata.  This commit modified
this behaviour to allow directories (replicating wildcards) to minimize the size of parent metadata.
ef7a551
@trishankkarthik trishankkarthik referenced this issue Jul 31, 2013
Merged

Metadata #79

@trishankkarthik
The Update Framework (TUF) member

The tentatively-named "lazy bin walk" scheme to address metadata scalability is discussed in our design document for PyPI+TUF.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment