-
Notifications
You must be signed in to change notification settings - Fork 56
Description
This is about the pattern matching done on "paths" attribute of targets delegation. The language in the spec seems to contradict the reference implementation but neither is very clear.
I would like to see a mention of path separator "/" in the text: is it special cased WRT pattern matching or not.
Specification:
PATHPATTERN can include shell-style wildcards and supports the Unix filename pattern matching convention. Its format may either indicate a path to a single file, or to multiple paths with the use of shell-style wildcards.
I've been reading this spec for a year now and this has so far seemed clear to me: Unix filename pattern matching convention clearly means a glob where the path separator character never matches the wildcards. this is further emphasized by mention of shell-style wildcards: it's clear that the paths separator never matches in the shell.
Based on the above it seems clear that *.tgz should not match targets/foo.tgz -- this is how shell pattern matching works.
Examples in specification:
For example, the path pattern "targets/*.tgz" would match file paths "targets/foo.tgz" and "targets/bar.tgz", but not "targets/foo.txt". Likewise, path pattern "foo-version-?.tgz" matches "foo-version-2.tgz" and "foo-version-a.tgz", but not "foo-version-alpha.tgz"
The examples do not cover the interesting case: does *.tgz match targets/foo.tgz?
The reference implementation:
This is a simplified version, see https://github.com/theupdateframework/tuf/blob/develop/tuf/client/updater.py#L2868 for real code
if fnmatch.fnmatch("targets/foo.tgz", "*.tgz"):
# path pattern matches
fnmatch documentation explains that the filename separator ('/' on Unix) is not special to this module. This means that in the reference implementation *.tgz does match targets/foo.tgz -- this is not what I expected from reading the spec.