Skip to content

Latest commit

 

History

History
executable file
·
288 lines (224 loc) · 11.6 KB

upset.md

File metadata and controls

executable file
·
288 lines (224 loc) · 11.6 KB

upset


The upset script on sourceware does lots of stuff. This document attempts to describe what it currently does, and enumerate it’s problems and shortcomings.

A package

A package consists of: a most one install tar file (package-version-release.tar.compression), at most one source tar file (package-version-release-src.tar.compression), a setup.hint file.

.xz, .bz2 and .gzip compression are permitted. genini also allows .lzma.

version and release must start with a digit. release must not contain a hyphen.

Issue: Both version and release historically have been allowed to contain a hyphen. The ambiguity in splitting package-version can be resolved in upset since the containing directory is always named package, but elsewhere this is may not be possible. In some places (?) the version is assumed to start at the first digit after a hyphen. This makes package names like qt-3-gcc3 invalid.

Issue: Since setup.hint is not package-version-release named, all packages in the same directory must share the same setup.hint. This causes problems when the dependencies change between package versions.

Issue: If more than one install or source file exists for the same package-version-release (i.e. with different compression), it’s undefined which one will be used, and is not reported as a problem.

A maintainer

A maintainer has the ability to upload arbitrary files to their upload area.

Each package has zero (if it is orphaned) or more maintainers.

upset accepts files for a set of packages from the upload area.

The base names of the packages are listed in cygwin-pkg-maint. A package is considered to match if it is located in path which contains a directory starting with the base package name (ignoring case), followed by a word boundary.

Certain maintainers are granted the ability to upload all orphaned packages as well.

Issue: The package name matching rule avoids the need to explicitly specify foo, foo-doc, foo-debug, foo_rebase, libfoo1, girepository-foo, etc., but means that a maintainer can upload a package with any name in the paths they are permitted to use. The only thing preventing arbitrary package uploads is that the same package is not allowed to exist at multiple paths.

Issue: Rather than parsing cygwin-pkg-maint, upset reads a !packages file left in the maintainer upload directory by mkpkgdir for the list of packages for a maintainer. Hopefully permissions prevent a maintainer from overwriting !packages.

Issue: Maintainers are allowed to upload a replacement for an existing tar file. This should not be permitted as (i) packages are supposed to be immutable, and (ii) when someone has already installed the previous tar file, setup (correctly) reports that it has the file in the download cache, but setup.ini gives a different hash, and requires manual removal of the previous tar file to proceed. To enforce this does open up the possibility not being able to replace a corrupt upload, so perhaps some additional checking should be done of the upload (e.g. check against an uploaded sh512.sum) before the file is accepted.

setup.hint parsing

Issue: The encoding of setup.hint is unspecified. Historically both ISO-8859-1 and UTF-8 have been used. UTF-8 probably displays correctly in the HTML package pages, but not in setup. Either specify UTF-8 and eventually fix setup, or specify ASCII.

Issue: The setup.hint sdesc gets mangled before being put into setup.ini (but not the HTML package list). In particular, it is forced to start with a capital letter (which is incorrect when the sdesc starts with a command name which is properly lower-case), and any text up to and including the first colon is removed, presumably in an effort to prevent people writing the package name again (which mangles perl and ruby module names)

Issue: In setup.ini key values, a multi-line double-quoted value is terminated only by a double-quote at the end of the line, and embedded double-quotes are transformed to single-quotes. There is no escaping of embedded double-quotes, and no way to represent one. In addition, spaces after the leading quote are magically stripped.

Issue: sdesc is expected to be a single line, but this is not checked for.

Issue: Erroneously transposed sdesc and ldesc seems to happen occasionally, and is not detected.

Issue: The documentation claims that a category name could be multiple words enclosed in quotation marks, but I don’t think this would be handled correctly.

Issue: genini validates that category names are in a fixed list. upset does not.

Issue: genini requires that sdesc and ldesc are double-quoted. upset does not.

Issue: It’s unclear if or how source-only packages should be explicitly identified as such. Some are marked with skip, which leads to them being omitted from setup.ini and the HTML package list, whereas some just have no install versions, which leads to them appearing in those lists.

Paths

Various paths are hardcoded into upset or the script used to invoke it.

  • The package search and file list web pages /www/sourceware/htdocs/cygwin/packages/
  • The release area /var/ftp/pub/cygwin/arch/release/
  • The maintainer upload directory /sourceware/cygwin-staging/home/
  • The deleted files vault /sourceware/snapshot-tmp/cygwin

What it does

  1. Upset is run from cron at 10 minutes past the hour
  2. Extract the current setup version from the setup-arch.exe symlink’s target
  3. Make a list of all the HTML package list pages
  4. Read the existing setup.ini
  5. Recursively scan directories given on the command line (the release area and maintainer upload directory)
  • it’s an error if the same package name is used with different relative paths below the base directory, it’s an error if there is no setup.hint file
  • files with a timestamp newer than !ready aren’t processed
  • parse setup.hint file
  • record tar files
  • for each package
    • unless package is marked skip
    • assign versions to stability levels
  1. Delete any !ready files
  2. Move files requested for deletion to vault, and build a list of files to move from upload to release area
  • stop if an error occurred
  1. Update the package listing pages
  • If it doesn’t already exist
    • Write a HTML file named after the tar file package/package-version-release(-src), listing it’s contents, to be used by the website package search.
    • Also create a .htaccess file in package/
  • Remove listing files for which there is no corresponding tar file
  • Remove any empty listing directories
  • Write packages.inc, a HTML list of packages used by the website package list page.
  1. As a side effect of the previous step, if we are reading the file list for the package, each filename is checked for autodep. Otherwise, any autodep requires from the existing setup.ini are preserved. For details, see below.
  2. Write temporary file containing new setup.ini contents
  • header
    • a comment indicating the file is automatically generated
    • release, a release identifier, as given on command line (‘cygwin’)
    • arch, as given on command line
    • setup-timestamp, the current time
    • setup-version, as determined above
  • for each package
    • do nothing if skip or no install tar files (i.e. is source only)
    • must have category and sdesc
    • ldesc, requires and message lines are suppressed if empty
    • packages listed in requires must exist, and have a curr version
    • category names are forced to start with a capital letter
    • if there is no source tar file, follow external-source
    • write package entry
  • stop if an error occurred
  1. Move files in the list of file to move
  • stop if an error occurred
  1. If more than the setup-timestamp has changed, replace setup.ini with the temporary file, and bz2 compress it, otherwise discard the temporary file
  2. Send email of errors and warnings to all maintainers
  3. gpg sign setup.ini, setup.bz2

Issue: Step 1 means maintainers have to wait up to an hour to discover if their were any problems with their upload. Perhaps it would be better if upset ran on-demand.

Issue: Step 2 seems very fragile. There is fallback code to extract the version from the setup binary, but this won’t work on UPX compressed binaries. UPX does have a the ability to leave some resources uncompressed, for use with the manifest, maybe something could be done using that.

Issue: The only data which appears to be retained from step 4 is autodep dependencies, as described later.

Issue: To reuse the step 5 mechanism for the release area, a file !always_ready is treated as a !ready file with a mtime of MAXINT, and such a file must exist in the release area.

Issue: Step 8 has the problem that if a package’s sdesc is updated, this page does not change to reflect it. (Although re-writing these pages on every run would have a prohibitive performance cost, since reading the file catalogue from a tar file is expensive)

Issue: In step 8, cleaning up empty listing directories isn’t working because there is always a .htaccess present which causes rmdir to fail

Issue: Step 8 presents the file timestamps as reported by tar. They are, I think in UTC, but perhaps we could stand to be clearer about that, and ensure that is the case if the script is run in a non-UTC timezone?

Issue: The upshot of step 9 is that the HTML package file listing acts as a marker that the package file list has been read on a previous run, and the requires for the package read from setup.ini in step 4 contain autodeps which should be preserved

Issue: In step 13, all maintainers are sent the same mail. It would be more logical to send a maintainer errors for only their packages, and send project leads all errors.

Issue: In step 14, gpg signatures are always regenerated, even if setup.ini didn’t change, which causes some unnecessary mirror traffic.

Issue: No verbose mode. If you want to investigate a problem you have to add debug output to diagnose what happened.

Issue: The environment that upset is supposed to run in is undocumented. This makes duplicating it to test changes a challenge.

Issue: There are numerous other package set integrity checks which could be done.

Issue: upset tries to operate in an atomic fashion, only changing the package set if the result is valid, but fails in several areas. e.g. nothing prevents deleting a file which is required by another package.

Sorting

For setup.ini and package.inc, package names are sorted in an order which puts numbers first, then letters (case-insensitively), then punctuation.

autodep, noautodep, incver_ifdep

If we are reading the file list for the package, each non-symlink pathname is matched against the autodep regexes for all packages. If a regex matches, the package with the autodep is added to the requires for that package, unless it is listed in noautodep.

incver_ifdep causes the package and source package to be renamed with an incremented version number. The unimplemented verpat gives the format of package-version-release for the package. At the moment version is assumed to be a sequence of only digits, at least 3 digits long.

Otherwise, any autodep requires from the existing setup.ini are preserved.

Empty archives

A compressed empty file may appear in place of a compressed empty tar archive. A file of 32 bytes or less is assumed to be one of these compressed empty files. The reasons for this are lost in the mists of history...