Skip to content

Conversation

LukeShu
Copy link
Contributor

@LukeShu LukeShu commented Jun 22, 2018

This patchset is intended to clean up dbscripts' interaction with SVN.
Besides (IMO) improved code clarity, this should make it easier for those
who wish to replace SVN with something else.

There are two major themes here:

  1. Avoid asking SVN for information; to the extent possible, get all
    information directly from the pacman repos.
  2. Isolate SVN-interfacing code in to functions; making it clear what
    interfaces the PKGBUILD tracking needs to provide.

This does not touch the test suite, which is still firmly SVN-dependent.

I've cc'd Florian Pritz because of his related work on migrating to git.
https://wiki.archlinux.org/index.php/User:Bluewind/dbscripts-rewrite

v2:

  • Add a test to verify that db-move works with single-arch packages;
    which was broken in v1 of this patchset
  • Don't break db-move with single-arch packages

This is identical to v2 of the patchset submitted to the mailing list https://lists.archlinux.org/pipermail/arch-projects/2018-June/004930.html

@eli-schwartz
Copy link
Contributor

Compare #12

@LukeShu
Copy link
Contributor Author

LukeShu commented Jul 7, 2018

v3:

  • rebase, resolving conflicts in db-move

@LukeShu LukeShu force-pushed the lukeshu/to-upstream/dont-parse-pkgbuild branch from e6f6763 to 95341f0 Compare July 7, 2018 09:04
@LukeShu
Copy link
Contributor Author

LukeShu commented Jul 7, 2018

v4:

  • Be shaped more like Preliminary work on refactoring svn into private functions #12 (partially cherry picking a commit from it)
    • Name the file db-functions-$VCS, rather than db-abs
    • Add a commit search/replacing "svn" to "vcs" as appropriate in variable names and such
  • Name the functions vcs_* instead of abs_*
  • Add "usage:" doc comments to all public functions in db-functions-svn

For any reviewers:

  • First commit: Generic addition of a test for something that v1 broke, but no test caught
  • Next 4 commits: Where it makes sense, use the pacman repos as the source of truth, rather than SVN
  • Final 4 commits: Pull remaining SVN code in to a separate file with clean interfaces

/cc @Bluewind @gbsf

@LukeShu
Copy link
Contributor Author

LukeShu commented Jul 9, 2018

If #13 gets merged, I have a modified version of this PR ready to go that uses the dbquery function added there, rather than the bsdtar+awk that @eli-schwartz expressed concern about on the ML.

@LukeShu LukeShu force-pushed the lukeshu/to-upstream/dont-parse-pkgbuild branch from 95341f0 to 821c0b9 Compare July 26, 2018 17:22
@LukeShu
Copy link
Contributor Author

LukeShu commented Jul 26, 2018

v5:

Copy link
Contributor

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall these test changes look good independently of the actual refactoring work.

done

if bsdtar xfO "$FTP_BASE/extra/os/x86_64/extra.db" pkg-split-a1-1-1/desc >/dev/null; then return 1; fi
bsdtar xfO "$FTP_BASE/extra/os/x86_64/extra.db" pkg-split-a2-1-1/desc >/dev/null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this use the same style as checkRemovedPackageDB? Acknowledging that that function currently checks if the pkgname appears at all, what about extending it to fulfill this use case as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkRemovedPackageDB takes a pkgbase, but this test very specifically needs to deal with a specific pkgname in a split package. I suppose doing the check in the same style:

if bsdtar xfO "$FTP_BASE/extra/os/x86_64/extra.db" | grep pkg-split-a1; then return 1; fi
   bsdtar xfO "$FTP_BASE/extra/os/x86_64/extra.db" | grep pkg-split-a2

would work, but I don't really see the point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it should use the -xf option to bsdtar instead of the xf format which is strictly for compatibility.

Also why do we check there in every db (for all arches and also, apparently, in the files database), using DBEXT as relevant...

But right here you hardcode one db and check just it. Also, oneliner if statement and incorrectly indented check for pkg-split-a2, plus not quieting the output of grep.

Copy link
Contributor Author

@LukeShu LukeShu Oct 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) Sure, I'll edit it to use the -

(2) You're right, hard-coding x86_64 was a mistake, I should have done this inside of the loop.

(3) Why do the other tests use $DBEXT instead of hard-coding .db? pacman itself hard-codes .db; we can count on that symlink working for read-only operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I guess using DBEXT without stripping the suffix, would almost make sense, but currently I'm not sure why it does what it does.


local dirarches=() pkgbuildarches=()
local pkgbuild dirarch pkgbuildver
for pkgbuild in "${TMP}/svn-packages-copy/${pkgbase}/repos/${repo}-"+([^-])"/PKGBUILD"; do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mention a fear of regular globs not correctly handling $repo with a - in it... I guess the other solution would be to match on ${ARCHES[@]}.

svn up -q "${TMP}/svn-packages-copy/${pkgbase}"

if __isGlobfile "${TMP}/svn-packages-copy/${pkgbase}/repos/${repo}-"+([^-])"/PKGBUILD"; then
return 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"While we're at it, just use an array assignment" -- it's not array assignment? It's just glob expansion plus checking that it expanded non-null.

Commit message for this is slightly confusing IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that was left over in the commit message from a previous revision. You're correct, there is no array assignment in this revision.

# every arch).
(( ${#dirarches[@]} > 0 ))
read -d '' -r -a dirarches < <(printf '%s\0' "${dirarches[@]}" | sort -uz)
read -d '' -r -a pkgbuildarches < <(printf '%s\0' "${pkgbuildarches[@]}" | sort -uz)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use mapfile here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.

@LukeShu LukeShu force-pushed the lukeshu/to-upstream/dont-parse-pkgbuild branch from 821c0b9 to 90e8cf7 Compare October 8, 2018 01:15
@LukeShu
Copy link
Contributor Author

LukeShu commented Oct 8, 2018

v6:

  • Rebase onto master
  • Contents:
    • db-remove.bats: Check .files databases, not just .db
    • db-remove.bats: Check all arches, not just x86_64
    • db-remove.bats: Change how we check for presence in a database
    • common(-svn).bash: Use mapfile -d '' instead of read -d '' -r -a
  • Commit messages:
    • "test: checkPackageDB() ": Add a missing space
    • "test: Resolve ": Reword the final bullet, removing reference to
      array assignment that isn't actually added in the commit.

@eli-schwartz
Copy link
Contributor

Cherry-picking test fixes here: https://github.com/archlinux/dbscripts/commits/test-fixes

Huh, suddenly an error in if bsdtar -xf "$FTP_BASE/extra/os/x86_64/extra.${db}" -O | grep pkg-split-a1 >/dev/null; then

@LukeShu
Copy link
Contributor Author

LukeShu commented Oct 8, 2018

Oops! that needs to be ${arch} instead of x86_64.

@LukeShu LukeShu force-pushed the lukeshu/to-upstream/dont-parse-pkgbuild branch from 90e8cf7 to 1a6cb0a Compare October 8, 2018 02:54
@LukeShu
Copy link
Contributor Author

LukeShu commented Oct 8, 2018

v7:

  • Fix that issue x86_64 being hard-coded.

LukeShu and others added 11 commits October 7, 2018 23:10
It is important that db-remove be able to remove a single pkgname, without
being able to look it up by pkgbase in SVN.  For instance, when a split
package update removes one of its members; there will be no reference to
the removed pkgname in SVN, and it won't be removed by db-update.  If
db-remove doesn't accept pkgnames, then this outdated orphan could not be
removed.
 - Implement the TODO by keeping a list/set of architectures found via
   "$repo-$arch" directory names, and another list/set of architectures
   named in arch=() in the PKGBUILD(s).  This means turning the simple
   `compgen` in to a loop.

 - While we're at it loading PKGBUILDs in a loop, fix that it clearly
   isn't doing anything with the $pkgver argument; it should verify that
   the version in the found PKGBUILD(s) matches that argument.

 - Use extglob to more strictly match the "arch" part of the "repo-arch"
   dirname; the existing glob wouldn't have behaved correctly for values
   of $repo containing a "-" (like "community-testing").  We don't
   currently test with any of those, but it makes me nervous.

 - Also make that extglob change in checkRemovedPackage.  While we're at
   it, let Bash do the glob expansion normally and check it with
   __isGlobfile, instead of compgen; it means we don't have to worry about
   escaping the non-glob part of the string.  Not that we expect it to
   contain anything needing escaping, but again, it makes me nervous.
In a patchset that I recently submitted, Eli was concerned that I was
parsing .db files with bsdtar+awk, when the format of .db files isn't
"public"; the only guarantees made about it are that libalpm can parse it.

https://lists.archlinux.org/pipermail/arch-projects/2018-June/004932.html

I wasn't too concerned, because `ftpdir-cleanup` and `sourceballs` already
parse the .db files in the same way.  Nonetheless, I think Eli is right: we
shouldn't be parsing these files ourselves.

So, add an `arch_expac` function that uses `expac` to parse the .db files.
`expac` is particularly palatable as a new dependency, as it seems likely
that it will be included in future releases of pacman.  `expac` (like the
underlying libalpm) doesn't offer an easy way to say "parse this DB file
for me"; instead, we must construct a pacman.conf that has a repo pointing
to that file with a `Server=file://...` line.
Don't try to parse PKGBUILD files from SVN; all of the information we need
is already in the DBEXT files.  Several programs use [[ -f PKGBUILD ]] or
[[ -r PKGBUILD ]] on files from SVN; those checks can stay, just remove all
instances of actually trying to *parse* those files.

As an exception, don't modify parse_pkgbuilds.sh (which is called by
check_packages.py, which is called by cron-jobs/integrity-check).

Most of the attributes we need have been present in the DBEXT files for as
long as `repo-add` has been part of pacman:

    attribute |   git    |  git date  |  ver  |  ver date
    ----------+----------+------------+-------+-----------
    FILENAME  | aa1c0ba9 | 2006-11-20 | 3.0.0 | 2007-03-25
    NAME      | aa1c0ba9 | 2006-11-20 | 3.0.0 | 2007-03-25
    VERSION   | aa1c0ba9 | 2006-11-20 | 3.0.0 | 2007-03-25
    ARCH      | aa1c0ba9 | 2006-11-20 | 3.0.0 | 2007-03-25
    BASE      | 4b21504f | 2009-07-22 | 3.3.0 | 2009-08-02

However, `BASE` for split packages is slightly newer, and its presence
relies not only on the `repo-add` version, but also the `makepkg` version
used to build the package.

As of 2018-06-15, the oldest package in any of the Arch Linux repos is
community/gimp-refocus, built on 2013-07-22; I don't believe it is
necessary to handle packages from before that change was made (before
`BASE` was set).

"Ignore space change" might be useful when viewing this diff.
This should have the same result, but be a bit less confusing to read.

Also, have a slightly more specific error message on a missing PKGBUILD.
testing2x needs to:
 1. Verify that the new version of the package is found in TESTING_REPO
 2. Determine which of STABLE_REPOS the old version of the package is in

Currently, to do those things, it checks for the existence of PKGBUILD
files in SVN.  That information is already stored in the DBEXT files; get
it from there, instead of talking to SVN.

"Ignore space change" might be useful when viewing this diff.
Introduce "db-functions-$VCS" which will eventually contain all
VCS-specific code, and make this configurable in config.

Move private arch_svn function and svn acl handling here.

(Luke Shumaker: this is a partial cherry pick of commit 3bb6775)
…-svn' file

Moving all SVN code in to a separate file means both that:
 1. It is easier to identify the interactions with SVN, when considering a
    replacement.
 2. It is easier to swap out the one file if/when replacing SVN with
    something else.

Put another way: try to be less tightly coupled with SVN.

Some of these functions could stand to be cleaned up a touch--for
obviousness' sake, to the extent possible, the code is copied verbatim
from where it was inline before, and modified as little as possible.
…-svn.bash' file

Similarly to the main code, factor out all SVN-specific code in the tests
in to a separate `common-$VCS.bash`.  This will allow anyone writing a
different `db-functions-$VCS` file to actually test that file.
Currently, vcs_move_preflight_check and vcs_move_arch require the
exact pkgarch (i.e. "any" or "x86_64").  Modify them so that they take
the tarch (i.e. "x86_64") and figure out themselves if it needs to set
pkgarch=$tarch or pkgarch=any.

vcs_export could probably stand to be modified to do something
similar, but it's more work to adjust its callers.

"Ignore space change" might be useful when viewing this diff.
This does not make any "real" changes, it just gets rid of saying "svn" in
function-names, variable-names, file-names, and comments; when the code is
actually now VCS-agnostic.

This makes no changes more sophisticated than the simple search/replace.
@LukeShu LukeShu force-pushed the lukeshu/to-upstream/dont-parse-pkgbuild branch from 1a6cb0a to 089ed94 Compare October 8, 2018 03:11
@LukeShu
Copy link
Contributor Author

LukeShu commented Oct 8, 2018

v8:

  • rebase on to master

@eli-schwartz
Copy link
Contributor

I merged my version, but cherry-picked the testsuite fixes from here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants