Skip to content

dvzrv/arch-repo-management

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

arch-repo-management

This is a PoC for a modular, Python based tool, that can manage git based repositories (the details are explained below) and their respective binary repository counterparts. The two not necessarily have to be located on the same host.

The below implementational details need to be mocked and tested against each other.

Git repository layout

The package repositories (one git repository for each pkgbase) are managed by their maintainers:

package_a
package_b
package_c

Following is an attempt to draw out a couple of potential designs for how to keep track of which version of which package is currently in which (binary package) repository. They all share a common design: Keep track of all available PKGBUILD files of all packages in a single (architecture specific, e.g. x86_64) repository (grouped by repository):

x86_64
  core
    package_a
  core-debug
  extra
    package_b
  extra-debug
  staging
  testing
  community
    package_c
  community-debug
  community-staging
  community-testing
  gnome-unstable
  kde-unstable
  multilib
  multilib-debug
  multilib-staging
  multilib-testing

Per-architecture monorepo using subtree

The different packages are being tracked using git-subtree.

Pros:

  • low complexity operations with logging (e.g. git-mv to move packages from one repo to the other)
  • git-log can carry all information of all packages
  • can manage the subtrees the same as the main repository

Cons:

  • more features than required
  • potentially intransparent to use
  • potentially very slow over time
  • potentially convoluted git-log as it's harder to define messages specifically

Per-architecture monorepo using read-tree

The different packages are merged in using git-read-tree.

Pros:

  • low complexity operations with logging (e.g. git-mv to move packages from one repo to the other)
  • git-log messages can be defined more easily
  • higher transparency of what gets merged

Cons:

  • potentially very slow over time

Per-architecture monorepo using submodule

The different packages are tracked using git-submodule.

Pros:

  • declarative syntax in .gitmodules
  • transparent locking/ pinning

Cons:

  • moving becomes potentially more complex (deinit/init)
  • updating becomes potentially more complex
  • git-log only reflects to which commit a submodule was updated
  • needs submodule init --update --recursive to be fully functional

Binary repository layout

The git repository layout directly reflects the binary repository layout. This means, that the location of a package's git repository in its specific location needs to match its built package in the respective binary repository (which is implemented by a symlink from a pool directory)

If package_a in version 1:2-3 is in:

x86_64
  core
    package_a

its binary package will be symlinked from the pool to the respective location:

core
  os
    x86_64
      core.db
      [..]
      package_a-1:2-3-x86_64.pkg.tar.xz -> ../../../pool/package_a-1:2-3-x86_64.pkg.tar.xz
      package_a-1:2-3-x86_64.pkg.tar.xz.sig -> ../../../pool/package_a-1:2-3-x86_64.pkg.tar.xz.sig
      [..]
pool
  [..]
  package_a-1:2-3-x86_64.pkg.tar.xz
  package_a-1:2-3-x86_64.pkg.tar.xz.sig
  [..]

Workflows

In this section the different workflows are listed, to give an overview, what they would mean in the different git repository layouts.

Adding a Package

Developer machine/ build server:

  1. Create repository
  2. Update, build package and commit changes in package's PKGBUILD
  3. Tag release
  4. Sign package
  5. Upload built package and signature
  6. Call application on repository/ package server to add package

Repository server/ package server:

Important

The following steps need to be atomic (reversable).

  1. Verify user permissions
  2. Lock package database and monorepo
  3. Inspect built files of package
  4. Lock tags (by storing them in package's bare repository)
  5. Modify monorepo to reflect changes
  6. Verify package file versioning and tag is consistent
  7. Copy built package and signature to pool and create symlink to them in target repository
  8. Add package to the package database
  9. Unlock package database and monorepo

Updating a Package

All steps, but the first, of Developer machine/ build server in Adding a Package apply.

All steps of Repository server/ package server in Adding a Package apply.

Removing a Package

Developer machine/ build server:

  1. Call application on repository/ package server to remove package

Repository server/ package server:

Important

The following steps need to be atomic (reversable).

Note

The remove command should be able to remove stale packages (e.g. leftover packages, when removing a member of a split package)

  1. Verify user permissions
  2. Lock package database and monorepo
  3. Modify monorepo to reflect changes
  4. Remove package from the package database
  5. Remove built package and signature from pool and remove symlink to them in target repository
  6. Unlock package database and monorepo

Moving a Package

Developer machine/ build server:

  1. Call application on repository/ package server to move package

Repository server/ package server:

Important

The following steps need to be atomic (reversable).

  1. Verify user permissions
  2. Lock source and target package databases and monorepo
  3. Modify monorepo to reflect changes
  4. Remove package from the source package database
  5. Add package to the destination package database
  6. Remove symlinks to package and signature files from source repository and add them to the target repository
  7. Unlock source and target package databases and monorepo

TODO

Following are a set of proposed tests to derive the best possible implementation from this.

Unit Tests

All submitted code should have 100% unit test coverage and be documented.

Integration Tests

The different repository layout approaches need to be mockable, by creating fixtures from scratch in a test run (for reproducibility). The tests should be able to cover use-case in which a couple of thousand operations can be mocked in sequence to track and measure the eventual required turnaround time of each approach.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors