Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to building stage repository by combining production and diff … #280

Merged
merged 1 commit into from Oct 27, 2023

Conversation

ehelms
Copy link
Member

@ehelms ehelms commented Sep 29, 2023

…of Copr

When the version is nightly, this will not pull from production but instead treat what is in Copr as the source of truth. At the end, the output for a versioned repository is a list of unsigned packages.

There is a lot of change here, and I can likely do some re-factoring now or after merge. I wanted to get a working version of the concept available.

Here is how this generates the stage repository.

For nightly:

  • Copy all RPMs from Copr for the given repository to local
  • Filter the downloaded RPMs through comps file, removing anything not in the comps file
  • Runs createrepo

This is done for RPM and SRPMs. For this workflow, Jenkins will run this script and push the RPMs via rsync to stagingyum.theforeman.org.

For releases:

  • Copy all RPMs from production (yum.theforeman.org) to a local repository
  • Run repodiff to identify new packages in Copr for the release
  • Download the new packages from Copr
  • Filter the downloaded RPMs through comps file, removing anything not in the comps file
  • Runs createrepo
  • Returns as stdout the list of unsigned RPMs and SRPMs

For this workflow, the release engineer will run this script, perform a signing of unsigned packages and push the RPMs via rsync to stagingyum.theforeman.org.

The idea is then that either one of these options will happen (via a follow up PR):

  1. New script will take as input the list of unsigned RPMs and sign them
  2. New script will calculate the unsigned RPMs with the location of the repository as input and sign them

@ehelms ehelms requested review from evgeni and ekohl September 29, 2023 01:42
@ehelms ehelms force-pushed the download-unsigned-packages branch 3 times, most recently from 59b8d6f to c54955a Compare September 29, 2023 18:08
sign_stage_rpms Fixed Show resolved Hide resolved
sign_stage_rpms Fixed Show fixed Hide fixed
sign_stage_rpms Fixed Show fixed Hide fixed
sign_stage_rpms Fixed Show fixed Hide fixed
sign_stage_rpms Fixed Show fixed Hide fixed
@ehelms
Copy link
Member Author

ehelms commented Sep 29, 2023

I decided to split this into:

  1. Build stage repository with script and new method described in original comment
  2. New script that for a given directory will list unsigned packages
  3. New script that calls list unsigned packages and then feeds them into the sign_rpms script

sign_stage_rpms Fixed Show fixed Hide fixed
sign_stage_rpms Fixed Show fixed Hide fixed
@ehelms ehelms force-pushed the download-unsigned-packages branch 2 times, most recently from 24a8598 to d572019 Compare October 2, 2023 12:32
upload_stage_rpms Fixed Show fixed Hide fixed
upload_stage_rpms Fixed Show fixed Hide fixed
upload_stage_rpms Fixed Show fixed Hide fixed
@@ -0,0 +1,8 @@
#!/bin/bash -e
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script is designed to work with this change theforeman/foreman-infra#1948 that allows infra users to rsync as yumrepostage user.

list_unsigned_rpms Outdated Show resolved Hide resolved
list_unsigned_rpms Outdated Show resolved Hide resolved
list_unsigned_rpms Outdated Show resolved Hide resolved
sign_stage_rpms Outdated Show resolved Hide resolved
@evgeni
Copy link
Member

evgeni commented Oct 18, 2023

Why do you need to "Copy all RPMs from production (yum.theforeman.org) to a local repository"?
For repodiff, you can just point it at the existing live repo instead.

It it so that you can generate "composed" repo (old plus new) locally, including module metadata? (Today, we push the diff only and the remote regenerates the repodata)

@ehelms
Copy link
Member Author

ehelms commented Oct 18, 2023

Why do you need to "Copy all RPMs from production (yum.theforeman.org) to a local repository"? For repodiff, you can just point it at the existing live repo instead.

It it so that you can generate "composed" repo (old plus new) locally, including module metadata? (Today, we push the diff only and the remote regenerates the repodata)

Yes. This was in service of:

  • the ability to generate a full stage repository locally
  • keep web01 dumb
  • generate module metadata outside of web01

Thinking about it more I could optimize and have support for that with an option to generate the whole thing locally but require web01 to be a tad smarter than it is today.

Optimized Workflow

For a release:

  1. Calculate repodiff between production and Copr
  2. Download identified packages from Copr
  3. Filter with comps
  4. Sign packages
  5. Copy signed packages to web01
  6. generate module metadata on web01
  7. update repo metadata

For nightly:

  1. Download all packages from Copr
  2. Filter with comps
  3. Copy all packages to stage
  4. generate module metadata on web01
  5. Createrepo

@evgeni
Copy link
Member

evgeni commented Oct 18, 2023

I think I am cool with the current flow (and generally in favor of dumbing down web01). Just needed to have a clearer picture.

Does downloading the existing RPMs from prod keep their filesystem timestamps, or would these be updated during rsync back to web01? If possible, I'd try to retain them, so people have a clearer picture when things changed.

@ehelms
Copy link
Member Author

ehelms commented Oct 18, 2023

I think I am cool with the current flow (and generally in favor of dumbing down web01). Just needed to have a clearer picture.

Does downloading the existing RPMs from prod keep their filesystem timestamps, or would these be updated during rsync back to web01? If possible, I'd try to retain them, so people have a clearer picture when things changed.

That will require some testing.

build_stage_repository Outdated Show resolved Hide resolved
@ehelms
Copy link
Member Author

ehelms commented Oct 24, 2023

import os
from email.utils import parsedate_to_datetime
from urllib.request import urlretrieve
filename, headers = urlretrieve(…)
if 'Last-Modified' in headers:
    modification_ts = parsedate_to_datetime(headers['Last-Modified']).timestamp()
    os.utime(filename, (modification_ts, modification_ts))

Do I understand correctly:

Your suggestion is to swap away from reposync to a method that downloads the RPMs individually and using this code chunk to do so in order to maintain the modification date.

@evgeni
Copy link
Member

evgeni commented Oct 24, 2023

Nah, just make me read the code better 🙈
The part that fetches prod repo is using reposync, not urlretrieve, that is used only for "new" stuff coming from copr.
Back to square one evgeni.

@ehelms
Copy link
Member Author

ehelms commented Oct 25, 2023

Nah, just make me read the code better 🙈

Does the new structure help at all?

@ehelms ehelms force-pushed the download-unsigned-packages branch 2 times, most recently from ea4e4b5 to b476b65 Compare October 25, 2023 12:14
build_stage_repository Outdated Show resolved Hide resolved
sign_stage_rpms Outdated Show resolved Hide resolved
sign_stage_rpms Outdated Show resolved Hide resolved
list_unsigned_rpms Outdated Show resolved Hide resolved
list_unsigned_rpms Outdated Show resolved Hide resolved

output = check_output(cmd, universal_newlines=True, stderr=STDOUT)

if gpgkey.lower() not in output:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should lower case it in handle_args?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was to make absolutely sure it's lowered when it's needed to avoid any future changes that might would break this identification of what is signed or not signed.

build_stage_repository Outdated Show resolved Hide resolved
Copy link
Member

@ekohl ekohl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also modify this as needed:

- Sign the RPMs in the release
- [ ] <%= rel_eng_script('download_rpms') %>
- [ ] <%= rel_eng_script('sign_rpms') %>
- [ ] <%= rel_eng_script('upload_rpm_signatures') %>
- [ ] <%= rel_eng_script('upload_rpms') %>
- Sign RPMs for client repos (call scripts with `PROJECT=client`)
- [ ] <%= rel_eng_script('download_rpms') %>
- [ ] <%= rel_eng_script('sign_rpms') %>
- [ ] <%= rel_eng_script('upload_rpm_signatures') %>
- [ ] <%= rel_eng_script('upload_rpms') %>

- [ ] <%= rel_eng_script('download_rpms') %>, <%= rel_eng_script('sign_rpms') %>, <%= rel_eng_script('upload_rpm_signatures') %>, <%= rel_eng_script('upload_rpms') %>

sign_stage_rpms Outdated Show resolved Hide resolved
@ehelms
Copy link
Member Author

ehelms commented Oct 25, 2023

Please also modify this as needed:

This is handled over here -- #285. I do not want to update the procedures until all the infrastructure is in place.

@ehelms
Copy link
Member Author

ehelms commented Oct 25, 2023

@evgeni I was able to add a partial optimization with filtering on download. The reason we cannot do a complete is repodiff returns the full NEVRA, e.g.

Added rubygem-pg-debuginfo-1.5.3-1.el8

And comps contains just the name rubygem-pg so we can at best do a token in match of the name.

@evgeni
Copy link
Member

evgeni commented Oct 25, 2023

You can split NEVRAs like this:

  package, _version, _release = nevra.rsplit('-', 2)

build_stage_repository Outdated Show resolved Hide resolved
build_stage_repository Outdated Show resolved Hide resolved
list_unsigned_rpms Outdated Show resolved Hide resolved
generate_stage_repository Outdated Show resolved Hide resolved
build_stage_repository Outdated Show resolved Hide resolved
build_stage_repository Outdated Show resolved Hide resolved
build_stage_repository Outdated Show resolved Hide resolved
@ehelms
Copy link
Member Author

ehelms commented Oct 25, 2023

I noticed that rpmsign does seem to help prevent us from re-signing an already signed package:

warning: tmp/foreman/3.8/el8/source/rubygem-unf_ext-0.0.8.2-1.el8.src.rpm already contains identical signature, skipping

settings Dismissed Show dismissed Hide dismissed
@@ -73,6 +73,7 @@

# Short GPGKEY is used by koji and is the last 8 chars or the full key
GPGKEY="$(echo ${FULLGPGKEY: -8} | tr '[A-Z]' '[a-z]')"
HALFGPGKEY="$(echo ${FULLGPGKEY: -16} | tr '[A-Z]' '[a-z]')"

Check notice

Code scanning / shellcheck

Double quote to prevent globbing and word splitting. Note

Double quote to prevent globbing and word splitting.
@@ -73,6 +73,7 @@

# Short GPGKEY is used by koji and is the last 8 chars or the full key
GPGKEY="$(echo ${FULLGPGKEY: -8} | tr '[A-Z]' '[a-z]')"
HALFGPGKEY="$(echo ${FULLGPGKEY: -16} | tr '[A-Z]' '[a-z]')"

Check notice

Code scanning / shellcheck

Don't use [] around classes in tr, it replaces literal square brackets. Note

Don't use [] around classes in tr, it replaces literal square brackets.
@@ -73,6 +73,7 @@

# Short GPGKEY is used by koji and is the last 8 chars or the full key
GPGKEY="$(echo ${FULLGPGKEY: -8} | tr '[A-Z]' '[a-z]')"
HALFGPGKEY="$(echo ${FULLGPGKEY: -16} | tr '[A-Z]' '[a-z]')"

Check notice

Code scanning / shellcheck

Don't use [] around classes in tr, it replaces literal square brackets. Note

Don't use [] around classes in tr, it replaces literal square brackets.
sign_stage_rpms Fixed Show fixed Hide fixed
sign_stage_rpms Fixed Show fixed Hide fixed
@ehelms
Copy link
Member Author

ehelms commented Oct 25, 2023

Updated with some included README workflow layout. This can be merged prior to (theforeman/foreman-infra#1948) for nightly, but theforeman/foreman-infra#1948 must be merged prior to this if I am to test this for releases and rsyncing to stagingyum.

@evgeni
Copy link
Member

evgeni commented Oct 27, 2023

One thing that I realized while reviewing theforeman/foreman-infra#1948:
Today, nothing cleans up ~yumrepostage/rsync_cache and the script that copies over the cache to the final destination doesn't use --delete. This is fine™, as we're aiming at a "merged" workflow for the release repos anyway, but it also means that the nightly repo might grow and not get cleanup (at least that's how I read the code). We could add a cleanup step once things get copied to the "final destination", but then the rsync from the release engineer workstation will be longer (as it will have to transfer the already signed files again).

Shouldn't block this PR, but something to keep an eye on.

Edit (after reading man rsync(1) for the tenth time): this PR uses --delete-after which implies --delete, so the cache will be cleaned up but the caller.

All good.

Copy link
Member

@evgeni evgeni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sprinkled a few comments here and there, but overall I like the flow!

list_unsigned_rpms Show resolved Hide resolved
list_unsigned_rpms Show resolved Hide resolved
generate_stage_repository Show resolved Hide resolved
build_stage_repository Outdated Show resolved Hide resolved
…of Copr

When the version is nightly, this will not pull from production but instead
treat what is in Copr as the source of truth. At the end, the output for
a versioned repository is a list of unsigned packages.
@ehelms ehelms merged commit c45ad89 into theforeman:master Oct 27, 2023
2 of 3 checks passed
Comment on lines +242 to +245
if ':' in version:
version_without_epoch = version.rsplit(':')[1]
else:
version_without_epoch = version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this too smart? version_without_epoch = version.rsplit(':', 1)[-1]
#286

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants