Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent failed upgrade from restarting in initramfs #996

Merged
merged 2 commits into from
Jan 20, 2023

Conversation

matejmatuska
Copy link
Member

@matejmatuska matejmatuska commented Dec 7, 2022

When the upgrade fails in the initramfs the dracut shell is entered.
Upon exiting the dracut shell, the upgrade.target is restarted which
causes the upgrade.service, which runs the leapp upgrade, to rerun as
well.

This commit fixes that by creating a "flag" file when the upgrade
fails, whose existence is checked before reruning the upgrade and the
upgrade is prevented in such case.

Also, a new removeupgradeartifacts actor is introduced to clean up leftover upgrade artifacts, including the upgrade failed flag file, at the beginning of the upgrade process.

Jira ref.: OAMG-4224
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1905519

@github-actions
Copy link

github-actions bot commented Dec 7, 2022

Thank you for contributing to the Leapp project!

Please note that every PR needs to comply with the Leapp Guidelines and must pass all tests in order to be mergeable.
If you want to request a review or rebuild a package in copr, you can use following commands as a comment:

  • review please to notify leapp developers of review request
  • /packit copr-build to submit a public copr build using packit

To launch regression testing public members of oamg organization can leave the following comment:

  • /rerun to schedule basic regression tests using this pr build and leapp*master* as artifacts
  • /rerun 42 to schedule basic regression tests using this pr build and leapp*PR42* as artifacts
  • /rerun-all to schedule all tests (including sst) using this pr build and leapp*master* as artifacts
  • /rerun-all 42 to schedule all tests (including sst) using this pr build and leapp*PR42* as artifacts

Please open ticket in case you experience technical problem with the CI. (RH internal only)

Note: In case there are problems with tests not being triggered automatically on new PR/commit or pending for a long time, please consider rerunning the CI by commenting leapp-ci build (might require several comments). If the problem persists, contact leapp-infra.

@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch from b2d5f2f to f9bbe50 Compare December 8, 2022 11:18
@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch 2 times, most recently from c790e43 to f1cf392 Compare December 8, 2022 16:15
@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch from f50bb28 to 47f9a9f Compare December 20, 2022 16:07
@matejmatuska matejmatuska marked this pull request as ready for review December 20, 2022 16:08
@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch from 47f9a9f to 25f8fb8 Compare January 3, 2023 14:24
@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch 2 times, most recently from e381f2b to eef4ba2 Compare January 16, 2023 14:24
@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch from 0a3a947 to 0dcdb6a Compare January 18, 2023 19:54
@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch from 51ecc7d to ed9eaf6 Compare January 19, 2023 12:48
@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch from a0ff9f8 to 77510a3 Compare January 19, 2023 13:17
When the upgrade fails in the initramfs the dracut shell is entered.
Upon exiting the dracut shell, the upgrade.target is restarted which
causes the upgrade.service, which runs the leapp upgrade, to rerun as
well.

This commit fixes that by creating a "flag" file when the upgrade
fails, whose existence is checked before reruning the upgrade and the
upgrade is prevented in such case.

Also, a new removeupgradeartifacts actor is introduced to clean up leftover upgrade artifacts, including the upgrade failed flag file, at the beginning of the upgrade process.

Jira ref.: OAMG-4224
@matejmatuska matejmatuska force-pushed the initramfs-upgrade-failed-restart branch from 77510a3 to 501805a Compare January 19, 2023 13:18
Copy link
Member

@Rezney Rezney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works as expected.

@Rezney Rezney merged commit f5a3d62 into oamg:master Jan 20, 2023
@pirat89 pirat89 added the changelog-checked The merger/reviewer checked the changelog draft document and updated it when relevant label Jan 20, 2023
[ -f "$NEWROOT$LEAPP_FAILED_FLAG_FILE" ] && {
echo >&2 "Found file $NEWROOT$LEAPP_FAILED_FLAG_FILE"
echo >&2 "Error: Leapp previously failed and cannot continue, returning back to emergency shell"
echo >&2 "Please file a support case with $NEWROOT/var/log/leapp/leapp-upgrade.log attached"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thinking about that, most likely the rdsosreport.txt file will be needed also. @Rezney @matejmatuska wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think rdsosreport.txt could be useful to have.

pirat89 added a commit to pirat89/leapp-repository that referenced this pull request Feb 21, 2023
## Packaging
- Requires cpio (oamg#979)
- Requires python3-gobject-base, NetworkManager-libnm (oamg#969)
- Bump leapp-repository-dependencies to 9 (oamg#969, oamg#979)

## Upgrade handling
### Fixes
- Add leapp RHUI packages to an allowlist to drop confusing reports (oamg#995)
- Check only mounted XFS partitions (oamg#1027)
- Detect the kernel-core RPM instead of kernel to prevent an error during post-upgrade phases (oamg#1024)
- Disable the amazon-id DNF plugin on AWS during the upgrade stage to omit error messages during the upgrade process caused by missing network connection (oamg#990)
- Do not create new *pyc files when running leapp after the DNF upgrade transaction (oamg#1017)
- Enable upgrades on s390x when /boot is part of rootfs (oamg#991)
- Extend the allow list of RHUI clients by azure-sap-apps to omit confusing report (oamg#974)
- Filter out PES events unrelated for the used upgrade path and handle overlapping events (oamg#1008)
- Fix scan of ceph volumes on systems without ceph-osd (oamg#1011)
- Fix scan of ceph volumes when ceph-osd container is not found (oamg#986)
- Fix systemd symlinks that become incorrect during the IPU (oamg#972)
- Fix the check of memory (RAM) limits (oamg#984)
- Fix the upgrade of IBM Z machines configured with ZFCP (oamg#983)
- Ignore external accounts in /etc/passwd (oamg#958)
- Inhibit the upgrade when entries in /etc/fstab cause overshadowing during the upgrade (oamg#1009)
- Prevent leapp failures caused by re-run of leapp in the upgrade initramfs after previous failure, which causes additional confusing error message hiding original bugs (oamg#996)
- Prevent the upgrade with RHSM when a baseos and an appstream target repositories are not discovered (oamg#1001)
- RHUI(Azure) handle correctly various SAP images (oamg#1037)
- Rework the network configuration handling and parse the configuration data properly (oamg#969)
- Set RHSM release for non-ga and non-beta channels (oamg#1033)
- Use the "grub" library to find the GRUB device (oamg#989)
- [IPU 7 -> 8] Detect corrupted grubenv file (oamg#1012)
- [IPU 7 -> 8] Ensure that rsyncd stays enabled if it is enabled prior the upgrade(oamg#1043)
- [IPU 7 -> 8] Ensure that satellite metapackages are installed after the upgrade (oamg#994)
- [IPU 7 -> 8] Ensure the device_cio_free service stays enabled on s390x after the upgrade (oamg#977)
- [IPU 7 -> 8] Fixed checks for RHEL SAP IPU 7.9 -> 8.6 (oamg#978)
- [IPU 7 -> 8] Fixed migration of ntp to chrony when the ntp service is masked (oamg#966)
- [IPU 7 -> 8] Prevent the traceback during migration of sendmail configuration files when the package is not installed (oamg#1041)
- [IPU 7 -> 8] Satellite: reindex all related databases to fix issues due to new locales in RHEL 8 (oamg#1007, oamg#1018)
- [IPU 7 -> 8] Use the correct domain name in SSSD reports (oamg#1040)
- [IPU 8 -> 9] Added checks for RHEL SAP IPU 8.6 -> 9.0 (oamg#978)
- [IPU 8 -> 9] CheckVDO: Ask user for the confirmation only on failures and undetermined devices (oamg#961)
- [IPU 8 -> 9] Fix the kernel detection during initramfs creation for new kernel on RHEL 9.2+ (oamg#1048)
- [IPU 8 -> 9] Fix the upgrade on Azure using RHUI for SAP Apps images (oamg#975)
- [IPU 8 -> 9] Handle correctly firewalld version 0.8 (oamg#963)

### Enhancements
- Set new upgrade paths (oamg#988):
-- RHEL 7.9 -> 8.8, 8.6 (default: 8.8)
-- RHEL 8.6 -> 9.0
-- RHEL 8.8 -> 9.2
- Check that used leapp data are valid and compatible with the installed leapp-repository (oamg#1003)
- Detect a proxy configuration in YUM/DNF and adjust an error msg on issues caused by the configuration (oamg#914)
- Detect and report systemd symlinks that are broken before the upgrade (oamg#972)
- Drop obsoleted upgrade paths (oamg#1047)
- Improve remediation instructions for packages in unknown repositories (oamg#1010)
- Improve the error message to guide users when discovered more space is needed (oamg#956)
- Introduce --nogpgcheck option to skip checking of RPM signatures (oamg#910)
- Introduced an option to use an ISO file as a target RHEL version content source (oamg#979)
- Introduced possibility to specify what systemd services should be enabled/disabled on the upgraded system (oamg#964)
- Map the target repositories also based on the installed content (oamg#967)
- Provide common information about systemd services (oamg#959)
- Register subscribed systems automatically to Red Hat Insights unless --no-insights-register is used (oamg#1000)
- Remove obsoleted GPG keys provided by RH after the upgrade to prevent errors (oamg#1022)
- Run upgrade process with checking RPM signatures by default (oamg#910, oamg#993, oamg#1025)
- Save breadcrumbs results as RHSM facts (oamg#1002)
- Small improvements in various reports (oamg#1038, oamg#1039, oamg#1032)
- [IPU 8 -> 9] Detect CIFS also when upgrading from RHEL8 to RHEL9 (PR1035)
- [IPU 8 -> 9] Detect RoCE on IBM Z machines and check the configuration is safe for the upgrade (oamg#1030)
- [IPU 8 -> 9] Enable upgrades of RHEL 8 for SAP HANA to RHEL 9 on ppc64le (oamg#1042)
- [IPU 8 -> 9] Improve the handling of blocklisted certificates (oamg#992)

## Additional changes interesting for devels
- Started work on bringing up networking inside the upgrade initramfs - currently available for testing and development purposes when LEAPP_DEVEL_INITRAM_NETWORK is set (oamg#960)
- Add leapp debug tools to the upgrade initramfs - dracut upgrade module (oamg#997)
- Enable disabling of DNF plugins in the dnfplugin library (oamg#990)
@pirat89 pirat89 mentioned this pull request Feb 21, 2023
pirat89 added a commit that referenced this pull request Feb 21, 2023
## Packaging
- Requires cpio (#979)
- Requires python3-gobject-base, NetworkManager-libnm (#969)
- Bump leapp-repository-dependencies to 9 (#969, #979)

## Upgrade handling
### Fixes
- Add leapp RHUI packages to an allowlist to drop confusing reports (#995)
- Check only mounted XFS partitions (#1027)
- Detect the kernel-core RPM instead of kernel to prevent an error during post-upgrade phases (#1024)
- Disable the amazon-id DNF plugin on AWS during the upgrade stage to omit error messages during the upgrade process caused by missing network connection (#990)
- Do not create new *pyc files when running leapp after the DNF upgrade transaction (#1017)
- Enable upgrades on s390x when /boot is part of rootfs (#991)
- Extend the allow list of RHUI clients by azure-sap-apps to omit confusing report (#974)
- Filter out PES events unrelated for the used upgrade path and handle overlapping events (#1008)
- Fix scan of ceph volumes on systems without ceph-osd (#1011)
- Fix scan of ceph volumes when ceph-osd container is not found (#986)
- Fix systemd symlinks that become incorrect during the IPU (#972)
- Fix the check of memory (RAM) limits (#984)
- Fix the upgrade of IBM Z machines configured with ZFCP (#983)
- Ignore external accounts in /etc/passwd (#958)
- Inhibit the upgrade when entries in /etc/fstab cause overshadowing during the upgrade (#1009)
- Prevent leapp failures caused by re-run of leapp in the upgrade initramfs after previous failure, which causes additional confusing error message hiding original bugs (#996)
- Prevent the upgrade with RHSM when a baseos and an appstream target repositories are not discovered (#1001)
- RHUI(Azure) handle correctly various SAP images (#1037)
- Rework the network configuration handling and parse the configuration data properly (#969)
- Set RHSM release for non-ga and non-beta channels (#1033)
- Use the "grub" library to find the GRUB device (#989)
- [IPU 7 -> 8] Detect corrupted grubenv file (#1012)
- [IPU 7 -> 8] Ensure that rsyncd stays enabled if it is enabled prior the upgrade(#1043)
- [IPU 7 -> 8] Ensure that satellite metapackages are installed after the upgrade (#994)
- [IPU 7 -> 8] Ensure the device_cio_free service stays enabled on s390x after the upgrade (#977)
- [IPU 7 -> 8] Fixed checks for RHEL SAP IPU 7.9 -> 8.6 (#978)
- [IPU 7 -> 8] Fixed migration of ntp to chrony when the ntp service is masked (#966)
- [IPU 7 -> 8] Prevent the traceback during migration of sendmail configuration files when the package is not installed (#1041)
- [IPU 7 -> 8] Satellite: reindex all related databases to fix issues due to new locales in RHEL 8 (#1007, #1018)
- [IPU 7 -> 8] Use the correct domain name in SSSD reports (#1040)
- [IPU 8 -> 9] Added checks for RHEL SAP IPU 8.6 -> 9.0 (#978)
- [IPU 8 -> 9] CheckVDO: Ask user for the confirmation only on failures and undetermined devices (#961)
- [IPU 8 -> 9] Fix the kernel detection during initramfs creation for new kernel on RHEL 9.2+ (#1048)
- [IPU 8 -> 9] Fix the upgrade on Azure using RHUI for SAP Apps images (#975)
- [IPU 8 -> 9] Handle correctly firewalld version 0.8 (#963)

### Enhancements
- Set new upgrade paths (#988):
-- RHEL 7.9 -> 8.8, 8.6 (default: 8.8)
-- RHEL 8.6 -> 9.0
-- RHEL 8.8 -> 9.2
- Check that used leapp data are valid and compatible with the installed leapp-repository (#1003)
- Detect a proxy configuration in YUM/DNF and adjust an error msg on issues caused by the configuration (#914)
- Detect and report systemd symlinks that are broken before the upgrade (#972)
- Drop obsoleted upgrade paths (#1047)
- Improve remediation instructions for packages in unknown repositories (#1010)
- Improve the error message to guide users when discovered more space is needed (#956)
- Introduce --nogpgcheck option to skip checking of RPM signatures (#910)
- Introduced an option to use an ISO file as a target RHEL version content source (#979)
- Introduced possibility to specify what systemd services should be enabled/disabled on the upgraded system (#964)
- Map the target repositories also based on the installed content (#967)
- Provide common information about systemd services (#959)
- Register subscribed systems automatically to Red Hat Insights unless --no-insights-register is used (#1000)
- Remove obsoleted GPG keys provided by RH after the upgrade to prevent errors (#1022)
- Run upgrade process with checking RPM signatures by default (#910, #993, #1025)
- Save breadcrumbs results as RHSM facts (#1002)
- Small improvements in various reports (#1038, #1039, #1032)
- [IPU 8 -> 9] Detect CIFS also when upgrading from RHEL8 to RHEL9 (PR1035)
- [IPU 8 -> 9] Detect RoCE on IBM Z machines and check the configuration is safe for the upgrade (#1030)
- [IPU 8 -> 9] Enable upgrades of RHEL 8 for SAP HANA to RHEL 9 on ppc64le (#1042)
- [IPU 8 -> 9] Improve the handling of blocklisted certificates (#992)

## Additional changes interesting for devels
- Started work on bringing up networking inside the upgrade initramfs - currently available for testing and development purposes when LEAPP_DEVEL_INITRAM_NETWORK is set (#960)
- Add leapp debug tools to the upgrade initramfs - dracut upgrade module (#997)
- Enable disabling of DNF plugins in the dnfplugin library (#990)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog-checked The merger/reviewer checked the changelog draft document and updated it when relevant
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants