Skip to content

[AI Generated] BugFix: Retry zypper operations on lock contention (exit code 7)#4400

Merged
LiliDeng merged 5 commits intomainfrom
bugfix/zypper-lock-retry_270326_084743
Apr 2, 2026
Merged

[AI Generated] BugFix: Retry zypper operations on lock contention (exit code 7)#4400
LiliDeng merged 5 commits intomainfrom
bugfix/zypper-lock-retry_270326_084743

Conversation

@johnsongeorge-w
Copy link
Copy Markdown
Collaborator

Summary

Zypper exits with code 7 (ZYPPER_EXIT_ERR_ZYPP) when another process holds the system management lock. This is a transient condition, but add_repository(), _install_packages(), and _uninstall_packages() in the Suse class all treated it as a hard failure.

Adds retry logic (up to 5 attempts with 10s delay + wait_running_process) that detects exit code 7, waits for the competing zypper process to finish, then retries the command.

Validation Results

Image Result
SUSE sles-15-sp6 gen2 2026.01.23 PASSED

…it code 7)

Zypper exits with code 7 (ZYPPER_EXIT_ERR_ZYPP) when another process
holds the system management lock.  This is a transient condition, but
add_repository(), _install_packages(), and _uninstall_packages() all
treated it as a hard failure.

Add retry logic (up to 5 attempts with 10s delay) that detects exit
code 7, waits for the competing zypper process to finish, then retries
the command.
Copilot AI review requested due to automatic review settings March 27, 2026 16:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves SUSE package management robustness by retrying zypper operations when they fail due to transient lock contention (exit code 7), instead of treating it as a hard failure.

Changes:

  • Add SUSE-specific constants for zypper lock exit code, max retries, and retry delay.
  • Wrap zypper ar, zypper rm, and zypper in executions with retry logic that waits for competing zypper processes and then retries.
Comments suppressed due to low confidence (1)

lisa/operating_system.py:2496

  • After the retry loop, if zypper still returns exit code 7, the code falls into the generic “Unexpected exit_code” path. Consider handling the “lock contention after max retries” case explicitly to produce a clearer, actionable error (include retry count/delay and indicate it was zypper lock contention).
            if install_result.exit_code == self._ZYPPER_EXIT_LOCK:
                self._log.debug(
                    f"zypper lock contention (exit code 7), "
                    f"retry {retry_num + 1}/{self._ZYPPER_LOCK_MAX_RETRIES}"
                )
                self.wait_running_process("zypper")
                time.sleep(self._ZYPPER_LOCK_RETRY_DELAY)
                continue
            break

        # zypper exit codes that indicate dependency/resolution issues:
        # 1: ZYPPER_EXIT_ERR_BUG - Unexpected situation
        # 4: ZYPPER_EXIT_INF_CAP_NOT_FOUND - Capability not found or dependency problem
        # 100: ZYPPER_EXIT_INF_UPDATE_NEEDED - Updates available
        # If installation failed due to dependency conflicts, retry with
        # --force-resolution to allow zypper to automatically resolve conflicts
        if install_result.exit_code in (1, 4, 100):
            self._log.debug(
                f"Installation failed with exit code {install_result.exit_code}, "
                "retrying with --force-resolution to resolve dependency conflicts."
            )
            command_with_force = f"zypper --non-interactive {add_args}"
            if not signed:
                command_with_force += " --no-gpg-checks "
            command_with_force += f" in --force-resolution {' '.join(packages)}"
            install_result = self._node.execute(
                command_with_force, shell=True, sudo=True, timeout=timeout
            )

        if install_result.exit_code in (1, 4, 100):
            raise LisaException(
                f"Failed to install {packages}. exit_code: {install_result.exit_code}, "
                f"stdout: {install_result.stdout}, stderr: {install_result.stderr}"
            )
        elif install_result.exit_code == 0:
            self._log.debug(f"{packages} is/are installed successfully.")
        else:
            raise LisaException(

Comment thread lisa/operating_system.py
Comment thread lisa/operating_system.py Outdated
Comment thread lisa/operating_system.py Outdated
Comment thread lisa/operating_system.py Outdated
Comment thread lisa/operating_system.py Outdated
Comment thread lisa/operating_system.py Outdated
Comment thread lisa/operating_system.py Outdated
Copilot AI requested a review from LiliDeng March 28, 2026 02:45
@LiliDeng
Copy link
Copy Markdown
Collaborator

@copilot no need to change the code format, it brings below CI check error.

nox > flake8 
./lisa/operating_system.py:230:5: E704 multiple statements on one line (def)
./lisa/operating_system.py:230:68: BLK100 Black would make changes.
./lisa/operating_system.py:740:17: E701 multiple statements on one line (colon)
./lisa/operating_system.py:761:19: E701 multiple statements on one line (colon)
./lisa/operating_system.py:1611:19: E701 multiple statements on one line (colon)

Copilot AI review requested due to automatic review settings March 28, 2026 03:32
@johnsongeorge-w johnsongeorge-w review requested due to automatic review settings March 28, 2026 03:32
Copy link
Copy Markdown
Contributor

Copilot AI commented Mar 28, 2026

@copilot no need to change the code format, it brings below CI check error.

nox > flake8 

...

Fixed in commit e4194d4. Reverted the four pre-existing lines that Black reformatted (lines 230, 740, 761, 1611) back to their original multi-line form, so those flake8 errors (E704/E701/BLK100) are gone. The actual code changes are untouched.

- Move wait_running_process into retry wrapper (before every attempt)
- Wrap _initialize_package_installation refresh with retry
- Wrap _update_packages with retry
- Wrap --force-resolution install path with retry
- Elevate lock contention log from debug to warning
- Remove dead code after exhaustion raise
Copilot AI review requested due to automatic review settings April 1, 2026 04:52
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comment thread lisa/operating_system.py
Comment thread lisa/operating_system.py
Zypper exits with code 8 (ZYPPER_EXIT_ERR_COMMIT) when the RPM
transaction fails because another process holds .rpm.lock. The
previous retry logic only caught exit code 7 (zypper-level lock)
but missed this RPM-level lock that occurs during the transaction
phase.

Add _is_zypper_lock_error() helper that detects both:
- Exit code 7: zypper management lock
- Exit code 8 + 'can't create transaction lock on ...rpm.lock':
  RPM transaction lock
@LiliDeng LiliDeng merged commit 8427ce3 into main Apr 2, 2026
58 checks passed
@LiliDeng LiliDeng deleted the bugfix/zypper-lock-retry_270326_084743 branch April 2, 2026 02:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants