Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu 24.04 jobs directly fail after installing libssl-dev package #9937

Closed
2 of 14 tasks
jwillemsen opened this issue May 27, 2024 · 13 comments
Closed
2 of 14 tasks

Ubuntu 24.04 jobs directly fail after installing libssl-dev package #9937

jwillemsen opened this issue May 27, 2024 · 13 comments
Assignees
Labels
Area: Common Tools awaiting-deployment Code complete; awaiting deployment and/or deployment in progress bug report OS: Ubuntu

Comments

@jwillemsen
Copy link

Description

I am adding gcc13 to ubuntu 24.04 using

 sudo apt-get --yes update
  sudo apt-get --yes install libxerces-c-dev libssl-dev g++-13

Directly after this the jobs all fail with

Restarting services...
 /etc/needrestart/restart.d/systemd-manager
 systemctl restart packagekit.service php8.3-fpm.service runner-provisioner.service systemd-journald.service systemd-networkd.service systemd-resolved.service systemd-udevd.service udisks2.service walinuxagent.service
Error: The operation was canceled.

See https://github.com/RemedyIT/taox11/actions/runs/9252204973/job/25449380677?pr=387

Platforms affected

  • Azure DevOps
  • GitHub Actions - Standard Runners
  • GitHub Actions - Larger Runners

Runner images affected

  • Ubuntu 20.04
  • Ubuntu 22.04
  • Ubuntu 24.04
  • macOS 11
  • macOS 12
  • macOS 13
  • macOS 13 Arm64
  • macOS 14
  • macOS 14 Arm64
  • Windows Server 2019
  • Windows Server 2022

Image version and build link

20240516.4.0

Is it regression?

Yes

Expected behavior

Job runs

Actual behavior

Job fails

Repro steps

See the taox11 repo at https://github.com/RemedyIT/taox11

@jwillemsen
Copy link
Author

Log file
job-logs.txt

@SlawekNowy
Copy link

SlawekNowy commented May 28, 2024

Confirmed. This is due to pulling libssh2-1 alongside with libssh-dev. Due to needrestart defaults change this now causes remote-provider service to exit early.

See
https://discourse.ubuntu.com/t/needrestart-changes-in-ubuntu-24-04-service-restarts/44671 for details.

@jwillemsen
Copy link
Author

jwillemsen commented May 28, 2024

I moved to 24.04 because on 22.04 gcc13 was removed (#9866), probably this will hit more users

@jwillemsen jwillemsen changed the title Ubuntu 24.04 jobs directly fail after installing packages Ubuntu 24.04 jobs directly fail after installing g++-13 packages May 28, 2024
Silverlan added a commit to Silverlan/pragma that referenced this issue May 28, 2024
This should be reverted once the core issue has been resolved: actions/runner-images#9937
Silverlan added a commit to Silverlan/pragma that referenced this issue May 28, 2024
This should be reverted once the core issue has been resolved: actions/runner-images#9937
Silverlan added a commit to Silverlan/pragma that referenced this issue May 28, 2024
This should be reverted once the core issue has been resolved: actions/runner-images#9937
@jmarrec
Copy link
Contributor

jmarrec commented May 28, 2024

I'm actually hitting the same issue, on my fork of actions/python-version, because it has 'libssl-dev' in there: https://github.com/actions/python-versions/blob/c990e6da9586f6b33bf19aba61c934ded6ec28c5/builders/ubuntu-python-builder.psm1#L71

i've tried setting combinations of these env vars to no avail:

NEEDRESTART_UI="NeedRestart::UI::Debconf"
NEEDRESTART_MODE="l"
NEEDRESTART_SUSPEND="y"
DEBIAN_FRONTEND="noninteractive"

https://github.com/liske/needrestart/blob/d3e33025543cc8459e470379a38c967c362cd4df/man/needrestart.1#L71-L73

In the end I, ahem, did sudo apt remove needrestart...

@jwillemsen jwillemsen changed the title Ubuntu 24.04 jobs directly fail after installing g++-13 packages Ubuntu 24.04 jobs directly fail after installing libssl-dev package May 28, 2024
jmarrec added a commit to jmarrec/python-versions that referenced this issue May 28, 2024
@jdetaeye
Copy link

I used "sudo apt install --no-upgrade ..." as workaround/solution.

@jwillemsen
Copy link
Author

Workafound --no-upgrade doesn't seem to work for me, at the moment the CI links with libssl it just failed with ##[error]The operation was canceled., no real error message

@MichaIng
Copy link

MichaIng commented May 29, 2024

In the end I, ahem, did sudo apt remove needrestart...

This is IMO the only real solution for GitHub runners: Any package upgrade from openssl sources triggers a complete systemd restart, i.e. ALL running systemd services, due to needrestart. This naturally kills the connection to the runner. And OpenSSL/LibSSL (-dev) upgrades are common, as well as they are a dependency of many other packages, also for many builds, GitHub Actions are often used for. So whenever one installs libssl-dev for a build to link it, implicitly all OpenSSL/LibSSL packages or this source package will be upgraded, if any upgrade is available, triggering the global systemd restart, crashing the runner.

So remove needrestart from the runner image, and all is good! Of course a general package update to bring all OpenSSL packages onto latest version along side with this change, makes sense as well.

PR up: #9956

MichaIng added a commit to MichaIng/runner-images that referenced this issue May 29, 2024
since it triggers a restart of all systemd services, including the `runner-provisioner`, crashing the workflow, whenever packages from the OpenSSL/LibSSL package source are upgraded.

Solves: actions#9937

Signed-off-by: MichaIng <micha@dietpi.com>
@kishorekumar-anchala kishorekumar-anchala self-assigned this May 31, 2024
@kishorekumar-anchala
Copy link
Contributor

Hi @jwillemsen ,

We're unable to re-produce the error , Please retry it now by forking the main branch .

@jmarrec
Copy link
Contributor

jmarrec commented May 31, 2024

@kishorekumar-anchala What?

I takes little effort to reproduce.

name: Test needrestart

on:
  push:

jobs:

  build:
    runs-on: ubuntu-24.04

    steps:
    - name: install libssl-dev
      run: |
        sudo apt update -qq
        sudo apt install -y libssl-dev
    - name: ok?
      run: |
        echo "ok"

https://github.com/jmarrec/test-Github-Actions/actions/runs/9316043037/job/25643443846

Looking above, you also have like 10 linked pull request to repos experiencing the same issue.

@kishorekumar-anchala
Copy link
Contributor

Hi @jmarrec ,

Could you please try with latest ubuntu image

build:
runs-on: ubuntu-latest

@MichaIng
Copy link

MichaIng commented May 31, 2024

@kishorekumar-anchala
Of course the issue does not appear when you build a fresh image with up-to-date APT packages, since it appears when you upgrade OpenSSL source based packages during a workflow run.

Within a GitHub Actions workflow run, it must never happen that all systemd services are restarted, the runner-provisioner.service in particular. Hence what needrestart does is naturally harmful for the GitHub runner use case: #9956

@kishorekumar-anchala
Copy link
Contributor

Hi @MichaIng ,

Thank you for your input, i have gone through your input and investigate . Also run the pipeline with the same it works. i have raised a PR with your input . credits to you 😊 . Thank you !

@jmarrec , You can review the @MichaIng input .

@mikhailkoliada
Copy link
Member

fixed and rolled now.

jmarrec added a commit to jmarrec/python-versions that referenced this issue Jun 10, 2024
jmarrec added a commit to jmarrec/python-versions that referenced this issue Jun 10, 2024
This reverts commit bcfac4118b1177f5bfae530a3eb118fe823b6932.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Common Tools awaiting-deployment Code complete; awaiting deployment and/or deployment in progress bug report OS: Ubuntu
Projects
None yet
Development

No branches or pull requests

8 participants