quadlet kube: correctly mark unit as failed #25481

Luap99 · 2025-03-05T13:53:14Z

When no containers could be started we need to make sure the unit status
reflects this. This means we should not send the READ=1 message and not
keep the service container running when we were unable to start any
container.

There is the question what should happen when only a subset was started.
For systemd we can only be either running or failed. And as podman kube
play also just keeps the partial started pods running I opted to let
systemd keep considering this as success.

Fixes #20667
Fixes https://issues.redhat.com/browse/RHEL-80471

And two more minor fixes, see the first two commits

Does this PR introduce a user-facing change?

Fixes an issue where kube quadlet units would not report an error (and stay running) even when the pod failed to start.

It is very bad practise to print to stdout in our backend code without nay real context. The exact same error message is returned to the caller and printed in the cli frontend hwere it should be. Therefore drop this print as it is redundant. Signed-off-by: Paul Holzinger <pholzing@redhat.com>

The first condition is checking an error where no error is returned and the second is checking even though err == nil was matched above already so we know the error is not nil here. Then also replace os.IsNotExist(err) with errors.Is(err, os.ErrNotExist) as that should be used for new code. This should not change behavior in any way. Signed-off-by: Paul Holzinger <pholzing@redhat.com>

openshift-ci · 2025-03-05T13:53:20Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [Luap99]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

packit-as-a-service · 2025-03-05T13:56:12Z

Ephemeral COPR build failed. @containers/packit-build please check.

ygalblum

Small nits

pkg/domain/infra/abi/play.go

When no containers could be started we need to make sure the unit status reflects this. This means we should not send the READ=1 message and not keep the service container running when we were unable to start any container. There is the question what should happen when only a subset was started. For systemd we can only be either running or failed. And as podman kube play also just keeps the partial started pods running I opted to let systemd keep considering this as success. Fixes containers#20667 Fixes https://issues.redhat.com/browse/RHEL-80471 Signed-off-by: Paul Holzinger <pholzing@redhat.com>

mheon · 2025-03-05T15:45:07Z

/lgtm

ygalblum · 2025-03-05T15:56:36Z

pkg/domain/infra/abi/play.go

+	setRanContainers := func(r *entities.PlayKubeReport) {
+		if !ranContainers {
+			for _, p := range r.Pods {
+				// If the list of container errors is less then the total number of pod containers then we know it didn't start.


Should the comment be that it did start?

oh yes, I guess I just sneak it in my next PR as small typo fix commit.

It did start there, as pointed out by Ygal on containers#25481. Signed-off-by: Paul Holzinger <pholzing@redhat.com>

Luap99 added 2 commits March 5, 2025 14:50

openshift-ci bot added the release-note label Mar 5, 2025

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2025

Luap99 force-pushed the kube-sdnotify-error branch from 68c31df to bacab64 Compare March 5, 2025 14:05

ygalblum reviewed Mar 5, 2025

View reviewed changes

pkg/domain/infra/abi/play.go Outdated Show resolved Hide resolved

pkg/domain/infra/abi/play.go Outdated Show resolved Hide resolved

Luap99 force-pushed the kube-sdnotify-error branch from bacab64 to 945aade Compare March 5, 2025 14:54

openshift-ci bot assigned mheon Mar 5, 2025

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2025

ygalblum reviewed Mar 5, 2025

View reviewed changes

openshift-merge-bot bot merged commit 919247a into containers:main Mar 5, 2025
65 of 82 checks passed

Luap99 deleted the kube-sdnotify-error branch March 5, 2025 16:11

Luap99 added a commit to Luap99/libpod that referenced this pull request Mar 13, 2025

pkg/domain/infra/abi/play.go: fix one comment

5207fee

It did start there, as pointed out by Ygal on containers#25481. Signed-off-by: Paul Holzinger <pholzing@redhat.com>

stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jun 4, 2025

stale-locking-app bot locked as resolved and limited conversation to collaborators Jun 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

quadlet kube: correctly mark unit as failed #25481

quadlet kube: correctly mark unit as failed #25481

Uh oh!

Luap99 commented Mar 5, 2025

Uh oh!

openshift-ci bot commented Mar 5, 2025

Uh oh!

packit-as-a-service bot commented Mar 5, 2025

Uh oh!

ygalblum left a comment

Uh oh!

Uh oh!

Uh oh!

mheon commented Mar 5, 2025

Uh oh!

ygalblum Mar 5, 2025

Uh oh!

Luap99 Mar 5, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

quadlet kube: correctly mark unit as failed #25481

quadlet kube: correctly mark unit as failed #25481

Uh oh!

Conversation

Luap99 commented Mar 5, 2025

Does this PR introduce a user-facing change?

Uh oh!

openshift-ci bot commented Mar 5, 2025

Uh oh!

packit-as-a-service bot commented Mar 5, 2025

Uh oh!

ygalblum left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mheon commented Mar 5, 2025

Uh oh!

ygalblum Mar 5, 2025

Choose a reason for hiding this comment

Uh oh!

Luap99 Mar 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants