Skip to content

Conversation

@Luap99
Copy link
Member

@Luap99 Luap99 commented Mar 5, 2025

When no containers could be started we need to make sure the unit status
reflects this. This means we should not send the READ=1 message and not
keep the service container running when we were unable to start any
container.

There is the question what should happen when only a subset was started.
For systemd we can only be either running or failed. And as podman kube
play also just keeps the partial started pods running I opted to let
systemd keep considering this as success.

Fixes #20667
Fixes https://issues.redhat.com/browse/RHEL-80471

And two more minor fixes, see the first two commits

Does this PR introduce a user-facing change?

Fixes an issue where kube quadlet units would not report an error (and stay running) even when the pod failed to start.

Luap99 added 2 commits March 5, 2025 14:50
It is very bad practise to print to stdout in our backend code without
nay real context. The exact same error message is returned to the caller
and printed in the cli frontend hwere it should be.

Therefore drop this print as it is redundant.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
The first condition is checking an error where no error is returned and
the second is checking even though err == nil was matched above already
so we know the error is not nil here.

Then also replace os.IsNotExist(err) with errors.Is(err, os.ErrNotExist)
as that should be used for new code.
This should not change behavior in any way.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Luap99

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 5, 2025
@packit-as-a-service
Copy link

Ephemeral COPR build failed. @containers/packit-build please check.

@Luap99 Luap99 force-pushed the kube-sdnotify-error branch from 68c31df to bacab64 Compare March 5, 2025 14:05
Copy link
Contributor

@ygalblum ygalblum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits

When no containers could be started we need to make sure the unit status
reflects this. This means we should not send the READ=1 message and not
keep the service container running when we were unable to start any
container.

There is the question what should happen when only a subset was started.
For systemd we can only be either running or failed. And as podman kube
play also just keeps the partial started pods running I opted to let
systemd keep considering this as success.

Fixes containers#20667
Fixes https://issues.redhat.com/browse/RHEL-80471

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
@Luap99 Luap99 force-pushed the kube-sdnotify-error branch from bacab64 to 945aade Compare March 5, 2025 14:54
@mheon
Copy link
Member

mheon commented Mar 5, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2025
setRanContainers := func(r *entities.PlayKubeReport) {
if !ranContainers {
for _, p := range r.Pods {
// If the list of container errors is less then the total number of pod containers then we know it didn't start.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the comment be that it did start?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, I guess I just sneak it in my next PR as small typo fix commit.

@openshift-merge-bot openshift-merge-bot bot merged commit 919247a into containers:main Mar 5, 2025
65 of 82 checks passed
@Luap99 Luap99 deleted the kube-sdnotify-error branch March 5, 2025 16:11
Luap99 added a commit to Luap99/libpod that referenced this pull request Mar 13, 2025
It did start there, as pointed out by Ygal on containers#25481.

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jun 4, 2025
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Jun 4, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Quadlet: pod fails to start, but unit is reported as online

3 participants