-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report error events on pull failures #1842
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rhatdan The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@vrothberg PTAL |
@vrothberg @mheon Would this work? |
f2e7cfd
to
200b9b7
Compare
@benoitf PTAL |
thanks @rhatdan for the notice ! I was wondering if in the error message it will display as image name, the image name that has been resolved or the image name the user entered. for example if I use 'podman pull hello' and I've an error, will it provide |
Currently what I have written here will not work, because Events do not currently cover errors. We would need to handle the Error struct in podman. I think the Name would be the expanded name. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks compatibility with Docker and previous versions of Podman. Until now, a pull event has only been emitted on successful pull. With this change, we'd always emit an event - even in case of errors.
We can - for sure - do this, but we'd be more and more leaving Docker compat. An alternative could be to make this behavior opt-in via a containers.conf field.
could it be a separate channel event like 'pull_error' (so it doesn't break the 'pull' channel) |
It would be |
That would also be a good option. I don't expect a potential |
I don't think this is a good idea long term, events are for things that happened not for things that failed. From the issue:
Pull errors are fundamentally flawed because registries can report whatever they choose. For example pulling |
It would always be an approximation. The good news is that we only care about registry.redhat.io and we know how this registry behaves.
How I understand the approach is to also consider CLI workflows which are common for, let's say, using compose. Podman Desktop can then, after the fact, look into the logs/events and see whether there are errors related to the registry we care about. If there are such errors, Podman Desktop may choose to propose installing a given extension and to sign into the RH account. |
Ok, I think this should be a cabal issue, where we all talk about the pros and cons of allowing this information flow. Options:
|
I am lazy but not that lazy :^)
It's breaking backwards compat, so we if go forward, we need a good justification.
I am perfectly fine with doing that only for pull. For that, we have a good use case which is currently lacking for other kinds of events. If other events need it in the future, we could use pull as a precedent. So I am in favor of option 3): create pull_error event. |
Ok so this is a normal event, just a pull error event. Where does the text go then? |
Good question. I see beauty in both options but have a slight tendency toward adding an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test would be good. Set the channel, pull a bogus image (e.g., with an invalid transport) and make sure the .Error
is set.
|
||
// Dispatch the copy operation. | ||
switch ref.Transport().Name() { | ||
// DOCKER REGISTRY | ||
case registryTransport.Transport.Name(): | ||
pulledImages, pullError = r.copyFromRegistry(ctx, ref, possiblyUnqualifiedName, pullPolicy, options) | ||
pulledImages, err = r.copyFromRegistry(ctx, ref, possiblyUnqualifiedName, pullPolicy, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this continue to be pullError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could leave it as pullError, but wanted to avoid the confusion that pullError is only set here, when code above returning an error will get set as a pullError in the defer block.
@@ -18,6 +18,8 @@ const ( | |||
EventTypeUnknown EventType = iota | |||
// EventTypeImagePull represents an image pull. | |||
EventTypeImagePull | |||
// EventTypeImagePullError represents an image pull failed. | |||
EventTypeImagePullError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need that here as well OR should we let Podman deal with it? I feel like Podman should handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with going that way, but thought this might not be a breaking change, since we are adding a new event and not forcing apps that are looking for pullevent as a success indicator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can break here as Podman is the only user/caller of c/common and we're below 1.0.0.
What I meant by breaking change is that we cannot make this user visible. So external users of Podman reading the events should not notice a change. So we need a new pull_error (or something like that) for podman events
(and the backends).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assume some automation that is reading the logs/events for pulled images. Now, pull errors would show up and cause false positives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to know that a pull happened, if we want to put this into Podman then we have to plumb this into tons of different places, and we would probably need to plumb it into buildah as well.
@mheon WDYT? |
I'm inclined not to do this. We're already abusing the events system too much, and we have more than one request to scale back the amount we're logging. Adding errors on top of everything else makes things an order of magnitude worse. |
Well this would just be on pull errors. Not all errors. |
We can make this opt-in, and hide it behind a containers.conf option. We did the very same for the container audit events. Let's circle back to the problem at hand to make sure we're not getting stuck in discussing a solution: Podman Desktop has a number of extensions that plug into the Red Hat ecosystem. The SSO extension, for instance, is used to register subscription-manager and to log into the RH registry. The goal is to make Podman Desktop (and the extensions) a bit smarter in guiding/helping the users. Imagine somebody tries to pull a protected image but isn't logged in. In such a case, the user can be prompted by the SSO extension to sign into their Red Hat account. Such prompting is fairly easy when using Podman Desktop directly. The new /resolve endpoint can help to catch such issues pro-actively and errors can be parsed. BUT here we are looking into catching errors after the fact which includes CLI usage. Imagine a So we need some record for pull errors. The events subsystem seems like the right place to me. |
I understand the use case but I disagree that using events is the right thing. Today you say it is only pull error but tomorrow comes the next ask... We need to draw the line somewhere and I think it is here. Errors are not events IMO. As you said if podman desktop makes the API request it can already see the error and offer solutions. For cli users I do not see how offering them some GUI pop-up to prompt them to log in is that helpful. It seems to be you are trying to solve the problem backwards. If the problem is cli users having no idea how to log in into the RH registry then the podman error message is bad in the first place and should be improved to guide the user, however I don't think this is the case. The message seems good to me:
I think you should look into getting the link updated to mention the Podman Desktop login method as well. |
The problem is that we can't do the request if we don't know what users are trying to do in case of errors |
I don't perceive this as a problem as tomorrow's asks guarantee our paychecks ;-) I understand we're talking about it from a technical perspective, but there is a precedent for the audit events.
I think the devil lies in the details. It is certainly something versed users can deal with, but we are looking into improving on that and assist users who just want to get their job done. Making it as easy as possible to create and sign into a Red Hat account is incremental. It's important to understand that it is more than "just logging in" and more than just accessing the container registry. Trying to pull a trusted/protected container image will very very very likely entail accessing protected content/repositories inside the container which in turn requires subscription-manager to be installed and registered. There are certain workflows where users are just not going through the GUI. So it is quite useful for Podman Desktop to know whether a image pull from a certain registry has failed. The assumption is not that users are incapable to do that own their own but to help them do it. So in other words, for Podman Desktop to know which image pulls have failed is an important corner stone for the UX of https://github.com/redhat-developer/podman-desktop-redhat-account-ext. |
@Luap99 what other method could we allow machine readable events to improve the user experience? If I get this type of error inside of compose of build, the message will be logged to stderr, but there is no real way for podman desktop to react. I think enhancing events to record pull failures might be a slippery sloap but as long as it can be turned off, I don't see the big issue. I don't think failures on pull is all that common. Most people complaining about events are around flooding of events, and not using something like Journal. |
Fixes: containers/podman#21458 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
code LGTM ... @mheon im tagging you for the merge |
/lgtm |
Fixes: containers/podman#21458