[RFE] Display problems from pods #2919

marusak · 2018-03-27T07:55:03Z

We, in ABRT team, focus on catching, processing and reporting problems. For now our main focus were servers and workstations. Sometime ago we started to support catching core problems from containers. Now we successfully showed how we can catch exceptions from interpreted languages from containers as well.[1] That means, that now we can fully inform about what is not behaving correctly in container. For that a universal tool was created.[2]

That being done we want to show these information in a more discoverable way. One step, mainly for servers was integration into Cockpit [3]. Now we want to continue and help OpenShift to be even better.
Therefore I propose this RFE. It could look maybe something like this:

What do you think?

[1] http://post-office.corp.redhat.com/archives/aos-devel/2018-February/msg00402.html
[2] https://github.com/abrt/container-exception-logger
[3] https://abrt.github.io/abrt/cockpit/2017/06/29/ABRT-in-cockpit/

spadgett · 2018-03-27T19:28:24Z

Are the problems captured as events in Kubernetes? Or some other way? I think that would change how we present them. It's not clear to me from the mocks since sometimes they're displayed with events and sometimes displayed separately.

@openshift/team-ux-review

marusak · 2018-03-27T19:56:47Z

The problems are available in logs. Other tools (for example node-problem-detector, ABRT etc...) can parse these logs. The same approach was in my mind for the console.

The mock-up with events was just idea, I am not sure how those work. I am pretty sure that it can be displayed in pods as seen in mock-ups 2 and 3.

ncameronbritt · 2018-03-28T20:01:23Z

I'm not sure what the correct level to surface this kind of information is. We have logs for the pods, so that level makes sense to me. How would users be made aware of these problems, or how would they know they need to look at their pods?

Currently events and warnings are more at the platform level, and a user can take action through the platform to do something about these problems. That doesn't seem to be the case if the problem is with the code running inside of my container. From OpenShift's perspective, everything could be fine--the container is running. Given that, surfacing every individual problem as an event does not seem like the right level. But maybe aggregating errors in an event that says something like "there are runtime errors in pod-x" could make sense?

Is there a need for different levels/views of monitoring, something like application and platform, depending on your persona, or what you're interested in?

beanh66 · 2018-03-29T14:00:21Z

@marusak I'm also curious if you have seen the notification drawer or interacted with it at all? Right now the drawer is not configurable but I wonder if it would be possible in the future to allow users to configure the types of events they want to be notified about.

marusak · 2018-03-31T06:17:21Z

Hi @ncameronbritt

I'm not sure what the correct level to surface this kind of information is. We have logs for the pods, so that level makes sense to me.

Agree. Pods is the correct place to show this, no doubt about that.

How would users be made aware of these problems, or how would they know they need to look at their pods?

Great question. I was thinking about that and I came up with showing it in Monitoring both in events and by pod name (triangle that there is something wrong, see first mockup). However I am not very satisfied with it, I was only showing my ideas. And as you explained, events seem not to be the correct place.

But maybe aggregating errors in an event that says something like "there are runtime errors in pod-x" could make sense?

Possibly, but it would only make sense to have one event for each problematic pod - which still can be a lot.

Is there a need for different levels/views of monitoring, something like application and platform, depending on your persona, or what you're interested in?

I would rather see this integrated into existing environment. It is hard to imagine admin who in his/her free time scrolls around and reads some monitoring views. I think that the best solution would be having one tab in pods as suggested, than possibly having triangle on pods overview and notify user somehow. I really like the notification drawer as @beanh66 suggested. I was not aware of that and I believe it is more suitable place for this than events.

marusak · 2018-04-09T08:09:50Z

ping @spadgett @ncameronbritt Do you agree with my last comment? Can I start working on this?

ncameronbritt · 2018-04-09T15:39:43Z

IMO this should be discussed at a higher level in terms of how/whether/where this feature is integrated into the console, particularly as we look to integrate with the Tectonic console. Thoughts @jwforres ?

openshift-bot · 2020-08-20T15:14:50Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

marusak mentioned this issue Jul 31, 2018

Display problems #3049

Closed

marusak mentioned this issue Oct 29, 2018

Display problems openshift/console#712

Closed

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 20, 2020

jwforres closed this as completed Aug 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFE] Display problems from pods #2919

[RFE] Display problems from pods #2919

marusak commented Mar 27, 2018

spadgett commented Mar 27, 2018

marusak commented Mar 27, 2018

ncameronbritt commented Mar 28, 2018

beanh66 commented Mar 29, 2018

marusak commented Mar 31, 2018

marusak commented Apr 9, 2018

ncameronbritt commented Apr 9, 2018

openshift-bot commented Aug 20, 2020

[RFE] Display problems from pods #2919

[RFE] Display problems from pods #2919

Comments

marusak commented Mar 27, 2018

spadgett commented Mar 27, 2018

marusak commented Mar 27, 2018

ncameronbritt commented Mar 28, 2018

beanh66 commented Mar 29, 2018

marusak commented Mar 31, 2018

marusak commented Apr 9, 2018

ncameronbritt commented Apr 9, 2018

openshift-bot commented Aug 20, 2020