Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgr/dashboard: perform daemon actions #44609

Merged
merged 1 commit into from Feb 2, 2022

Conversation

pereman2
Copy link
Contributor

@pereman2 pereman2 commented Jan 17, 2022

Ceph.2.mp4

Fixes: https://tracker.ceph.com/issues/50322

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox

Copy link
Member

@nizamial09 nizamial09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pereman2: Didn't yet tested this manually but found some minor improvements in the UI section. Also I think its good to use pre-commit hook before commiting so that it could take care of most of the linting issues in code. You can find out how to setup pre-commit hook in ceph-dev..

Dashboard automation moved this from In progress to Review in progress Jan 17, 2022
@nizamial09
Copy link
Member

nizamial09 commented Jan 17, 2022

Forgot one thing: when the daemon is already in a running state and you select the daemon, instead of showing the Start Action, you can disable the Start Action and show the Stop Action first. That way you don't need to click the drop down action again. Similar to when you select a Pool and it shows Edit right away in the action instead of Create. @pereman2

Screenshot 2022-01-17 192726

@pereman2
Copy link
Contributor Author

@nizamial09 thanks! I should've opened this as draft since it was mostly to show alfonso the frontend test :p

@pereman2
Copy link
Contributor Author

jenkins test make check

@nizamial09
Copy link
Member

@pereman2 I just saw we already have a tracker dedicated for this. Instead of using the new one you created, shall we use that? https://tracker.ceph.com/issues/50322

@pereman2
Copy link
Contributor Author

@pereman2 I just saw we already have a tracker dedicated for this. Instead of using the new one you created, shall we use that? https://tracker.ceph.com/issues/50322

Let's use that one, I'll close the current one.

@pereman2
Copy link
Contributor Author

jenkins test make check

Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! But I think we should start covering all day-2 ops with e2e, so here it comes the challenge:

Given that you already have experience with BDD/Behave/Cucumber
When you implement this new day-2 operation
Then it'd be awesome if a new BDD e2e test is added 😻 

If you want to take the challenge, Cypress has a Cucumber-like library.

{
permission: 'update',
icon: Icons.deploy,
click: () => this.daemonAction('redeploy'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the docs I see reconfig and/or redeploy?

Not sure if tableActions support tool-tips, but wouldn't be worthy to explain what redeploy and reconfig do (start and stop are self-explanatory though)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was self-explanatory both restart and redeploy. Restart is stop and start I'd say, and redeploy recreates the systemd unit files, stops and restarts the service again. Reconfig I haven't checked since I thought it would increase the complexity of this pr rn

class Daemon(RESTController):
@raise_if_no_orchestrator([OrchFeature.DAEMON_ACTION])
@handle_orchestrator_error('daemon')
@RESTController.MethodMap(version=APIVersion.EXPERIMENTAL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is experimental the new black? :P Now serious: I see the point of always starting with experimental version, but then we'll have to force ourselves to periodically review and freeze endpoints as stable.

Copy link
Member

@nizamial09 nizamial09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks almost good to me apart from some minor hiccups.

I have an rgw service deployed in such a way that 2 daemons are in a same host and 1 in other. When I try to select a daemon in the first host, both of the daemons in that host gets selected.
https://user-images.githubusercontent.com/71764184/150755853-40673a7d-58dc-4111-8aa7-d1a64b8ceeae.mp4

Also, whenever I try to stop an rgw daemon it goes into an error state (same happening while using cephadm commands) so maybe we need to raise a bug or talk with cephadm folks...

Dashboard automation moved this from Review in progress to Reviewer approved Jan 27, 2022
Copy link
Member

@nizamial09 nizamial09 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Awesome work @pereman2
One last request from me and I think that'll be it.

@nizamial09 nizamial09 added the needs-quincy-backport backport required for quincy label Jan 27, 2022
@nizamial09
Copy link
Member

jenkins test dashboard cephadm

@nizamial09
Copy link
Member

jenkins test dashboard

@epuertat epuertat moved this from Reviewer approved to Ready-to-merge in Dashboard Jan 27, 2022
@epuertat
Copy link
Member

Moved to RTB @pereman2 , but please squash commits (it'd facilitate backporting) and add the Fixes: to the commit message too.

Copy link
Member

@epuertat epuertat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly good, but some comments from my end:

  • When I stop a service it enters "error" state. Shouldn't it be "stopped" instead? "Error" is something unexpected, and no action that we allow users to do should directly result in an error state (IMHO we should prevent that). @adk3798 any clue on this?

image

  • I also see the mon/mgr actions are grayed out? Aren't they supported?

  • Another interesting behaviour I observed is that when performing multiple actions (start/stop/redeploy), no more than 6 events are displayed, but they're not the latest 6, but the first six (2 events from 1 hour ago are there, while I just did many restarts and they aren't displayed: the issue is probably in the back-end code). Additionally, I'd expect to see the events in reverse chronological order (newer at the top). Does that make sense?:

image

  • When I check the service events, I get:

image

Any clue about if that's expected?

  • Just a nit UI question: could we add some padding between icons and buttons (I know it's not specific of this PR, but let's incrementally improve the UI?)

image

This is a sample from the Patternfly website:

image

@pereman2
Copy link
Contributor Author

Looks mostly good, but some comments from my end:

  • When I stop a service it enters "error" state. Shouldn't it be "stopped" instead? "Error" is something unexpected, and no action that we allow users to do should directly result in an error state (IMHO we should prevent that). @adk3798 any clue on this?

2022-01-28 06:10:34,210 7fd195dfeb80 DEBUG /usr/bin/docker: Error: No such object: ceph-00000000-0000-0000-0000-0000deadbeef-crash-7b37b3131b1a this is what I see in cephadm.log

image

  • I also see the mon/mgr actions are grayed out? Aren't they supported?

Do we want to support this? If you kill a mon or mgr you risk making the dashboard unusable.

  • Another interesting behaviour I observed is that when performing multiple actions (start/stop/redeploy), no more than 6 events are displayed, but they're not the latest 6, but the first six (2 events from 1 hour ago are there, while I just did many restarts and they aren't displayed: the issue is probably in the back-end code). Additionally, I'd expect to see the events in reverse chronological order (newer at the top). Does that make sense?:

image

Events come from cephadm we don't modify them.
Fixed:
image

  • When I check the service events, I get:

image

Any clue about if that's expected?

Those are service events not daemon events which come from a different datasource.
image

  • Just a nit UI question: could we add some padding between icons and buttons (I know it's not specific of this PR, but let's incrementally improve the UI?)

image

This is a sample from the Patternfly website:

image

done too:
image

@epuertat
Copy link
Member

jenkins test api

@epuertat
Copy link
Member

jenkins test make check

@epuertat
Copy link
Member

jenkins test dashboard cephadm

@pereman2
Copy link
Contributor Author

pereman2 commented Feb 1, 2022

jenkins test make check

1 similar comment
@pereman2
Copy link
Contributor Author

pereman2 commented Feb 1, 2022

jenkins test make check

@pereman2
Copy link
Contributor Author

pereman2 commented Feb 1, 2022

jenkins test api

@epuertat epuertat merged commit 8505861 into ceph:master Feb 2, 2022
10 checks passed
Dashboard automation moved this from Ready-to-merge to Done Feb 2, 2022
@epuertat epuertat deleted the daemon_action branch February 2, 2022 13:07
@epuertat epuertat removed the needs-quincy-backport backport required for quincy label Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Dashboard
  
Done
5 participants