-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: smartnode notification functionality for bounty BA022308 (PR 1 of 3) #449
feat: smartnode notification functionality for bounty BA022308 (PR 1 of 3) #449
Conversation
@jshufro Will appreciate your review of this before I move onto the additional "harder" alerts. Specifically looking for any feedback on technical direction/factoring and the remaining Todos above which I believe, once completed, will complete the bounty. Feel free to tag others from GMC or dev team that might be appropriate to weigh in here. |
Will give this a read over tomorrow! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very strong start.
I do think we want more flexibility in the TUI, at a minimum the ability to enable/disable specific receivers and groups of alerts for specific receivers.
You don't want node operators editing yaml themselves... it's not pretty.
Okay how about I add a TUI entry for each "built in" alert with the choices "Do not alert" and "Discord" for now. I can use that to template the default rules file. Then other receivers can be added to the choices later (as well as an "All" I think?). Ok? |
Sounds like a good approach. You might want to make a new TUI page sooner rather than later. It sounds like it's going to get 'busy' and having an alerts/notifications dedicated page will buy you more real-estate. I'm kind of envisioning a checkbox style grid:
something like this. Ideally hiding receivers that aren't configured. It's a big ask, but anything you can do directionally towards this would be helpful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few comments
- Bounty: https://longforwisdom.github.io/rp-bounty/BA022308 - Thread: https://discord.com/channels/1109303903767507016/1204627769724637275 - Gist: https://gist.github.com/jshufro/e86b61fcd131d4c99938680934d854eb Also requires a corresponding PR in smartnode-install repo to start an alertmanager container and corresponding config.
…on client is unavailable
…ipool staking succeeded/failed
a88bbe6
to
d0a96dd
Compare
@@ -205,3 +205,23 @@ func (layout *standardLayout) mapParameterizedFormItems(params ...*parameterized | |||
layout.parameters[param.item] = param | |||
} | |||
} | |||
|
|||
// Sets up a handler to return to the specified homePage when the user presses escape on the layout. | |||
func (layout *standardLayout) setupEscapeReturnHomeHandler(md *mainDisplay, homePage *page) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I found 10 copies of this code throughout the codebase and replaced them all with this function.
Co-authored-by: Jacob Shufro <116244+jshufro@users.noreply.github.com>
@jshufro When ya'll are back this PR and the corresponding PR in rocket-pool/smartnode-install#119 are ready to have a "final" review and some testing. I believe this is now feature complete, I just need to get a PR together for docs. |
I believe that the combination of the 3 PRs mentioned above now meets the requirements of the bounty. While I think there are things that could be added (e.g. more delivery channels, more alerts, more TUI configuration options) this has become a significant body of work and a meaningful improvement for node operators so I'd like to follow up additional functionality in subsequent milestones. I do have some notes above pointing out some things that went beyond the requirements in the bounty and any somewhat "grey areas" (e.g. native mode) in the interest of transparency. Happy to discuss anything at all. What else can I do to move this forward? |
@@ -14,8 +14,13 @@ require ( | |||
github.com/ferranbt/fastssz v0.1.3 | |||
github.com/gdamore/tcell/v2 v2.6.0 | |||
github.com/glendc/go-external-ip v0.1.0 | |||
github.com/go-openapi/errors v0.21.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: I needed the openapi dependencies for the client to call alertmanager. The other version bumps were done by go. I didn't notice any problems with those bumps but they were really just done by the go mod...
command (get or tidy, I forget which).
I hope/think it's only stalled because of conference travel. @0xfornax should be able to approve and merge. @jclapis is the final verdict on whether the bounty is fulfilled but he's on paternity leave so we may have to ask the GMC if they're willing to take fornax/my word instead. I just got back from Denver so I'll test a bit and do another review pass. @shfryn fyi |
Greetings @activescott and @jshufro! I've started testing/reviewing the 3 PRs. |
…per review feedback
const maxItems = 3 | ||
for i, alert := range status.Alerts { | ||
fmt.Println(alert.ColorString()) | ||
if i == maxItems-1 { | ||
break | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const maxItems = 3 | |
for i, alert := range status.Alerts { | |
fmt.Println(alert.ColorString()) | |
if i == maxItems-1 { | |
break | |
} | |
} | |
const maxItems = 3 | |
for i, alert := range status.Alerts { | |
if i == maxItems { | |
break | |
} | |
fmt.Println(alert.ColorString()) | |
} |
upside- no need to reason about off-by-one errors
downside- one extra iteration if the compiler doesn't catch and optimize it.
this is the softest suggestion i think i've ever given though, so feel free to ignore
This is an implementation of Smartnode Notification Functionality bounty BA022308.
The approach is based on an ongoing discord discussion thread and @jshufro was kind enough to provide some implementation guidance.
This PR is related to:
Below is a screenshot of the TUI:
Below is a screenshot of what some alerts look like with a configured Discord webhook URL:
Below is a screenshot of
rocketpool node status
output:Todo
This is a draft to get some initial feedback. What's included and remaining todos are (across both of these PRs):
Alertmanager container managed by rocketpool-cli
Support for Discord Notifications (Configuration and docs in TUI)
Support for notifications to all Alertmanager receivers (email, msteams, opsgenie, pagerduty, pushover, slack, sns, telegram, victorops, webhook, wechat, webex) using yml configuration.
Alertmanager client implementation to implement remaining ephemeral alerts. NOTE: Alertmanager client will handle things such as notification rate limiting (if an alert is sent by the node, resolves, and resent rapidly) and auto-resolving (e.g. once the node stop sending an alert to alertmanager, it will be automatically resolved based on a configurable time-based threshold which can be set per alert and globally).
Configured alerts from bounty:
ClientSyncStatusBeacon
,ClientSyncStatusExecution
, implemented via new prometheus metrics namedrocketpool_node_sync_progress
using the same sync status calculation thatrocketpool node sync
uses).LowDiskSpaceWarning
,LowDiskSpaceCritical
provide a warning level and a critical level both with thresholds)UpcomingProposal
using the same metric shown on the grafana dashboard)UpcomingSyncCommittee
using the same metric shown on the grafana dashboard)ActiveSyncCommittee
using the same metric shown on the grafana dashboard)RPUpdatesAvailable
)Additional features beyond what is mentioned in bounty:
/alerting/rules/*.yml
using Prometheus rule configuration.OSUpdatesAvailable
)rocketpool node status
shows current actively firing alertsBasic TUI: Ability to configure alertmanager port & a discord webhook receiver/notification
Advanced TUI: Ability to enable/disable specific receivers and groups of alerts for specific receivers + new TUI page
Native Mode users support:
instructions on creating and managing a systemd service for Native Mode usersI'd like to recommend dropping this. The implementation is heavily related to setting up the Prometheus/Grafana stack and the current native mod docs direct the user to the docker section to set those up. Largely this is going to be installing Prometheus and Alertmanager and configuring them. If someone is advanced enough to do that I doubt they'll bother with our docs.a pull request to our documentation guides repository (GitHub - rocket-pool/docs.rocketpool.net: Rocket Pool Documentation & Guide Hub) with complete and thorough documentation describing its configuration and usage for Docker, Hybrid, and Native Mode users alike. – see feat: docs for smartnode notification functionality for bounty BA022308 (PR 3 of 3) docs.rocketpool.net#73
Open Questions
~/.rocketpool/alerting/rules/default.yml
. I can add each alert to the TUI, but this is of questionable value in IMHO considering the documentation and flexibility of the alertmanager rules file. Does the GMC want each notification in the TUI or is the rules yml acceptable?