Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New email notification template #11219

Merged
merged 4 commits into from Jun 10, 2021

Conversation

MrZammler
Copy link
Contributor

Summary

This PR adds the new email template for notifications.

Component Name

Health

Test Plan

Requires an agent host to be able to send emails (sendmail, etc), and the agent to be configured to send emails on alerts.

There are a few things to notice in the emails if they are correct:

  • The "Go to chart" link if it's ok.
  • If global and local time are ok.
  • The edit command if it correctly executes on the node.
  • If the small table on the bottom of the email correctly contains the list of other active alerts, along with the time they are active for.
Additional Information

@MrZammler
Copy link
Contributor Author

If anyone wants to try, but doesn't have (and doesn't want!) to configure and install sendmail, there is a small trick:

In line ~3424 of alarm-notify.sh.in file, an echo command can be added (e.g. echo $email_html_part > /a/path/somewhere/email.html) which will dump the constructed email message to a file, which can then be viewed from a web browser.

But because the way it works, we actually need to have a sendmail command somewhere in the $PATH. An empty, executable file will do. Without it, email notifications are disabled.

@stelfrag
Copy link
Collaborator

stelfrag commented Jun 3, 2021

The chart link will not work if the alert is triggered by a child node on the parent. It will attempt to redirect to the chart on the parent.

The edit command says

image

The host cherry0 is not where I need to go to edit this

@MrZammler
Copy link
Contributor Author

Yes... we will need to pass the localhost and put it in the edit part... Not sure about the chart link....

@stelfrag
Copy link
Collaborator

stelfrag commented Jun 3, 2021

The email for alerts on the parent looks ok

  • Chart link works
  • Number of other alerts (warning / critical)
  • Time looks ok

@stelfrag
Copy link
Collaborator

stelfrag commented Jun 3, 2021

Yes... we will need to pass the localhost and put it in the edit part... Not sure about the chart link....

The following is a link to a chart for a child node

https://staging.netdata.cloud/alarms/redirect?agentId=eb1a6f9a-9d5b-4d0a-8581-62eb5484a2a7&host=cherry0&chart=system.cpu&family=cpu&alarm=10min_cpu_usage&alarm_unique_id=1622366972&alarm_id=1622360286&alarm_event_id=121&alarm_when=1622730780&alarm_status=CRITICAL&alarm_chart=system.cpu&alarm_value=2.99%25

This will go to http://127.0.0.1:19999/#top;after=1622730480000;before=1622731080000;theme=white;help=true

but I guess it should know and include the /host/cherry0/ appropriately to show the chart for the child. Right now it will just go to the corresponding chart on the parent and just put a label 2.99% @jacekkolasa

<!--[if mso | IE]><table role="presentation" border="0" cellpadding="0" cellspacing="0"><tr><td class="" style="vertical-align:top;width:418.6px;" ><![endif]-->
<div
class="mj-column-per-70 mj-outlook-group-fix" style="font-size:0px;text-align:left;direction:ltr;display:inline-block;vertical-align:top;width:70%;"
>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the div was also split in three different lines.

@MrZammler
Copy link
Contributor Author

The chart link will not work if the alert is triggered by a child node on the parent. It will attempt to redirect to the chart on the parent.

The edit command says

image

The host cherry0 is not where I need to go to edit this

Ok, thanks Stelios. This should be fixed with 787c665

<tbody>
<tr>
<td align="left" style="font-size:0px;padding:10px 25px;padding-left:0;word-break:break-word;">
<div style="font-family:Open Sans, sans-serif;font-size:16px;font-weight:700;line-height:1;text-align:left;color:#35414A;">Want to know more about this alert?</div>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jacekkolasa Jacek, do you think we should change text-align:left to center? There are a couple of those, and might look better if centered?

@jacekkolasa
Copy link
Contributor

jacekkolasa commented Jun 4, 2021

Yes... we will need to pass the localhost and put it in the edit part... Not sure about the chart link....

The following is a link to a chart for a child node

https://staging.netdata.cloud/alarms/redirect?agentId=eb1a6f9a-9d5b-4d0a-8581-62eb5484a2a7&host=cherry0&chart=system.cpu&family=cpu&alarm=10min_cpu_usage&alarm_unique_id=1622366972&alarm_id=1622360286&alarm_event_id=121&alarm_when=1622730780&alarm_status=CRITICAL&alarm_chart=system.cpu&alarm_value=2.99%25

This will go to http://127.0.0.1:19999/#top;after=1622730480000;before=1622731080000;theme=white;help=true

but I guess it should know and include the /host/cherry0/ appropriately to show the chart for the child. Right now it will just go to the corresponding chart on the parent and just put a label 2.99% @jacekkolasa

Is the eb1a6f9a-9d5b-4d0a-8581-62eb5484a2a7 machine_guid of the child?
If it is proper child's guid, can you please look for the child in Visited Nodes sidebar section and see if it forwards correctly? The Visited Nodes section is backed by the same cloud registry as email-redirect.

Edit: In other words, if the guid is proper, what could happen here is that child was once accessed with 127.0.0.1:19999 url. You were probably signed in to the cloud at that time and that url was remembered as child's url.
I think that can be the case and we'll need to make a fix in the cloud redirect for this - when having list of possible urls in redirect phase, make additional registry/hello call for machine_guid and only redirect if that guid is matched.
Redirect link is probably good (if this is really child's guid)

@MrZammler
Copy link
Contributor Author

Hi @jacek! Can we proceed with merging this? I suppose the url on the mail template is ok and the fix is only on the cloud side, right?

@jacekkolasa
Copy link
Contributor

@MrZammler yes, it looks fine to me! Thanks!

@thiagoftsm thiagoftsm self-requested a review June 10, 2021 11:34
Copy link
Contributor

@thiagoftsm thiagoftsm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@MrZammler
Copy link
Contributor Author

Thank you guys!

@MrZammler MrZammler merged commit 86c9ada into netdata:master Jun 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants