Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: added total time for outage in notification msg #3394

Closed
wants to merge 4 commits into from

Conversation

prabhsharan36
Copy link

@prabhsharan36 prabhsharan36 commented Jul 9, 2023

Description

Resolves #177

Type of change

I have changed the msg body which goes to notification providers to give the total time for the outage.

Please delete any options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Checklist

  • My code follows the style guidelines of this project
  • I ran ESLint and other linters for modified files
  • I have performed a self-review of my own code and tested it
  • I have commented my code, particularly in hard-to-understand areas
    (including JSDoc for methods)
  • My changes generate no new warnings

Screenshots

Screenshot 2023-07-10 020006

Copy link
Collaborator

@CommanderStorm CommanderStorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made some style and logic comments which were unclear to me.
Given that this targets all notification providers, I tried to be thorough.

if (bean.status === UP) {
const downAfterPreviousUpBeat = await R.findOne("heartbeat", " monitor_id = ? AND status = ? AND time > ( SELECT time from heartbeat WHERE monitor_id = ? AND status = ? ORDER BY time DESC ) ORDER BY time ASC", [
this.id,
0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the rationale behind extracting the status like this?
Would using status = 0 or using DOWN/ UP be better?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's done

0
]);
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true);
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line does a lot of magic in one line.
Please extract complex statements.

This comment especially targets (downTime % 1).toFixed(2).replace(/^0\./, "").
I would prefer downTime to be formatted via dayjs instead (probably more readable)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's done

this.id,
0
]);
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be shure:
You are using ? here. Is this save? (the other parts of the code don't use optional chaining)
Read: have you tested this edgecase?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's done

// Change the notification msg if status becomes UP from DOWN
// Adding msg for the downtime
if (bean.status === UP) {
const downAfterPreviousUpBeat = await R.findOne("heartbeat", " monitor_id = ? AND status = ? AND time > ( SELECT time from heartbeat WHERE monitor_id = ? AND status = ? ORDER BY time DESC ) ORDER BY time ASC", [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Likely just paranoia:)
This query sounds expensive (2xordered Table scans with filtering of the largest table).

Given the current scaling issues, this is not ideal, but given that this query is only executed on notification if the bean is UP, this probably won't be an issue.

Have you measured the impact of this query? How long does it take? (sqlite does not allow concurrent queries => long-running queries could cause downtime on bigger databases or slower IO (e.g. rasperry Pis using SD-Cards))
Could waiting here for longer be an issue?
Testable via inserting

const sleep = ms => new Promise(r => setTimeout(r, ms));
await sleep(3000);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the current scaling issues, this is not ideal, but given that this query is only executed on notification if the bean is UP, this probably won't be an issue.

Yes, basically this query will only run when the service goes from DOWN to UP which is not a daily scenario for most of the service. Also, for cutting down the rows I am using filtering to get only those related to specific monitor by it's ID.
There is also no other way we can get the most recent UP beat without using the ordering of rows.

Have you measured the impact of this query? How long does it take? (sqlite does not allow concurrent queries => long-running queries could cause downtime on bigger databases or slower IO (e.g. rasperry Pis using SD-Cards))
Could waiting here for longer be an issue?

This actually varies from system to system and also how big the table size is. In my system, there are only around 100 rows for a single monitor and it gives results in some milliseconds from DB. So, we can't really find out how it will impact other systems of different sizes.

// Change the notification msg if status becomes UP from DOWN
// Adding msg for the downtime
if (bean.status === UP) {
const downAfterPreviousUpBeat = await R.findOne("heartbeat", " monitor_id = ? AND status = ? AND time > ( SELECT time from heartbeat WHERE monitor_id = ? AND status = ? ORDER BY time DESC ) ORDER BY time ASC", [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit unsure if these query results in the correct accounting.

So we are currently in bean.status === UP and are searching for the last DOWN after being UP at least once.
=> We are searching for UP-.*-DOWN-.*-Down-.*-Up:
image
(Red: DOWN, green: UP, blue: MAINTANANCE or PENDING)
And are counting from first Red to last Red.

Concerns:

  • Do we need to do special accounting for MAINTANANCE or PENDING?
  • What happens in this case? (is this reported as down 0 minute(s) 0 second(s)?)
    image
  • Would the current beat's UP to the last UP (or first event) be more accurate?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do special accounting for MAINTANANCE or PENDING?

According to me we shouldn't count them into the downtime period as in MAINTANANCE the service may be down but that is expected and in PENDING status we don't know whether service is UP or DOWN.

What happens in this case? (is this reported as down 0 minute(s) 0 second(s)?)

I have changed this condition now and now I am adding the duration of down beats.

Would the current beat's UP to the last UP (or first event) be more accurate?

According to me, that should be the condition, but I am happy about your views also on this.

0
]);
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true);
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`;
bean.msg += ` <Down for ${Math.floor(downTime)} minute(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the before one looks more readable

0
]);
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true);
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how this generates <Down for 3minute(s) 1 second(s)>.

  • (s) should only be added if pluralisation requires it ^^
  • The <> is a bit counterintuitive imo ⇒ I think round braces would fit better
  • Could relying on dayjs' formatting simplify this code?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

I have changed the format to this. But, I am not very satisfied with the minutes keyword at last as in this case we only have seconds and 00 minutes. Could you suggest a better format for this?

0
]);
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true);
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a message to the msg is something which might not be ideal.
This prevents notification providers like discord/slack to including them in the fancy embeds.

This is something that Luis will have to give his opinion on (both paths are workable imo)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better solution would be to add a parameter downDuration to send(), which is the duration of downtime in seconds, then individual notification providers can format the message as needed. This would provide better customization, and be non-breaking for the existing notifications.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chakflying I guess we will send the same msg to every notification provider. For instance, we will send how many minutes or hours this service goes down in the same manner to every provider. In refactored code, I have put the syntax like (mm: ss) format.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you check provider discord.js, it doesn't even use the msg parameter for up/down notifications. To be clear, I'm not saying this PR needs to update all the notification providers. You can update the ones that you are using. I'm just saying this is not a good solution considering the current situation.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you check provider discord.js, it doesn't even use the msg parameter for up/down notifications.

Oh, I now get your point. Thanks for letting me know. I have added one more parameter in the send() function which will have downDuration value in seconds and then all notification providers can use it the way they want.
Currently, I have added downDuration in Telegram Notification Provider only.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change should be added to the heartbeatJSON instead.
What is the rationale behind adding another parameter?

If adding it to heartbeatJSON is not an option / adding a parameter this is better, please add this parameter to every function invocation to make this obvious to the authors of the notificationproviders)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added downTime to hearbeatJSON.
code1

But in the "sendNotification" function I have kept the downTime parameter and not added another property to the "bean" parameter as eventually we are storing the data of "bean" into the database and adding a property to it will mean either we have to add coulmn in table which is of no use or we have to delete that added (downTime) property from the "bean" object.

code

@louislam louislam added this to the 2.0.0 milestone Jul 10, 2023
@RobertCSternberg
Copy link
Sponsor

Many SMS services will deny sending messages containing special characters to the carrier networks. It would make the notification more friendly to read and possibly pass through a few more notification services if we removed the encapsulating <> characters.

@prabhsharan36
Copy link
Author

Many SMS services will deny sending messages containing special characters to the carrier networks. It would make the notification more friendly to read and possibly pass through a few more notification services if we removed the encapsulating <> characters.

I have already changed this and now the response looks something like this:
Screenshot 2023-07-12 015020

@CommanderStorm

This comment was marked as resolved.

@CommanderStorm

This comment was marked as resolved.

@CommanderStorm

This comment was marked as resolved.

Copy link
Collaborator

@CommanderStorm CommanderStorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation does add the downTime parameter, even if this does not make sense
⇒ The people who will use this parameter will use it wrong.

I have attached an alternative way of handling this. what do you think?

@@ -809,8 +809,39 @@ class Monitor extends BeanModel {
bean.important = true;

if (Monitor.isImportantForNotification(isFirstBeat, previousBeat?.status, bean.status)) {
// Change the notification msg if status becomes UP from DOWN
// Adding msg for the downtime
let downTimeInSeconds;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let downTimeInSeconds;
let downTimeInSeconds = null;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bother setting it to null? It is already undefined so it doesn't really make much of a difference.

Comment on lines +840 to +844
let downTime = 0;
if (downTimeInSeconds) {
downTime = downTimeInSeconds.duration;
}
await Monitor.sendNotification(isFirstBeat, this, bean, downTime);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let downTime = 0;
if (downTimeInSeconds) {
downTime = downTimeInSeconds.duration;
}
await Monitor.sendNotification(isFirstBeat, this, bean, downTime);
await Monitor.sendNotification(isFirstBeat, this, bean, downTimeInSeconds.duration);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if downTimeInSeconds is undefined here, isn't downTimeInSeconds.duration will give error?

@@ -1233,7 +1264,7 @@ class Monitor extends BeanModel {
* @param {Monitor} monitor The monitor to send a notificaton about
* @param {Bean} bean Status information about monitor
*/
static async sendNotification(isFirstBeat, monitor, bean) {
static async sendNotification(isFirstBeat, monitor, bean, downTime = 0) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static async sendNotification(isFirstBeat, monitor, bean, downTime = 0) {
static async sendNotification(isFirstBeat, monitor, bean, downTime = null) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I really understand the point of setting it to null

@@ -1259,6 +1290,7 @@ class Monitor extends BeanModel {
heartbeatJSON["timezone"] = await UptimeKumaServer.getInstance().getTimezone();
heartbeatJSON["timezoneOffset"] = UptimeKumaServer.getInstance().getTimezoneOffset();
heartbeatJSON["localDateTime"] = dayjs.utc(heartbeatJSON["time"]).tz(heartbeatJSON["timezone"]).format(SQL_DATETIME_FORMAT);
heartbeatJSON["downTime"] = downTime;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
heartbeatJSON["downTime"] = downTime;
if (downTime !== null) {
heartbeatJSON["downTime"] = downTime;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a simple truthiness check not be fine here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being able to distinguish between information not available and "0" is a good thing here IMO.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sending the downTime property in heartbeatJSON even when downTime is 0 is a good thing as it tells the notification handler that there is a property called "downTime" in heartbeatJSON which can be anywhere between 0 - [MAX NUMBER]. Giving the downTime property only when it is larger than 0 will confuse the notification handler where the property sometimes comes and sometimes does not.

@@ -809,8 +809,39 @@ class Monitor extends BeanModel {
bean.important = true;

if (Monitor.isImportantForNotification(isFirstBeat, previousBeat?.status, bean.status)) {
// Change the notification msg if status becomes UP from DOWN
// Adding msg for the downtime
let downTimeInSeconds;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why bother setting it to null? It is already undefined so it doesn't really make much of a difference.

@@ -9,9 +10,17 @@ class Telegram extends NotificationProvider {
let okMsg = "Sent Successfully.";

try {
let message = msg;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the point of this is?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is added as it is recommended not to change the function attributes directly, so I am saving into another variable and changing it further by concatenating it with another string.

@@ -1259,6 +1290,7 @@ class Monitor extends BeanModel {
heartbeatJSON["timezone"] = await UptimeKumaServer.getInstance().getTimezone();
heartbeatJSON["timezoneOffset"] = UptimeKumaServer.getInstance().getTimezoneOffset();
heartbeatJSON["localDateTime"] = dayjs.utc(heartbeatJSON["time"]).tz(heartbeatJSON["timezone"]).format(SQL_DATETIME_FORMAT);
heartbeatJSON["downTime"] = downTime;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a simple truthiness check not be fine here?

@Computroniks
Copy link
Contributor

Oops, sorry about the duplication. GitHub has for some reason added my replies also as comments.

@louislam
Copy link
Owner

louislam commented Sep 19, 2023

Unfortunately, as I am changing the uptime calculation method, I believe it is not working in 2.0.0 anymore.

Changed in 2.0.0:

  • The heartbeat table stores only 24-hour data
  • heartbeat.duration is no longer used
  • Data will be aggregated into tables stat_daily and stat_minutely

Which are affect this pr.

Since the message is removed, be sure that you have read this and better have a discussion first next time.

⚠️⚠️⚠️ Since we do not accept all types of pull requests and do not want to waste your time. Please be sure that you have read pull request rules:
https://github.com/louislam/uptime-kuma/blob/master/CONTRIBUTING.md#can-i-create-a-pull-request-for-uptime-kuma

@louislam louislam modified the milestones: 2.0.0, Pending Sep 19, 2023
@louislam louislam added the question Further information is requested label Sep 19, 2023
@mabed-fr
Copy link

mabed-fr commented Oct 5, 2023

Malheureusement, comme je change la méthode de calcul de la disponibilité, je pense qu'elle ne fonctionne plus dans la version 2.0.0.

Modifié dans la version 2.0.0 :

  • La table des battements de cœur stocke uniquement les données sur 24 heures
  • heartbeat.duration n'est plus utilisé
  • Les données seront regroupées dans des tableaux stat_dailyetstat_minutely

Qui s'approprie ce pr.

Puisque le message a été supprimé, assurez-vous d'avoir lu ceci et feriez mieux d'avoir une discussion la prochaine fois.

⚠️⚠️⚠️Puisque nous n'acceptons pas tous les types de demandes de tirage et que nous ne voulons pas vous faire perdre du temps. Assurez-vous d'avoir lu les règles des pull request :
https://github.com/louislam/uptime-kuma/blob/master/CONTRIBUTING.md#can-i-create-a-pull-request-for-uptime -kuma

Do you have an idea to still enable this feature?

@CommanderStorm
Copy link
Collaborator

@prabhsharan36
Thank you for this contribution. Unfortunately, in v2.0 we changed the way how we calculate uptimes a lot. This makes this change incompatible as stated by Louis in #3394 (comment).

Given how different this approach is from what is needed and the progress this has made in the last few months, I am going to close this issue.

As documented in our contribution guide, we believe that new features create less wasted effort if a discussion takes place beforehand (⇒ we can tell you if we are about to make significant changes to this area of code).
This feature is still possible and should likely be implemented via f.ex. adding a field such as last_up_heartbeat_at on each monitor. If the necessary changes are made, we can reopen the PR or handle this in a different PR ^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:notifications Everything related to notifications question Further information is requested
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Duration of downtime in notifications
7 participants