-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: added total time for outage in notification msg #3394
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made some style and logic comments which were unclear to me.
Given that this targets all notification providers, I tried to be thorough.
server/model/monitor.js
Outdated
if (bean.status === UP) { | ||
const downAfterPreviousUpBeat = await R.findOne("heartbeat", " monitor_id = ? AND status = ? AND time > ( SELECT time from heartbeat WHERE monitor_id = ? AND status = ? ORDER BY time DESC ) ORDER BY time ASC", [ | ||
this.id, | ||
0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the rationale behind extracting the status like this?
Would using status = 0
or using DOWN
/ UP
be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's done
server/model/monitor.js
Outdated
0 | ||
]); | ||
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true); | ||
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line does a lot of magic in one line.
Please extract complex statements.
This comment especially targets (downTime % 1).toFixed(2).replace(/^0\./, "")
.
I would prefer downTime
to be formatted via dayjs instead (probably more readable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's done
server/model/monitor.js
Outdated
this.id, | ||
0 | ||
]); | ||
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be shure:
You are using ?
here. Is this save? (the other parts of the code don't use optional chaining)
Read: have you tested this edgecase?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's done
server/model/monitor.js
Outdated
// Change the notification msg if status becomes UP from DOWN | ||
// Adding msg for the downtime | ||
if (bean.status === UP) { | ||
const downAfterPreviousUpBeat = await R.findOne("heartbeat", " monitor_id = ? AND status = ? AND time > ( SELECT time from heartbeat WHERE monitor_id = ? AND status = ? ORDER BY time DESC ) ORDER BY time ASC", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Likely just paranoia:)
This query sounds expensive (2xordered Table scans with filtering of the largest table).
Given the current scaling issues, this is not ideal, but given that this query is only executed on notification if the bean is UP
, this probably won't be an issue.
Have you measured the impact of this query? How long does it take? (sqlite does not allow concurrent queries => long-running queries could cause downtime on bigger databases or slower IO (e.g. rasperry Pis using SD-Cards))
Could waiting here for longer be an issue?
Testable via inserting
const sleep = ms => new Promise(r => setTimeout(r, ms));
await sleep(3000);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the current scaling issues, this is not ideal, but given that this query is only executed on notification if the bean is UP, this probably won't be an issue.
Yes, basically this query will only run when the service goes from DOWN to UP which is not a daily scenario for most of the service. Also, for cutting down the rows I am using filtering to get only those related to specific monitor by it's ID.
There is also no other way we can get the most recent UP beat without using the ordering of rows.
Have you measured the impact of this query? How long does it take? (sqlite does not allow concurrent queries => long-running queries could cause downtime on bigger databases or slower IO (e.g. rasperry Pis using SD-Cards))
Could waiting here for longer be an issue?
This actually varies from system to system and also how big the table size is. In my system, there are only around 100 rows for a single monitor and it gives results in some milliseconds from DB. So, we can't really find out how it will impact other systems of different sizes.
server/model/monitor.js
Outdated
// Change the notification msg if status becomes UP from DOWN | ||
// Adding msg for the downtime | ||
if (bean.status === UP) { | ||
const downAfterPreviousUpBeat = await R.findOne("heartbeat", " monitor_id = ? AND status = ? AND time > ( SELECT time from heartbeat WHERE monitor_id = ? AND status = ? ORDER BY time DESC ) ORDER BY time ASC", [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit unsure if these query results in the correct accounting.
So we are currently in bean.status === UP
and are searching for the last DOWN
after being UP
at least once.
=> We are searching for UP-.*-DOWN-.*-Down-.*-Up
:
(Red: DOWN
, green: UP
, blue: MAINTANANCE
or PENDING
)
And are counting from first Red to last Red.
Concerns:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do special accounting for MAINTANANCE or PENDING?
According to me we shouldn't count them into the downtime period as in MAINTANANCE the service may be down but that is expected and in PENDING status we don't know whether service is UP or DOWN.
What happens in this case? (is this reported as down 0 minute(s) 0 second(s)?)
I have changed this condition now and now I am adding the duration of down beats.
Would the current beat's UP to the last UP (or first event) be more accurate?
According to me, that should be the condition, but I am happy about your views also on this.
server/model/monitor.js
Outdated
0 | ||
]); | ||
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true); | ||
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`; | |
bean.msg += ` <Down for ${Math.floor(downTime)} minute(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the before one looks more readable
server/model/monitor.js
Outdated
0 | ||
]); | ||
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true); | ||
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like how this generates <Down for 3minute(s) 1 second(s)>
.
(s)
should only be added if pluralisation requires it ^^- The
<>
is a bit counterintuitive imo ⇒ I think round braces would fit better - Could relying on dayjs' formatting simplify this code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server/model/monitor.js
Outdated
0 | ||
]); | ||
const downTime = dayjs(mostRecentDownBeat?.time).diff(dayjs(downAfterPreviousUpBeat?.time), "minutes", true); | ||
bean.msg = `${bean.msg} <Down for ${Math.floor(downTime)} mintue(s) ${(downTime % 1).toFixed(2).replace(/^0\./, "")} second(s)>`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a message to the msg
is something which might not be ideal.
This prevents notification providers like discord
/slack
to including them in the fancy embeds.
This is something that Luis will have to give his opinion on (both paths are workable imo)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better solution would be to add a parameter downDuration
to send(), which is the duration of downtime in seconds, then individual notification providers can format the message as needed. This would provide better customization, and be non-breaking for the existing notifications.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chakflying I guess we will send the same msg to every notification provider. For instance, we will send how many minutes or hours this service goes down in the same manner to every provider. In refactored code, I have put the syntax like (mm: ss) format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you check provider discord.js
, it doesn't even use the msg
parameter for up/down notifications. To be clear, I'm not saying this PR needs to update all the notification providers. You can update the ones that you are using. I'm just saying this is not a good solution considering the current situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you check provider discord.js, it doesn't even use the msg parameter for up/down notifications.
Oh, I now get your point. Thanks for letting me know. I have added one more parameter in the send() function which will have downDuration value in seconds and then all notification providers can use it the way they want.
Currently, I have added downDuration in Telegram Notification Provider only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change should be added to the heartbeatJSON
instead.
What is the rationale behind adding another parameter?
If adding it to heartbeatJSON
is not an option / adding a parameter this is better, please add this parameter to every function invocation to make this obvious to the authors of the notificationproviders)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added downTime to hearbeatJSON.
But in the "sendNotification" function I have kept the downTime parameter and not added another property to the "bean" parameter as eventually we are storing the data of "bean" into the database and adding a property to it will mean either we have to add coulmn in table which is of no use or we have to delete that added (downTime) property from the "bean" object.
658df28
to
562285f
Compare
Many SMS services will deny sending messages containing special characters to the carrier networks. It would make the notification more friendly to read and possibly pass through a few more notification services if we removed the encapsulating <> characters. |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation does add the downTime
parameter, even if this does not make sense
⇒ The people who will use this parameter will use it wrong.
I have attached an alternative way of handling this. what do you think?
@@ -809,8 +809,39 @@ class Monitor extends BeanModel { | |||
bean.important = true; | |||
|
|||
if (Monitor.isImportantForNotification(isFirstBeat, previousBeat?.status, bean.status)) { | |||
// Change the notification msg if status becomes UP from DOWN | |||
// Adding msg for the downtime | |||
let downTimeInSeconds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let downTimeInSeconds; | |
let downTimeInSeconds = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why bother setting it to null
? It is already undefined
so it doesn't really make much of a difference.
let downTime = 0; | ||
if (downTimeInSeconds) { | ||
downTime = downTimeInSeconds.duration; | ||
} | ||
await Monitor.sendNotification(isFirstBeat, this, bean, downTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let downTime = 0; | |
if (downTimeInSeconds) { | |
downTime = downTimeInSeconds.duration; | |
} | |
await Monitor.sendNotification(isFirstBeat, this, bean, downTime); | |
await Monitor.sendNotification(isFirstBeat, this, bean, downTimeInSeconds.duration); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if downTimeInSeconds is undefined here, isn't downTimeInSeconds.duration will give error?
@@ -1233,7 +1264,7 @@ class Monitor extends BeanModel { | |||
* @param {Monitor} monitor The monitor to send a notificaton about | |||
* @param {Bean} bean Status information about monitor | |||
*/ | |||
static async sendNotification(isFirstBeat, monitor, bean) { | |||
static async sendNotification(isFirstBeat, monitor, bean, downTime = 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static async sendNotification(isFirstBeat, monitor, bean, downTime = 0) { | |
static async sendNotification(isFirstBeat, monitor, bean, downTime = null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I really understand the point of setting it to null
@@ -1259,6 +1290,7 @@ class Monitor extends BeanModel { | |||
heartbeatJSON["timezone"] = await UptimeKumaServer.getInstance().getTimezone(); | |||
heartbeatJSON["timezoneOffset"] = UptimeKumaServer.getInstance().getTimezoneOffset(); | |||
heartbeatJSON["localDateTime"] = dayjs.utc(heartbeatJSON["time"]).tz(heartbeatJSON["timezone"]).format(SQL_DATETIME_FORMAT); | |||
heartbeatJSON["downTime"] = downTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heartbeatJSON["downTime"] = downTime; | |
if (downTime !== null) { | |
heartbeatJSON["downTime"] = downTime; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a simple truthiness check not be fine here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Being able to distinguish between information not available and "0" is a good thing here IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sending the downTime property in heartbeatJSON even when downTime is 0 is a good thing as it tells the notification handler that there is a property called "downTime" in heartbeatJSON which can be anywhere between 0 - [MAX NUMBER]. Giving the downTime property only when it is larger than 0 will confuse the notification handler where the property sometimes comes and sometimes does not.
@@ -809,8 +809,39 @@ class Monitor extends BeanModel { | |||
bean.important = true; | |||
|
|||
if (Monitor.isImportantForNotification(isFirstBeat, previousBeat?.status, bean.status)) { | |||
// Change the notification msg if status becomes UP from DOWN | |||
// Adding msg for the downtime | |||
let downTimeInSeconds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why bother setting it to null
? It is already undefined
so it doesn't really make much of a difference.
@@ -9,9 +10,17 @@ class Telegram extends NotificationProvider { | |||
let okMsg = "Sent Successfully."; | |||
|
|||
try { | |||
let message = msg; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the point of this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is added as it is recommended not to change the function attributes directly, so I am saving into another variable and changing it further by concatenating it with another string.
@@ -1259,6 +1290,7 @@ class Monitor extends BeanModel { | |||
heartbeatJSON["timezone"] = await UptimeKumaServer.getInstance().getTimezone(); | |||
heartbeatJSON["timezoneOffset"] = UptimeKumaServer.getInstance().getTimezoneOffset(); | |||
heartbeatJSON["localDateTime"] = dayjs.utc(heartbeatJSON["time"]).tz(heartbeatJSON["timezone"]).format(SQL_DATETIME_FORMAT); | |||
heartbeatJSON["downTime"] = downTime; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a simple truthiness check not be fine here?
Oops, sorry about the duplication. GitHub has for some reason added my replies also as comments. |
Unfortunately, as I am changing the uptime calculation method, I believe it is not working in 2.0.0 anymore. Changed in 2.0.0:
Which are affect this pr. Since the message is removed, be sure that you have read this and better have a discussion first next time.
|
Do you have an idea to still enable this feature? |
@prabhsharan36 Given how different this approach is from what is needed and the progress this has made in the last few months, I am going to close this issue. As documented in our contribution guide, we believe that new features create less wasted effort if a discussion takes place beforehand (⇒ we can tell you if we are about to make significant changes to this area of code). |
Description
Resolves #177
Type of change
I have changed the msg body which goes to notification providers to give the total time for the outage.
Please delete any options that are not relevant.
Checklist
(including JSDoc for methods)
Screenshots