Push monitor status parameter overrides retries #3208

dallyger · 2023-05-30T13:32:56Z

⚠️ Please verify that this bug has NOT been raised before.

I checked and didn't find similar issue

🛡️ Security Policy

I agree to have read this project Security Policy

Description

Hi,

thank you for this project. Helps a lot.

We have a couple scripts, where we push the result to push monitors.
If they fail, we push with status=down and attach the error message via the msg= parameter.
This works beautifully.

However, this will ignore the "Retry" option and immediately mark the monitor as "Down", causing notifications to go off. But most of the time, this is a temporary issue, which is gone in a minute or two.
This makes those notifications less impactful, as users ignore them thinking "Oh it's most likely just a temporary issue" or worse, silencing them.

It's not clear if this is a bug or intended behavior (we explicitly say state=down).
But even if this is a bug, someone will probably rely on this behavior.

I think the best action would be to add a third state (up, down and pending) for opting into the retry functionality in push monitors (and update docs in the UI).

Maybe someone has a scenario like this and therefore relies on the current behavior:

Custom script monitoring something and sends a heartbeat
Retry set to prevent frequent "No heartbeat in time window" notifications
Send heartbeat with state=down to get notified immediately

👟 Reproduction steps

Create Uptime-Kuma instance

docker run --rm -it --name uk -P --env=NODE_ENV=development louislam/uptime-kuma:1

Visit https://localhost:49153 (Port may differ)
Create User
Create Push Monitor with "Retries" set to positive number like 5
Send heartbeat with status=down

curl 'http://localhost:49153/api/push/dOS7AMWHLg?status=down&msg=OK&ping='

Reproduction environment (required fields represent the production server):
Uptime-Kuma: 1.21.3
OS: Pop!_OS 22.04 LTS (x86_64)
Docker: 20.10.21, build 20.10.21-0ubuntu1~22.04.3

👀 Expected behavior

Monitor switching state to "Pending", as the retry option is not surpassed.

😓 Actual Behavior

Monitor switching state to "Down", ignoring the retry option.
Switching from "Down" to "Pending" if no heartbeats are sent after.

🐻 Uptime-Kuma Version

1.21.3

💻 Operating System and Arch

Linux 5.10.0-19-amd64 #1 SMP Debian 5.10.149-2 (2022-10-21) x86_64 GNU/Linux

🌐 Browser

Firefox Developer Edition 114.0b9

🐋 Docker Version

20.10.5+dfsg1, build 55c4c88

🟩 NodeJS Version

v16.20.0 (from docker image)

📝 Relevant log output

2023-05-30T13:00:56Z [AUTH] INFO: Successfully logged in user uk. IP=172.17.0.1
2023-05-30T13:00:56Z [SETTINGS] DEBUG: Get Setting: initServerTimezone: null
2023-05-30T13:00:56Z [SERVER] DEBUG: emit initServerTimezone
2023-05-30T13:00:56Z [GENERALSOCKETHANDLER] DEBUG: Timezone: Europe/Berlin
2023-05-30T15:00:56+02:00 [SETTINGS] DEBUG: Get Setting (cache): primaryBaseURL: null
2023-05-30T15:00:56+02:00 [SETTINGS] DEBUG: Get Setting: serverTimezone: Europe/Berlin
2023-05-30T15:01:12+02:00 [SETTINGS] DEBUG: Get Setting (cache): trustProxy: null
2023-05-30T15:01:12+02:00 [ENTRY] DEBUG: Request Domain: 127.0.0.1
2023-05-30T15:01:13+02:00 [SETTINGS] DEBUG: Cache Cleaner is just started.
2023-05-30T15:01:24+02:00 [MANAGE] INFO: Resume Monitor: 1 User ID: 1
2023-05-30T15:01:24+02:00 [MONITOR] INFO: Added Monitor: undefined User ID: 1
2023-05-30T15:01:34+02:00 [ROUTER] DEBUG: /api/push/ called at 2023-05-30 15:01:34.090
2023-05-30T15:01:34+02:00 [ROUTER] DEBUG: PreviousStatus: 0
2023-05-30T15:01:34+02:00 [ROUTER] DEBUG: Current Status: 0
2023-05-30T15:01:34+02:00 [UPTIMECACHELIST] DEBUG: clearCache: 1
2023-05-30T15:01:34+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 24
2023-05-30T15:01:34+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 720
2023-05-30T15:01:40+02:00 [ROUTER] DEBUG: /api/push/ called at 2023-05-30 15:01:40.029
2023-05-30T15:01:40+02:00 [ROUTER] DEBUG: PreviousStatus: 0
2023-05-30T15:01:40+02:00 [ROUTER] DEBUG: Current Status: 1
2023-05-30T15:01:40+02:00 [UPTIMECACHELIST] DEBUG: clearCache: 1
2023-05-30T15:01:40+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 24
2023-05-30T15:01:40+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 720
2023-05-30T15:01:41+02:00 [ROUTER] DEBUG: /api/push/ called at 2023-05-30 15:01:41.683
2023-05-30T15:01:41+02:00 [ROUTER] DEBUG: PreviousStatus: 1
2023-05-30T15:01:41+02:00 [ROUTER] DEBUG: Current Status: 0
2023-05-30T15:01:41+02:00 [UPTIMECACHELIST] DEBUG: clearCache: 1
2023-05-30T15:01:41+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 24
2023-05-30T15:01:41+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 720
2023-05-30T15:01:42+02:00 [ROUTER] DEBUG: /api/push/ called at 2023-05-30 15:01:42.574
2023-05-30T15:01:42+02:00 [ROUTER] DEBUG: PreviousStatus: 0
2023-05-30T15:01:42+02:00 [ROUTER] DEBUG: Current Status: 0
2023-05-30T15:01:42+02:00 [UPTIMECACHELIST] DEBUG: clearCache: 1
2023-05-30T15:01:42+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 24
2023-05-30T15:01:42+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 720
2023-05-30T15:01:43+02:00 [ROUTER] DEBUG: /api/push/ called at 2023-05-30 15:01:43.737
2023-05-30T15:01:43+02:00 [ROUTER] DEBUG: PreviousStatus: 0
2023-05-30T15:01:43+02:00 [ROUTER] DEBUG: Current Status: 1
2023-05-30T15:01:43+02:00 [UPTIMECACHELIST] DEBUG: clearCache: 1
2023-05-30T15:01:43+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 24
2023-05-30T15:01:43+02:00 [UPTIMECACHELIST] DEBUG: addUptime: 1 720
2023-05-30T15:02:12+02:00 [SETTINGS] DEBUG: Get Setting (cache): trustProxy: null
2023-05-30T15:02:12+02:00 [ENTRY] DEBUG: Request Domain: 127.0.0.1
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner is just started.
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: title
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: dnsCache
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: entryPage
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: cloudflaredTunnelToken
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: checkUpdate
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: checkBeta
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: trustProxy
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: primaryBaseURL
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: disableAuth
2023-05-30T15:02:13+02:00 [SETTINGS] DEBUG: Cache Cleaner deleted: serverTimezone

The text was updated successfully, but these errors were encountered:

chakflying · 2023-06-01T14:34:54Z

Related: #1863

Thinking about how to fix this: Since the push route handling and the heartbeat loop are separate, there is currently no way to track the number of retries in the push route. What is the best way to get this information? We can recursively query the database for the last beat until we find the last successful beat, but it will build up a stack and get slow if the number of retries is high.

I guess we could query the last successful beat, then based on the returned ID use a second query to get all the beats since then.

There is also a problem that this way retries would persist across restart, which would be different to other monitors. We could mitigate this by saving the server restart time and adding that to the query, but yeah another point of complexity.

chakflying · 2023-06-01T14:40:17Z

I like your idea of being able to set pending for the push route, but there is an issue as the heartbeat loop would be unaware of this, so you can get more than your set number of "retries" before the timer set it as down.

ebricca · 2023-08-28T10:50:59Z

even a push calling "status=pending" would be ok in my case ..
it would allow me to represent services which are degraded / or busy with a msg notice ..

(the calling script itself could be made to tell after x times to set it to down ..)

dallyger added the bug Something isn't working label May 30, 2023

chakflying mentioned this issue Jun 14, 2023

Push Monitor Retries Not Work #3266

Closed

2 tasks

chakflying mentioned this issue Sep 28, 2023

Feat: Retries persistence #3814

Merged

7 tasks

louislam closed this as completed in #3814 Nov 24, 2023

thielj mentioned this issue May 22, 2024

Push monitors' PENDING-status not respecting retries #4785

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Push monitor status parameter overrides retries #3208

Push monitor status parameter overrides retries #3208

dallyger commented May 30, 2023

chakflying commented Jun 1, 2023

chakflying commented Jun 1, 2023 •

edited

Loading

ebricca commented Aug 28, 2023 •

edited

Loading

Push monitor status parameter overrides retries #3208

Push monitor status parameter overrides retries #3208

Comments

dallyger commented May 30, 2023

⚠️ Please verify that this bug has NOT been raised before.

🛡️ Security Policy

Description

👟 Reproduction steps

👀 Expected behavior

😓 Actual Behavior

🐻 Uptime-Kuma Version

💻 Operating System and Arch

🌐 Browser

🐋 Docker Version

🟩 NodeJS Version

📝 Relevant log output

chakflying commented Jun 1, 2023

chakflying commented Jun 1, 2023 • edited Loading

ebricca commented Aug 28, 2023 • edited Loading

chakflying commented Jun 1, 2023 •

edited

Loading

ebricca commented Aug 28, 2023 •

edited

Loading