Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitors are stopped after database is locked error #3130

Closed
2 tasks done
deluxghost opened this issue May 6, 2023 · 11 comments · Fixed by #3174
Closed
2 tasks done

Monitors are stopped after database is locked error #3130

deluxghost opened this issue May 6, 2023 · 11 comments · Fixed by #3174
Labels
bug Something isn't working

Comments

@deluxghost
Copy link
Contributor

⚠️ Please verify that this bug has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

Description

I have updated uptime kuma to the latest version recently. I use push monitors only, they are all created in an old version.

One day my push clients are offline for about 1 hour because my local firewall, but when I check the uptime dashboard, there is no DOWN record at all.

The heartbeat intervals of monitors are 60s/30s an retry is 5/3, they should go down in 5mins normally without any push request.

the chart of affected monitors looks like this:

image

I stopped the push client for about 10 mins, no request in the period. In chart, there is no data in the period too but it's green all the time.

Editing an affected monitor without actually changing any settings and saving directly will fix this issue for that one monitor. (Mumble in the logs below)

New monitor (test in the logs below) works well

👟 Reproduction steps

  1. updated from an old version with push monitors (I cant remember the actual version, sorry)
  2. stop the push client
  3. wait until the monitor should go down
  4. it's not going down
  5. start the push client
  6. you can see the missing data in chart, but no down event

👀 Expected behavior

push monitors from old version should work as expected (going down without push requests) without re-saving

😓 Actual Behavior

such monitors never going down

🐻 Uptime-Kuma Version

1.21.2

💻 Operating System and Arch

Ubuntu 20.04.5 LTS

🌐 Browser

Chrome 112.0.5615.49

🐋 Docker Version

20.10.18

🟩 NodeJS Version

No response

📝 Relevant log output

uptime-kuma  | 2023-05-06T03:14:01+08:00 [PLUGIN] WARN: Warning: In order to enable plugin feature, you need to use the default data directory: ./data/
uptime-kuma  | 2023-05-06T03:14:01+08:00 [DB] INFO: Data Dir: data/
uptime-kuma  | 2023-05-06T03:14:02+08:00 [DB] INFO: SQLite config:
uptime-kuma  | [ { journal_mode: 'wal' } ]
uptime-kuma  | [ { cache_size: -12000 } ]
uptime-kuma  | 2023-05-06T03:14:02+08:00 [DB] INFO: SQLite Version: 3.41.1
uptime-kuma  | {
uptime-kuma  |   name: 'clear-old-data',
uptime-kuma  |   message: 'Clearing Data older than 180 days...'
uptime-kuma  | }
uptime-kuma  | { name: 'clear-old-data', message: 'done' }
uptime-kuma  | 2023-05-06T14:52:29+08:00 [AUTH] INFO: Login by token. IP=#############
uptime-kuma  | 2023-05-06T14:52:29+08:00 [AUTH] INFO: Username from JWT: #############
uptime-kuma  | 2023-05-06T14:52:29+08:00 [AUTH] INFO: Successfully logged in user #############. IP=#############
uptime-kuma  | 2023-05-06T16:05:55+08:00 [AUTH] INFO: Login by token. IP=#############
uptime-kuma  | 2023-05-06T16:05:55+08:00 [AUTH] INFO: Username from JWT: #############
uptime-kuma  | 2023-05-06T16:05:55+08:00 [AUTH] INFO: Successfully logged in user #############. IP=#############
uptime-kuma  | 2023-05-06T16:07:02+08:00 [MONITOR] INFO: Get Monitor: 12 User ID: 1
uptime-kuma  | 2023-05-06T16:08:30+08:00 [AUTH] INFO: Login by token. IP=#############
uptime-kuma  | 2023-05-06T16:08:30+08:00 [AUTH] INFO: Username from JWT: #############
uptime-kuma  | 2023-05-06T16:08:30+08:00 [AUTH] INFO: Successfully logged in user #############. IP=#############
uptime-kuma  | 2023-05-06T16:08:37+08:00 [MONITOR] INFO: Get Monitor: 12 User ID: 1
uptime-kuma  | 2023-05-06T16:09:18+08:00 [MANAGE] INFO: Resume Monitor: 28 User ID: 1
uptime-kuma  | 2023-05-06T16:09:18+08:00 [MONITOR] INFO: Added Monitor: undefined User ID: 1
uptime-kuma  | 2023-05-06T16:09:27+08:00 [MONITOR] INFO: Get Monitor: 28 User ID: 1
uptime-kuma  | 2023-05-06T16:09:38+08:00 [MONITOR] WARN: Monitor #28 'test': Pending: No heartbeat in the time window | Max retries: 1 | Retry: 1 | Retry Interval: 20 seconds | Type: push
uptime-kuma  | 2023-05-06T16:10:06+08:00 [MONITOR] WARN: Monitor #28 'test': Pending: No heartbeat in the time window | Max retries: 1 | Retry: 1 | Retry Interval: 20 seconds | Type: push
uptime-kuma  | 2023-05-06T16:10:08+08:00 [MONITOR] INFO: Get Monitor: 12 User ID: 1
uptime-kuma  | 2023-05-06T16:10:26+08:00 [MONITOR] WARN: Monitor #28 'test': Failing: No heartbeat in the time window | Interval: 20 seconds | Type: push | Down Count: 0 | Resend Interval: 0
uptime-kuma  | 2023-05-06T16:10:34+08:00 [MANAGE] INFO: Delete Monitor: 28 User ID: 1
uptime-kuma  | 2023-05-06T16:10:38+08:00 [MONITOR] INFO: Get Monitor: 12 User ID: 1
uptime-kuma  | 2023-05-06T16:10:43+08:00 [MANAGE] INFO: Resume Monitor: 12 User ID: 1
uptime-kuma  | 2023-05-06T16:10:43+08:00 [MONITOR] INFO: Get Monitor: 12 User ID: 1
uptime-kuma  | 2023-05-06T16:11:13+08:00 [MONITOR] WARN: Monitor #12 'Mumble': Pending: No heartbeat in the time window | Max retries: 3 | Retry: 1 | Retry Interval: 30 seconds | Type: push
uptime-kuma  | 2023-05-06T16:11:43+08:00 [MONITOR] WARN: Monitor #12 'Mumble': Pending: No heartbeat in the time window | Max retries: 3 | Retry: 2 | Retry Interval: 30 seconds | Type: push
uptime-kuma  | 2023-05-06T16:12:07+08:00 [MONITOR] INFO: Get Monitor: 8 User ID: 1
uptime-kuma  | 2023-05-06T16:12:12+08:00 [MONITOR] INFO: Get Monitor: 26 User ID: 1
uptime-kuma  | 2023-05-06T16:12:13+08:00 [MONITOR] WARN: Monitor #12 'Mumble': Pending: No heartbeat in the time window | Max retries: 3 | Retry: 3 | Retry Interval: 30 seconds | Type: push
uptime-kuma  | 2023-05-06T16:12:24+08:00 [MONITOR] INFO: Get Monitor: 14 User ID: 1
uptime-kuma  | 2023-05-06T16:12:44+08:00 [MONITOR] WARN: Monitor #12 'Mumble': Failing: No heartbeat in the time window | Interval: 30 seconds | Type: push | Down Count: 0 | Resend Interval: 0
uptime-kuma  | 2023-05-06T16:13:16+08:00 [MONITOR] INFO: Get Monitor Beats: 11 User ID: 1
uptime-kuma  | 2023-05-06T16:13:33+08:00 [MONITOR] INFO: Get Monitor Beats: 11 User ID: 1
uptime-kuma  | 2023-05-06T16:14:00+08:00 [AUTH] INFO: Login by token. IP=#############
uptime-kuma  | 2023-05-06T16:14:00+08:00 [AUTH] INFO: Username from JWT: #############
uptime-kuma  | 2023-05-06T16:14:00+08:00 [AUTH] INFO: Successfully logged in user #############. IP=#############
uptime-kuma  | 2023-05-06T16:45:11+08:00 [MONITOR] INFO: Get Monitor Beats: 11 User ID: 1
uptime-kuma  | 2023-05-06T16:45:14+08:00 [MONITOR] INFO: Get Monitor Beats: 11 User ID: 1
uptime-kuma  | 2023-05-06T16:56:42+08:00 [MONITOR] INFO: Get Monitor: 11 User ID: 1
@deluxghost deluxghost added the bug Something isn't working label May 6, 2023
@chakflying
Copy link
Collaborator

Does the issue reappear if you restart the server?

@deluxghost
Copy link
Contributor Author

Does the issue reappear if you restart the server?

I think I restarted the server during the updating process... but yeah, restarting the server seems to fix this issue

@deluxghost
Copy link
Contributor Author

Does the issue reappear if you restart the server?

weird, I think I have already restarted the server, but when I shutted down one of my node yesterday, the issue is still there. All services on the node should go down but not

image

@deluxghost deluxghost reopened this May 14, 2023
@chakflying
Copy link
Collaborator

Unfortunately a lot of the useful logging are hidden unless you run with NODE_ENV=development. Is this the same monitor that you previously "Saved without changing", or is this another one?

I thought about this but there really shouldn't be any database changes that would affect the behavior of the push monitor.

@deluxghost
Copy link
Contributor Author

Ah, so I need to add the variable to my compose file to get more info. I will do more test later

@deluxghost
Copy link
Contributor Author

Looks like it might not be about the version. I have enabled debug logs, and noticed some of the monitors gone after these errors:

^^^ There are more same errors above ^^^


uptime-kuma  | Trace: [Error: insert into `heartbeat` (`down_count`, `duration`, `important`, `monitor_id`, `msg`, `status`, `time`) values (0, 61, false, 8, 'No heartbeat in the time window', 2, '2023-05-17 19:14:33.005') - SQLITE_BUSY: database is locked] {
uptime-kuma  |   errno: 5,
uptime-kuma  |   code: 'SQLITE_BUSY'
uptime-kuma  | }
uptime-kuma  |     at process.<anonymous> (/app/server/server.js:1804:13)
uptime-kuma  |     at process.emit (node:events:513:28)
uptime-kuma  |     at emit (node:internal/process/promises:140:20)
uptime-kuma  |     at processPromiseRejections (node:internal/process/promises:274:27)
uptime-kuma  |     at processTicksAndRejections (node:internal/process/task_queues:97:32)
uptime-kuma  | If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues
uptime-kuma  | Trace: [Error: insert into `heartbeat` (`down_count`, `duration`, `important`, `monitor_id`, `msg`, `status`, `time`) values (0, 62, false, 10, 'No heartbeat in the time window', 2, '2023-05-17 19:14:34.030') - SQLITE_BUSY: database is locked] {
uptime-kuma  |   errno: 5,
uptime-kuma  |   code: 'SQLITE_BUSY'
uptime-kuma  | }
uptime-kuma  |     at process.<anonymous> (/app/server/server.js:1804:13)
uptime-kuma  |     at process.emit (node:events:513:28)
uptime-kuma  |     at emit (node:internal/process/promises:140:20)
uptime-kuma  |     at processPromiseRejections (node:internal/process/promises:274:27)
uptime-kuma  |     at processTicksAndRejections (node:internal/process/task_queues:97:32)
uptime-kuma  | If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues
uptime-kuma  | Trace: [Error: insert into `heartbeat` (`down_count`, `duration`, `important`, `monitor_id`, `msg`, `status`, `time`) values (0, 62, false, 1, 'No heartbeat in the time window', 2, '2023-05-17 19:14:34.033') - SQLITE_BUSY: database is locked] {
uptime-kuma  |   errno: 5,
uptime-kuma  |   code: 'SQLITE_BUSY'
uptime-kuma  | }
uptime-kuma  |     at process.<anonymous> (/app/server/server.js:1804:13)
uptime-kuma  |     at process.emit (node:events:513:28)
uptime-kuma  |     at emit (node:internal/process/promises:140:20)
uptime-kuma  |     at processPromiseRejections (node:internal/process/promises:274:27)
uptime-kuma  |     at processTicksAndRejections (node:internal/process/task_queues:97:32)
uptime-kuma  | If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues
uptime-kuma  | 2023-05-18T03:14:56+08:00 [UPTIMECACHELIST] DEBUG: clearCache: 10
uptime-kuma  | 2023-05-18T03:14:56+08:00 [MONITOR] DEBUG: No clients in the room, no need to send stats
uptime-kuma  | { name: 'clear-old-data', message: 'done' }
uptime-kuma  | 2023-05-18T03:14:56+08:00 [UPTIMECACHELIST] DEBUG: clearCache: 1
uptime-kuma  | 2023-05-18T03:14:56+08:00 [MONITOR] DEBUG: No clients in the room, no need to send stats
uptime-kuma  | 2023-05-18T03:14:56+08:00 [ROUTER] DEBUG: /api/push/ called at 2023-05-18 03:14:56.428

I'm using docker local bind (/local/path:/app/data), my db preserve setting is 180 days, and current db size is about 1400M

I have seen another database is locked error before, but the log messages are different:

uptime-kuma  | Trace: [Error: insert into `heartbeat` (`down_count`, `duration`, `important`, `monitor_id`, `msg`, `status`, `time`) values (0, 61, false, 21, 'No heartbeat in the time window', 0, '2023-05-16 19:14:25.968') - SQLITE_BUSY: database is locked] {
uptime-kuma  |   errno: 5,
uptime-kuma  |   code: 'SQLITE_BUSY'
uptime-kuma  | }
uptime-kuma  |     at Timeout.safeBeat [as _onTimeout] (/app/server/model/monitor.js:811:25)
uptime-kuma  | 2023-05-17T03:14:35+08:00 [MONITOR] INFO: Try to restart the monitor

one of the message mentioned the monitor is restarted, another message didn't

after the batch of errors, I haven't seen the Checking monitor at <date time> of most of the monitors

@deluxghost
Copy link
Contributor Author

The two error handlers:

uptime-kuma/server/server.js

Lines 1802 to 1807 in 9dc02bb

// Catch unexpected errors here
process.addListener("unhandledRejection", (error, promise) => {
console.trace(error);
UptimeKumaServer.errorLog(error, false);
console.error("If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues");
});

const safeBeat = async () => {
try {
await beat();
} catch (e) {
console.trace(e);
UptimeKumaServer.errorLog(e, false);
log.error("monitor", "Please report to https://github.com/louislam/uptime-kuma/issues");
if (! this.isStop) {
log.info("monitor", "Try to restart the monitor");
this.heartbeatInterval = setTimeout(safeBeat, this.interval * 1000);
}
}
};

@deluxghost
Copy link
Contributor Author

I have also encountered the KnexTimeoutError issue, maybe waiting for 2.0 with external db is a better solution

@deluxghost deluxghost changed the title Push monitors from old version don't go DOWN state at all Monitors are stopped after database is locked error May 18, 2023
@chakflying
Copy link
Collaborator

Those errors will show up without setting debug logging. If you haven't seen them before, it maybe because printing the debug logs increased server load too much.

@deluxghost
Copy link
Contributor Author

The log was too long so I didn't find them before starting checking Checking monitor at... logs

@chakflying
Copy link
Collaborator

Thank you for your investigation. This is a separate issue and a fix is now open in #3174.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants