Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dashboard takes forever to load #1397

Closed
2 tasks done
babecassis opened this issue Mar 21, 2022 · 23 comments · Fixed by #3515
Closed
2 tasks done

Dashboard takes forever to load #1397

babecassis opened this issue Mar 21, 2022 · 23 comments · Fixed by #3515
Labels
bug Something isn't working

Comments

@babecassis
Copy link

⚠️ Please verify that this bug has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

Description

Loading the dashboard sometimes loads with no monitored entities. When this happens I see. Happens often. Average load on my system is

Trace: KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? at Client_SQLite3.acquireConnection (/app/node_modules/knex/lib/client.js:305:26) at runNextTicks (internal/process/task_queues.js:60:5) at listOnTimeout (internal/timers.js:526:9) at processTimers (internal/timers.js:500:7) at async Runner.ensureConnection (/app/node_modules/knex/lib/execution/runner.js:259:28) at async Runner.run (/app/node_modules/knex/lib/execution/runner.js:30:19) at async RedBeanNode.normalizeRaw (/app/node_modules/redbean-node/dist/redbean-node.js:570:22) at async RedBeanNode.getRow (/app/node_modules/redbean-node/dist/redbean-node.js:556:22) at async Function.calcUptime (/app/server/model/monitor.js:590:22) at async Function.sendUptime (/app/server/model/monitor.js:650:24) { sql: '\n' + ' SELECT\n' + ' -- SUM all duration, also trim off the beat out of time window\n' + ' SUM(\n' + ' CASE\n' + ' WHEN (JULIANDAY(time) - JULIANDAY(?)) * 86400 < duration\n' + ' THEN (JULIANDAY(time) - JULIANDAY(?)) * 86400\n' + ' ELSE duration\n' + ' END\n' + ' ) AS total_duration,\n' + '\n' + ' -- SUM all uptime duration, also trim off the beat out of time window\n' + ' SUM(\n' + ' CASE\n' + ' WHEN (status = 1)\n' + ' THEN\n' + ' CASE\n' + ' WHEN (JULIANDAY(time) - JULIANDAY(?)) * 86400 < duration\n' + ' THEN (JULIANDAY(time) - JULIANDAY(?)) * 86400\n' + ' ELSE duration\n' + ' END\n' + ' END\n' + ' ) AS uptime_duration\n' + ' FROM heartbeat\n' + ' WHERE time > ?\n' + ' AND monitor_id = ?\n' + ' ', bindings: [ '2022-02-15 19:32:42', '2022-02-15 19:32:42', '2022-02-15 19:32:42', '2022-02-15 19:32:42', '2022-02-15 19:32:42', 20 ] } at process.<anonymous> (/app/server/server.js:1553:13) at process.emit (events.js:400:28) at processPromiseRejections (internal/process/promises.js:245:33) at processTicksAndRejections (internal/process/task_queues.js:96:32) at runNextTicks (internal/process/task_queues.js:64:3) at listOnTimeout (internal/timers.js:526:9) at processTimers (internal/timers.js:500:7) If you keep encountering errors, please report to https://github.com/louislam/uptime-kuma/issues

👟 Reproduction steps

I do not know what triggers this

👀 Expected behavior

I see my monitored services

😓 Actual Behavior

Dashboard loads w/ no monitored services

🐻 Uptime-Kuma Version

1.12.1

💻 Operating System and Arch

Raspbian / Raspberry pi 3

🌐 Browser

MS Edge

🐋 Docker Version

20.10.12, build e91ed57

🟩 NodeJS Version

No response

📝 Relevant log output

No response

@babecassis babecassis added the bug Something isn't working label Mar 21, 2022
@louislam
Copy link
Owner

louislam commented Mar 22, 2022

It is usually related to read/write. Since you are using pi, make sure the sd card is fast and do not using a network drive as the volume.

@halfu
Copy link

halfu commented May 18, 2022

It is usually related to read/write. Since you are using pi, make sure the sd card is fast and do not using a network drive as the volume.

I can confirm this on DSM 6.2 docker, uptime version 1.15.1.

I had put the ./data folder on a HDD, when I got those "KnexTimeoutError", I noticed hight disk usage at that HDD (99% or 100% for quite long period). So I moved ./data folder to an SSD RAID 1 yesterday, then all those error was gone. The usage of this SSD raid is less than 5% while writes/read IOs at about 50/40.

FYI, I have setup 37 monitors in totall, 2 of them are https monitor, 1 income monitor, 1 DNS monitor, ping monitors for the rest. Most of these monitors are triggered every 1 minute.

@m3nu
Copy link

m3nu commented Jul 2, 2022

I'm seeing the same issue and error in my own installations and some PikaPods.com users have reported them.

Workaround is to limit history days (to 14 or 30 days) and clear the history. As others have noticed the issue starts at around 1 GB DB size.

It's not a network drive, but by default we only assign 0.25 CPU cores. That gives about 3 MB/s read/write speed from what I see. That may be too slow and maybe we need more cores for more history.

Screen Shot 2022-07-02 at 10 21 26

@Aterfax
Copy link

Aterfax commented Jul 16, 2022

Seeing the same on the latest docker image.

@kalpik
Copy link

kalpik commented Nov 11, 2022

I have the same issue on a Synology Diskstation.

@Aterfax
Copy link

Aterfax commented Nov 12, 2022

Had this happen to me, solved it by doing a backup / export of the config, deleting all the appdata then re-importing.

@kalpik
Copy link

kalpik commented Nov 12, 2022

re

Yes, but it keeps happening again and again after while.

@m3nu
Copy link

m3nu commented Nov 12, 2022

I'm guessing the solution would be to use a real database if one has "real" data. Or to limit the data, like I'm doing right now #1397 (comment)

@kalpik
Copy link

kalpik commented Nov 12, 2022

I'm guessing the solution would be to use a real database if one has "real" data. Or to limit the data, like I'm doing right now #1397 (comment)

Yep, but a real database isn't supported unfortunately. Even limiting data to 14 days is not enough.

@hyperbart
Copy link

Running on a Docker host, Ubuntu 22.04.1 LTS, latest and greatest.
Uptime Kuma has been becoming slower lately: opening the page shows 0 monitors and only after 20/30 secs it starts populating.

Server is almost idle. Uptime Kuma is backed by NVMe SSD on ZFS.

Database is indeed quite big:

image

@hyperbart
Copy link

Hit the button Shrink, nothing seemed to happen.
Hit the button clear all statistics
Meanwhile went to NetData Monitoring, it shows Uptime Kuma was very busy, don't know if the busyness came from the shrink or the clear statistics.

image

Interface feels snappy now, lowered the days to keep monitoring data to 14 days.

@tabimoba
Copy link

tabimoba commented Nov 25, 2022

I have same problem. I have 600+ monitors on Uptime Kuma.

Immediately after starting Uptime Kuma, Dashboard appears immediately, but as time passes, the display slows down or does not appear at all. When the display slows down or does not display at all, the CPU is near 100%.

After checking Network in Chrome Developer Tools, it appears that when the Dashboard is open in the web browser, there is a huge amount of websocket traffic flowing through the web browser.

It also appears that all data is being read when the Dashboard is displayed (data from all monitors and events is being read, including monitors and events that are not displayed on the screen). I assume that this results in increased disk I/O and excessive load on the DB.

I think this problem could be solved by adding lazyload and pagination. Please consider this.

@kalpik
Copy link

kalpik commented Nov 25, 2022

I have same problem. I have 600+ monitors on Uptime Kuma.

Immediately after starting Uptime Kuma, Dashboard appears immediately, but as time passes, the display slows down or does not appear at all. When the display slows down or does not display at all, the CPU is near 100%.

After checking Network in Chrome Developer Tools, it appears that when the Dashboard is open in the web browser, there is a huge amount of websocket traffic flowing through the web browser.

It also appears that all data is being read when the Dashboard is displayed (data from all monitors and events is being read, including monitors and events that are not displayed on the screen). I assume that this results in increased disk I/O and excessive load on the DB.

I think this problem could be solved by adding lazyload and pagination. Please consider this.

The number of monitors is not the issue. The data layer (SQLite) just locks up after a while. I only have like 15 monitors, and that locks up SQLite as well. The only solution IMO is running a proper DB, which as I understand is difficult to do because of the way uptime Kuma is built. So at this point, I'm mostly looking for alternatives :(

@m3nu
Copy link

m3nu commented Dec 4, 2022

The only solution IMO is running a proper DB, which as I understand is difficult to do because of the way uptime Kuma is built

Why is it difficult to use another database, @louislam? I just looked through the code and in most places it already uses a ORM (Redbean) that also supports MySQL. Here an example how it's used.

The only place that's more SQLite-specific is the migrations here. I'm assuming Redbean can also change DB tables, so that's solvable.

Since we already host a few Uptime-Kuma instances on PikaPods and Louis hasn't claimed any revenue share, I'd like to offer a bounty of US$1500 to implement MySQL support for this project under the same open license. This is much more than we can afford in relation to what our users pay, but I see it as an investment, since the rest of Uptime Kuma is great.

@JacksonChen666
Copy link

I'd like to offer a bounty of US$1500 to implement MySQL support for this project under the same open license.

personally, i would lean on postgresql support. but i think the most ideal and beneficial support is support for different database systems like postgresql, mysql, sqlite, etc.

@christopherpickering
Copy link
Contributor

I'd vote postgres just because it is already running on my server :)

@Saibamen
Copy link
Contributor

Please retest with latest UpTime Kuma release

@aessing
Copy link

aessing commented Sep 8, 2023

Please retest with latest UpTime Kuma release

For me the issue still exists in the latest version.
Running 1.23.1 on Kubernetes and Longhorn backed by SSDs.

@CommanderStorm
Copy link
Collaborator

@aessing your issue is possibly unrelated to the issue you are posting in.
Longhorn uses iscaci NFS under the hood as I understad it.
=> uptime-kuma contains a database
=> you are running a database on a network share
=> possibly the added latency of reads/writes is killing the database performance and not #3515

Note that running on a NFS-Style system has soundness bugs with SQLite databases due to faulty file locking which may lead to corrupted databases.
Please run uptime-kuma on a local volume instead.
See https://github.com/louislam/uptime-kuma/wiki/%F0%9F%94%A7-How-to-Install#-docker and https://www.sqlite.org/howtocorrupt.html#_filesystems_with_broken_or_missing_lock_implementations

@aessing
Copy link

aessing commented Sep 8, 2023

Thanks @CommanderStorm for your quick response.

Indeed Longhorn uses iSCSI and NFS. In my case iSCSI, as the volume is Read Write Once. If you go for Many, Longhorn will use NFS.

Local unfortunately isn't possible in a multi node cluster. Otherwise HA/DR will not work.

The SQLLite DB is getting slower and slower as more data gets into the DB. Seems it could be a latency issue, but other DBs like MySQL and Influx works fine and fast from Longhorn.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Sep 8, 2023

HA will not work with uptime-kuma. Please don't run multiple instances of the same docker container as this may corrupt the database.

V2 includes a version to connect to external databases (or continue with the embedded mariadb/sqlite)

In the meantime, choose a lower rentention to hide this issue.

@aessing
Copy link

aessing commented Sep 8, 2023

@CommanderStorm
HA with K8S does not mean to run 2 containers - but when a node fails, another node will restart the container.

It seems we have to wait and see what v2 looks like. Is there an ETA, or preview?

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Sep 9, 2023

Is there an ETA

No, but you can watch the progress here https://github.com/louislam/uptime-kuma/milestone/24

or preview

We are not currently in beta
⇒ no public preview is available.

Some test tags can be found here https://hub.docker.com/r/louislam/uptime-kuma/tags, but be aware that they are by definition undefined behaviour and not kept up to date automatically
⇒ they can do anything, including making daemons fly out of your nose (read: don't create issues for things you find in them, as they are meant for internal testing ^^)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.