No data and no alerts if site goes down #1219

katsil · 2022-01-24T13:45:30Z

⚠️ Please verify that this bug has NOT been raised before.

I checked and didn't find similar issue

🛡️ Security Policy

I agree to have read this project Security Policy

Description

At some point in time, my monitoring stops checking the availability of the site. This is expressed in the fact that it "freezes" and displays the latest data for a certain period of time, the same thing happens when I use the prometheus exporter.

👟 Reproduction steps

This also happened on version 1.1 (before switching to the new one), I even put monitoring on a separate server, but the problem still repeats. I have about 15-20 HTTP checks, and they all don't work

👀 Expected behavior

Operational monitoring and display of correct data

😓 Actual Behavior

At some point in time, monitoring "freezes" and it's not clear to me how to fix it. This happens until I stop/enable monitoring or restart the server with monitoring.
Here is some screenshots:

how it looks like from uptime-kuma:

if 24h:

grafana:

So as you can see it stopped posting status after 00:00 24.01.22

How to fix this? My monitoring VM have about 100Gb NVMe space free

🐻 Uptime-Kuma Version

1.11.3

💻 Operating System and Arch

Ubuntu 18.04

🌐 Browser

Safari

🐋 Docker Version

No response

🟩 NodeJS Version

No response

📝 Relevant log output

No response

katsil · 2022-01-24T19:40:57Z

Again i see error, after restart

chakflying · 2022-01-25T03:20:45Z

Are there any logs in the server output?

katsil · 2022-01-25T11:04:27Z

Here is logs from docker container, maybe you can tell me where to find some other debug logs?

https://pb0.superhub.xyz/?fdff20a2a4ba2348#KNj9yiwtEjvqlqkkBtTBBJIqUpFOwLQok1xJvYnRCjw=

katsil · 2022-01-25T11:05:08Z

also error.log inside container:

[2022-01-20 06:49:43] KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
    at Client_SQLite3.acquireConnection (/app/node_modules/knex/lib/client.js:305:26)
    at runMicrotasks (<anonymous>)
    at runNextTicks (internal/process/task_queues.js:60:5)
    at listOnTimeout (internal/timers.js:526:9)
    at processTimers (internal/timers.js:500:7)
    at async Runner.ensureConnection (/app/node_modules/knex/lib/execution/runner.js:259:28)
    at async Runner.run (/app/node_modules/knex/lib/execution/runner.js:30:19)
    at async RedBeanNode.storeCore (/app/node_modules/redbean-node/dist/redbean-node.js:166:26)
    at async RedBeanNode.store (/app/node_modules/redbean-node/dist/redbean-node.js:126:20)
    at async beat (/app/server/model/monitor.js:417:13) {
  sql: undefined,
  bindings: undefined
}
[2022-01-20 06:49:43] KnexTimeoutError: Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call?
    at Client_SQLite3.acquireConnection (/app/node_modules/knex/lib/client.js:305:26)
    at runMicrotasks (<anonymous>)
    at runNextTicks (internal/process/task_queues.js:60:5)
    at processTimers (internal/timers.js:497:9)
    at async Runner.ensureConnection (/app/node_modules/knex/lib/execution/runner.js:259:28)
    at async Runner.run (/app/node_modules/knex/lib/execution/runner.js:30:19)
    at async RedBeanNode.storeCore (/app/node_modules/redbean-node/dist/redbean-node.js:166:26)
    at async RedBeanNode.store (/app/node_modules/redbean-node/dist/redbean-node.js:126:20)
    at async beat (/app/server/model/monitor.js:417:13)
    at async Timeout.safeBeat [as _onTimeout] (/app/server/model/monitor.js:443:17) {
  sql: undefined,
  bindings: undefined
}

katsil · 2022-01-26T10:14:56Z

Hey, guys, any news please?

chakflying · 2022-01-26T15:02:04Z

Relevant previous discussion in #218. Unfortunately it's a generic database connection error and there isn't much to go on.

katsil · 2022-01-26T18:13:22Z

generic database connection error

But im using native sqlite database inside docker container, how it may be error with connecting to database?

louislam · 2022-02-01T16:30:23Z

It may causes by a busy database.

The monitor should be restarted if there is any error in general.
Unfortunately, I don't know why, most Knex errors are not catch by try-catch.

CommanderStorm · 2024-02-12T23:26:43Z

v1.23.X included some improvements in the direction of using incremental_vaccum => improving the situation.

A lot of performance improvements (using aggregated vs non-aggregated tables to store heartbeats, enabling users to choose mariadb as a db-backend, pagination of important events) have been made in v2.0 (our next release) resolving™️ this problem-area.
=> I'm going to close this issue

You can subscribe to our releases and get notified when a new release (such as v2.0-beta.0) gets made.
See #4171 for the bugs that need addressing before that can happen.

Meanwhile (the issue is with SQLite not reading data fast enough to keep up):

limit how much retention you have configured
limit yourself to a reasonable amount of monitors (hardware-dependant, no good measure)
don't run on slow disks or disk with high latency like HDDs, SD-cards, USB-Stick attached to a router, ...

katsil added the bug Something isn't working label Jan 24, 2022

CommanderStorm mentioned this issue Nov 5, 2023

Knex: Timeout acquiring a connection. The pool is probably full. Are you missing a .transacting(trx) call? #3978

Closed

2 tasks

CommanderStorm added the area:core issues describing changes to the core of uptime kuma label Dec 8, 2023

CommanderStorm closed this as completed Feb 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No data and no alerts if site goes down #1219

No data and no alerts if site goes down #1219

katsil commented Jan 24, 2022

katsil commented Jan 24, 2022

chakflying commented Jan 25, 2022

katsil commented Jan 25, 2022

katsil commented Jan 25, 2022

katsil commented Jan 26, 2022

chakflying commented Jan 26, 2022

katsil commented Jan 26, 2022

louislam commented Feb 1, 2022

CommanderStorm commented Feb 12, 2024

No data and no alerts if site goes down #1219

No data and no alerts if site goes down #1219

Comments

katsil commented Jan 24, 2022

⚠️ Please verify that this bug has NOT been raised before.

🛡️ Security Policy

Description

👟 Reproduction steps

👀 Expected behavior

😓 Actual Behavior

🐻 Uptime-Kuma Version

💻 Operating System and Arch

🌐 Browser

🐋 Docker Version

🟩 NodeJS Version

📝 Relevant log output

katsil commented Jan 24, 2022

chakflying commented Jan 25, 2022

katsil commented Jan 25, 2022

katsil commented Jan 25, 2022

katsil commented Jan 26, 2022

chakflying commented Jan 26, 2022

katsil commented Jan 26, 2022

louislam commented Feb 1, 2022

CommanderStorm commented Feb 12, 2024