vspheredb daemon fails all the time after update to 1.1.0 #143

terra-nova · 2019-12-03T08:45:11Z

Expected Behavior

The icinga-vspheredb.service systemd service should be running constantly.

Current Behavior

The service starts up, terminates, starts up again, terminates, ... ad infinitum

Logs:

Dec 03 09:20:33 xxx systemd[1]: Started Icinga vSphereDB Daemon.
Dec 03 09:20:43 xxx systemd[1]: icinga-vspheredb.service watchdog timeout (limit 10s)!
Dec 03 09:20:43 xxx systemd[1]: icinga-vspheredb.service: main process exited, code=killed, status=6/ABRT
Dec 03 09:20:43 xxx systemd[1]: Unit icinga-vspheredb.service entered failed state.
Dec 03 09:20:43 xxx systemd[1]: icinga-vspheredb.service failed.
Dec 03 09:21:14 xxx systemd[1]: icinga-vspheredb.service holdoff time over, scheduling restart.
Dec 03 09:21:14 xxx systemd[1]: Stopped Icinga vSphereDB Daemon.
Dec 03 09:21:14 xxx systemd[1]: Starting Icinga vSphereDB Daemon...
...

Shortly after starting up, the service reports Running DB cleanup (this could take some time) and attempts to run an MySQL OPTIMIZE TABLE command on the vspheredb_daemonlog table:

...
| 374 | icinga_vspheredb | localhost | icinga_vspheredb | Query   |     2 | altering table         | OPTIMIZE TABLE vspheredb_daemonlog |
...
+-----+------------------+-----------+------------------+---------+-------+------------------------+------------------------------------+

If I run that statement manally (using the same database user), I get this output:

mysql> optimize table vspheredb_daemonlog;
+--------------------------------------+----------+----------+-------------------------------------------------------------------+
| Table                                | Op       | Msg_type | Msg_text                                                          |
+--------------------------------------+----------+----------+-------------------------------------------------------------------+
| icinga_vspheredb.vspheredb_daemonlog | optimize | note     | Table does not support optimize, doing recreate + analyze instead |
| icinga_vspheredb.vspheredb_daemonlog | optimize | status   | OK                                                                |
+--------------------------------------+----------+----------+-------------------------------------------------------------------+
2 rows in set (5.15 sec)

So the operation does not seem to have failed. Shortly after, the daemon process is terminated ( watchdog timeout (limit 10s)!).

Possible Solution

Steps to Reproduce (for bugs)

Your Environment

VMware vCenter®/ESXi™-Version:
Version/GIT-Hash of this module: v1.7.2
Icinga Web 2 version: 2.7.3
Operating System and version: CentOS x64 7.7.1908
Webserver, PHP versions: httpd 2.4.6-90, rh-php71-php 7.1.30-1, rh-mysql80-mysql-server-8.0.17-1

The text was updated successfully, but these errors were encountered:

log1-c · 2019-12-04T14:28:19Z

Can confirm:

root@server01:/usr/share/icingaweb2/modules/vspheredb# systemctl status icinga-vspheredb.service                                                                                           
● icinga-vspheredb.service - Icinga vSphereDB Daemon
   Loaded: loaded (/etc/systemd/system/icinga-vspheredb.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2019-12-04 15:22:39 CET; 9s ago
     Docs: https://icinga.com/docs/icinga-vsphere/latest/
 Main PID: 29689 (icingacli)
   Status: "Running DB cleanup (this could take some time)"
    Tasks: 1 (limit: 4660)
   CGroup: /system.slice/icinga-vspheredb.service
           └─29689 Icinga::vSphereDB::main: 0 active runners

Dez 04 15:22:38 server01 systemd[1]: Starting Icinga vSphereDB Daemon...
Dez 04 15:22:39 server01 systemd[1]: Started Icinga vSphereDB Daemon.
root@server01:/usr/share/icingaweb2/modules/vspheredb# systemctl status icinga-vspheredb.service                                                                                           
● icinga-vspheredb.service - Icinga vSphereDB Daemon
   Loaded: loaded (/etc/systemd/system/icinga-vspheredb.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: watchdog) since Wed 2019-12-04 15:22:49 CET; 7s ago
     Docs: https://icinga.com/docs/icinga-vsphere/latest/
  Process: 29689 ExecStart=/usr/bin/icingacli vspheredb daemon run (code=dumped, signal=ABRT)
 Main PID: 29689 (code=dumped, signal=ABRT)
   Status: "Running DB cleanup (this could take some time)"
root@server01:/usr/share/icingaweb2/modules/vspheredb# systemctl status icinga-vspheredb.service                                                                                           
● icinga-vspheredb.service - Icinga vSphereDB Daemon
   Loaded: loaded (/etc/systemd/system/icinga-vspheredb.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: watchdog) since Wed 2019-12-04 15:22:49 CET; 15s ago
     Docs: https://icinga.com/docs/icinga-vsphere/latest/
  Process: 29689 ExecStart=/usr/bin/icingacli vspheredb daemon run (code=dumped, signal=ABRT)
 Main PID: 29689 (code=dumped, signal=ABRT)
   Status: "Running DB cleanup (this could take some time)"
root@server01:/usr/share/icingaweb2/modules/vspheredb# systemctl status icinga-vspheredb.service                                                                                           
● icinga-vspheredb.service - Icinga vSphereDB Daemon
   Loaded: loaded (/etc/systemd/system/icinga-vspheredb.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: watchdog) since Wed 2019-12-04 15:22:49 CET; 22s ago
     Docs: https://icinga.com/docs/icinga-vsphere/latest/
  Process: 29689 ExecStart=/usr/bin/icingacli vspheredb daemon run (code=dumped, signal=ABRT)
 Main PID: 29689 (code=dumped, signal=ABRT)
   Status: "Running DB cleanup (this could take some time)"
root@server01:/usr/share/icingaweb2/modules/vspheredb# systemctl status icinga-vspheredb.service                                                                                           
● icinga-vspheredb.service - Icinga vSphereDB Daemon
   Loaded: loaded (/etc/systemd/system/icinga-vspheredb.service; enabled; vendor preset: enabled)
   Active: activating (auto-restart) (Result: watchdog) since Wed 2019-12-04 15:24:10 CET; 7s ago
     Docs: https://icinga.com/docs/icinga-vsphere/latest/
  Process: 30116 ExecStart=/usr/bin/icingacli vspheredb daemon run (code=dumped, signal=ABRT)
 Main PID: 30116 (code=dumped, signal=ABRT)
   Status: "Running DB cleanup (this could take some time)"

Dez 04 15:24:10 server01 systemd[1]: icinga-vspheredb.service: Failed with result 'watchdog'.

Icinga Web 2 Version
2.7.3
Git Commit
06cabfe8ba28cf545a42c92f25484383191a4e51
PHP Version
7.2.24-0ubuntu0.18.04.1
Git Commit Datum
2019-10-18

Module vspheredb
Status enabled
Version 1.1.0
Git Commit 5bc3546

uffsalot · 2019-12-05T11:53:13Z

Can confirm.

Icinga Web 2 Version
2.7.3
Git Commit
06cabfe8ba28cf545a42c92f25484383191a4e51
PHP Version
7.3.11-1~deb10u1
Git Commit Datum
2019-10-18

VMware vCenter®/ESXi™-Version: 6.7
Version/GIT-Hash of this module: v1.1.0
Operating System and version: Debian 10 x64

Obivatelj · 2019-12-09T19:06:42Z

Confirm here too,
When starting with icingacli vspheredb daemon run --debug it works. It seams that Running DB cleanup (if longer then 10s) timeout causes this output message and, ergo, restarting of service.
I think it is related to #138

madmax01 · 2019-12-29T17:57:28Z

Is there any Fix for this?

i just fresh installed a setup today. and the service not starting up. the "Add Esxi/vcenter > is this just working once the Service is up ? as once clicking "Add",.. there is nothing to add. Create a new vCenter/ESXi-Connection > underneath everything blank !!.

Icinga Web 2 Version
2.7.3
PHP 7.2.26

dependencies all installed.

Errors:

`PHP Fatal error: Uncaught Error: Call to undefined function Icinga\Module\Vspheredb\Daemon\posix_getpid() in /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Daemon/Daemon.php:64
Stack trace:
#0 /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Daemon/Daemon.php(57): Icinga\Module\Vspheredb\Daemon\Daemon->detectProcessInfo()
#1 /usr/share/icingaweb2/modules/vspheredb/application/clicommands/DaemonCommand.php(25): Icinga\Module\Vspheredb\Daemon\Daemon->__construct()
#2 /usr/share/php/Icinga/Cli/Loader.php(265): Icinga\Module\Vspheredb\Clicommands\DaemonCommand->runAction()
#3 /usr/share/php/Icinga/Application/Cli.php(152): Icinga\Cli\Loader->dispatch()
#4 /usr/share/php/Icinga/Application/Cli.php(142): Icinga\Application\Cli->dispatchOnce()
#5 /usr/bin/icingacli(7): Icinga\Application\Cli->dispatch()
#6 {main}
thrown in /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Daemon/Daemon.php on line 64

Fatal error: Uncaught Error: Call to undefined function Icinga\Module\Vspheredb\Daemon\posix_getpid() in /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Daemon/Daemon.php:64
Stack trace:
#0 /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Daemon/Daemon.php(57): Icinga\Module\Vspheredb\Daemon\Daemon->detectProcessInfo()
#1 /usr/share/icingaweb2/modules/vspheredb/application/clicommands/DaemonCommand.php(25): Icinga\Module\Vspheredb\Daemon\Daemon->__construct()
#2 /usr/share/php/Icinga/Cli/Loader.php(265): Icinga\Module\Vspheredb\Clicommands\DaemonCommand->runAction()
#3 /usr/share/php/Icinga/Application/Cli.php(152): Icinga\Cli\Loader->dispatch()
#4 /usr/share/php/Icinga/Application/Cli.php(142): Icinga\Application\Cli->dispatchOnce()
#5 /usr/bin/icingacli(7): Icinga\Application\Cli->dispatch()
#6 {main}
thrown in /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Daemon/Daemon.php on line 64
`

madmax01 · 2019-12-29T18:29:05Z

somehow i got the service working to start. but i'am not able to add vCenter. everything blank below "Create a new vCenter/ESXi-Connection

Thomas-Gelf · 2019-12-29T19:06:23Z

@madmax01: please check the requirements section in our installation documentation

guldil · 2020-01-04T13:58:32Z

i have same issue service always restart "Running DB cleanup (this could take some time)" but if i run "icingacli vspheredb daemon run --debug" it's working.

guldil · 2020-01-05T09:44:48Z

i found a solution, just change WatchdogSec=10 to WatchdogSec=360 in /etc/systemd/system/icinga-vspheredb.service then systemctl daemon-reload and systemctl start icinga-vspheredb.

stultitiophobia · 2020-01-07T20:49:13Z

can confirm the fix working here - thanks for that !

slasse · 2020-01-09T08:33:36Z

we had the same problem and we can also confirm the fix

Wintermute2k6 · 2020-01-10T08:31:06Z

We can also confirm this as an working fix for the Issue.

hhamester · 2020-03-12T09:00:16Z

Hi,

someone with the same problem?

# systemctl status icinga-vspheredb.service

● icinga-vspheredb.service - Icinga vSphereDB Daemon
     Loaded: loaded (/etc/systemd/system/icinga-vspheredb.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2020-03-12 07:28:23 CET; 26min ago
       Docs: https://icinga.com/docs/icinga-vsphere/latest/
   Main PID: 2156724 (icingacli)
     Status: "DB has been cleaned up"
      Tasks: 1 (limit: 9490)
     Memory: 13.3M
     CGroup: /system.slice/icinga-vspheredb.service
             └─2156724 Icinga::vSphereDB::main: 3 active runners

Mar 12 07:28:23 98lipmoni3 systemd[1]: Started Icinga vSphereDB Daemon.
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Got invalid NetString data:
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Fatal error: Uncaught Error: Class 'SoapVar' not found in /usr/share/icingaweb2/mod>
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Got invalid NetString data:
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Fatal error: Uncaught Error: Class 'SoapVar' not found in /usr/share/icingaweb2/mod>
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Got invalid NetString data:
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Fatal error: Uncaught Error: Class 'SoapVar' not found in /usr/share/icingaweb2/mod>
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Server for vCenterID=2 failed, will try again in 30 seconds
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Server for vCenterID=4 failed, will try again in 30 seconds
Mar 12 07:32:42 98lipmoni3 icingacli[2156724]: Server for vCenterID=6 failed, will try again in 30 seconds

# icingacli vspheredb daemon run --trace --debug

Got invalid NetString data: 
Fatal error: Uncaught Error: Class 'SoapVar' not found in /usr/share/icingaweb2/modules/vspheredb/l[..] truncated 1120 bytes [..] a\Module\Vspheredb in /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Api.php on line 410
Got invalid NetString data: 
Fatal error: Uncaught Error: Class 'SoapVar' not found in /usr/share/icingaweb2/modules/vspheredb/l[..] truncated 1120 bytes [..] a\Module\Vspheredb in /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Api.php on line 410
Got invalid NetString data: 
Fatal error: Uncaught Error: Class 'SoapVar' not found in /usr/share/icingaweb2/modules/vspheredb/l[..] truncated 1120 bytes [..] a\Module\Vspheredb in /usr/share/icingaweb2/modules/vspheredb/library/Vspheredb/Api.php on line 410
Server for vCenterID=2 failed, will try again in 30 seconds
Pid 2158637 stopped
Server for vCenterID=4 failed, will try again in 30 seconds
Pid 2158638 stopped
Server for vCenterID=6 failed, will try again in 30 seconds
Pid 2158639 stopped
SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry '\x8D\xA3\xF8\xE6I\x00s\xFE\xBA\x18K#\xCE\x92=\x1C' for key 'PRIMARY', query was: INSERT INTO vspheredb_daemon (instance_uuid, ts_last_refresh, process_info, pid, fqdn, username, php_version) VALUES (?, ?, ?, ?, ?, ?, ?)
Database connection has been closed

ChristianMoritz · 2020-03-16T14:05:17Z

after ive tested the option with the WatchDog Timer and this doesnt works for me...

ive started the module with --debug and after about 1hour ive got an status update..

"DB has been cleaned up"

and a while later ive got the next error...

root@smon03:/# systemctl status icinga-vspheredb
● icinga-vspheredb.service - Icinga vSphereDB Daemon
Loaded: loaded (/etc/systemd/system/icinga-vspheredb.service; enabled; vendor preset: enabled)
Active: active (running) since Mon 2020-03-16 14:59:36 CET; 1min 41s ago
Docs: https://icinga.com/docs/icinga-vsphere/latest/
Main PID: 11396 (icingacli)
Status: "DB has been cleaned up"
CGroup: /system.slice/icinga-vspheredb.service
├─11396 Icinga::vSphereDB::main: 5 active runners
├─12977 Icinga::vSphereDB::sync (shv19call01)
├─12978 Icinga::vSphereDB::sync (shv06call01)
├─12980 Icinga::vSphereDB::sync (shvwerm01)
├─12982 Icinga::vSphereDB::sync (shv1911: Event Stream)
└─12987 Icinga::vSphereDB::sync (svcs: VM DataStore Usage)

Mar 16 14:59:36 smon03 systemd[1]: Starting Icinga vSphereDB Daemon...
Mar 16 14:59:36 smon03 systemd[1]: Started Icinga vSphereDB Daemon.
Mar 16 15:00:39 smon03 icingacli[11396]: Task perfCounterInfo failed: SQLSTATE[22001]: String data, right truncated: 1406 Data too long for column 'summary' at row 1, query was: INSERT INTO performance_counter (vcenter_uuid, counter_key, name, group_name, unit_name, label, summary, rollup_type, stats_type, level, per_device_level) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)

does some one got a hint for me ?

my enviroment:
icinga2: r2.11.3-1
icingaweb2: 2.7.3
vspheredb: 1.1.0
ipl: 0.5.0
incubator: 0.5.0
reactbundle: 0.7.0

UPDATE: after about 2 hours... the modul now is working again fine (without doing anything)

wp-perc · 2020-03-31T09:36:46Z

I have the same issue. I'm not inside the systemd workflow, but it seems like there is some kind of "pulse" the vspheredb daemon process must send to the systemd to avoid being killed.

Cleaning up the daemon log table can take a very large amount of time, depending on both how often you restart the vspheredb service and how many virtual centers are monitored.
Because of this, I don't feel much comfortable on increasing the watchdog timeout: in case of failure, the unit will take a log time to be automatically restarted... Or am I wrong?

Therefore, a trade-off is needed: how often you restart the vspheredb service vs how much log you want to keep.

Besides, increasing the watchdog timeout to 600 seconds resolved for me. But I'm not that happy.

Thomas-Gelf · 2020-04-29T01:50:52Z

This has been fixed, see #138 for related commits. Please upgrade to the current master (or the upcoming v1.2.0 release), apply schema migrations, restore the former watchdog setting and restart your daemon.

log1-c mentioned this issue Feb 20, 2020

icinga-vspheredb.service failed to start #150

Closed

Thomas-Gelf added bug duplicate labels Apr 29, 2020

Thomas-Gelf self-assigned this Apr 29, 2020

Thomas-Gelf added this to the v1.2.0 milestone Apr 29, 2020

Thomas-Gelf closed this as completed Apr 29, 2020

Decstasy mentioned this issue May 5, 2021

Startup / stability problems caused by vspheredb_daemonlog #253

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vspheredb daemon fails all the time after update to 1.1.0 #143

vspheredb daemon fails all the time after update to 1.1.0 #143

terra-nova commented Dec 3, 2019

log1-c commented Dec 4, 2019 •

edited

Loading

uffsalot commented Dec 5, 2019

Obivatelj commented Dec 9, 2019 •

edited

Loading

madmax01 commented Dec 29, 2019 •

edited

Loading

madmax01 commented Dec 29, 2019

Thomas-Gelf commented Dec 29, 2019

guldil commented Jan 4, 2020

guldil commented Jan 5, 2020

stultitiophobia commented Jan 7, 2020 •

edited

Loading

slasse commented Jan 9, 2020

Wintermute2k6 commented Jan 10, 2020

hhamester commented Mar 12, 2020 •

edited

Loading

ChristianMoritz commented Mar 16, 2020 •

edited

Loading

wp-perc commented Mar 31, 2020

Thomas-Gelf commented Apr 29, 2020

vspheredb daemon fails all the time after update to 1.1.0 #143

vspheredb daemon fails all the time after update to 1.1.0 #143

Comments

terra-nova commented Dec 3, 2019

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Your Environment

log1-c commented Dec 4, 2019 • edited Loading

uffsalot commented Dec 5, 2019

Obivatelj commented Dec 9, 2019 • edited Loading

madmax01 commented Dec 29, 2019 • edited Loading

madmax01 commented Dec 29, 2019

Thomas-Gelf commented Dec 29, 2019

guldil commented Jan 4, 2020

guldil commented Jan 5, 2020

stultitiophobia commented Jan 7, 2020 • edited Loading

slasse commented Jan 9, 2020

Wintermute2k6 commented Jan 10, 2020

hhamester commented Mar 12, 2020 • edited Loading

ChristianMoritz commented Mar 16, 2020 • edited Loading

wp-perc commented Mar 31, 2020

Thomas-Gelf commented Apr 29, 2020

log1-c commented Dec 4, 2019 •

edited

Loading

Obivatelj commented Dec 9, 2019 •

edited

Loading

madmax01 commented Dec 29, 2019 •

edited

Loading

stultitiophobia commented Jan 7, 2020 •

edited

Loading

hhamester commented Mar 12, 2020 •

edited

Loading

ChristianMoritz commented Mar 16, 2020 •

edited

Loading