Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Z-Way crashes on WebSocket push. #101

Open
PoltoS opened this issue Feb 24, 2016 · 16 comments
Open

Z-Way crashes on WebSocket push. #101

PoltoS opened this issue Feb 24, 2016 · 16 comments
Assignees

Comments

@PoltoS
Copy link
Member

PoltoS commented Feb 24, 2016

Hardly repeatable, pretty rare. Happens in mongose:

SIGPIPE on send:

Program received signal SIGPIPE, Broken pipe.
(gdb) backtrace
#0  0xb69882f4 in send () at ../sysdeps/unix/syscall-template.S:81
#1  0xb5af9b9c in ns_write_to_socket(ns_connection*) () from ./modules/modws.so
#2  0xb5afa478 in ns_mgr_poll () from ./modules/modws.so
#3  0xb5b0612c in mg_poll_server () from ./modules/modws.so
#4  0xb5ae86b0 in zwjs::WebServerPortContext::WebServerFunc(ZRefCountedPointer<zwjs::Thread>) () from ./modules/modws.so
#5  0xb6f87264 in zwjs::Thread::ThreadFuncNative(void*) () from ./libs/libzwayjs.so
#6  0xb697fe90 in start_thread (arg=0xb3dff450) at pthread_create.c:311
#7  0xb6436128 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

"man send" says:

EPIPE The local end has been shut down on a connection oriented socket. In this case the process will also receive a SIGPIPE unless MSG_NOSIGNAL is set.

@ronie-z-wave
Copy link

mhmm ... I think it happens after a specific count of vdev metrics changes

I've observed this crash with apps (since yet I'm not able to force it):

  • SensorsPolling
  • Astronomy
  • (NotificationMail)
  • (NotificationPushover)

Today I've added the Astronomy module from store (needs active Base Module) and after an hour (and lots of changes by the device created by this module) the server crashes with the same error

please have also a look at the log file:

ws_push_error_astronomy.txt

but here is a different error:

(gdb) bt
#0  0xb5798460 in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
#1  0xb579a760 in zwjs::SocketsContext::Poll(int) () from ./modules/modsockets.so
#2  0xb5799960 in zwjs::SocketsContext::ServerFunc(ZRefCountedPointer<zwjs::Thread>) () from ./modules/modsockets.so
#3  0xb6f87264 in zwjs::Thread::ThreadFuncNative(void*) () from ./libs/libzwayjs.so
#4  0xb697fe90 in start_thread (arg=0xb2bff450) at pthread_create.c:311
#5  0xb6436128 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

Also I did some changes in AutomationController last months affecting ws.push() ...

Z-Wave-Me/home-automation@08617c7#diff-1abed9a9f6d6a5c9957285958c12b3c0

Z-Wave-Me/home-automation@e997df8

Z-Wave-Me/home-automation@1f35aae

Z-Wave-Me/home-automation@d28a6af

Not sure if this matters ...

@ronie-z-wave
Copy link

now I've restarted the server multiple times during tests and v8 debugging on Notification
Pushover app and it has crashed with this error:

[2016-02-25 10:55:56.031] [I] [core] Loading module TagOnOff from modules/TagOnOff
[2016-02-25 10:55:56.048] [I] [core] Executing script: /*** TagOnOff Z-Way HA module ******************************************* ...
[2016-02-25 10:55:56.296] [I] [core] Instantiating module 26 from class TagOnOff
[2016-02-25 10:55:56.304] [I] [core] --- Starting module Tag devices with On/Off
[2016-02-25 10:55:57.064] [I] [core] Loading module HTTPDevice from modules/HTTPDevice
[2016-02-25 10:55:57.099] [I] [core] Executing script: /*** HTTPDevice Z-Way HA module ******************************************* ...
[2016-02-25 10:55:57.342] [I] [core] Loading module CustomUserCodeLoader from modules/CustomUserCodeLoader
[2016-02-25 10:55:57.357] [I] [core] Executing script:   /*** CustomUserCodeLoader ZAutomation module **************************************** ...
[2016-02-25 10:55:57.364] [I] [core] Loading module ThermostatDevice from modules/ThermostatDevice
[2016-02-25 10:55:57.674] [I] [core] Executing script: /*** ThermostatDevice Z-Way HA module ******************************************* ...
[2016-02-25 10:55:57.687] [I] [core] Loading module ClimateControl from userModules/ClimateControl
[2016-02-25 10:55:58.273] [I] [core] Executing script: /*** ClimateControl Z-Way HA module ******************************************* ...
[2016-02-25 10:55:58.313] [I] [core] Instantiating module 35 from class ClimateControl
[2016-02-25 10:55:58.324] [I] [core] --- Starting module Climate Control
[2016-02-25 10:55:58.957] [I] [core] Creating device sensorMultiline ClimateControl_35
[2016-02-25 10:55:59.002] [I] [core] Loading module RGB from modules/RGB
[2016-02-25 10:55:59.039] [I] [core] Executing script: /*** RGB Z-Way HA module ******************************************* ...
[2016-02-25 10:55:59.050] [I] [core] Loading module ScheduledScene from modules/ScheduledScene
[2016-02-25 10:55:59.316] [I] [core] Executing script: /*** ScheduledScene Z-Way HA module ******************************************* ...
[2016-02-25 10:55:59.334] [I] [core] Loading module AutoOff from modules/AutoOff
[2016-02-25 10:55:59.641] [I] [core] Executing script: /*** AutoOff Z-Way Home Automation module ************************************* ...
[2016-02-25 10:55:59.653] [I] [core] Loading module DeviceHistory from modules/DeviceHistory
 ...6-02-25 10:56:00.163] [I] [core] Executing script: /*** DeviceHistory Z-Way HA module *******************************************
[2016-02-25 10:56:00.192] [I] [core] Instantiating module 36 from class DeviceHistory
[2016-02-25 10:56:00.224] [I] [core] --- Starting module 24 Hours Device History
[2016-02-25 10:56:00.861] [I] [core] ---  histories polled
[2016-02-25 10:56:00.871] [I] [core] Loading module GlobalCache from modules/GlobalCache
[2016-02-25 10:56:00.896] [I] [core] Executing script: /*** GlobalCache Z-Way HA module ******************************************* ...
[2016-02-25 10:56:01.214] [I] [core] Loading module FosCam9828 from modules/FosCam9828
[2016-02-25 10:56:01.250] [I] [core] Executing script: /*** FosCam9828 Z-Way HA module ******************************************* ...
[2016-02-25 10:56:01.587] [I] [core] Loading module SwitchControlGenerator from modules/SwitchControlGenerator
[2016-02-25 10:56:01.635] [I] [core] Executing script: /*** SwitchControlGenerator Z-Way HA module ******************************************* ...
[2016-02-25 10:56:01.926] [I] [core] Instantiating module 24 from class SwitchControlGenerator
[2016-02-25 10:56:01.984] [I] [core] --- Starting module Trap events from Remotes and Sensors
[2016-02-25 10:56:02.069] [I] [core] Creating device switchControl ZWayVDev_zway_Remote_80-0-0-B
[2016-02-25 10:56:02.122] [I] [core] Creating device switchControl ZWayVDev_zway_Remote_156-0-0-B
[2016-02-25 10:56:02.143] [I] [core] Creating device switchControl ZWayVDev_zway_Remote_167-1-1-B
[2016-02-25 10:56:02.599] [I] [core] Loading module Notification from modules/Notification
[2016-02-25 10:56:02.621] [I] [core] Executing script: /*** Notification Z-Way HA module ******************************************* ...
[2016-02-25 10:56:02.912] [I] [core] Loading module Sonos from modules/Sonos
[2016-02-25 10:56:02.957] [I] [core] Executing script: /*** Sonos Z-Way HA module ******************************************* ...
[2016-02-25 10:56:03.324] [I] [core] Instantiating module 29 from class Sonos
[2016-02-25 10:56:03.332] [I] [core] --- Starting module Sonos
[New Thread 0xb1bff450 (LWP 28111)]
[2016-02-25 10:56:03.433] [I] [core] Waiting for Sonos to send announcement
[2016-02-25 10:56:03.456] [I] [core] Binding Sonos notifier
[2016-02-25 10:56:03.469] [I] [core] Loading module NotificationMail from userModules/NotificationMail
[2016-02-25 10:56:03.500] [I] [core] Executing script: /*** NotificationMail Z-Way HA module ******************************************* ...
[2016-02-25 10:56:03.820] [I] [core] Instantiating module 34 from class NotificationMail
[2016-02-25 10:56:03.832] [I] [core] --- Starting module Mail Notifications
[2016-02-25 10:56:03.919] [I] [core] Loading module LightScene from modules/LightScene
[2016-02-25 10:56:03.943] [I] [core] Executing script: /*** LightScene Z-Way HA module ******************************************* ...
[2016-02-25 10:56:04.296] [I] [core] Instantiating module 17 from class LightScene
[2016-02-25 10:56:04.311] [I] [core] --- Starting module Scene
[2016-02-25 10:56:04.383] [I] [core] Creating device toggleButton LightScene_17
[2016-02-25 10:56:04.430] [I] [core] Loading module NotificationSMSru from modules/NotificationSMSru
[2016-02-25 10:56:04.452] [I] [core] Executing script: /*** NotificationSMSru Z-Way HA module ******************************************* ...
[2016-02-25 10:56:04.737] [I] [core] Loading module Astronomy from userModules/Astronomy
[2016-02-25 10:56:04.757] [I] [core] Executing script: /*** Astronomy Z-Way HA module ******************************************* ...
[2016-02-25 10:56:04.780] [D] [zway] Job 0x07: deleted from queue
[2016-02-25 10:56:04.961] [D] [zway] Job 0x06: deleted from queue
[2016-02-25 10:56:04.975] [D] [zway] Job 0x17: deleted from queue
[2016-02-25 10:56:04.989] [D] [zway] Job 0xd2: deleted from queue
[2016-02-25 10:56:05.003] [D] [zway] Job 0x20: deleted from queue
[2016-02-25 10:56:05.018] [D] [zway] Job 0x05: deleted from queue
[2016-02-25 10:56:05.032] [D] [zway] Job 0x15: deleted from queue
[2016-02-25 10:56:05.056] [D] [zway] Job 0x56: deleted from queue
[2016-02-25 10:56:05.122] [D] [zway] Job 0x02: deleted from queue
[2016-02-25 10:56:05.126] [D] [zway] Job 0x62: deleted from queue
[2016-02-25 10:56:05.149] [D] [zway] Job 0x62: deleted from queue
[2016-02-25 10:56:05.163] [D] [zway] Job 0x62: deleted from queue
[2016-02-25 10:56:05.172] [I] [core] Instantiating module 37 from class Astronomy
[2016-02-25 10:56:05.183] [I] [core] --- Starting module Astronomy Module
[2016-02-25 10:56:05.198] [D] [zway] Job 0x62: deleted from queue
[2016-02-25 10:56:05.222] [D] [zway] Job 0x62: deleted from queue
[2016-02-25 10:56:05.247] [D] [zway] Job 0x62: deleted from queue
[2016-02-25 10:56:05.271] [D] [zway] Job 0x62: deleted from queue
[2016-02-25 10:56:05.296] [D] [zway] Job 0x62: deleted from queue
[2016-02-25 10:56:05.320] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.354] [I] [core] Executing script: /* ...
[2016-02-25 10:56:05.367] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.412] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.468] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.533] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.578] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.634] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.685] [I] [core] Creating device sensorMultilevel Astronomy_37
[2016-02-25 10:56:05.690] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.745] [D] [zway] Job 0x41: deleted from queue
[2016-02-25 10:56:05.774] [I] [core] [Astronomy] Calculate
[2016-02-25 10:56:05.790] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:05.835] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:05.881] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:05.915] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:05.949] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:05.984] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:06.018] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:06.053] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:06.088] [D] [zway] Job 0x80: deleted from queue
[2016-02-25 10:56:06.904] [I] [core] Loading module LogicalRules from modules/LogicalRules
 ...6-02-25 10:56:06.941] [I] [core] Executing script: /*** LogicalRules Z-Way HA module *******************************************
[2016-02-25 10:56:07.336] [I] [core] Instantiating module 22 from class LogicalRules
[2016-02-25 10:56:07.347] [I] [core] --- Starting module Logical Rule
[2016-02-25 10:56:07.476] [I] [core] Instantiating module 27 from class LogicalRules
[2016-02-25 10:56:07.481] [I] [core] --- Starting module Logical Rule
[2016-02-25 10:56:07.556] [I] [core] Loading module AutoLock from modules/AutoLock
[2016-02-25 10:56:07.577] [I] [core] Executing script: /*** AutoLock Z-Way Home Automation module ************************************* ...
[2016-02-25 10:56:07.879] [I] [core] Loading module SensorsPolling from modules/SensorsPolling
[2016-02-25 10:56:07.897] [I] [core] Executing script: /*** SensorsPolling Z-Way HA module ******************************************* ...
[2016-02-25 10:56:07.908] [I] [core] Loading module PoppCam from modules/PoppCam
[2016-02-25 10:56:08.275] [I] [core] Executing script: /*** PoppCam Z-Way HA module ******************************************* ...
[2016-02-25 10:56:08.291] [I] [core] Loading module InbandNotifications from modules/InbandNotifications
[2016-02-25 10:56:08.888] [I] [core] Executing script: /*** InbandNotifications Z-Way HA module ******************************************* ...
[2016-02-25 10:56:08.918] [I] [core] Instantiating module 3 from class InbandNotifications
[2016-02-25 10:56:08.940] [I] [core] --- Starting module Inband Notifier
[2016-02-25 10:56:09.055] [I] [core] Loading module FosCam9826 from modules/FosCam9826
[2016-02-25 10:56:09.086] [I] [core] Executing script: /*** FosCam9826 Z-Way HA module ******************************************* ...
[2016-02-25 10:56:09.384] [I] [core] Loading module OpenWeather from modules/OpenWeather
[2016-02-25 10:56:09.404] [I] [core] Executing script: /*** OpenWeather Extended Z-Way HA module ******************************************* ...
[2016-02-25 10:56:09.694] [I] [core] Loading module IfThen from modules/IfThen
[2016-02-25 10:56:09.711] [I] [core] Executing script: /*** IfThen Z-Way HA module ******************************************* ...
[2016-02-25 10:56:10.034] [I] [core] Instantiating module 18 from class IfThen
[2016-02-25 10:56:10.045] [I] [core] --- Starting module If -> Then
[2016-02-25 10:56:10.150] [I] [core] Instantiating module 19 from class IfThen
[2016-02-25 10:56:10.154] [I] [core] --- Starting module If -> Then
[2016-02-25 10:56:10.231] [I] [core] Starting automation...
[2016-02-25 10:56:10.368] [I] [core] Starting storage...
[2016-02-25 10:56:10.377] [I] [core] ZWay Automation started
[2016-02-25 10:56:11.011] [I] [core] Creating device sensorBinary ZWayVDev_zway_2-0-48-1
[2016-02-25 10:56:16.482] [D] [zway] Job 0x60: deleted from queue
[2016-02-25 10:56:16.503] [D] [zway] Job 0x03: deleted from queue
[2016-02-25 10:56:16.527] [D] [zway] Job 0x05: deleted from queue
[2016-02-25 10:56:55.808] [I] [core] Notification: device-info (device-OnOff): {"dev":"Flood Motion (2.0.48.1)","l":"off"}
[2016-02-25 10:56:56.088] [I] [core] Creating device battery ZWayVDev_zway_2-0-128
[2016-02-25 10:56:56.542] [I] [core] Creating device sensorBinary ZWayVDev_zway_2-0-156-5-A
[2016-02-25 10:56:56.586] [I] [core] Notification: device-info (device-OnOff): {"dev":"Flood Alarm (2.0.156.5)","l":"off"}
[2016-02-25 10:56:56.708] [I] [core] Creating device sensorMultilevel ZWayVDev_zway_2-2-49-1
[2016-02-25 10:56:56.746] [I] [core] Notification: device-info (device-temperature): {"dev":"Flood Temperature (2.2.49.1)","l":"22.2 °C"}
[2016-02-25 10:56:56.882] [I] [core] Creating device switchBinary ZWayVDev_zway_3-0-37
[2016-02-25 10:56:57.037] [I] [core] Notification: device-info (device-OnOff): {"dev":"Z-Wave.Me Switch (3.0)","l":"off"}
[2016-02-25 10:56:57.120] [I] [core] Creating device sensorBinary ZWayVDev_zway_3-0-48-2
[2016-02-25 10:56:57.146] [I] [core] Notification: device-info (device-OnOff): {"dev":"Z-Wave.Me Smoke (3.0.48.2)","l":"off"}
[2016-02-25 10:56:57.208] [I] [core] Creating device sensorBinary ZWayVDev_zway_3-0-48-8
[2016-02-25 10:56:57.236] [I] [core] Notification: device-info (device-OnOff): {"dev":"Z-Wave.Me Tamper (3.0.48.8)","l":"off"}
[2016-02-25 10:56:57.393] [I] [core] Creating device sensorBinary ZWayVDev_zway_3-0-113-1-2-A
[2016-02-25 10:56:57.425] [I] [core] Creating device sensorBinary ZWayVDev_zway_3-0-113-7-3-A
[2016-02-25 10:56:57.451] [I] [core] Creating device sensorBinary ZWayVDev_zway_3-0-113-9-1-A
[2016-02-25 10:56:57.531] [I] [core] Creating device battery ZWayVDev_zway_3-0-128
[2016-02-25 10:56:57.852] [I] [core] Creating device sensorMultilevel ZWayVDev_zway_4-0-49-1
[2016-02-25 10:56:57.881] [I] [core] Notification: device-info (device-temperature): {"dev":"Fibar Group Temperature (4.0.49.1)","l":"21.4 °C"}
[2016-02-25 10:56:58.192] [I] [core] Creating device battery ZWayVDev_zway_4-0-128
[2016-02-25 10:56:58.554] [I] [core] Creating device sensorBinary ZWayVDev_zway_4-0-113-1-2-A
[2016-02-25 10:56:58.579] [I] [core] Creating device sensorBinary ZWayVDev_zway_4-0-113-4-2-A
[2016-02-25 10:56:58.791] [I] [core] Creating device switchBinary ZWayVDev_zway_6-0-37
[2016-02-25 10:56:58.945] [I] [core] Notification: device-info (device-OnOff): {"dev":"Everspring Switch (6.0)","l":"on"}
[2016-02-25 10:56:59.046] [I] [core] Creating device sensorMultilevel ZWayVDev_zway_6-0-50-0
[2016-02-25 10:56:59.134] [I] [core] Creating device sensorMultilevel ZWayVDev_zway_6-0-50-2
[New Thread 0xb11ff450 (LWP 28112)]
[2016-02-25 10:56:59.468] [I] [core] [BaseModule-11] Set lastLevel to 26.983111154960746 for Astronomy_37 (was 26.905581824502356)
[Thread 0xb11ff450 (LWP 28112) exited]
[2016-02-25 10:56:59.668] [I] [core] after restart
[2016-02-25 10:56:59.679] [I] [core] Deleting device sensorMultiline ClimateControl_35
[2016-02-25 10:56:59.717] [I] [core] --- Stopping module Climate Control
[2016-02-25 10:56:59.722] [I] [core] --- Starting module Climate Control
[2016-02-25 10:56:59.802] [I] [core] Creating device sensorMultiline ClimateControl_35
[2016-02-25 10:56:59.842] [I] [core] Detected AV capable Sonos player on 192.168.10.205
[2016-02-25 10:56:59.854] [I] [core] Creating device switchBinary Sonos_Device_Play_192.168.10.205_29
[2016-02-25 10:56:59.873] [I] [core] Creating device switchMultilevel Sonos_Device_Volume_192.168.10.205_29

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb1bff450 (LWP 28111)]
0xb5798460 in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
(gdb) thread info
No symbol "info" in current context.
(gdb) info thread
  Id   Target Id         Frame
* 11   Thread 0xb1bff450 (LWP 28111) "zway/sockets" 0xb5798460 in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
  9    Thread 0xb23ff450 (LWP 28109) "v8:DbgAgntSessn" 0xb69880c0 in recv () at ../sysdeps/unix/syscall-template.S:81
  8    Thread 0xb2bff450 (LWP 28108) "zway/core" 0xb642f514 in select () at ../sysdeps/unix/syscall-template.S:81
  7    Thread 0xb35ff450 (LWP 28107) "zway/timers" 0xb6403d90 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  6    Thread 0xb3dff450 (LWP 28106) "zway/core" 0xb642f514 in select () at ../sysdeps/unix/syscall-template.S:81
  5    Thread 0xb4723450 (LWP 28105) "zway/core" 0xb69834b0 in __pthread_mutex_unlock_usercnt (mutex=0x1, decr=1) at pthread_mutex_unlock.c:57
  4    Thread 0xb4f23450 (LWP 28104) "v8:MsgDispHelpr" 0xb6986a40 in do_futex_wait (isem=isem@entry=0x5dc2c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  3    Thread 0xb4f33450 (LWP 28103) "v8:SweeperThrea" 0xb6986a40 in do_futex_wait (isem=isem@entry=0x5db2c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  2    Thread 0xb5733450 (LWP 28102) "zway" 0xb6987f60 in accept () at ../sysdeps/unix/syscall-template.S:81
  1    Thread 0xb634f000 (LWP 28099) "z-way-server" 0xb6403d90 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
(gdb) bt
#0  0xb5798460 in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
#1  0xb579a760 in zwjs::SocketsContext::Poll(int) () from ./modules/modsockets.so
#2  0xb5799960 in zwjs::SocketsContext::ServerFunc(ZRefCountedPointer<zwjs::Thread>) () from ./modules/modsockets.so
#3  0xb6f87264 in zwjs::Thread::ThreadFuncNative(void*) () from ./libs/libzwayjs.so
#4  0xb697fe90 in start_thread (arg=0xb1bff450, arg@entry=<error reading variable: Cannot access memory at address 0x38>) at pthread_create.c:311
#5  0xb6436128 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
/build/gdb-nrlhe3/gdb-7.7.1+dfsg/gdb/frame.c:472: internal-error: get_frame_id: Assertion `fi->this_id.p' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.
Quit this debugging session? (y or n)

@ronie-z-wave
Copy link

Another one ... looks a bit different:

Program received signal SIGPIPE, Broken pipe.
[Switching to Thread 0xb3dff450 (LWP 28576)]
0xb69882f4 in send () at ../sysdeps/unix/syscall-template.S:81
81      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) info thread
  Id   Target Id         Frame
  10   Thread 0xb23ff450 (LWP 28580) "zway/core" 0xb642f514 in select () at ../sysdeps/unix/syscall-template.S:81
  8    Thread 0xb2bff450 (LWP 28578) "zway/sockets" 0xb642f514 in select () at ../sysdeps/unix/syscall-template.S:81
  7    Thread 0xb33ff450 (LWP 28577) "zway/timers" 0xb6403d90 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
 6    Thread 0xb3dff450 (LWP 28576) "zway/core" 0xb69882f4 in send () at ../sysdeps/unix/syscall-template.S:81
  5    Thread 0xb4723450 (LWP 28575) "zway/core" 0xb6403d90 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  4    Thread 0xb4f23450 (LWP 28574) "v8:MsgDispHelpr" 0xb6986a40 in do_futex_wait (isem=isem@entry=0x5dc24) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  3    Thread 0xb4f33450 (LWP 28573) "v8:SweeperThrea" 0xb6986a40 in do_futex_wait (isem=isem@entry=0x5db24) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  2    Thread 0xb5733450 (LWP 28572) "zway" 0xb6987f60 in accept () at ../sysdeps/unix/syscall-template.S:81
  1    Thread 0xb634f000 (LWP 28569) "z-way-server" 0xb6403d90 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
(gdb) bt
#0  0xb69882f4 in send () at ../sysdeps/unix/syscall-template.S:81
#1  0xb5af9b9c in ns_write_to_socket(ns_connection*) () from ./modules/modws.so
#2  0xb5afa478 in ns_mgr_poll () from ./modules/modws.so
#3  0xb5b0612c in mg_poll_server () from ./modules/modws.so
#4  0xb5ae86b0 in zwjs::WebServerPortContext::WebServerFunc(ZRefCountedPointer<zwjs::Thread>) () from ./modules/modws.so
#5  0xb6f87264 in zwjs::Thread::ThreadFuncNative(void*) () from ./libs/libzwayjs.so
#6  0xb697fe90 in start_thread (arg=0xb3dff450, arg@entry=<error reading variable: Cannot access memory at address 0x4>) at pthread_create.c:311
#7  0xb6436128 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
/build/gdb-nrlhe3/gdb-7.7.1+dfsg/gdb/frame.c:472: internal-error: get_frame_id: Assertion `fi->this_id.p' failed.
A problem internal to GDB has been detected,
further debugging may prove unreliable.

@ronie-z-wave
Copy link

Stucked after some time using v8 debugger (not forced via browsers tap ~ idle) and DeviceHistory Module: poll every 1 min ....

it crashes about 50 min after restart ...

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb2bff450 (LWP 2509)]
0xb5798460 in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
(gdb) thread info
No symbol "info" in current context.
(gdb) ubfo thread
Undefined command: "ubfo".  Try "help".
(gdb) info thread
  Id   Target Id         Frame
  10   Thread 0xb23ff450 (LWP 2510) "zway/core" 0xb642f514 in select () at ../sysdeps/unix/syscall-template.S:81
* 9    Thread 0xb2bff450 (LWP 2509) "zway/sockets" 0xb5798460 in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
  7    Thread 0xb35ff450 (LWP 2507) "zway/timers" 0xb6403d90 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  6    Thread 0xb3dff450 (LWP 2506) "zway/core" 0xb642f514 in select () at ../sysdeps/unix/syscall-template.S:81
  5    Thread 0xb4723450 (LWP 2505) "zway/core" 0xb6403d90 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  4    Thread 0xb4f23450 (LWP 2504) "v8:MsgDispHelpr" 0xb6986a40 in do_futex_wait (isem=isem@entry=0x5dbf4) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  3    Thread 0xb4f33450 (LWP 2503) "v8:SweeperThrea" 0xb6986a40 in do_futex_wait (isem=isem@entry=0x5daf4) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  2    Thread 0xb5733450 (LWP 2502) "zway" 0xb6987f60 in accept () at ../sysdeps/unix/syscall-template.S:81
  1    Thread 0xb634f000 (LWP 2499) "z-way-server" 0xb6403d90 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
(gdb) bt
#0  0xb5798460 in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
#1  0xb579a760 in zwjs::SocketsContext::Poll(int) () from ./modules/modsockets.so
#2  0xb5799960 in zwjs::SocketsContext::ServerFunc(ZRefCountedPointer<zwjs::Thread>) () from ./modules/modsockets.so
#3  0xb6f87264 in zwjs::Thread::ThreadFuncNative(void*) () from /opt/z-way-server/libs/libzwayjs.so
#4  0xb697fe90 in start_thread (arg=0xb2bff450) at pthread_create.c:311
#5  0xb6436128 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

@ronie-z-wave
Copy link

ronie-z-wave commented Dec 12, 2016

@PoltoS , @pieceofsummer : Could please somebody of you guys have a look at this issue??

RPi 3 with latest v2.2.5 - UZB (5.06)

  Id   Target Id         Frame
 *11   Thread 0x72dff450 (LWP 5581) "zway/sockets" 0x757e750c in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
  10   Thread 0x735ff450 (LWP 5571) "zway/timers" 0x763ef360 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  8    Thread 0x742ff450 (LWP 5546) "zway/webserver" 0x763ef360 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  7    Thread 0x74ea6450 (LWP 5543) "zway/core" 0x763ef360 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
  6    Thread 0x74eb6450 (LWP 5537) "v8:SweeperThrea" 0x76972a40 in do_futex_wait (isem=isem@entry=0x64084) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  5    Thread 0x74ec6450 (LWP 5536) "v8:SweeperThrea" 0x76972a40 in do_futex_wait (isem=isem@entry=0x63f7c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  4    Thread 0x74ed6450 (LWP 5535) "v8:SweeperThrea" 0x76972a40 in do_futex_wait (isem=isem@entry=0x63e74) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  3    Thread 0x74ee6450 (LWP 5534) "v8:SweeperThrea" 0x76972a40 in do_futex_wait (isem=isem@entry=0x63d6c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  2    Thread 0x756e6450 (LWP 5533) "OptimizingCompi" 0x76972a40 in do_futex_wait (isem=isem@entry=0x63c24) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  1    Thread 0x76ff4000 (LWP 5484) "z-way-server" 0x763ef360 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
(gdb) bt
#0  0x757e750c in zwjs::SocketConnection::IsConfigured() const () from ./modules/modsockets.so
#1  0x757e980c in zwjs::SocketsContext::Poll(int) () from ./modules/modsockets.so
#2  0x757e8a0c in zwjs::SocketsContext::ServerFunc(ZRefCountedPointer<zwjs::Thread>) () from ./modules/modsockets.so
#3  0x76f8640c in zwjs::Thread::ThreadFuncNative(void*) () from ./libs/libzwayjs.so
#4  0x7696be90 in start_thread (arg=0x72dff450) at pthread_create.c:311
#5  0x76421598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92 from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb)

He's running a lot of Maros' apps.

I also observe regularly a killed z-way-server on my RPI B+ with UZB (5.16) v2.2.5+ ...
But I've no more apps running as the regular ones ...

There is also a german thread in forum.z-wave.me where a user has same problem with RPi B+, v2.2.5, RAZ 1 ... z-way-server is regularly killed after some hours ... we've questioned him about his apps in use

Thx.

@pz1
Copy link

pz1 commented Dec 12, 2016

There is also a german thread in forum.z-wave.me where a user has same problem with RPi B+, v2.2.5, RAZ 1 ... z-way-server is regularly killed after some hours ... we've questioned him about his apps in use

Maybe unrelated:
On my production Raspi2+UZB, the z-way-server always stops after a couple of days. Nothing irregular seen in the logs unfortunately.
I somehow suspect my UDP-send routines to OpenRemote .
Is there a general approach to debugging the "sudden" loss of z-way-server? IIRC a similar question has been asked before on the forum, but I haven't found it yet.

@ronie-z-wave
Copy link

There is also a german thread in forum.z-wave.me where a user has same problem with RPi B+, v2.2.5, RAZ 1 ... z-way-server is regularly killed after some hours ... we've questioned him about his apps in use

Here's the stack from the german guy:

0xb6402360 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
81	../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) info thread
  Id   Target Id         Frame 
  9    Thread 0xb1eff450 (LWP 1420) "zway/sockets" 0xb642d964 in select ()
    at ../sysdeps/unix/syscall-template.S:81
  6    Thread 0xb38ff450 (LWP 1417) "zway/timers" 0xb6402360 in nanosleep ()
    at ../sysdeps/unix/syscall-template.S:81
  5    Thread 0xb40ff450 (LWP 1416) "zway/core" 0xb642d964 in select ()
    at ../sysdeps/unix/syscall-template.S:81
  4    Thread 0xb4aff450 (LWP 1415) "zway/webserver" 0xb642d964 in select ()
    at ../sysdeps/unix/syscall-template.S:81
  3    Thread 0xb56d3450 (LWP 1414) "zway/core" 0xb6402360 in nanosleep ()
    at ../sysdeps/unix/syscall-template.S:81
  2    Thread 0xb56e3450 (LWP 1413) "v8:SweeperThrea" 0xb6985a40 in do_futex_wait (isem=isem@entry=0x63bb4) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
* 1    Thread 0xb6fef000 (LWP 1410) "z-way-server" 0xb6402360 in nanosleep ()
    at ../sysdeps/unix/syscall-template.S:81
(gdb) bt
#0  0xb6402360 in nanosleep () at ../sysdeps/unix/syscall-template.S:81
#1  0xb6402098 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137
#2  0x0000aeb4 in main ()

@PoltoS , @pieceofsummer can you please give some feedback?

@PoltoS
Copy link
Member Author

PoltoS commented Dec 15, 2016

We have made quite a lot of changes in 2.2.5, so the original issue seems to be irrelevant

Please show me how to reproduce the proble..

@ronie-z-wave
Copy link

Hmm how to reproduce - good question, think there are all different influences:

The german user for example has this setup:

Modul : RaZberry 1 Chip 500

Apps active on system:
Z-Wave Network Access (1)
Trap events from Remote and Sensors (1)
Remote Access (1)
System Clock (CRON) (1)
Periodic Sensor Polling (1)
Inband Notifier (1)
Battery Polling (1)
Sonos (1)
LOG external IP (1)
Logical Rule (2)
Open Remote Helpers
Smart Light
Scheduled Scene (2)
Astronomy (1)
Forecast.io Weather
Base Module (1)
Scene (1)
iF>Then (11)
Local Weather (1)
Pushover Notifications (2)
Load custom JavaScript file
Load custom JavaScript code
Automated Locking of Door
HTTP Device
SMS.ru Notifications
Sensor Values Logging
RGB Light from 3 Dimmers

All users have only start gdb debugger and waited till z-way-server crashes ...

What about OS or SD Card Class? Could this also influence such behaviour?
Maybe that could affect users that started with v2.2.0 and less ...

@pz1
Copy link

pz1 commented Dec 15, 2016

What about OS or SD Card Class?

FWIW:
I have done extensive reading with Google, because I had frequent corruptions of my Class 10 SD cards. Because there is not a lot of solid test data, I haven't made a systematic review. Some people experienced that lower class cards gave fewer corruption problems. The most serious concern was about the ext4 file system, and applications not properly flushing data to the SDCard (for the case of optimisation of Speed)

To solve the corruption problem, I first tried to use the syslog server on my Synology212+. Unfortunately the Syno could not cope with the data bombardment at zway startup.

Next I changed the configuration such that all logs were written to USB stick. No corruption since. Don't remember when I started that, but definitely over 6 months ago.

@ronie-z-wave
Copy link

@PoltoS , @pieceofsummer , @aivs

My log is now written to an USB stick, seems that this improves the behaviour on my RPi B+ (RaZ 2) ...
Crash rate seems to be reduced by ~80% (maybe more) ...

Im my case SD card is permanently stressed by z-way-server startup during developement :)

@davidc
Copy link

davidc commented Apr 22, 2017

Since upgrading to 2.3.0 a couple of months ago, this has been happening to me many times a day and causes the server to exit. I am now running 2.3.3 and it still happens.

Program received signal SIGPIPE, Broken pipe.
0x769772f4 in send () at ../sysdeps/unix/syscall-template.S:81
81      in ../sysdeps/unix/syscall-template.S
(gdb) bt
#0  0x769772f4 in send () at ../sysdeps/unix/syscall-template.S:81
#1  0x7631cb80 in mg_write_to_socket () from ./modules/modws.so
#2  0x7631d674 in mg_mgr_handle_conn () from ./modules/modws.so
#3  0x7631e1bc in mg_mgr_poll () from ./modules/modws.so
#4  0x76307ab0 in zwjs::ManagerContext::ThreadFunc(ZRefCountedPointer<zwjs::Thread>) ()
   from ./modules/modws.so
#5  0x76f86510 in zwjs::Thread::ThreadFuncNative(void*) () from ./libs/libzwayjs.so
#6  0x7696ee90 in start_thread (arg=0x651ff450) at pthread_create.c:311
#7  0x76424598 in ?? () at ../ports/sysdeps/unix/sysv/linux/arm/nptl/../clone.S:92
   from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) info thread
  Id   Target Id         Frame
  10   Thread 0x5dcff450 (LWP 28618) "zway/timers" 0x763f2360 in nanosleep ()
    at ../sysdeps/unix/syscall-template.S:81
  9    Thread 0x711ff450 (LWP 28596) "zway/core" 0x763f2360 in nanosleep ()
    at ../sysdeps/unix/syscall-template.S:81
* 8    Thread 0x651ff450 (LWP 28595) "zway/webserver" 0x769772f4 in send ()
    at ../sysdeps/unix/syscall-template.S:81
  7    Thread 0x74e87450 (LWP 28584) "zway/core" 0x76b6ce34 in v8::String::WriteUtf8(char*, int, int*, int) const () from ./libs/libv8.so
  6    Thread 0x74e97450 (LWP 28583) "v8:SweeperThrea" 0x76975a40 in do_futex_wait (
    isem=isem@entry=0x6476c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  5    Thread 0x74ea7450 (LWP 28582) "v8:SweeperThrea" 0x76975a40 in do_futex_wait (
    isem=isem@entry=0x64664) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  4    Thread 0x74eb7450 (LWP 28581) "v8:SweeperThrea" 0x76975a40 in do_futex_wait (
    isem=isem@entry=0x6455c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  3    Thread 0x75766450 (LWP 28580) "v8:SweeperThrea" 0x76975a40 in do_futex_wait (
    isem=isem@entry=0x64454) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  2    Thread 0x756b7450 (LWP 28579) "OptimizingCompi" 0x76975a40 in do_futex_wait 
    isem=isem@entry=0x6430c) at ../nptl/sysdeps/unix/sysv/linux/sem_wait.c:48
  1    Thread 0x76ff4000 (LWP 28276) "z-way-server" 0x763f2360 in nanosleep ()
    at ../sysdeps/unix/syscall-template.S:81

I do not have any apps besides the defaults.

@PoltoS
Copy link
Member Author

PoltoS commented Apr 22, 2017

This is known and fixed on 2.3.1. Now it is 2.3.4

@PoltoS
Copy link
Member Author

PoltoS commented Apr 22, 2017

In 2.3.3 gdb traps it but continue should resume the process as it ignores this signal now. Please try and fun the real reason for the process to crash

@PoltoS
Copy link
Member Author

PoltoS commented Apr 23, 2017

Do you mean that your crashes are due to bad alloc and SIG_PIPE is not the real reason?

@davidc
Copy link

davidc commented Apr 23, 2017

That's how it looks yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants