Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rfc] kodi crashlogs #3657

Merged
merged 3 commits into from
Jan 4, 2015
Merged

[rfc] kodi crashlogs #3657

merged 3 commits into from
Jan 4, 2015

Conversation

stefansaraev
Copy link
Contributor

on ubuntu, when kodi crashes, it dumps a nice crashlog. crashlogs help us and kodi devs to find out WHY..

the kodi.sh script included in this PR is a shameless rip off, cleaned up a bit ;)

I have tested on x86_64 and so far all good (simulating crash via kill -SEGV). but this needs testing. on RPi/Cuboxi-i with slow sdcard it might take quite lot of time to dump a 300+ MB core. also sh*t may happen on netbooted clients (but we shouldnt care much for those, anyway) EDIT: oh, and those people who install x86_64 on usb sticks, too...

please dont merge (yet), this is for after OE 5 and should not go to 5.0 branch.

@sraue any comments?

//cc @fritsch

@popcornmix
Copy link

Excellent. I've wanted this feature for a long while. Will test on Pi.

@fritsch
Copy link
Contributor

fritsch commented Dec 2, 2014

That's excellent in deed! yes. +10!

@MilhouseVH
Copy link
Contributor

Works well on the Pi. Example output when simulating a seg fault can be seen here.

It takes about 30 seconds to generate the crash log on a 1GHz Model B R-Pi (384MB/128MB memory split, with 128MB swap, and /storage on USB3 flash).

It takes about 23 seconds to generate the crash log on a 1GHz Model B R-Pi (256MB/256MB memory split, with 128MB swap, and /storage on USB3 flash).

@MilhouseVH
Copy link
Contributor

What might be handy is having a sym link of /storage/.kodi/temp/kodi_crashlog to the last generated kodi_crashlog-*.log, so that it will be easier to get users to pastebin their last crashlog (getting them to try and work out the filename could be more trouble than it's worth).

Edit: Ah, so this only ever keeps a maximum of one crash log. It still might be easier having a sym link, or just give the single crashlog file that will ever exist a consistent file name (similar to kodi.log) as the datestamp is present within the file.

@stefansaraev
Copy link
Contributor Author

done @MilhouseVH
side note. this does NOT affect official (release) builds, as we dont install gdb on those. if we decide to do so, I can make this run only if "debugging" is enabled.

@MilhouseVH
Copy link
Contributor

Thanks.

So it seems that release builds (without gdb) may accumulate an unlimited number of core dumps, as single_stacktrace() will not be called which is where the core cleanup occurs - could this result in a potential space issue?

Perhaps only the last core should be retained in release builds (I'm assuming the core from a release build may still be useful to a developer if uploaded to dropbox etc.).

@stefansaraev
Copy link
Contributor Author

So it seems that release builds (without gdb) may accumulate an unlimited number of core dumps, as single_stacktrace() will not be called which is where the core cleanup occurs - could this result in a potential space issue?

nope. no coredumps will be stored at all if gdb is not installed.

"max core file size" (ulimit -c) for every process/daemon is 0 by default, unless it is set via "ulimit -c" or LimitCORE=.. in daemon's systemd unit (kodi.service)

https://github.com/OpenELEC/OpenELEC.tv/pull/3657/files#diff-28d0db85438a3e221913f31ca2c50c34R75

@MilhouseVH
Copy link
Contributor

Smashing, thanks.

ulimit -c unlimited
fi

/usr/lib/kodi/kodi.bin "$SAVED_ARGS"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be:

/usr/lib/kodi/kodi.bin $SAVED_ARGS

By quoting, the collection of args is treated as a single arg (and then you get no lircd with this commit)

@MilhouseVH
Copy link
Contributor

@stefansaraev, had a report of this PR causing a problem when starting Retroarch - any ideas?

http://forum.kodi.tv/showthread.php?tid=192380&pid=1855429#pid1855429

@stefansaraev
Copy link
Contributor Author

sure. when you call "systemctl stop kodi" - it basicaly commits suicide, as I have removed KillMode=process

@MilhouseVH
Copy link
Contributor

Yes, adding KillMode=process allows retroarch to start successfully, but now kodi.bin isn't terminated by "systemctl stop kodi" (kodi.sh has gone).

@stefansaraev
Copy link
Contributor Author

Yes, adding KillMode=process allows retroarch to start successfully, but now kodi.bin isn't terminated by "systemctl stop kodi" (kodi.sh has gone).

exactly.

why do "kodi" addons want to stop kodi ?

@MilhouseVH
Copy link
Contributor

I guess that's a good question, but not one I can answer.

@walterheisenberg
Copy link

because of loading RetroArch to play a Game
http://openelec.tv/forum/128-addons/72972-retroarch-addon-arm-rpi
btw: I'm Solo0815 from the Kodi-Forums

@stefansaraev
Copy link
Contributor Author

@walterheisenberg I understand it but "systemctl stop kodi" from an addon that is run from kodi is plain wrong (EDIT: it has always been wrong. not just now..)

people should use systemd-run to start their stuff in different cgroup, if they want to stop kodi and have their external stuff working

example:

openelec:~/.kodi/addons/test.test # cat default.py
import xbmc, xbmcaddon, subprocess, os

__settings__   = xbmcaddon.Addon(id='test.test')
__cwd__        = __settings__.getAddonInfo('path')
__start__      = xbmc.translatePath( os.path.join( __cwd__, 'bin', "test.start") )

subprocess.Popen(__start__, shell=True, close_fds=True)
openelec:~/.kodi/addons/test.test # cat bin/test.start
#!/bin/sh

systemd-run /storage/.kodi/addons/test.test/bin/launcher.sh
openelec:~/.kodi/addons/test.test # cat bin/launcher.sh
#!/bin/sh

systemctl stop kodi

# run whatever you want IN FOREGROUND. in my case "sleep 10"
sleep 10

systemctl start kodi

@MilhouseVH
Copy link
Contributor

Yep, changing /storage/.kodi/addons/emulator.tools.retroarch/bin/retroarch.sh

from

nohup /storage/.kodi/addons/emulator.tools.retroarch/bin/retroarch.start "$@" > /storage/.kodi/userdata/addon_data/emulator.tools.retroarch/config/nohup.out &

to

systemd-run /storage/.kodi/addons/emulator.tools.retroarch/bin/retroarch.start "$@"

fixes the issue - thanks @stefansaraev

@walterheisenberg
Copy link

OK, thank you guys for making that clear
I will change my scripts

@MilhouseVH
Copy link
Contributor

Next issue... :)

When selecting "Power off" from the GUI system menu, I believe the guisettings.xml is being corrupted by this PR - when the device (a Pi) is next booted, the guisettings.xml file is 0 bytes:

-rw-r--r--    1 root     root             0 Dec  7 12:56 /storage/.kodi/userdata/guisettings.xml

I've noticed that with this PR, there are two threads saving the settings on shutdown:

13:15:36 101.085030 T:3041517568   DEBUG: LIRC: Update - NEW at 85426:160 0 KEY_OK devinput (KEY_OK)
13:15:36 101.085411 T:3041517568   DEBUG: OnKey: 11 (0x0b) pressed, action is Select
13:15:36 101.103607 T:3041517568    INFO: Selected Logind/UPower as PowerSyscall
13:15:36 101.106201 T:3041517568   ERROR: DBus: Error org.freedesktop.DBus.Error.ServiceUnknown - The name org.freedesktop.UPower was not provided by any .service files
13:15:36 101.106422 T:3041517568    INFO: LogindUPowerSyscall - UPower not found, battery information will not be available
13:15:36 101.126610 T:3041517568   DEBUG: LogindUPowerSyscall - inhibit lock taken, fd 51
13:15:36 101.165619 T:3041517568   DEBUG: LogindUPowerSyscall - delay lock released
13:15:36 101.165916 T:3041517568   DEBUG: ------ Window Init (DialogBusy.xml) ------
13:15:36 101.169060 T:3041517568   DEBUG: LIRC: Update - NEW at 85510:160 0 KEY_OK_UP devinput (KEY_OK_UP)
13:15:37 101.409882 T:3031430208    INFO: Received SIGTERM...
13:15:37 101.410217 T:3031430208   DEBUG: CAnnouncementManager - Announcement: OnQuit from xbmc
13:15:37 101.410576 T:3031430208   DEBUG: GOT ANNOUNCEMENT, type: 8, from xbmc, message OnQuit
13:15:37 101.410965 T:3031430208  NOTICE: Storing total System Uptime
13:15:37 101.411972 T:3031430208  NOTICE: Saving settings
13:15:37 101.529541 T:3041517568   DEBUG: ------ Window Deinit (DialogButtonMenu.xml) ------
13:15:37 101.575813 T:3041517568    INFO: Received SIGTERM...
13:15:37 101.576088 T:3041517568   DEBUG: CAnnouncementManager - Announcement: OnQuit from xbmc
13:15:37 101.576347 T:3041517568   DEBUG: GOT ANNOUNCEMENT, type: 8, from xbmc, message OnQuit
13:15:37 101.576813 T:3041517568  NOTICE: Storing total System Uptime
13:15:37 101.577400 T:3041517568  NOTICE: Saving settings

but with the "old" kodi.service, there is only thread that saves the settings. Could the extra thread be trashing guisettings.xml?

I can't reproduce the guisettings.xml corruption with the old kodi.service, but with this new kodi.service it happens frequently.

@MilhouseVH
Copy link
Contributor

I've done some more testing (using autostart.sh to log the size of guisettings.xml to a file) and it's not always zero bytes, for instance I've seen guisettings.xml with a size of 12288 when it should be 25953+.

So I'd say there's a good chance the two threads are trampling on each other.

@stefansaraev
Copy link
Contributor Author

yep @MilhouseVH confirmed. will fix it. thanks ;)

EDIT: part of the problem is our "deadlock-on-exit fix". I forgot about it ;)

@stefansaraev
Copy link
Contributor Author

@MilhouseVH last commit should fix it.

@MilhouseVH
Copy link
Contributor

@stefansaraev - initial testing looks good, many thanks!

@MilhouseVH
Copy link
Contributor

@stefansaraev: There is another problem with guisettings.xml, but it's completely the opposite of the previous problem as the settings are now never saved on exit (power off, reboot, systemctl stop/restart etc.).

For instance, if you modify a setting with JSON, the modified setting will not be persisted in guisettings.xml on exit.

Example:

  1. Start with debug disabled in guisettings.xml.
  2. Restart kodi.bin - confirm debug is still disabled
  3. Enable debug using JSON:
    curl -sH 'Content-Type: application/json' -d '{"jsonrpc": "2.0", "params": {"setting": "debug.showloginfo", "value": true}, "method": "Settings.SetSettingValue", "id": "libSetSetting"}' localhost:8080/jsonrpc
    (I'm assuming you have the webserver on 8080 etc. - change as necessary)
  4. Shutdown kodi.bin - settings should be saved, meaning debug remains enabled on re-start.
  5. Start kodi.bin - debug is still disabled, confirming settings were not saved on last exit. You can also tell that the file hasn't been updated by observing the timestamp on guisettings.xml.

Without this PR, the changes are correctly persisted on exit.

Another way to confirm the problem is through the lack of any OnQuit message when kodi.bin shuts down - this message is not present when this PR is used:

07:06:03 T:3040444416    INFO: Received SIGTERM...
07:06:03 T:3040444416   DEBUG: CAnnouncementManager - Announcement: OnQuit from xbmc
07:06:03 T:3040444416   DEBUG: GOT ANNOUNCEMENT, type: 8, from xbmc, message OnQuit
07:06:03 T:3040444416  NOTICE: Storing total System Uptime
07:06:03 T:3040444416  NOTICE: Saving settings

but appears once when this PR is not used.

@stefansaraev
Copy link
Contributor Author

huh. I am sure it was fine when I pushed the kill $MAINPID change. but I'll doublecheck tonight.

@MilhouseVH
Copy link
Contributor

Thanks.

There's definitely something weird going on. I'd be inclined to think that kodi.bin is being killed before it can gracefully shut down and save the settings and complete any other exit tasks. There are however times when it will save the settings on exit, but this seems to be the exception rather than the rule.

@stefansaraev
Copy link
Contributor Author

gotcha. that happens when one (me) reads the systemd docs between the lines. I was blind.

if you are curious, here is what happened: systemd sends TERM to $MAINPID (kodi.sh) then TERM to all remaining processes in the cgroup (including kodi.bin). as kodi.sh exits immediately on TERM, the service is considered "clean stopped" and systemd continues the reboot/shutdown sequence. if kodi.bin had chance to save guisettings.xml - that was by pure luck (facepalm. I am blind). it never happens on "systemctl stop kodi". by design. but reboot/shutdown is different story.

now kodi.sh waits for kodi.bin to exit. if it does not exit within the TimeoutStopSec interval. well... systemd sends KILL. it has been that way for long, to prevent "deadlocks" on shutdown/reboot

and well I will update our sigterm handler patch to guard against double sigterm. but that's outside of the scope of this PR.

@MilhouseVH
Copy link
Contributor

@stefansaraev testing with the updates and there's still a problem in some of the shutdown scenarios.

  1. Enable debug using JSON. Reboot from GUI power menu, and debug is not enabled on restart
  2. Enable debug using JSON. Power Off from GUI power menu, and debug is not enabled on restart
  3. Enable debug using JSON. systemctl stop/restart kodi - debug is enabled on restart

So it looks like there's a problem saving settings when using the GUI power menu options.

@stefansaraev
Copy link
Contributor Author

are you sure the changes to kodi.sh are applied or no modified kodi.service in .config ?
you should have now short lag (up to 5 seconds) when doing reboot from GUI. just like when calling systemctl restart.
look at kodi.old.log for "Saving ..."

or is there any rpi-specific patch that does "reboot from gui" in different way and might interfere?

@MilhouseVH
Copy link
Contributor

are you sure the changes to kodi.sh are applied or no modified kodi.service in .config

I'm sure the changes are there and there is no kodi.service in .config (except when I'm testing the old service).

or is there any rpi-specific patch that does "reboot from gui" in different way and might interfere?

Not that I'm aware of.

d0f7237 which went in on the 14th did cause a problem with the Pi's "virtual suspend" - these patches are not yet upstream however I've tested a build without these "virual suspend" patches. I tested a build using only b7d591d plus d0f7237 and the latest PR3657/SIGTERM and it made no difference - still no settings saved with the new service/new SIGTERM patch, or the old service/new SIGTERM.

It may have something to do with the change you made to the SIGTERM patch as I'm not seeing Received SIGTERM... even when using the old service in a new build. The old service used to work reliably but in #1215+ using the updated SIGTERM patch, even the old service no longer works as the SIGTERM signal is never "seen" (nothing in the logs).

Here's four logs - two from my last test build #1215 (latest PR3657/latest SIGTERM) and two from the previous build #1214 (previous PR3657/previous SIGTERM).

In all cases I'm enabling debug via json, then rebooting from the GUI.

#1215, New Service (settings not saved on reboot):

http://sprunge.us/AUJJ

rpi512:~/.config # md5sum /usr/lib/kodi/kodi.sh /usr/lib/systemd/system/kodi.service
052c0666904e78bf442712dc8e02cb4b  /usr/lib/kodi/kodi.sh
4ad44e472aef59196c77d4b8ec993bda  /usr/lib/systemd/system/kodi.service

rpi512:~ # systemctl status kodi
● kodi.service - Kodi Media Center
   Loaded: loaded (/usr/lib/systemd/system/kodi.service; disabled)
   Active: active (running) since Tue 2014-12-16 09:48:11 GMT; 34s ago
  Process: 425 ExecStartPre=/usr/lib/kodi/kodi-config (code=exited, status=0/SUCCESS)
 Main PID: 430 (kodi.sh)
   CGroup: /system.slice/kodi.service
           ├─430 /bin/sh /usr/lib/kodi/kodi.sh --standalone -fs --lircdev /run/lirc/lircd
           └─441 /usr/lib/kodi/kodi.bin --standalone -fs --lircdev /run/lirc/lircd

Dec 16 09:48:34 rpi512 kodi.sh[430]: hwclock: can't open '/dev/misc/rtc': No such file or directory
#1215, Old Service (settings not saved on reboot):

http://sprunge.us/Befj

rpi512:~ # systemctl status kodi
● kodi.service - Kodi Media Center
   Loaded: loaded (/storage/.config/system.d/kodi.service; disabled)
   Active: active (running) since Tue 2014-12-16 09:52:30 GMT; 36s ago
  Process: 423 ExecStartPre=/usr/lib/kodi/kodi-config (code=exited, status=0/SUCCESS)
 Main PID: 428 (kodi.bin)
   CGroup: /system.slice/kodi.service
           └─428 /usr/lib/kodi/kodi.bin --standalone -fs --lircdev /run/lirc/lircd

Dec 16 09:52:55 rpi512 sh[428]: hwclock: can't open '/dev/misc/rtc': No such file or directory
#1214, New Service (settings not saved on reboot):

http://sprunge.us/SCPH

rpi512:~ # systemctl status kodi
● kodi.service - Kodi Media Center
   Loaded: loaded (/usr/lib/systemd/system/kodi.service; disabled)
   Active: active (running) since Tue 2014-12-16 10:09:40 GMT; 1min 20s ago
  Process: 421 ExecStartPre=/usr/lib/kodi/kodi-config (code=exited, status=0/SUCCESS)
 Main PID: 426 (kodi.sh)
   CGroup: /system.slice/kodi.service
           ├─426 /bin/sh /usr/lib/kodi/kodi.sh --standalone -fs --lircdev /run/lirc/lircd
           └─437 /usr/lib/kodi/kodi.bin --standalone -fs --lircdev /run/lirc/lircd

Dec 16 10:10:05 rpi512 kodi.sh[426]: hwclock: can't open '/dev/misc/rtc': No such file or directory
#1214, Old Service (settings saved on reboot):

http://sprunge.us/VTVE

rpi512:~ # systemctl status kodi
● kodi.service - Kodi Media Center
   Loaded: loaded (/storage/.config/system.d/kodi.service; disabled)
   Active: active (running) since Tue 2014-12-16 10:00:20 GMT; 1min 16s ago
  Process: 423 ExecStartPre=/usr/lib/kodi/kodi-config (code=exited, status=0/SUCCESS)
 Main PID: 428 (kodi.bin)
   CGroup: /system.slice/kodi.service
           └─428 /usr/lib/kodi/kodi.bin --standalone -fs --lircdev /run/lirc/lircd

Dec 16 10:00:45 rpi512 sh[428]: hwclock: can't open '/dev/misc/rtc': No such file or directory

With the #1215 build (old or new service, new SIGTERM) and #1214 build (new service, old SIGTERM) there is no OnQuit message, nor is there a Received SIGTERM... message in the log - kodi.bin just dies, and settings are not saved.

Only with the #1214 build (old service, old SIGTERM) do I see the OnQuit and SIGTERM messages being processed/received.

Given that the latest #1215 build no longer works correctly with the "old" service, I strongly suspect the updated SIGTERM handler is not working as expected.

@MilhouseVH
Copy link
Contributor

With 6597a82 (ie. SIGTERM update, reverted) this is now testing very well.

A new build (ie. new service, old SIGTERM) seems to be working well - no duplicate attempts to save settings (causing corruption), and always one attempt to save settings on all of the following shutdowns:

  • systemctl stop kodi (you said it should, 'by design'?)
  • systemctl restart kodi
  • GUI Power Off
  • GUI Reboot
  • JSON Application.Quit

Will test more, but thumbs up so far! Hopefully that's it nailed. :)

@stefansaraev
Copy link
Contributor Author

now. after xbmc/xbmc#5895 we have a problem. a big one, that will affect lot of stuff (not only this PR).

we will need:
http://sprunge.us/HJgV
or:
http://sprunge.us/YZEL

both tested and work. I'd prefer first one, but it's ugly... (XDG_DATA_HOME= )

the .xbmc -> .kodi move was enough pain. this xdg nonsense (sorry for the wording) will be even more pain.. so up to you. ping @sraue

@stefansaraev
Copy link
Contributor Author

ok, xbmc/5895 was reverted. IMO we are good to go. when (or if..) it is added back upstream, we will take care, it is unrelated to this PR.

@sraue any objections? this has been well tested and should be safe (for master only)

@sraue
Copy link
Contributor

sraue commented Jan 3, 2015

yeah please commit if you are sure its ok, on master i will stick with helix for some time or readd support for building master or helix based on make options. master will become 5.2 soon (with kernel 3.18/19, systemd 218, python 2.7.9 and mesa 10.4)

stefansaraev added a commit that referenced this pull request Jan 4, 2015
@stefansaraev stefansaraev merged commit bb70949 into OpenELEC:master Jan 4, 2015
@stefansaraev stefansaraev deleted the crashlogs branch January 4, 2015 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants