Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in Helix Master when accessing IFO over NFS #3692

Closed
MilhouseVH opened this issue Dec 14, 2014 · 13 comments
Closed

Segfault in Helix Master when accessing IFO over NFS #3692

MilhouseVH opened this issue Dec 14, 2014 · 13 comments

Comments

@MilhouseVH
Copy link
Contributor

TrevorH reported this issue on #irc last night, while using a 32-bit x86 OpenELEC build (Atom d510).

I've been able to reproduce on a R-Pi with test build #1213.

This appears to be OpenELEC specific. @stefansaraev (seo) also took part in the conversation.

To reproduce
  1. Scan in an IFO over NFS (I'm using Universal Movie Scanner)
  2. Play it from the Library
  3. Result: SEGFAULT
Test setup

With current master, I added a folder imaginatively called "Test".

In sources.xml, it's added as an extra path on an existing source:

        <source>
            <name>My Movies</name>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video/Comedy/</path>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video/Documentaries/</path>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video/MoviesSD/</path>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video/MoviesHD/</path>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video-Private/Comedy/</path>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video-Private/Documentaries/</path>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video-Private/MoviesSD/</path>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video-Private/MoviesHD/</path>
            <path pathversion="1">nfs://192.168.0.3/mnt/share/media/Video-Private/Test/</path>
            <allowsharing>true</allowsharing>
        </source>

The contents of the Test folder is as follows (I had intended for the VIDEO_TS and AUDIO_TS folders to be in a folder called "Hitchikers Guide - Extra" as this is what the DVD is, however I messed up my cp command while creating the Test folder which may have helped, who knows):

# ls -la Test
total 717604
drwxrwxr-x  4 neil  neil          9 Dec 13 23:11 ./
drwxr-xr-x  8 neil  neil          8 Sep 21 23:24 ../
drwxr-xr-x  2 neil  neil          2 Dec 13 23:11 AUDIO_TS/
-rw-rw-r--  1 neil  neil      79482 Feb  5  2013 Screamers (1995)[DVDRip]-fanart.jpg
-rw-rw-r--  1 neil  neil      60306 Mar 15  2014 Screamers (1995)[DVDRip]-logo.png
-rw-rw-r--  1 neil  neil     159241 Feb  5  2013 Screamers (1995)[DVDRip]-poster.jpg
-rw-rw-r--  1 neil  neil  733992960 Oct 18  2012 Screamers (1995)[DVDRip].avi
-rw-rw-r--  1 neil  neil       4809 Apr 22  2013 Screamers (1995)[DVDRip].nfo
drwxr-xr-x  2 neil  neil         61 Dec 13 23:13 VIDEO_TS/

And this is the contents of the VIDEO_TS folder (AUDIO_TS is empty):

# ls -la Test/VIDEO_TS
total 8021627
drwxr-xr-x  2 neil  neil          61 Dec 13 23:13 ./
drwxrwxr-x  4 neil  neil           9 Dec 13 23:11 ../
-rw-r--r--  1 neil  neil       24576 Dec 13 23:11 VIDEO_TS.BUP
-rw-r--r--  1 neil  neil       24576 Dec 13 23:11 VIDEO_TS.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VIDEO_TS.VOB
-rw-r--r--  1 neil  neil       81920 Dec 13 23:12 VTS_01_0.BUP
-rw-r--r--  1 neil  neil       81920 Dec 13 23:13 VTS_01_0.IFO
-rw-r--r--  1 neil  neil   192430080 Dec 13 23:12 VTS_01_0.VOB
-rw-r--r--  1 neil  neil      991232 Dec 13 23:13 VTS_01_1.VOB
-rw-r--r--  1 neil  neil       36864 Dec 13 23:13 VTS_02_0.BUP
-rw-r--r--  1 neil  neil       36864 Dec 13 23:12 VTS_02_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:13 VTS_02_0.VOB
-rw-r--r--  1 neil  neil  1073565696 Dec 13 23:12 VTS_02_1.VOB
-rw-r--r--  1 neil  neil   439345152 Dec 13 23:11 VTS_02_2.VOB
-rw-r--r--  1 neil  neil       24576 Dec 13 23:13 VTS_03_0.BUP
-rw-r--r--  1 neil  neil       24576 Dec 13 23:12 VTS_03_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_03_0.VOB
-rw-r--r--  1 neil  neil   609685504 Dec 13 23:12 VTS_03_1.VOB
-rw-r--r--  1 neil  neil       20480 Dec 13 23:11 VTS_04_0.BUP
-rw-r--r--  1 neil  neil       20480 Dec 13 23:11 VTS_04_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_04_0.VOB
-rw-r--r--  1 neil  neil   425539584 Dec 13 23:11 VTS_04_1.VOB
-rw-r--r--  1 neil  neil       20480 Dec 13 23:11 VTS_05_0.BUP
-rw-r--r--  1 neil  neil       20480 Dec 13 23:11 VTS_05_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_05_0.VOB
-rw-r--r--  1 neil  neil   384049152 Dec 13 23:11 VTS_05_1.VOB
-rw-r--r--  1 neil  neil       18432 Dec 13 23:11 VTS_06_0.BUP
-rw-r--r--  1 neil  neil       18432 Dec 13 23:11 VTS_06_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_06_0.VOB
-rw-r--r--  1 neil  neil   129024000 Dec 13 23:11 VTS_06_1.VOB
-rw-r--r--  1 neil  neil       24576 Dec 13 23:11 VTS_07_0.BUP
-rw-r--r--  1 neil  neil       24576 Dec 13 23:11 VTS_07_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_07_0.VOB
-rw-r--r--  1 neil  neil   537753600 Dec 13 23:11 VTS_07_1.VOB
-rw-r--r--  1 neil  neil       22528 Dec 13 23:11 VTS_08_0.BUP
-rw-r--r--  1 neil  neil       22528 Dec 13 23:11 VTS_08_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_08_0.VOB
-rw-r--r--  1 neil  neil   492632064 Dec 13 23:11 VTS_08_1.VOB
-rw-r--r--  1 neil  neil       18432 Dec 13 23:11 VTS_09_0.BUP
-rw-r--r--  1 neil  neil       18432 Dec 13 23:11 VTS_09_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_09_0.VOB
-rw-r--r--  1 neil  neil   148426752 Dec 13 23:11 VTS_09_1.VOB
-rw-r--r--  1 neil  neil       18432 Dec 13 23:12 VTS_10_0.BUP
-rw-r--r--  1 neil  neil       18432 Dec 13 23:11 VTS_10_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_10_0.VOB
-rw-r--r--  1 neil  neil   142884864 Dec 13 23:11 VTS_10_1.VOB
-rw-r--r--  1 neil  neil       18432 Dec 13 23:11 VTS_11_0.BUP
-rw-r--r--  1 neil  neil       18432 Dec 13 23:11 VTS_11_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_11_0.VOB
-rw-r--r--  1 neil  neil    36808704 Dec 13 23:11 VTS_11_1.VOB
-rw-r--r--  1 neil  neil       18432 Dec 13 23:11 VTS_12_0.BUP
-rw-r--r--  1 neil  neil       18432 Dec 13 23:12 VTS_12_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:13 VTS_12_0.VOB
-rw-r--r--  1 neil  neil   195045376 Dec 13 23:11 VTS_12_1.VOB
-rw-r--r--  1 neil  neil       53248 Dec 13 23:11 VTS_13_0.BUP
-rw-r--r--  1 neil  neil       53248 Dec 13 23:11 VTS_13_0.IFO
-rw-r--r--  1 neil  neil      178176 Dec 13 23:11 VTS_13_0.VOB
-rw-r--r--  1 neil  neil  1073565696 Dec 13 23:12 VTS_13_1.VOB
-rw-r--r--  1 neil  neil  1073565696 Dec 13 23:12 VTS_13_2.VOB
-rw-r--r--  1 neil  neil  1073565696 Dec 13 23:13 VTS_13_3.VOB
-rw-r--r--  1 neil  neil   175679488 Dec 13 23:13 VTS_13_4.VOB

After the video scanner (Universal Movie Scanner) had finished scanning it added the following new movie:

  {
    "art": {},
    "file": "nfs://192.168.0.3/mnt/share/media/Video-Private/Test/VIDEO_TS/VIDEO_TS.IFO",
    "label": "Screamers",
    "movieid": 825,
    "title": "Screamers"
  }

Any attempt to play this movie via the library and using a recent master build results in a seg fault.

Tail end of kodi.log:

08:13:10  92.554596 T:2862609472  NOTICE: Thread DVDPlayer start, auto delete: false
08:13:10  92.556030 T:2785326144  NOTICE: Thread CMMALRenderer start, auto delete: false
08:13:10  92.557243 T:2862609472  NOTICE: Creating InputStream
08:13:10  92.558128 T:2862609472   DEBUG: SECTION:LoadDLL(special://xbmcbin/system/players/dvdplayer/libdvdnav-arm.so)
08:13:10  92.558556 T:2862609472   DEBUG: Loading: /usr/lib/kodi/system/players/dvdplayer/libdvdnav-arm.so
08:13:10  92.577690 T:2862609472    INFO:   msg: libdvdnav: Using dvdnav version 4.2.1
08:13:10  92.578926 T:2862609472   DEBUG: SECTION:LoadDLL(libnfs.so.4)
08:13:10  92.583954 T:2862609472   DEBUG: Loading: libnfs.so.4
08:13:10  92.683022 T:2862609472   DEBUG: NFS: Context for 192.168.0.3/mnt/share not open - get a new context.
08:13:10  92.827583 T:2862609472   DEBUG: NFS: Connected to server 192.168.0.3 and export /mnt/share
08:13:10  92.827797 T:2862609472   DEBUG: NFS: chunks: r/w 32768/32768
08:13:10  92.831871 T:2862609472    INFO: dll_fopen - something opened the mount file, let's hope it knows what it's doing
08:13:10  92.833260 T:2862609472   DEBUG:   msg: libdvdread: Couldn't find device name.
08:13:10  92.837830 T:2862609472   ERROR: NFS: Failed to stat(mnt/share/media/Video-Private/Test/VIDEO_TS.IFO) stat call failed with "NFS: Lookup of /media/Video-Private/Test/VIDEO_TS.IFO failed with NFS3ERR_NOENT(-2)"
08:13:10  92.850456 T:2862609472   DEBUG: CNFSFile::Open - opened mnt/share/media/Video-Private/Test/VIDEO_TS/VIDEO_TS.IFO
08:13:10  92.888268 T:2862609472   DEBUG: CNFSFile::Open - opened mnt/share/media/Video-Private/Test
08:13:10  92.892876 T:2862609472   ERROR: Read - Error( -14, read call failed with "" )

Running kodi.bin in gdb produces a fairly useless backtrace[1]:

rpi512:~ # gdb --args /usr/lib/kodi/kodi.bin --standalone -fs --lircdev /run/lirc/lircd
...
Program received signal SIGSEGV, Segmentation fault.
[Switching to LWP 2229]
0xb6ecfc50 in memcpy () from /lib/libarmmem.so
(gdb) thread apply all bt

Thread 6 (LWP 2229):
#0  0xb6ecfc50 in memcpy () from /lib/libarmmem.so
#1  0x00000800 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

I also added the same Test folder using SMB, with the following movie added to the Library:

  {
    "art": {},
    "file": "smb://NM-WIN7/Torrent/qTorrent/Samples/DVDTS/VIDEO_TS/VIDEO_TS.IFO",
    "label": "Screamers",
    "movieid": 826,
    "title": "Screamers"
  }

In build #1213, this SMB version of the movie plays without a problem from the Library while the NFS version does not.

Bisecting builds allowed me to narrow it down to build #1027 from the evening of 27 Oct:

26 Oct #1026: SMB crashes, NFS works
27 Oct #1027: SMB works, NFS crashes

The release notes for build #1027 are here.

My October test builds are using libnfs 1.9.5 while my latest #1213 test build is using libnfs 1.9.6, so it doesn't seem to be a libnfs issue, at least not one that is already fixed upstream.

The only relevant change in build #1027 is the revert of xbmc/xbmc#5534 in order to avoid an SMB problem introduced by the final commit, xbmc/xbmc@e502b09. The revert does indeed resolve the SMB issue, but then appears to break NFS.

Subsequent pull requests (one or more of: xbmc/xbmc#5660, xbmc/xbmc#5694, xbmc/xbmc#5700, xbmc/xbmc#5740, possibly others) then addressed the SMB-related problem in xbmc/xbmc#5534 meaning the revert could be dropped, but as a result we now seem to be left with this persistent NFS issue.

The strange thing - and why I'm reporting this issue here and not on Kodi github - is that a current Kodi master build (xbmc/xbmc@238c5ef) on Ubuntu 14.10/x64 (using the same shared MySQL Library and sources.xml as the Raspberry Pi) does not crash with either NFS or SMB and plays the IFO correctly, over both NFS and SMB, so this currently appears to be OpenELEC specific.

Ubuntu kodi.log showing playback over NFS, then SMB, is here.

Based on comparison of the Ubuntu log (where NFS works) and the R-Pi log (where NFS segfaults), the point of failure appears to be libdvdnav related, as the following entry doesn't make it into the R-Pi log:

08:57:20 T:140638762821376   DEBUG: libdvdnav: DVD Title:

It's entirely possible that xbmc/xbmc#5534 is a red herring and the NFS problem could have been fixed after 27 Oct by one of the other related pull requests, then broken again by an unrelated (and as yet unidentified) commit, however given the history of the SMB/Posix fixes it's the most likely cause.

I'm happy to test patches etc. if anyone has any ideas or is able to debug further.


  1. Despite the test build including the crashlog facility ([rfc] kodi crashlogs #3657), in this case it isn't capturing the backtrace, apparently due to build options:
[13-Dec-2014 23:33:04] <Milhouse> so no idea why this crash isn't generating a useful crashlog
[13-Dec-2014 23:33:32] <seo> because -fomit-frame-pointer
[13-Dec-2014 23:33:34] <seo> thats why.
[13-Dec-2014 23:36:42] <seo> sraue: ping.
[13-Dec-2014 23:36:44] <seo> sraue: maybe on arm drop -fomit-frame-pointer. or drop it for everything. -O2 will add it where it doesnt break debugging...
[13-Dec-2014 23:36:59] <seo> dunno why we have it in our cflags ;)

Building with debug is an option:

[13-Dec-2014 23:38:45] <seo> Milhouse: kodi with DEBUG=yes will help.
[13-Dec-2014 23:40:27] <seo> Milhouse: PROJECT=RPi ARCH=arm ./scripts/clean kodi
[13-Dec-2014 23:40:37] <seo> Milhouse: PROJECT=RPi ARCH=arm DEBUG=yes ./scripts/build kodi
[13-Dec-2014 23:40:45] <seo> Milhouse: PROJECT=RPi ARCH=arm make release
[13-Dec-2014 23:41:09] <seo> Milhouse: "export DEBUG=yes" in kodi/package.mk pre_configure_target() may help but I never tested.

The second solution (export ...) would IMHO be best but unfortunately it doesn't seem to actually work - ideas weclome!

Although building kodi with the first solution doesn't yield a better traceback than kodi built with DEBUG=no, despite kodi.bin increasing from:

-rwxr-xr-x    1 root     root      18695288 Dec 13 21:46 /usr/lib/kodi/kodi.bin

to:

-rwxr-xr-x    1 root     root     151434580 Dec 14 07:48 /usr/lib/kodi/kodi.bin

so it obviously had an effect on the build, but not in terms of any resulting backtrace.

@stefansaraev
Copy link
Contributor

@Karlson2k apologies for pinging you here, but this could be related to the vfs changes, and you are the one who knows that code best. any clue ?

@popcornmix
Copy link

I believe both libc and the optimised libarmmem versions of memcpy use assembly code without cfa directives which means crashes in memcpy never produce a valid backtrace from memcpy.

Ben did add these a while ago (bavison/arm-mem@2e6f275) so it might be worth updating to latest version of libarmmem which I think will fix this specific backtrace.

@MilhouseVH
Copy link
Contributor Author

I'm testing later builds, after #1027, and it seems the NFS crash problem went away starting with build #1112 which included xbmc/xbmc#5700 and xbmc/xbmc#5707, amongst other changes.

At some point the problem seems to have returned, and I'm now working through the builds after #1112 in an effort to identify which build reintroduced the problem.

@MilhouseVH
Copy link
Contributor Author

#1126 is the build where the problem reappears. This includes various SMB related fixes. Now in the process of reverting these commits and testing.

@MilhouseVH
Copy link
Contributor Author

I think we've reached a conclusion.

The problem starts with build #1126, specifically due to the inclusion of xbmc/xbmc#5819.

With xbmc/xbmc#5819 reverted from master there is no problem with either NFS or SMB playback. With it, NFS crashes kodi.bin.

Adding xbmc/xbmc#5694 on top of master (ie. without reverting xbmc/xbmc#5819) does not help - kodi.bin still crashes when testing NFS (SMB is OK).

Will now build with the updated arm-mem to see if its possible to obtain a better backtrace.

@MilhouseVH
Copy link
Contributor Author

Backtrace of master including xbmc/xbmc#5819, with updated arm-mem:

Program received signal SIGSEGV, Segmentation fault.
[Switching to LWP 632]
0xb6eeec78 in memcpy () from /lib/libarmmem.so
(gdb) thread apply all bt

Thread 6 (LWP 632):
#0  0xb6eeec78 in memcpy () from /lib/libarmmem.so
#1  0xb56fa63c in std::basic_streambuf<char, std::char_traits<char> >::xsgetn(char*, int) () from /usr/lib/libstdc++.so.6
#2  0x00513d70 in XFILE::CFile::Read (this=this@entry=0x30434f8, lpBuf=lpBuf@entry=0xaa8fdab0, uiBufSize=uiBufSize@entry=2048) at File.cpp:521
#3  0x007f6f8c in dll_read (fd=<optimized out>, buffer=0xaa8fdab0, uiSize=2048) at emu_msvcrt.cpp:580
#4  0xae44cb74 in dvd_read_name (name=0x30a92dc "", serial=0x30a930e "", device=<optimized out>)
    at /home/neil/projects/OpenELEC.tv/build.OpenELEC-RPi.arm-devel/kodi-14-f4576be/lib/libdvd/libdvdnav/src/vm/vm.c:178
#5  0xab710100 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Full trace

@Karlson2k
Copy link

This small patch should fix crashes, but it will not fix real bug:

diff --git a/xbmc/filesystem/File.cpp b/xbmc/filesystem/File.cpp
index c25667d..e0a612a 100644
--- a/xbmc/filesystem/File.cpp
+++ b/xbmc/filesystem/File.cpp
@@ -298,7 +298,7 @@ bool CFile::Open(const CURL& file, const unsigned int flags)
       return false;
     }

-    if (m_pFile->GetChunkSize() && !(m_flags & READ_CHUNKED))
+    if (m_pFile->GetChunkSize() > 1 && !(m_flags & READ_CHUNKED))
     {
       m_pBuffer = new CFileStreamBuffer(0);
       m_pBuffer->Attach(m_pFile);

@MilhouseVH
Copy link
Contributor Author

This small patch should fix crashes, but it will not fix real bug:

@Karlson2k - I've given the patch a quick test (I mean, real quick - videos either played or the app crashed) and it does indeed stop the crashing, while also allowing playback over both NFS and SMB. Many thanks.

@fritsch
Copy link
Contributor

fritsch commented Dec 14, 2014

@Karlson2k please take care that we get a valid "workaround" into helix, so that the next RC does not have that issue. Thanks much in advance.

@MilhouseVH
Copy link
Contributor Author

@Karlson2k - will you be pushing this patch to Helix/master etc.?

@fritsch
Copy link
Contributor

fritsch commented Dec 14, 2014

Still curious what happens with a chunkSize of 1 though, which code path is taken then?

@Karlson2k
Copy link

@fritsch @MilhouseVH Already: xbmc/xbmc#5947 :)
chunkSize of 1 must be equal to chunkSize of 0.

@stefansaraev
Copy link
Contributor

this should now be fixed upstream. thanks to everyone involved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants