Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Crashing on Linux due to memory corruption #21
I've tried building and running Aleph One on the two Raspberry Pis listed above but am experiencing various memory-related crashes once I enter the actual game (i.e. pressing "Begin New Game" and clicking past the Chapter Screen). Most of the time it crashes immediately but sometimes I get to play for a few seconds, enough to see the BOBs slaughter all of the Pfhor at the beginning of Waterloo Waterpark.
I'm using the latest tarball from http://alephone.lhowon.org (20150620). The latest code from GitHub crashes immediately. I can get some output from that version as well if you want me to though.
Some sample error messages:
Output from GDB:
The "handle SIGILL pass nostop noprint" part is required or else Aleph One crashes immediately due to "Program received signal SIGILL, Illegal instruction".
A possibly related bug report over at SourceForge: Alephone crash ppc linux
Thread for this issue over on the Pfhorums: Aleph One on Raspberry Pi
I don't have a Raspberry Pi, but here are some tips for saving memory in Aleph One:
Best of luck to you! Feel free to document your progress here or on the Pfhorums.
I should probably have told what settings I was using. I've done most of my testing on software mode, 1280×800, 16 bit. It seemed to run decently on both machines at those settings.
I don't think this issue is due to lack of memory, but I still tried out your suggestion to disable plugins. It appears the Enhanced HUD was the culprit. It uses Lua and the crash was in the Lua code, so there we have it. Disable it and the game runs just fine.
I also tried OpenGL by the way and it was horrendously slow. Completely unplayable on even the lowest settings and with HD textures disabled.
The next step is to see if I can get the latest code from GitHub to run. I'm also curious to see what kind of performance I'll get from the Raspberry Pi 3 Model B that was released today.
I took a stab at running the latest code from GitHub.
autogen.sh listed some additional SDL 2 dependencies not required by the tarball and not listed on the wiki, namely libsdl2-dev, libsdl2-ttf-dev and libsdl2-net-dev.
Aleph One starts, prints the usual credits and then exits with status 1 and no error message. The log file contains one entry: an unhandled exception.
I'd expect the program to then call the exit function on line 389, but when I load it up in GDB and set a breakpoint on exit it instead breaks on line 425.
So it seems to me that SDL initialization failed and that SDL_GetError() returned null which is why I got an exception instead of an error message.
Commit f750664 should avoid the exception when SDL_GetError() isn't working properly, but that doesn't change the fact that SDL isn't starting up properly.
This article contains a test SDL2 program; can you see if that runs? In addition to what you've already installed, it suggests installing libsdl2-image-dev and libsdl2-mixer-dev. Aleph One uses sdl2-image if present, but it should be optional, and Aleph One shouldn't care about the mixer part. However, package dependency errors do happen, so if the test program runs it's worth trying Aleph One again with those added packages, just in case.
Commit d724d36 was the last commit before SDL 2, so if that library is a sticking point and you feel like changing gears, you can roll back to that commit to work on whatever broke between June and February. Most of those changes should be unaffected by SDL 2, so I am interested if you find Raspbian crashes on that snapshot.
Installed additional SDL2 packages libsdl2-image-dev and libsdl2-mixer-dev.
Aleph One still exits due to the same exception. Loading it up in GDB shows that it manages to get past the SDL initialization step.
I still end up with the same type of exception. What makes things so difficult is that I can't get a backtrace for where the exception was thrown. By carefully stepping through the program I was finally able to locate the exact line that causes the exception.
Checking the return value confirms that getlogin returns null which in turn causes the exception since it cannot be converted to a string.
The line was last modified in 43627bb, commited on 2015-06-28, 8 days after the last release.
Adding a null check solved the whole problem. Well, except for the fact that Aleph One runs extremely slow. We're talking sub-1 fps.
I then went back to debugging the SDL issue. I tried uninstalling libsdl2-image-dev and libsdl2-mixer-dev and rebuilding. Aleph One runs without throwing any uncaught exception, although still very slow. Huh? So it never was SDL to begin with?
It looks like SDL_GetError is supposed to always return a string, even if just an empty one when there's been no error (SDL wiki reference). I think I've been chasing a red herring thanks to me forgetting to use the -O0 flag with my first few builds. Feels weird that it would cause GDB to break on the wrong exit call though. Need to investigate further.
Next up: getting back to the Enhanced HUD Lua issue – you know, the one this thread was originally about.
Glad to hear you did track down the true crash, and it's an easy fix!
SDL 2 has several rendering backends; in your case, it's probably using software OpenGL, which would explain the slowness. See this thread. I haven't tried it, but the documentation says you can do something like:
Try adding that after SDL_Init and see if it helps.
added a commit
Mar 6, 2016
Thanks, that did the trick!
The Enhanced HUD is slightly less crash prone in the most recent code. It still crashes at about the same frequency when I try to start a game. If I do manage to get past that point then the Enhanced HUD works as expected. Well, almost. The health and oxygen bars are grey for some reason. I've played for a few minutes each time and seen no crashes during gameplay. Exiting back to the main menu has caused it to crash every time so far though.
The callstacks are intimidating but look fairly consistent. I was hoping Valgrind might be able to provide me with some more info, but unfortunately it's broken on Raspbian. The same illegal instruction that caused me a bit of grief when using GDB spell major trouble for Valgrind. As I understand it some libraries on Raspbian use ARM instructions that the Valgrind people have no intention of supporting. (Allegedly, one of them is for switching endianness.)
Callstack for crash on game start:
Callstack for crash on game exit:
Good to hear the hint helped. I'll add support for that somewhere in the next official release, probably in the advanced graphics prefs.
The gray bars in the HUD are normal. The graphics come straight from the Xbox version, which used the same graphic plus a color tint to draw the bars. Software mode doesn't support color tinting, so that doesn't work. You're the first person to mention this since its release in 2011, which tells you how often software mode and enhanced HUD are used together.
Hi there. I just started work on integrating Aleph One into RetroPie, the retrogaming distribution for Raspberry Pi and I've been seeing the exact same behavior here as mentioned by OP in this thread. I end up getting "double free or corruption (out)" on the command line when a crash occurs. I have yet to try adding in the line you mentioned. I too am using software mode by default.
Sorry for the lack of updates. My SD card went corrupt and the lastest backup was two months ago. I decided to get myself a Raspberry Pi 3 as a consolation gift and to try out Ubuntu Mate. One advantage is that I get to try out a newer version of Valgrind (3.11.0 vs 3.7.0). It only reports one illegal instruction this time, but that's still one too many. I did however manage to find some tips for how to get rid of the weird Raspberry Pi libraries:
I tried removing "/usr/lib/arm-linux-gnueabihf/libarmmem.so" from /etc/ld.so.preload. Aleph One now runs in both GDB (without needing the "handle SIGILL" command) and Valgrind. It does not crash though when running in Valgrind. Granted, it was so incredibly slow that I only started two games, but I was able to enter and exit both times without a crash. I'm still reading up on Valgrind and have only used it for finding leaks in the past. Any tips on what to look for and what flags to use are much appreciated. Be aware that the Pi 2 and 3 have 1 GB of RAM and it's easy to run out of memory.
I also tried building and running on an Ubuntu VM on my Intel iMac. No problems there. Judging by the SourceForge issue I linked previously it might be reproducible on PPC Linux. Unfortunately I don't have a suitable PPC machine available.
The Default HUD plugin crashes in pretty much the same spot as the Enhanced HUD. Switching to OpenGL mode doesn't help either. I haven't enabled the experimental hardware-accelerated OpenGL driver, but I only expect it to affect performance since the HUDs only crash during initialization and destruction.