Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process has been terminated by signal {SIGSEGV::SEGV_MAPERR} #6000

Closed
nessotrin opened this issue Mar 23, 2020 · 91 comments · Fixed by #6669
Closed

Process has been terminated by signal {SIGSEGV::SEGV_MAPERR} #6000

nessotrin opened this issue Mar 23, 2020 · 91 comments · Fixed by #6669
Labels
bug firmware help-wanted We'd like help with this issue PocketBook
Milestone

Comments

@nessotrin
Copy link

nessotrin commented Mar 23, 2020

  • KOReader version: latest master branch - v2020.03.2
  • Device: PB631/PB632 (Touch HD/ Touch HD2)

Issue

Constant and random crashes. "Process has been terminated by signal {SIGSEGV::SEGV_MAPERR}"
Book progress is lost every time. Koreader is completely unusable.

Steps to reproduce

Install any koreader version on the device.
Do anything inside koreader, including using the menu. You can read around 10 pages before it crashes. There is no correlation between what you do and when it happens.
Sometimes is crashes even before the GUI shows up when running with gdb.

crash.log & strace

crash.log
strace.txt

I've attached GDB, but it doesn't seem to catch the fault. I made a custom build environment with updated headers and libraries for FW5.19 (including libinkview), but it didn't help.
Koreader never worked on my device (been trying for 2 years).

I can't give away the device, but I'm willing to help if given instructions. I have gdbserver and telnet access. (GDB over telnet hangs for some reason)

@NiLuJe
Copy link
Member

NiLuJe commented Mar 23, 2020

What do you mean, exactly, when you say that gdb can't catch the crash?

You can either attach to luajit at runtime, or simply run reader.lua via luajit under gdb, both approaches should work.

You may also try to let the kernel handle creating a coredump. You'll probably have to tweak the limits first and run KOReader in the same shell session. ulimit -c unlimited may not work in some crappy busybox versions, in which case, simply use a large integer value instead of unlimited.

@pazos
Copy link
Member

pazos commented Mar 24, 2020

I can't give away the device, but I'm willing to help if given instructions

Without the device just some random guesses (based on issues related to SEGV_MAPERR).

You can try to restrict JIT for your device &| try to allocate a "big" chunk of memory for machine code

@nessotrin
Copy link
Author

nessotrin commented Mar 24, 2020

I can't give away the device, but I'm willing to help if given instructions

Without the device just some random guesses (based on issues related to SEGV_MAPERR).

You can try to restrict JIT for your device &| try to allocate a "big" chunk of memory for machine code

I just tried both. It seems to help a little, but doesn't fix the issue.
EDIT: Depending on the usage, I was able to swipe through an entire book without a crash. It's better than I initially thought.

@nessotrin
Copy link
Author

What do you mean, exactly, when you say that gdb can't catch the crash?

You can either attach to luajit at runtime, or simply run reader.lua via luajit under gdb, both approaches should work.

You may also try to let the kernel handle creating a coredump. You'll probably have to tweak the limits first and run KOReader in the same shell session. ulimit -c unlimited may not work in some crappy busybox versions, in which case, simply use a large integer value instead of unlimited.

Indeed, I got GDB to catch a segfault.
Ulimit wouldn't work (It does report being set to unlimited, but no coredump is produced), so I used GDB to make one. It's quite big (200M), should I upload it ?

I'd be great to build KOReader with debug symbols, but the wiki doesn't seem up to date. I know nothing about this code base, could somebody enlighten me ?

@Frenzie
Copy link
Member

Frenzie commented Mar 24, 2020

Where in the wiki does it not seem up to date?

Instructions here: https://github.com/koreader/koreader/blob/master/doc/Building.md

@NiLuJe
Copy link
Member

NiLuJe commented Mar 24, 2020

@nessotrin: gcore may have a harder time creating an usable dump than the kernel (especially on older kernels such as the one that's probably used there). Try setting ulimit -c to a stupidly large value instead of unlimited (Kobos suffer from the same quirk, you just have to double-check that 'stupidly large' isn't actually too large to be ignored. IIRC, half INT32_MAX did the trick last time I checked).

IIRC, the kodev wrapper should be able to get you a debug build w/ symbols, I've relied on it on a few occasions.

@nessotrin
Copy link
Author

Where in the wiki does it not seem up to date?

Instructions here: https://github.com/koreader/koreader/blob/master/doc/Building.md

That page doesn't say anything about a debug build.
The only thing I found : https://github.com/koreader/koreader-base/wiki/Remote-debugging-with-gdbserver

@Frenzie
Copy link
Member

Frenzie commented Mar 25, 2020

Thanks, I'll update that page. The method described there annoyed me immensely a few years back so I gave builds with debug symbols their own folder. You can simply use a command like kodev release --debug pocketbook. The kodev script is self-documenting. ;-)

@Frenzie
Copy link
Member

Frenzie commented Mar 25, 2020

PS The emulator defaults to debug.

@nessotrin
Copy link
Author

@nessotrin: gcore may have a harder time creating an usable dump than the kernel (especially on older kernels such as the one that's probably used there). Try setting ulimit -c to a stupidly large value instead of unlimited (Kobos suffer from the same quirk, you just have to double-check that 'stupidly large' isn't actually too large to be ignored. IIRC, half INT32_MAX did the trick last time I checked).

IIRC, the kodev wrapper should be able to get you a debug build w/ symbols, I've relied on it on a few occasions.

I tried ulimit at different settings, no luck. Here's the kernel and busybox :
Linux pocketbook 3.0.35+ #1 PREEMPT Thu Oct 31 13:23:18 EET 2019 armv7l GNU/Linux
BusyBox v1.26.2 (2019-09-24 21:41:24 EEST) multi-call binary.

I found how to make a debug release with kodev.
It could be helpful to mention in kodev's --help that you can get more informations on sub-categories.
Something like "see kodev COMMAND --help for more information"

@Frenzie
Copy link
Member

Frenzie commented Mar 25, 2020

I'd expect people to just read over that, but sure.

I guess I never thought about it because it's the same for Git.

@pazos pazos added bug firmware help-wanted We'd like help with this issue PocketBook labels Mar 27, 2020
@pazos pazos changed the title Koreader unstable on Pocketbook Touch HD (PB631) Process has been terminated by signal {SIGSEGV::SEGV_MAPERR} Mar 27, 2020
@bcm0
Copy link

bcm0 commented Apr 3, 2020

I very much hope that you can fix this issue. It's been around for ages.
The stock reading app on PB631 can't even adjust the screen contrast.
Please tell me if I can help somehow.
Good luck!

@NiLuJe
Copy link
Member

NiLuJe commented Jul 25, 2020

We can't do anything without a stacktrace and a strace log, much like we had to in order to investigate the libcurl + inkview issues earlier (#5861).

@bcm0
Copy link

bcm0 commented Jul 25, 2020

crash.log
crash_debug_verbose.log
I can't get ssh to work for strace and the terminal emulator inside koreader crashes.
There are some tips to get ssh but it's hard
https://www.mobileread.com/forums/showthread.php?t=159636&page=7

@NiLuJe
Copy link
Member

NiLuJe commented Jul 25, 2020

I meant a gdb stacktrace, my bad (although a verbose crash.log will potentially also be helpful to cross-reference).

@bcm0
Copy link

bcm0 commented Jul 25, 2020

I used 'ssh -T -v -p 2222 reader@1.1.1.1'
and got a connection. There is no input prompt but it works.
I ran 'strace -f -o log.txt /mnt/ext1/applications/koreader.app'
and got this
https://raw.githubusercontent.com/adnion/PB631-strace/master/log.txt

@bcm0
Copy link

bcm0 commented Jul 25, 2020

Can you provide instructions for creating a gdb stacktrace?

@NiLuJe
Copy link
Member

NiLuJe commented Jul 25, 2020

It's detailed in the issue I linked to, IIRC.

@NiLuJe
Copy link
Member

NiLuJe commented Jul 25, 2020

Appears to be crashing later than usual though, so, beats me. Quite likely still related to InkView and/or the "new" TC on FW 6 or whenever they happened to switch to Clang.

@bcm0
Copy link

bcm0 commented Jul 25, 2020

@NiLuJe
Copy link
Member

NiLuJe commented Jul 25, 2020

Nope, I thought I'd mentioned gdb vs. an ulimit coredump in #5861, but apparently not. Must have been in one of the other, numerous PB+InkView crash reports ;).

(Although, yeah, the wiki is probably a good start; but I'm more of a native on-device gdb guy myself, I've never actually used gdbserver).

@Uwe-B
Copy link

Uwe-B commented Sep 17, 2020

_gdb_bt7(epub).txt
Can't crash it. No chance.

@ezdiy
Copy link
Member

ezdiy commented Sep 17, 2020

@Uwe-B Neat! And after you re-enable self.mech_refresh = refresh_pocketbook ?

In general, try to undo steps you did, one by one: ie mech_refresh, mech_wait_update_complete and dev_no_c_blitter until you get the one that's crashing.

@Uwe-B
Copy link

Uwe-B commented Sep 17, 2020

Does this help?
Ich changed this back to the previous settings:
self.mech_refresh = function() end
self.mech_wait_update_complete = nil
_gdb_bt8(epub).txt

@ezdiy
Copy link
Member

ezdiy commented Sep 17, 2020

@Uwe-B try the following lines exactly:

self.mech_refresh = refresh_pocketbook
self.mech_wait_update_complete = nil

(this is a combo where display stil works, but may still avoid crashes if wait_update is the culprit)

@Uwe-B
Copy link

Uwe-B commented Sep 17, 2020

Thats exactly what I did the last minutes. No crash that I could produce!!!

@NiLuJe
Copy link
Member

NiLuJe commented Sep 17, 2020

Then you get to check with mech_wait_update_complete re-enabled, and if it still doesn't crash, you can try switching back to the Lua blitter, to cinch things up ;).

@ezdiy
Copy link
Member

ezdiy commented Sep 17, 2020

Welp, finally narrowed it down. Now try whole pristine koreader from scratch (no gdb, no custom launch, no debug mode), and just change of self.mech_wait_update_complete = nil so we can be sure that's the fix. Then we can try to find a fix for the corruption in that routine.

@NiLuJe it seems like the collision field is written to, or even something beyond it. But only on kernels of peculiar devices.

@Uwe-B
Copy link

Uwe-B commented Sep 17, 2020

Cool its working.
What I did:
Deflate koreader-pocketbook-v2020.08.1-62-gffa9857_2020-09-09 and replace KOreader on this device with the aformentioned version.
Just edited existing:
ffi/framebuffer_mxcfb.lua:
elseif self.device:isPocketBook() then
require("ffi/mxcfb_pocketbook_h")

  • self.mech_wait_update_complete = pocketbook_mxc_wait_for_update_complete
    into
  • self.mech_wait_update_complete = nil
    Device Pocketbook 631+ (which has the same firmware as 631 but with warmlight).

Thanks guys, you are great, was waiting years for this fix!

@NiLuJe
Copy link
Member

NiLuJe commented Sep 17, 2020

@ezdiy: Huh. Apparently the 631 kernel definitely supports both variants (or not, depending on how it's built), but we should definitely be calling the legacy variant.

Which, err, I imagine should just plain fail instead of crash weirdly if the kernels aren't built w/ MX50_IOCTL_IF defined?

@NiLuJe
Copy link
Member

NiLuJe commented Sep 17, 2020

Oh. On the other hand, older kernels use the old ioctl address w/ the "new" struct, so, mystery solved?

@NiLuJe
Copy link
Member

NiLuJe commented Sep 17, 2020

I'm going to build a proper franken-header + cdecl like on other platforms to make this kind of stuff slightly less crazy to debug...

Because I seem to remember that what we have is pretty much based on trial and error and old Kindle kernels, so, it's a bit of a miracle that it kinda works to begin with ;p.

@ezdiy
Copy link
Member

ezdiy commented Sep 17, 2020

@NiLuJe I suppose it couldn't hurt just passing the two-field struct always (since the marker int is first in there), just to be on safe side?

@ezdiy
Copy link
Member

ezdiy commented Sep 17, 2020

@Uwe-B If I'm not mistaken, this should work correctly with no crashes:

        self.mech_wait_update_complete = remarkable_mxc_wait_for_update_complete

@Uwe-B
Copy link

Uwe-B commented Sep 17, 2020

@Uwe-B If I'm not mistaken, this should work correctly with no crashes:

        self.mech_wait_update_complete = remarkable_mxc_wait_for_update_complete

Yep, couldn't crash it!

@EastEriq
Copy link

EastEriq commented Sep 17, 2020

Sorry, been late to the party.

So I installed koreader-pocketbook-v2020.08.1-74-geecdf5b_2020-09-16 from the nightly builds; edited /mnt/ext1/applications/koreader/ffi/framebuffer_mxcfb.lua and only changed line 702 to self.mech_wait_update_complete = nil.

Confirm that all seems to work for long without crashing.
Well, I see several quirks, like the frontlight not commanded (maybe #6663 not yet in?), the green activity led always on, occasionally some redraw problem - but all pertain to different bug reports. Big difference. Thanks to all involved!

@ezdiy
Copy link
Member

ezdiy commented Sep 18, 2020

@EastEriq Frontlight is in, as well as the autostandby issues.

The current workaround for the crash is essentialy disabling part of functionality. We need some detail about what really goes on on your device by observing what the official reader does. Open an epub in epub (v2) reader (long-tap and open-with, i think), and strace the fb2 reader for ioctls while tapping through pages, paste the output here:

/mnt/secure # strace -T -i -e trace=ioctl -p `ps | grep eink-reader|grep firsttime|awk {'print $1'}`
Process 1000 attached - interrupt to quit
[a1645d7c] ioctl(4, 0x80044655, 0xaedbd594) = 0 <0.001507>
[a1645d7c] ioctl(4, 0x80204656, 0xaedbad08) = 0 <0.000082>
[a1645d7c] ioctl(4, 0x4004462f, 0xa3015dc4) = 0 <0.000081>
[a1645d7c] ioctl(4, 0x4004462f, 0xa3015dc4) = 0 <0.000709>
[a1645d7c] ioctl(4, 0x4004462f, 0xa3015dc4) = 0 <0.000802>
[a1645d7c] ioctl(4, 0x80044655, 0xaedbd594) = 0 <0.001509>
[a1645d7c] ioctl(4, 0x4004462f, 0xa3015dc4) = 0 <0.000790>
[a1645d7c] ioctl(4, 0x4004462f, 0xa3015dc4) = 0 <0.401787>
[a1645d7c] ioctl(4, 0x4004462f, 0xa3015dc4) = 0 <0.000088>
[a1645d7c] ioctl(4, 0x80044655, 0xaedbd594) = 0 <0.001684>
.....

@NiLuJe
Copy link
Member

NiLuJe commented Sep 18, 2020 via email

@EastEriq
Copy link

Here you go. Attached soon after I opened the reader on the device, played a bit with it like turning back and forth a couple of pages, skimmed through settings, dictionary-looked a word, then exited. I'm a bit surprised by the periodic EINVAL appearing; they go at about 2/sec when the reader is left alone; but that's the way it is.

ereader_strace.txt

For the next two days I'll be offline, in case a new party starts...

@Uwe-B
Copy link

Uwe-B commented Sep 18, 2020

I'll do my best to help out with testing if needed.

NiLuJe added a commit to NiLuJe/koreader that referenced this issue Sep 19, 2020
NiLuJe added a commit that referenced this issue Sep 19, 2020
* Fix WAIT_FOR_COMPLETE ioctl (fix #6000)

* Prevent a promotion to a flashing on fg/bg toggle

* Bump base for the matching PB updates

(koreader/koreader-base#1188)
@Frenzie Frenzie added this to the 2020.10 milestone Sep 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug firmware help-wanted We'd like help with this issue PocketBook
Projects
None yet
10 participants