Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash accessing network menu on k3 #5780

Closed
icetbr opened this issue Jan 20, 2020 · 31 comments · Fixed by #5875
Closed

Crash accessing network menu on k3 #5780

icetbr opened this issue Jan 20, 2020 · 31 comments · Fixed by #5875
Milestone

Comments

@icetbr
Copy link

icetbr commented Jan 20, 2020

  • KOReader version: 2019.12
  • Device: K3

Issue

01/20/20-12:47:30 WARN  menu id not found: taps_and_gestures
Segmentation fault

Steps to reproduce

Menu -> Network

crash.log (if applicable)

crash.log

@Frenzie Frenzie changed the title Crash accessing network menu on k3 (taps_and_gestures not found) Crash accessing network menu on k3 Feb 10, 2020
@Frenzie
Copy link
Member

Frenzie commented Feb 10, 2020

As an aside, the menu id not found warning is quite unlikely to be related to the segmentation fault.

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Oh, yeah, it's been there for a while (c.f., #4845).

I'll see if I can get a usable backtrace...

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Also, obvious question is obvious: does it happen on the current stable/nightlies?

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Okay, I can repro.

More info once I unfuck my GDB build ;D.

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Program terminated with signal SIGSEGV, Segmentation fault.
#0  hb_atomic_int_t::get_relaxed (this=0xc1a61fac) at hb-atomic.hh:267
267     hb-atomic.hh: No such file or directory.
(gdb) bt full
#0  hb_atomic_int_t::get_relaxed (this=0xc1a61fac) at hb-atomic.hh:267
No locals.
#1  hb_reference_count_t::is_inert (this=0xc1a61fac) at hb-object.hh:157
No locals.
#2  hb_object_is_inert<hb_shape_plan_t> (obj=0xc1a61fac) at hb-object.hh:246
No locals.
#3  hb_shape_plan_execute (shape_plan=shape_plan@entry=0xc1a61fac, font=font@entry=0x21c8c8, buffer=buffer@entry=0x21c930, features=features@entry=0x21c9e0, num_features=num_features@entry=2) at hb-shape-plan.cc:391
        __PRETTY_FUNCTION__ = "hb_bool_t hb_shape_plan_execute(hb_shape_plan_t*, hb_font_t*, hb_buffer_t*, const hb_feature_t*, unsigned int)"
#4  0x40924bb4 in hb_shape_full (font=font@entry=0x21c8c8, buffer=buffer@entry=0x21c930, features=features@entry=0x21c9e0, num_features=num_features@entry=2, shaper_list=shaper_list@entry=0x0) at hb-shape.cc:139
        shape_plan = 0xc1a61fac
        res = <optimized out>
#5  0x40924bf0 in hb_shape (font=font@entry=0x21c8c8, buffer=buffer@entry=0x21c930, features=features@entry=0x21c9e0, num_features=num_features@entry=2) at hb-shape.cc:169
No locals.
#6  0x4097b350 in XText::measureSegment (hints=6, end=2, start=0, font_num=0, this=0x5d7100) at xtext.cpp:828
        len = <optimized out>
        glyph_pos = <optimized out>
        final_width = <optimized out>
        hcl = <optimized out>
        t_notdef_start = <optimized out>
        _hb_font = <optimized out>
        is_rtl = <optimized out>
        glyph_info = <optimized out>
        t_notdef_end = <optimized out>
        notdef_width = <optimized out>
        hb_data = <optimized out>
        _hb_buffer = <optimized out>
        hb_flags = <optimized out>
        cur_cluster = <optimized out>
        _hb_features = <optimized out>
        _hb_features_nb = <optimized out>
        glyph_count = <optimized out>
        hg = <optimized out>
        len = <optimized out>
        hb_data = <optimized out>
        _hb_font = <optimized out>
        _hb_buffer = <optimized out>
        _hb_features = <optimized out>
        _hb_features_nb = <optimized out>
        hb_flags = <optimized out>
        is_rtl = <optimized out>
        glyph_count = <optimized out>
        glyph_info = <optimized out>
        glyph_pos = <optimized out>
        final_width = <optimized out>
        cur_cluster = <optimized out>
        hg = <optimized out>
        hcl = <optimized out>
        t_notdef_start = <optimized out>
        t_notdef_end = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        notdef_width = <optimized out>
        t = <optimized out>
        cur_width = <optimized out>
        advance = <optimized out>
        fb_hints = <optimized out>
        fallback_width = <optimized out>
        fb_hints = <optimized out>
        fallback_width = <optimized out>
#7  XText::measure (this=0x5d7100) at xtext.cpp:751
        end = <optimized out>
        w = <optimized out>
        end = <optimized out>
        w = <optimized out>
        hints = <optimized out>
        hints = <optimized out>
        end_of_text = <optimized out>
        line_break = <optimized out>
        end_of_text = <optimized out>
        line_break = <optimized out>
        bidi_level_changed = <optimized out>
        last_direction = <optimized out>
        script_changed = <optimized out>
        bidi_level_changed = <optimized out>
        last_direction = <optimized out>
        script_changed = <optimized out>
        i = <optimized out>
        i = <optimized out>
        lbCtx = {lang = 0x0, lbpLang = 0x0, lbcCur = LBP_AL, lbcNew = LBP_AL, lbcLast = LBP_SP, fLb8aZwj = false, fLb10LeadSpace = false, fLb21aHebrew = false, cLb30aRI = 0}
        final_width = <optimized out>
        prev_para_start = <optimized out>
        new_bidi_level = <optimized out>
        unicode_funcs = <optimized out>
        prev_script = <optimized out>
        start = <optimized out>
        last_bidi_level = <optimized out>
        lbCtx = <optimized out>
        final_width = <optimized out>
        prev_para_start = <optimized out>
        start = <optimized out>
        last_bidi_level = <optimized out>
        new_bidi_level = <optimized out>
        unicode_funcs = <optimized out>
        prev_script = <optimized out>
        i = <optimized out>
        end_of_text = <optimized out>
        bidi_level_changed = <optimized out>
        last_direction = <optimized out>
        script_changed = <optimized out>
        line_break = <optimized out>
--Type <RET> for more, q to quit, c to continue without paging--
        script = <optimized out>
        ch = <optimized out>
        brk = <optimized out>
        pch = <optimized out>
        hints = <optimized out>
        end = <optimized out>
        w = <optimized out>
#8  XText_measure (L=<optimized out>) at xtext.cpp:2114
        xt = 0x5d7100
#9  0x0006e3a0 in lj_BC_FUNCC () at buildvm_arm.dasc:919
No locals.
#10 0x00064990 in lua_pcall (L=L@entry=0x401cd1c0, nargs=nargs@entry=1, nresults=-1, errfunc=errfunc@entry=2) at lj_api.c:1129
        g = 0x401cd1f0
        oldh = 0 '\000'
        ef = <optimized out>
        status = -1046077524
#11 0x000145c0 in docall (L=L@entry=0x401cd1c0, narg=narg@entry=1, clear=clear@entry=0) at luajit.c:121
        status = <optimized out>
        base = 2
#12 0x00014f50 in handle_script (L=L@entry=0x401cd1c0, argx=argx@entry=0xbedf4148) at luajit.c:292
        narg = 1
        status = <optimized out>
        fname = <optimized out>
#13 0x00015788 in pmain (L=0x401cd1c0) at luajit.c:553
        s = 0x871b8 <smain>
        argv = 0xbedf4144
        argn = 1
        flags = 0
#14 0x0006e3a0 in lj_BC_FUNCC () at buildvm_arm.dasc:919
No locals.
#15 0x00064af8 in lua_cpcall (L=L@entry=0x401cd1c0, func=<optimized out>, ud=ud@entry=0x0) at lj_api.c:1153
        g = 0x401cd1f0
        oldh = 0 '\000'
        status = -1046077524
#16 0x00015838 in main (argc=3, argv=0xbedf4144) at luajit.c:582
        status = <optimized out>
        L = 0x401cd1c0

Ping @poire-z ;).

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

This is an older CPU, armv6 (i.MX35, arm1136JF-S), and that was built with the usual GCC 7.5 Linaro TC (in case it matters, since it seems to die in a gnarly GCC intrinsics).

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

And in case the smaller screen helps reproducing this elsewhere: 600x800 (167dpi) ;).

@poire-z
Copy link
Contributor

poire-z commented Feb 10, 2020

No real idea.
measureSegment (hints=6, end=2, start=0... says it's a 2 chars string.
Might be interesting to see if it fails on the first attempt at measureing something, or if it's triggered by some specific string. This should tell you:

--- a/frontend/ui/widget/textboxwidget.lua
+++ b/frontend/ui/widget/textboxwidget.lua
@@ -217,2 +217,3 @@ function TextBoxWidget:_measureWithXText()
     if type(self.charlist) == "table" then
+        require("logger").warn("TextBoxWidget:_measureWithXText", self.charlist)
         self._xtext = xtext.new(self.charlist, self.face, self.auto_para_direction,
@@ -225,2 +226,3 @@ function TextBoxWidget:_measureWithXText()
         end
+        require("logger").warn("TextBoxWidget:_measureWithXText", self.text)
         self._xtext = xtext.new(self.text, self.face, self.auto_para_direction,
diff --git a/frontend/ui/widget/textwidget.lua b/frontend/ui/widget/textwidget.lua
index c0ccac3c..11760ece 100644
--- a/frontend/ui/widget/textwidget.lua
+++ b/frontend/ui/widget/textwidget.lua
@@ -156,2 +156,3 @@ function TextWidget:_measureWithXText()
     end
+    require("logger").warn("TextWidget:_measureWithXText", self.text)
     self._xtext = xtext.new(self.text, self.face, self.auto_para_direction,

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

FWIW, switching to the STL atomics doesn't help.

#0  std::__atomic_base<int>::load (__m=std::memory_order_relaxed, this=0xc1929fac) at /home/niluje/x-tools/arm-kindle-linux-gnueabi/arm-kindle-linux-gnueabi/include/c++/7.5.0/bits/atomic_base.h:396
396     /home/niluje/x-tools/arm-kindle-linux-gnueabi/arm-kindle-linux-gnueabi/include/c++/7.5.0/bits/atomic_base.h: No such file or directory.
(gdb) bt full
#0  std::__atomic_base<int>::load (__m=std::memory_order_relaxed, this=0xc1929fac) at /home/niluje/x-tools/arm-kindle-linux-gnueabi/arm-kindle-linux-gnueabi/include/c++/7.5.0/bits/atomic_base.h:396
        __b = <optimized out>
        __b = <optimized out>

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

@poire-z:

02/10/20-23:39:08 WARN  TextWidget:_measureWithXText ▢ 
Segmentation fault

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

That would be the first entry, [ ] Wi-Fi connection. o_O

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Disabling xtext obviously works ;).

Disabling atomics entirely in HB also does it.

diff --git a/src/hb-atomic.hh b/src/hb-atomic.hh
index f9afd4ff..783a4198 100644
--- a/src/hb-atomic.hh
+++ b/src/hb-atomic.hh
@@ -49,7 +49,7 @@
 /* Defined externally, i.e. in config.h. */
 
 
-#elif !defined(HB_NO_MT) && defined(__ATOMIC_ACQUIRE)
+#elif !defined(HB_NO_MT) && defined(__ATOMIC_ACQUIRE) && defined(__NOPEY_NOPE)
 
 /* C++11-style GCC primitives. */
 
@@ -72,7 +72,7 @@ _hb_atomic_ptr_impl_cmplexch (const void **P, const void *O_, const void *N)
 }
 #define hb_atomic_ptr_impl_cmpexch(P,O,N)      _hb_atomic_ptr_impl_cmplexch ((const void **) (P), (O), (N))
 
-#elif !defined(HB_NO_MT) && __cplusplus >= 201103L
+#elif !defined(HB_NO_MT) && __cplusplus >= 201103L && defined(__NOPEY_NOPE)
 
 /* C++11 atomics. */

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

@poire-z : Do you want to investigate or should I should apply that on kindle-legacy and be done with it?

(EDIT: Probably on PB too).

@poire-z
Copy link
Contributor

poire-z commented Feb 10, 2020

I don't feel like I can investigate anything :| That's some low level cooking you understand a lot more than me :)
So, go ahead being done with it :)

@poire-z
Copy link
Contributor

poire-z commented Feb 10, 2020

That would be the first entry, [ ] Wi-Fi connection.

But... to go to Menu > Network, xtest has measuring and HB has shaped a lot of strings... So, it's not the first measuring, you've seen other WARN before the crash, right?
Does going to some other menu with a [ ] checkbox fail too ? Like Typeset icon > Styles ?

@poire-z
Copy link
Contributor

poire-z commented Feb 10, 2020

Or there's some multithreading on that old Kindle ?! :)
Like that menu item being updated aynchronously ? But even with coroutines, there should be only a single call happening at a time :|

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Yep, even though I went the quick route (i.e., startup, menu, left, left, ..., Network) ;).

And, I hadn't really noticed because I was tail'ing post-crash, but this leads to... interesting results:

02/10/20-23:38:47 WARN  TextWidget:_measureWithXText Q
02/10/20-23:38:47 WARN  TextWidget:_measureWithXText W
02/10/20-23:38:48 WARN  TextWidget:_measureWithXText E
02/10/20-23:38:48 WARN  TextWidget:_measureWithXText R
02/10/20-23:38:48 WARN  TextWidget:_measureWithXText T
02/10/20-23:38:48 WARN  TextWidget:_measureWithXText Y
02/10/20-23:38:48 WARN  TextWidget:_measureWithXText U
02/10/20-23:38:48 WARN  TextWidget:_measureWithXText I
02/10/20-23:38:48 WARN  TextWidget:_measureWithXText O
02/10/20-23:38:48 WARN  TextWidget:_measureWithXText P
02/10/20-23:38:54 WARN  menu id not found: taps_and_gestures
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText History
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText Open last document
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText Favorites
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText System statistics
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText Update ▸
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText Version
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText Help ▸
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText Exit ▸
02/10/20-23:38:55 WARN  TextWidget:_measureWithXText 11:38 PM ⌁100%
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:01 WARN  TextWidget:_measureWithXText ▢ 
02/10/20-23:39:02 WARN  TextWidget:_measureWithXText Night mode
02/10/20-23:39:02 WARN  TextWidget:_measureWithXText Network ▸
02/10/20-23:39:02 WARN  TextWidget:_measureWithXText Screen ▸
02/10/20-23:39:02 WARN  TextWidget:_measureWithXText Navigation ▸
02/10/20-23:39:02 WARN  TextWidget:_measureWithXText Document ▸
02/10/20-23:39:02 WARN  TextWidget:_measureWithXText Language ▸
02/10/20-23:39:02 WARN  TextWidget:_measureWithXText Device ▸
02/10/20-23:39:02 WARN  TextWidget:_measureWithXText 11:39 PM ⌁100%
02/10/20-23:39:08 WARN  TextWidget:_measureWithXText ▢ 
Segmentation fault

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

(The QWERTYUIOP stuff is the non-touch labels for the keyboard shortcuts).

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Ooh, err, the WiFi callback happens to be asynchronous on Kindle, yeah, would that matter?

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Eh, the is_checked function just does a popen, I don't think the actual on-click callback should matter?

@poire-z
Copy link
Contributor

poire-z commented Feb 10, 2020

If you disable that callback stuff in the code, and all works fine, well, it gets more interesting :)
But no issue on other Kindles?
I remember writting:

class XText {
private:
    // Shared by all XText instances. Should not be used
    // across calls to shapeLine()
    static xtext_shapeinfo_t s_shape_result[MAX_LINE_GLYPHS];
    static bool s_libunibreak_init_done;
public:

and all the calls, measure, shaping, must be serialized - but Lua should ensure that, even if from multiple coroutines, only one should execute one call and wait before another can.

Does that popen would do some UI/Text stuff?

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Nope, it basically returns 0 or 1 ^^. (Spoiler: always 0 on a K3 anyway ;p).

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

And no issue on my PW2 on a 2019.12+ nightly, yeah.

EDIT: (Where the is_checked function doesn't popen, though, it uses some native Lua bindings for the request).

@poire-z
Copy link
Contributor

poire-z commented Feb 10, 2020

It's a keyboard device? No other KB device to see if it's a common thing with that kind?
What if you disable flash UI?

@poire-z
Copy link
Contributor

poire-z commented Feb 10, 2020

liblipclua ? That's not something we build, right? It's some own Kindle stuff, and they would use Lua ?!

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Yeah, it has a KB. And the only other KB device I have is also an older Kindle using the same CPU ;p.

Flash UI doesn't change anything.

And, yeah, that's lipclua. Everything since the K5 is using X with the awesome WM, which is written in Lua, so there's a bunch of Lua bits ;).

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

And finally, NOP'ing the is_checked doesn't help ;).

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

And, AFAICT, other menus w/ leading checkboxes are fine, both checked and unchecked.

@poire-z
Copy link
Contributor

poire-z commented Feb 10, 2020

Well, no real idea.
May be just G_reader_settings:saveSetting("use_xtext", false) early on that kind of Kindle - so we can easily re-enable it the day we get some lightbulb idea :)

@NiLuJe
Copy link
Member

NiLuJe commented Feb 10, 2020

Is it normal that I'm seeing far more 02/11/20-00:51:27 WARN TextWidget:_measureWithXText ▢ than there are actual checkboxes? And that the matching label never makes it?

@poire-z
Copy link
Contributor

poire-z commented Feb 11, 2020

I see a lot of that on the emulator:

02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢
02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢
02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢
02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢
02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢
02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢
02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢
02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢
02/11/20-01:25:03 WARN  TextWidget:_measureWithXText â¢

which is white square with rounded corners (U+25A2)
I think the chekbox/checkmarks is not really optimized, and may compute that for each instance.

self.dimen = unchecked_widget:getSize()

NiLuJe added a commit to koreader/koreader-base that referenced this issue Feb 11, 2020
* On Kindle legacy, disable atomics in HB

Workaround for koreader/koreader#5780
@NiLuJe NiLuJe mentioned this issue Feb 17, 2020
@Frenzie Frenzie added this to the 2020.03 milestone Feb 18, 2020
NiLuJe added a commit to NiLuJe/koreader that referenced this issue Apr 4, 2020
Because apparently stuff blows up in weird and interesting ways for...
reasons.

Might be relevant on older PB devices, too?

Re koreader#5780
Closes koreader#6024
NiLuJe added a commit to koreader/koreader-base that referenced this issue Apr 5, 2020
* Ditch thread-safety in HB on LEGACY & PB to avoid weird atomics issues on armv6

koreader/koreader#6024
koreader/koreader#5780

* Don't ship glib on Kindle

High kablooey potential, because we pull liblipc, which depends on it and
libgthread, so we'd be mixing glib versions between the two...

Regression since #1048
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants