Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PW4: Memory above 93 leads to device freeze on exit - a potential memory leak? #7853

Closed
easyrider opened this issue Jun 15, 2021 · 48 comments
Closed
Labels
can't fix an issue that, by definition, cannot be fixed firmware help-wanted We'd like help with this issue Kindle

Comments

@easyrider
Copy link

  • KOReader version: 2021.05
  • Device: PW4

Issue

The longer I read the higher memory consumption goes - memory leak?
The only solution that I'm aware of is to restart a KOReader, then exit.
I tried adjusting DGLOBAL_CACHE_FREE_PROPORTION but without success.

Steps to reproduce

After 93 when I try to exit my kindle freezes (OOM?).
Above 150 while reading it freezes a device (OOM?).

@pazos
Copy link
Member

pazos commented Jun 15, 2021

The longer I read the higher memory consumption goes

Is that a PDF?. In that case memory should grow until a certain threeshold, then stop there. On my device it reaches 300MB usage, but everything works normal.

On the other hand, crengine memory usage depends a lot on the current document. But it never pass 100/150MB on my own usage.

a potential memory leak?

Probably not -at least on crengine- I read on my Bq almost every day and don't update often, so it can be running the same luaJIT instance for months.

OOM?

Probably. Kindle devices run a bunch of stuff under the hood.

@Frenzie
Copy link
Member

Frenzie commented Jun 15, 2021

Kindles are apparently extremely memory constrained.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

I was going to ask on which FW version this is, but, given that it's a PW4, it's going to be recent enough, so, yeah: extremely memory constrained.

@NiLuJe NiLuJe added can't fix an issue that, by definition, cannot be fixed firmware Kindle labels Jun 15, 2021
@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

The difference in behavior on exit purely stems from what the OOM killer targets first. Given that the worst offender is the native system, it indeed makes perfect sense that it gets killed first, which yields confusing soft-locks on exit.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

You could possibly cobble something up by stopping/restarting the KF8 & KFX readers and a few other non-essential auxiliary daemons, but unfortunately, the worst offender is the JVM, and that's a tougher nut to get rid of without hampering functionality.

@NiLuJe NiLuJe added the help-wanted We'd like help with this issue label Jun 15, 2021
@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

I don't have an affected device, but if someone's willing to do the work, I can point you in the right direction.

@easyrider
Copy link
Author

@NiLuJe, please give me directions I'll investigate it further.
What other devices do you recommended with better memory management?

@easyrider
Copy link
Author

The other strange thing that I found the memory is freed only after Koreader restart and not after a book close - maybe it's a good ocassion to free memory/cache ?

@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

@NiLuJe, please give me directions I'll investigate it further.

Basically, check the process list, and look for stuff that isn't critical (e.g., cvm, pillow, blanket), and look for the matching upstart job, and look at that job's dep graph to see if it can be easily stopped/started without wreaking havoc.

I mentioned the KF8 & KFX readers because they're somewhat obvious, and I have experience with doing just that to the KF8 reader in the font hack.

What other devices do you recommended with better memory management?

Any Kobo (except the upcoming one which we can't support without basically a new port, because sunxi).

@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

The other strange thing that I found the memory is freed only after Koreader restart and not after a book close - maybe it's a good ocassion to free memory/cache ?

Anything that can be released on book close already is.

@mergen3107
Copy link
Contributor

@NiLuJe

Basically, check the process list, and look for stuff that isn't critical (e.g., cvm, pillow, blanket)

I tried this on PW3, with ps -o pid,user,%mem,command ax | sort -b -k3 -r, and it showed:
PW3 memory usage.txt
KOReader was running at the moment, and free -m showed around 7 MB free.

killall cvm triggers a framework restart on top of running KOReader, which stay responsive. There was also a message from your ScreenSavers package (about shuffling the images) which is usually shown at the startup of Kindle. The free memory was around 70 MB before framework restart finished. KOReader was working fine, I even managed to restart it.

Killing blanket triggers the same framework restart.
Killing pillowd yields nothing but 1-2 MB free.

look for the matching upstart job

How to check this?

look at that job's dep graph

As well as this?

@easyrider
Copy link
Author

easyrider commented Jun 15, 2021

@NiLuJe, my initial finding are as follows:

M130 (a PDF file)

root      3927  0.7 27.6 519040 138948 ?       S<l  17:46   1:53 ./luajit ./reader.lua
9000      5659  1.1 26.8 284756 135016 ?       Sl   11:23   7:04 /usr/java/bin/cvm 
9000     16373  0.0  4.0 119520 20552 ?        Sl   11:25   0:01 /usr/bin/mesquite -d com.lab126.stored
root      3234  0.0  2.0  30280 10108 ?        S<l  11:23   0:19 Xorg -nolisten tcp +bs -ardelay 0 -arinterval 0
9000      4811  0.0  1.4  90516  7284 ?        S<sl 11:23   0:00 webreader
root      3749  0.0  1.3  26560  6588 ?        Sl   11:23   0:09 blanket -t splash screensaver langpicker blankwindow
root      2388  0.0  1.1  41720  5992 ?        Ssl  11:23   0:10 tmd -f
root      3748  0.0  1.1  24944  5748 ?        Tl   11:23   0:02 awesome

M90 (a PDF file)

9000      5659  1.1 26.8 284752 134764 ?       Sl   11:23   7:00
root      3927  0.6 18.4 487500 92628 ?        S<l  17:46   1:37 ./luajit ./reader.lua
9000     16373  0.0  4.0 119520 20552 ?        Sl   11:25   0:01 /usr/bin/mesquite -d com.lab126.stored
root      3234  0.0  2.0  30276 10108 ?        S<l  11:23   0:18 Xorg -nolisten tcp +bs -ardelay 0 -arinterval 0
9000      4811  0.0  1.4  90516  7284 ?        S<sl 11:23   0:00 webreader
root      3749  0.0  1.3  26560  6588 ?        Sl   11:23   0:09 blanket -t splash screensaver langpicker blankwindow
root      2388  0.0  1.1  41720  5992 ?        Ssl  11:23   0:10 tmd -f
root      3748  0.0  1.1  24944  5748 ?        Tl   11:23   0:02 awesome

after restart

9000      5659  1.1 27.1 284752 136496 ?       Sl   11:23   7:28 /usr/java/bin/cvm 
root     13025  0.7  5.7 184696 29108 ?        S<l  21:58   0:02 ./luajit ./reader.lua

So basically the biggest memory increase is by reader.lua from 18.4% to 26.8%.
CVM stayed constant. The other processes are pretty minor.

Please give more details on dep graph monitoring if needed.

@techmoan
Copy link

maybe you guys can make something like this #7823 , its a solid solution if you ask me... hashtag just sayin ;)

@mergen3107
Copy link
Contributor

I dug a little further... (!) I have USBNetwork over Wi-Fi auto-starting on boot, as well as Rescue Pack and Coward's Rescue Pack, so make sure you have too before trying this. (!)

In the /etc/upstart/framework.conf there are these lines:

  # check here so we don't engage respawning
  if [ -f /mnt/us/DONT_START_FRAMEWORK ] ; then
    f_log I framework dont_start
    stop
    exit 0
  fi

So, naturally, I went ahead and created this file. Then did killall cvm again. It again showed as if the framework is restarting, i.e. splash screen + progress bar + ScreenSaver thingy. However, cvm has not appeared after it finished. I had to do something blindly in KOReader so it can redraw. Now there are 100-102 MB free.

Half-success I guess?

@mergen3107
Copy link
Contributor

P.S. Obviously, I wouldn't try to restart Kindle with that file existing :D

@easyrider
Copy link
Author

maybe you guys can make something like this #7823 , its a solid solution if you ask me... hashtag just sayin ;)

Yea, I've seen that - for me it's more like a workaround than a problem solving (that is what causes memory increase). Still better than constant crashes though.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

We can't kill the framework, because we lose suspend/buttons/wifi and a whole bunch of other things I might be forgetting, which is precisely why the "no framework" button is disabled on FW 5.x ;).

Of course, reinventing the wheel and making no framework behave on FW 5.x is another possibility. It's a much harder task requiring a lot of low-level potential device-specific stuff, though; and I'd probably veto it because it'd be a maintenance nightmare ^^.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

As for the ps output: of course cvm isn't budging: we're SIGSTOP'ing it. The issue is the insane memory pressure 'it' (and all its friends) puts the system under just by being, not what it does while we run.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 15, 2021

Much like webreader & kfxreader, mesquite might be a safe target for a stop/start loop, though. Or not, depending on FW versions?

@easyrider
Copy link
Author

easyrider commented Jun 16, 2021

After some parameters tweaking I've discovered that the best DGLOBAL_CACHE_FREE_PROPORTION value for PW4 (v 2021.05) is 0.6.
I tried hardly to freeze PW4 using that value without success - I'll continue today.
However crashes/freeze occured when the value was 0.2 and 0.8 and default 0.4.

Maybe it's a good idea to use different default values based on a device model?

@Frenzie
Copy link
Member

Frenzie commented Jun 16, 2021

Maybe it's a good idea to use different default values based on a device model?

Err, 0.6 is higher than 0.4.

You could distinguish between Kindle (something like 0.2-0.4) and all the rest (more like 0.6-0.8) but there's no need for it really.

@easyrider
Copy link
Author

easyrider commented Jun 16, 2021

Err, 0.6 is higher than 0.4.

Right, seems counter-intuitive but works so far - I still need to investigate it further though.
IMHO it looks like an issue with with KOReader memory management based on that specific PW4 hardware.

@Frenzie
Copy link
Member

Frenzie commented Jun 16, 2021

You need a certain amount of memory to render something like a high res PDF. That's not so much an issue as just a thing that is unavoidable (of course you can preprocess them to be more embedded device friendly, pdf2djvu perhaps being the quickest and easiest method). That's why it's at a value like 0.4 and not at 0.8.

It'd probably be possible to more or less detect that and refuse in advance but I think that's about it.

If there were an actual issue it'd be crashing constantly on Kobo too, which it doesn't. But it does need a little breathing room — quite little really.

PS You can also try disabling all plugins you don't use. The impact should normally be negligible but even just a few MB might make a difference in these extreme conditions.

@mergen3107
Copy link
Contributor

In /etc/upstart/framework there are these lines:

if [ $MEMTOTAL -gt 511 ]; then
  HEAP="-minimal -Xmx30m -XX:MaxNewSize=4m -XX:SurvivorRatio=2 -XX:TargetSurvivorRatio=80 -Xss100k -XX:ReservedCodeCacheSize=3m"
  HEAP="$HEAP -XX:CompileThreshold=5000 -XX:CodeCacheMinimumFreeSpace=100k"
  HEAP="$HEAP -XX:NmethodSweepFraction=6 -XX:NmethodSweepActivity=2 -XX:NmethodSweepMaxWaitTime=8 -XX:NmethodHotnessCounterResetValue=64"
elif [ $MEMTOTAL -gt 255 ]; then
  HEAP="-minimal -Xmx23m -XX:MaxNewSize=3m -XX:SurvivorRatio=2 -XX:TargetSurvivorRatio=80 -Xss100k -XX:ReservedCodeCacheSize=2m"
  HEAP="$HEAP -XX:CompileThreshold=8000 -XX:CodeCacheMinimumFreeSpace=100k"
  HEAP="$HEAP -XX:NmethodSweepFraction=4 -XX:NmethodSweepActivity=3 -XX:NmethodSweepMaxWaitTime=5 -XX:NmethodHotnessCounterResetValue=64"
fi
HEAP="$HEAP -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=15 -XX:+UseAggressiveHeapShrink"
HEAP="$HEAP -XX:MaxInlineSize=16 -XX:MaxInlineLevel=2 -XX:-InlineSynchronizedMethods"

I tried to comment out the first section to make it think this is a 256 MB RAM device, so framework would take less RAM. However, it didn't help. The same 7-10 MB free in KOReader.

Does anybody know what it is supposed to do?

@mergen3107
Copy link
Contributor

By the way, does KOReader work with swap file if such was created? Like explained here.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 16, 2021

A swap file on eMMc is a terrible, terrible idea.

@mergen3107
Copy link
Contributor

@NiLuJe
Ok ._.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 16, 2021

To be fair, a swap file anywhere is mostly always a terrible idea ;o). Sane modern systems in this situation should vastly prefer zram/zswap.

Unfortunately, these do not qualify as "modern" systems ;).

@hius07
Copy link
Member

hius07 commented Jun 17, 2021

Since we are talking about the memory.
The "Memory usage" indicator in the status bar was introduced in 2017 in 4316284.

Will it be more informative to show "Free memory" instead? We can get it from

function Cache:_calcFreeMem()

If so, the warning in #7857 might be set up depending on the "Free memory" as well.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 17, 2021

I'm much more interested in KOReader's actual memory consumption than the available memory as a user, FWIW (as it's a non-issue on sane devices ;p).

Code & user have different goals there.

@hius07
Copy link
Member

hius07 commented Jun 17, 2021

Returnng to your #7857 (comment), how can we set a threshold based on a parameter which we cannot watch anywhere?

@NiLuJe
Copy link
Member

NiLuJe commented Jun 17, 2021

We covered that already: #7857 (comment)

(i.e.,: we don't ;p, you can forget that earlier comment ;)).

@mergen3107
Copy link
Contributor

@NiLuJe
Thank you for the latest fixes in #7858 ! I applied the fixes in koreader.sh manually and had to fully exit KOReader and start it again for the fixes to apply (apparently, restart in KOReader isn't enough).
PW3 with 5.9.7 now usually has about 30-40 MB free RAM with 3 pdf, 1 djvu and 1 fb2 books opened in a row. KOReader Memory usage topped out at 185 MB (after 50 pages advance in each book) and doesn't grow further than that with more books. Previously it usually had <10 MB free in the same scenario, however I have never experienced any crashes during reading or exiting, as described by the OP.

@mergen3107
Copy link
Contributor

mergen3107 commented Jun 18, 2021

@NiLuJe
Maybe I celebrated too soon. However I am not sure that the new koreader.sh script is the reason.
I was trying to install new fonts to /mnt/us/fonts/ , while KOReader was open, then tried to restart KOReader. Instead, it exited and crashlog has the following lines:

06/18/21-01:44:03 INFO  quitting uimanager
06/18/21-01:44:03 INFO  no dialog left to show
lipc-wait-event exited normally with status: 0
Segmentation fault

Is this related to new koreader.sh or the fonts? Should I try to catch debug logs? This happens to me twice this evening, however I can't reproduce it consistently.

@mergen3107
Copy link
Contributor

P.S. All the installed fonts showed up fine, and there are no warnings during startup in the crashlog

@easyrider
Copy link
Author

easyrider commented Jun 18, 2021

The new koreader.sh from #7858
seems to work fine in PW4 - no crashes,
While RAM usage can go up to 175 MB
at the same time:
Free mem: 32 MB
Available mem: 52 MB

@NiLuJe
Copy link
Member

NiLuJe commented Jun 18, 2021

@mergen3107: Unrelated. Possibly mildly fronts related, though? (maybe #7806)

@mergen3107
Copy link
Contributor

@NiLuJe
I thought about fonts and removed most of installed fonts which I don’t use anyway. Maybe having 30+ extra fonts was a little bit too much :D now I have only 12. Haven’t seen the issue since then

@mergen3107
Copy link
Contributor

As for 7806, no, I don’t think it looks the same in my case. My crashes happened only after I hit restart. Also, I wasn’t testing fonts in KOReader yet, because I just uploaded them via WinSCP and wanted to restart KOReader first, to make these fonts appear.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 18, 2021

I'll chalk it up to Kindle fsp gremlins then ;).

(In any case, can't really tell much about a segfault without a gdb backtrace from a debug build).

@mergen3107
Copy link
Contributor

What debug build should I get? Is there a newer version since you mentioned in 7806?

@NiLuJe
Copy link
Member

NiLuJe commented Jun 18, 2021

That one's for Kobo, so, nowhere ^^.

If you can actually somewhat reliably reproduce it, I can build you one, though.

@mergen3107
Copy link
Contributor

OK! I’ll try to dump all the new fonts I uploaded yesterday back to Kindle to see if I can catch it again

@easyrider
Copy link
Author

@NiLuJe, it looks like change from #7858
causes wifi icon to disappear and battery percentage to stop refreshing - I suppose some of the process aren't restarted when exiting Koreader

@mergen3107
Copy link
Contributor

Adding to that, when KOReader crashed in #7869, all my collections in Kindle Menu were gone. I think it might be related to the fact that KOReader didn't have a chance to relaunch these processes it killed at the startup. I am not sure though. I just rebuilt collections in LibrarianSync and they showed up again.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 19, 2021

Both sound unrelated, and just possibly the OOM killer having killed some stuff in the meantime... (We certainly don't kill anything related to networking or device monitoring, for obvious reasons ;)).

I'll double-check, but, IIRC, that behaved just fine on my end.

@NiLuJe
Copy link
Member

NiLuJe commented Jun 19, 2021

Yup, works just fine on my end (PW2, 5.9.7, for reference).

I tweaked the ordering in #7868 to make the fact that we're finished clearer, but it shouldn't change a thing in practice.

@easyrider
Copy link
Author

(We certainly don't kill anything related to networking or device monitoring, for obvious reasons ;)).

On my side those were just icons refreshing problem that didn't happen before. My guess was that after crash it didn't call restart as expected. However it still may be unrelated as you said.

@pazos pazos closed this as completed Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can't fix an issue that, by definition, cannot be fixed firmware help-wanted We'd like help with this issue Kindle
Projects
None yet
Development

No branches or pull requests

7 participants