Memory Error & No Auto Complete when binary flict is large ~ 6mb #403

sabzo · 2021-02-27T17:08:28Z

Referring to this feature

I generated a binary flict using the dict-tools for en where size was 6.8 mb -- about 3 times the default en binary flict. Size is larger due to more words and the highest dimension for n-grams used was 2. However when file is reduced to a small size, significantly fewer tokens, word-complete works fine.

Short description

Memory runs out when flict is too large

Steps to reproduce

Generate large flict > 6mb
Type an entire paragraph as quickly as possible
No auto-complete will show and keyboard will quickly crash in seconds
See error

Environment information

FlorisBoard Version: current main branch, commit 38baac1af92fea80a86f5fdd8850bc677a27a3d2.
Install Source: Github
Device: Emulator Android Studio
Android version, ROM: Android 11

~~~ 1614445172221.stacktrace ~~~

java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw an exception; no stack trace available
~~~ 1614445172167.stacktrace ~~~

java.lang.OutOfMemoryError: Failed to allocate a 80 byte allocation with 8 free bytes and 8B until OOM, target footprint 201326592, growth limit 201326592
	at java.util.Arrays.copyOf(Arrays.java:3257)
	at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
	at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
	at java.lang.StringBuilder.append(StringBuilder.java:137)
	at dev.patrickgold.florisboard.ime.core.FlorisBoard.onUpdateSelection(FlorisBoard.kt:388)
	at android.inputmethodservice.InputMethodService$InputMethodSessionImpl.updateSelection(InputMethodService.java:906)
	at android.inputmethodservice.IInputMethodSessionWrapper.executeMessage(IInputMethodSessionWrapper.java:104)
	at com.android.internal.os.HandlerCaller$MyHandler.handleMessage(HandlerCaller.java:44)
	at android.os.Handler.dispatchMessage(Handler.java:106)
	at android.os.Looper.loop(Looper.java:223)
	at android.app.ActivityThread.main(ActivityThread.java:7656)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)

@patrickgold

The text was updated successfully, but these errors were encountered:

patrickgold · 2021-02-27T17:16:26Z

Thanks for reporting this issue! I am currently analyzing a few memory crashes (both related to suggestions as well to other features) and somewhere in the code some references don't clear up, which very quickly builds up garbage and crashes the app on some devices. Will comment here if I have found out more what could be the cause.

Generate large flict > 6mb

Hmm I am surprised that the load method doesn't crash for this size. I think I will start there (because the entire binary file is currently read and then analyzed, instead of reading only small chunks and processing it bit by bit), then move forward to the actual suggestion algorithm.

sabzo · 2021-02-27T17:29:27Z

@patrickgold There's actually a bigger and more serious issue I believe: the flict compression algorithm actually makes the file bigger than it has to be, significantly bigger. I used the dict-tools from aosp and the binary created was 1.9 mb. The same combined wordlist on florisboard creates 6.8 mb. Besides the memory error issue there's something wrong with the compression issue when using a premade combined wordlist and running it with flict.py.

sabzo · 2021-02-27T17:33:21Z

@patrickgold It may help to use the default https://github.com/remi0s/aosp-dictionary-tools on a wordlist and also use the same wordlist on florisboard dict-tools to verify my results.

patrickgold · 2021-02-27T17:34:45Z

@sabzo One thing that the Python library currently does not do is utilizing the end byte grouping the Flict spec defines,because it always glitched out in the claculation and at the time I just commented out the code: https://github.com/florisboard/dictionary-tools/blob/cc712ffc70b8485ea8fbd25d06381a4fc2b9b906/flict.py#L129-L140

For your file size of >6MB, this can definitely save a lot of bytes in the file size once fixed.

sabzo · 2021-02-27T17:50:24Z

One thing that the Python library currently does not do is utilizing the end byte grouping the Flict spec defines,because it always glitched out in the claculation and at the time I just commented out the code: https://github.com/florisboard/dictionary-tools/blob/cc712ffc70b8485ea8fbd25d06381a4fc2b9b906/flict.py#L129-L140

Since Luminiso doesn't support higher n-grams, is fixing the calculation error in flict.py part of phase two work? Or will the solution be more about reducing memory so that large binaries don't cause a crash?

patrickgold · 2021-02-27T17:52:50Z

Since Luminiso doesn't support higher n-grams, is fixing the calculation error in flict.py part of phase two work? Or will the solution be more about reducing memory so that large binaries don't cause a crash?

No the end bug is something I prefer to have included still in phase 1, because it reduces the file size and also reduces the amount of bytes read into memory in runtime when the binary file is cached.

patrickgold · 2021-02-27T18:29:07Z

@sabzo I've pushed a fix for the end count bug on both the dictionary-ttols main branch and on FlorisBoard master branch, could you try out if

a) the file size of the binary Flictionary decreases and
b) if the smaller version of your sample still loads correctly (the large 6MB file will most likely still fail, as I've not done improvements in that direction yet).

Thanks in advance!

sabzo · 2021-02-27T19:37:26Z

@patrickgold File size decreased by 2 mb to 4.9 mb, however the aosp-tools jar still produces for same wordlist a 1.9 mb file. So aosp-tools is saving 3mb more than the flict algorithm.

b)

I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 493us total 222.374ms
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.537ms total 9.717ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 668us total 215.366ms
    Forcing collection of SoftReferences for 32B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 6(144B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 2.449ms total 480.191ms
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 1(16B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.671ms total 7.485ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 421us total 280.729ms
I/orisboard.debu: Forcing collection of SoftReferences for 56B allocation
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Thread[6,tid=16689,WaitingInMainSignalCatcherLoop,Thread*=0xdab02a10,peer=0x12c452b0,"Signal Catcher"]: reacting to signal 3
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 1(32B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 2.830ms total 1.408s
I/orisboard.debu: Starting a blocking GC Alloc
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 920us total 15.779ms
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 424us total 319.413ms
I/orisboard.debu: WaitForGcToComplete blocked Background on HeapTrim for 3.215s
I/orisboard.debu: WaitForGcToComplete blocked Alloc on HeapTrim for 5.091s
    Starting a blocking GC Alloc
I/orisboard.debu: Forcing collection of SoftReferences for 80B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 12(360B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.957ms total 913.433ms
I/orisboard.debu: WaitForGcToComplete blocked HeapTrim on HeapTrim for 914.009ms
I/orisboard.debu: WaitForGcToComplete blocked Alloc on HeapTrim for 4.132s
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 5(104B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.989ms total 11.307ms
    Starting a blocking GC Alloc
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 2(40B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 462us total 293.508ms
    Starting a blocking GC Alloc
I/orisboard.debu: WaitForGcToComplete blocked ObjectsAllocated on HeapTrim for 2.835s
I/orisboard.debu: WaitForGcToComplete blocked Alloc on ObjectsAllocated for 303.973ms
    Starting a blocking GC Alloc
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 2.534ms total 15.820ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 388us total 263.354ms
    Forcing collection of SoftReferences for 56B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 4(96B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 2.081ms total 658.383ms
    Starting a blocking GC Alloc
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.401ms total 7.697ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 291us total 215.581ms
    Forcing collection of SoftReferences for 80B allocation
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.367ms total 483.564ms
I/orisboard.debu: WaitForGcToComplete blocked Alloc on ObjectsAllocated for 1.645s
    Starting a blocking GC Alloc
W/orisboard.debu: Throwing OutOfMemoryError "Failed to allocate a 80 byte allocation with 8 free bytes and 8B until OOM, target footprint 201326592, growth limit 201326592" (VmSize 1373984 kB)
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 922us total 9.618ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 289us total 205.918ms
I/orisboard.debu: Forcing collection of SoftReferences for 32B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.629ms total 471.029ms
I/orisboard.debu: WaitForGcToComplete blocked Alloc on ObjectsAllocated for 6.915s
I/orisboard.debu: Starting a blocking GC Alloc
    Starting a blocking GC Alloc
W/orisboard.debu: Throwing OutOfMemoryError "Failed to allocate a 32 byte allocation with 8 free bytes and 8B until OOM, target footprint 201326592, growth limit 201326592" (VmSize 1373984 kB)
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 941us total 7.843ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 392us total 226.733ms
    Forcing collection of SoftReferences for 16B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.703ms total 463.760ms
W/orisboard.debu: Throwing OutOfMemoryError "Failed to allocate a 16 byte allocation with 8 free bytes and 8B until OOM, target footprint 201326592, growth limit 201326592" (VmSize 1373984 kB)
I/orisboard.debu: WaitForGcToComplete blocked ObjectsAllocated on ObjectsAllocated for 3.030s
I/orisboard.debu: WaitForGcToComplete blocked Alloc on ObjectsAllocated for 1.391s
    Starting a blocking GC Alloc
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.370ms total 8.237ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Wrote stack traces to tombstoned

~~~ 1614447754638.stacktrace ~~~

java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw an exception; no stack trace available
~~~ 1614447794240.stacktrace ~~~

java.lang.NoClassDefFoundError: kotlinx.coroutines.CoroutineExceptionHandlerImplKt
	at kotlinx.coroutines.CoroutineExceptionHandlerImplKt.handleCoroutineExceptionImpl(CoroutineExceptionHandlerImpl.kt:27)
	at kotlinx.coroutines.CoroutineExceptionHandlerKt.handleCoroutineException(CoroutineExceptionHandler.kt:33)
	at kotlinx.coroutines.DispatchedTask.handleFatalException$kotlinx_coroutines_core(DispatchedTask.kt:146)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:115)
	at android.os.Handler.handleCallback(Handler.java:938)
	at android.os.Handler.dispatchMessage(Handler.java:99)
	at android.os.Looper.loop(Looper.java:223)
	at android.app.ActivityThread.main(ActivityThread.java:7656)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)
Caused by: java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw OutOfMemoryError; no stack trace available
~~~ 1614454619716.stacktrace ~~~

java.lang.OutOfMemoryError: Failed to allocate a 40 byte allocation with 16 free bytes and 16B until OOM, target footprint 201326592, growth limit 201326592
	at sun.misc.Cleaner.create(Cleaner.java:133)
	at libcore.util.NativeAllocationRegistry.registerNativeAllocation(NativeAllocationRegistry.java:245)
	at android.os.BinderProxy.getInstance(BinderProxy.java:420)
	at android.os.Parcel.nativeReadStrongBinder(Native Method)
	at android.os.Parcel.readStrongBinder(Parcel.java:2483)
	at com.android.internal.view.IInputMethod$Stub.onTransact(IInputMethod.java:203)
	at android.os.Binder.execTransactInternal(Binder.java:1159)
	at android.os.Binder.execTransact(Binder.java:1123)
~~~ 1614454619721.stacktrace ~~~

java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw an exception; no stack trace available

sabzo · 2021-02-27T19:39:00Z

Memory Management + Large File sizes seem to be contributing to the above -- makes me wonder if this is why C++ is used in the AOSP keyboards outthere for word complete/prediction...

patrickgold · 2021-02-27T19:52:56Z

Memory Management + Large File sizes seem to be contributing to the above -- makes me wonder if this is why C++ is used in the AOSP keyboards outthere for word complete/prediction...

Java is known to not be the best language in terms of memory management, but it is definitely possible to write Kotlin/JVM code which runs smoothly, just requires a lot of attention and care. I just don't want to use C++ besides Kotlin for this, because then the code is as readable as the AOSP prediction/dictionary code...

I will rewrite the load function either later this evening or tomorrow midday to not preload the binary file at once but read the InputStream bit by bit, then I will investigate the behavior with the Android Studio Memory Profiler.

patrickgold · 2021-02-28T19:40:11Z

Above PR fixes a lot in the prediction algorithm (both the Flictionary load function and memory management), could you have a look into it if this fixes the memory crash for you?

sabzo · 2021-03-01T20:13:11Z

@patrickgold the same memory error and keyboard freezing and restarting still persists unfortunately. The new flict algorithm produced a 4.9 mb binary, while the same combined wordlist using aosp tools still built a 1.9mb binary. The compression worked, but it doesn't do better than the aosp compression algorithm.

sabzo · 2021-03-09T17:00:41Z

@patrickgold Revisiting this -- are there plans to reduce size of flicts to be <= their equivalent AOSP binary size?

patrickgold · 2021-03-09T19:22:26Z

@sabzo I will definitely try to decrease the file size of a Flictionary, but the 4.9MB of your Flictionary are not the problem, the problem lies in the representation of the data after being parsed in runtime, which in its current state is far from memory-efficient. Thus the OOM errors and crashes occur. In the next few days I will begin with phase 2 of the suggestion feature and then I will see where I can improve.

sabzo · 2021-03-09T19:33:12Z

@patrickgold understood. If you can point me to that part of the code where the representation of data happens I can help troubleshoot as well. I've a good amount of time on my hands.

patrickgold · 2021-03-11T00:25:49Z

This is the definition of a NgramNode:

florisboard/app/src/main/java/dev/patrickgold/florisboard/ime/nlp/FlorisLanguageModel.kt

Lines 32 to 38 in e4f5fcf

    
           open class NgramNode( 
        
               val order: Int, 
        
               val char: Char, 
        
               val freq: Int, 
        
               val sameOrderChildren: MutableList<NgramNode> = mutableListOf(), 
        
               val higherOrderChildren: MutableList<NgramNode> = mutableListOf() 
        
           ) {

The main issue in this is the MutableList (which is an ArrayList in JVM runtime), which takes a lot of overhead for each node, even if the list is empty. In the current representation this is 2 times per node and in my precompiled version the Ngram node occurs about 670k times. This is even worse for your 4.9MB Flictionary, thus crashing the app with OOM errors. The problem is I can't really find a more efficient mutable array than ArrayList (except if I had primitives, which I don't have here).

Another thing I see is that I can use a short instead of int for the freq and byte for the order, this would save 5 bytes per node, which is a few MBs per Flictionary in runtime. Will begin experimenting with this in the next few days.

patrickgold · 2021-03-15T17:05:31Z

makes me wonder if this is why C++ is used in the AOSP keyboards outthere for word complete/prediction...

I didn't want to believe this when you originally said this, but after investigating and learning more about the JVM, it is more and more clear for me why the AOSP keyboard uses native C++ for NLP. The JVM just represents data in a very inefficient way, which works well for normal applications and small data sets but becomes quickly a big problem for large data sets like a Flictionary once parsed. I will look into maybe rewriting the dictionary base implementation (everything which requires a lot of memory) in native C++ or Rust (depends if the F-Droid build servers support compiling Rust, else I will resort to C++) and this should hopefully be the solution to the OOM errors.

tsiflimagas · 2021-03-15T18:18:53Z

@patrickgold f-droid servers can probably compile rust, since this app https://github.com/jensstein/oandbackup
required it and it's on f-droid.

patrickgold · 2021-03-15T19:16:20Z

@tsiflimagas Good to know, thanks for linking!

sabzo · 2021-03-29T16:46:52Z

@patrickgold heads up that the latest release https://github.com/florisboard/florisboard/releases/tag/v0.3.10-beta04 breaks the suggestions on the default en.flict. While more suggestions show, clicking on any of them does nothing. Perhaps maybe a memory error too. The app doesn't crash, but the selection of suggestions doesn't work either. On an emulator it works (as the computer has more resources than a regular phone I assume) but on the devices I've had multiple issues.

patrickgold · 2021-03-29T18:28:14Z

The fact that the suggestions show but you can't select them makes me believe there's a bug in the CadidateView touch logic rather that in the suggestions core. If you go to Settings > Typing > Suggestions display mode and set it to classic or dynamic width, does it work then?

sabzo · 2021-03-29T22:15:45Z

@patrickgold You're right about that. Only classic, dynamic width work. dyanmic width & scrollable doesn't work.

patrickgold · 2021-03-30T00:43:55Z

I have an idea what happens: For all modes, a suggestions is marked as soon as your pointer touches down on that. When your pointer moves while being down, it automatically cancels the selection for the scrollable display mode to scroll the candidate view. I suspect that an ACTION_MOVE is triggered so fast because your touch pointer (finger) can't be pixel-perfect still. On an emulator though holding still without moving is far easier because you use a mouse. Anyways, I guess I will have to introduce a minimum distance threshold before the scrolling starts to prevent this kind of bug.

sabzo · 2021-03-30T00:58:48Z

@patrickgold understood. I've been experimenting with loading multiple flicks at the same time so users can have multiple language support without having to manually switch between languages. I know flictionary.load loads a single flict. Is there a quick easy way to hack this o load multiple flicts?

patrickgold · 2021-03-30T01:21:11Z

Is there a quick easy way to hack this o load multiple flicts?

Into one runtime flict? Untested but this hacked version could work, which just loops over all passed assetRefs. For n-grams wich appear multiple times the last loaded Flictionary will take precedence:

    fun load(context: Context, assetRefs: List<AssetRef>): Result<Flictionary> {
            val buffer = ByteArray(5000) { 0 }

            var headerStr: String? = null
            var date: Long = 0
            var version = 0
            val ngramTree = NgramTree()

            for (assetRef in assetRefs) {
                val inputStream: InputStream
                if (assetRef.source == AssetSource.Assets) {
                    inputStream = context.assets.open(assetRef.path)
                } else {
                    return Result.failure(Exception("Only AssetSource.Assets is currently supported!"))
                }
                var pos = 0
                val ngramOrderStack = mutableListOf<Int>()
                val ngramTreeStack = mutableListOf<NgramNode>()

                while (true) {
                    if (inputStream.readNext(buffer, 0, 1) <= 0) {
                        break
                    }
                    val cmd = buffer[0].toInt() and 0xFF
                    when {
                        (cmd and MASK_BEGIN_PTREE_NODE) == CMDB_BEGIN_PTREE_NODE -> {
                            if (pos == 0) {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_CMD_BEGIN_PTREE_NODE,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            val order = ((cmd and ATTR_PTREE_NODE_ORDER) shr 4) + 1
                            val type = ((cmd and ATTR_PTREE_NODE_TYPE) shr 2)
                            val size = (cmd and ATTR_PTREE_NODE_SIZE) + 1
                            val freq: Int
                            val freqSize: Int
                            when (type) {
                                ATTR_PTREE_NODE_TYPE_CHAR -> {
                                    freq = NgramNode.FREQ_CHARACTER
                                    freqSize = 0
                                }
                                ATTR_PTREE_NODE_TYPE_WORD_FILLER -> {
                                    freq = NgramNode.FREQ_WORD_FILLER
                                    freqSize = 0
                                }
                                ATTR_PTREE_NODE_TYPE_WORD -> {
                                    if (inputStream.readNext(buffer, 1, 1) > 0) {
                                        freq = buffer[1].toInt() and 0xFF
                                    } else {
                                        inputStream.close()
                                        return Result.failure(
                                            ParseException(
                                                errorType = ParseException.ErrorType.UNEXPECTED_EOF,
                                                address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                            )
                                        )
                                    }
                                    freqSize = 1
                                }
                                else -> return Result.failure(Exception("TODO: shortcut not supported"))
                            }
                            if (inputStream.readNext(buffer, freqSize + 1, size) > 0) {
                                val char = String(buffer, freqSize + 1, size, Charsets.UTF_8)[0]
                                val node = NgramNode(order, char, freq)
                                val lastOrder = ngramOrderStack.lastOrNull()
                                if (lastOrder == null) {
                                    ngramTree.higherOrderChildren.add(node)
                                } else {
                                    if (lastOrder == order) {
                                        ngramTreeStack.last().sameOrderChildren.add(node)
                                    } else {
                                        ngramTreeStack.last().higherOrderChildren.add(node)
                                    }
                                }
                                ngramOrderStack.add(order)
                                ngramTreeStack.add(node)
                                pos += (freqSize + 1 + size)
                            } else {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_EOF,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                        }

                        (cmd and MASK_BEGIN_HEADER) == CMDB_BEGIN_HEADER -> {
                            if (pos != 0) {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_CMD_BEGIN_HEADER,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            version = cmd and ATTR_HEADER_VERSION
                            if (version != VERSION_0) {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNSUPPORTED_FLICTIONARY_VERSION,
                                        address = pos,
                                        cmdByte = cmd.toByte(),
                                        absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            if (inputStream.readNext(buffer, 1, 9) > 0) {
                                val size = (buffer[1].toInt() and 0xFF)
                                date =
                                    ((buffer[2].toInt() and 0xFF).toLong() shl 56) +
                                        ((buffer[3].toInt() and 0xFF).toLong() shl 48) +
                                        ((buffer[4].toInt() and 0xFF).toLong() shl 40) +
                                        ((buffer[5].toInt() and 0xFF).toLong() shl 32) +
                                        ((buffer[6].toInt() and 0xFF).toLong() shl 24) +
                                        ((buffer[7].toInt() and 0xFF).toLong() shl 16) +
                                        ((buffer[8].toInt() and 0xFF).toLong() shl 8) +
                                        ((buffer[9].toInt() and 0xFF).toLong() shl 0)
                                if (inputStream.readNext(buffer, 10, size) > 0) {
                                    headerStr = String(buffer, 10, size, Charsets.UTF_8)
                                    ngramOrderStack.add(-1)
                                    ngramTreeStack.add(NgramTree())
                                    pos += (10 + size)
                                } else {
                                    inputStream.close()
                                    return Result.failure(
                                        ParseException(
                                            errorType = ParseException.ErrorType.UNEXPECTED_EOF,
                                            address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                        )
                                    )
                                }
                            } else {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_EOF,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                        }

                        (cmd and MASK_END) == CMDB_END -> {
                            if (pos == 0) {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_CMD_END,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            val n = (cmd and ATTR_END_COUNT)
                            if (n > 0) {
                                if (n <= ngramTreeStack.size) {
                                    for (c in 0 until n) {
                                        ngramOrderStack.removeLast()
                                        ngramTreeStack.removeLast()
                                    }
                                } else {
                                    inputStream.close()
                                    return Result.failure(
                                        ParseException(
                                            errorType = ParseException.ErrorType.UNEXPECTED_ABSOLUTE_DEPTH_DECREASE_BELOW_ZERO,
                                            address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size - n
                                        )
                                    )
                                }
                            } else {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_CMD_END_ZERO_VALUE,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            pos += 1
                        }
                        else -> {
                            inputStream.close()
                            return Result.failure(
                                ParseException(
                                    errorType = ParseException.ErrorType.INVALID_CMD_BYTE_PROVIDED,
                                    address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                )
                            )
                        }
                    }
                }
                inputStream.close()

                if (ngramTreeStack.size != 0) {
                    return Result.failure(
                        ParseException(
                            errorType = ParseException.ErrorType.UNEXPECTED_ABSOLUTE_DEPTH_NOT_ZERO_AT_EOF,
                            address = pos, cmdByte = 0x00.toByte(), absoluteDepth = ngramTreeStack.size
                        )
                    )
                }
            }

            return Result.success(
                Flictionary(
                    name = "flict",
                    label = "flict",
                    authors = listOf(),
                    headerStr = headerStr ?: "",
                    date = date,
                    version = version,
                    languageModel = FlorisLanguageModel(ngramTree)
                )
            )
        }

sabzo · 2021-03-30T01:39:26Z

hmmm I can see that working -- but the more I try to debug, even switching between languages doesn't work well. There's an incredible delay. I tried to clear the dictionaryCache to reduce the memory footprint and it didn't help either. The NLP work desperately needs a native implementation -- as Rust doesn't have official support in the NDK, although there are work arounds, it seems C/C++ is the only way forward.

sabzo added the bug A bug report label Feb 27, 2021

D3SOX mentioned this issue Feb 27, 2021

Possible memory leak in keyboard view #267

Closed

patrickgold mentioned this issue Feb 28, 2021

Fix memory management for Flictionaries & prediction algorithm #405

Merged

patrickgold added the bug-confirmed A confirmed and reproducible bug report label Feb 28, 2021

patrickgold added the area: word-prediction Word predictions related stuff (both current and next-word) label May 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Error & No Auto Complete when binary flict is large ~ 6mb #403

Memory Error & No Auto Complete when binary flict is large ~ 6mb #403

sabzo commented Feb 27, 2021 •

edited

patrickgold commented Feb 27, 2021 •

edited

sabzo commented Feb 27, 2021 •

edited

sabzo commented Feb 27, 2021

patrickgold commented Feb 27, 2021

sabzo commented Feb 27, 2021

patrickgold commented Feb 27, 2021

patrickgold commented Feb 27, 2021 •

edited

sabzo commented Feb 27, 2021

sabzo commented Feb 27, 2021

patrickgold commented Feb 27, 2021

patrickgold commented Feb 28, 2021

sabzo commented Mar 1, 2021 •

edited

sabzo commented Mar 9, 2021

patrickgold commented Mar 9, 2021

sabzo commented Mar 9, 2021

patrickgold commented Mar 11, 2021

patrickgold commented Mar 15, 2021

tsiflimagas commented Mar 15, 2021

patrickgold commented Mar 15, 2021

sabzo commented Mar 29, 2021

patrickgold commented Mar 29, 2021

sabzo commented Mar 29, 2021

patrickgold commented Mar 30, 2021

sabzo commented Mar 30, 2021

patrickgold commented Mar 30, 2021 •

edited

sabzo commented Mar 30, 2021

Memory Error & No Auto Complete when binary flict is large ~ 6mb #403

Memory Error & No Auto Complete when binary flict is large ~ 6mb #403

Comments

sabzo commented Feb 27, 2021 • edited

Short description

Steps to reproduce

Environment information

patrickgold commented Feb 27, 2021 • edited

sabzo commented Feb 27, 2021 • edited

sabzo commented Feb 27, 2021

patrickgold commented Feb 27, 2021

sabzo commented Feb 27, 2021

patrickgold commented Feb 27, 2021

patrickgold commented Feb 27, 2021 • edited

sabzo commented Feb 27, 2021

sabzo commented Feb 27, 2021

patrickgold commented Feb 27, 2021

patrickgold commented Feb 28, 2021

sabzo commented Mar 1, 2021 • edited

sabzo commented Mar 9, 2021

patrickgold commented Mar 9, 2021

sabzo commented Mar 9, 2021

patrickgold commented Mar 11, 2021

patrickgold commented Mar 15, 2021

tsiflimagas commented Mar 15, 2021

patrickgold commented Mar 15, 2021

sabzo commented Mar 29, 2021

patrickgold commented Mar 29, 2021

sabzo commented Mar 29, 2021

patrickgold commented Mar 30, 2021

sabzo commented Mar 30, 2021

patrickgold commented Mar 30, 2021 • edited

sabzo commented Mar 30, 2021

sabzo commented Feb 27, 2021 •

edited

patrickgold commented Feb 27, 2021 •

edited

sabzo commented Feb 27, 2021 •

edited

patrickgold commented Feb 27, 2021 •

edited

sabzo commented Mar 1, 2021 •

edited

patrickgold commented Mar 30, 2021 •

edited