Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Error & No Auto Complete when binary flict is large ~ 6mb #403

Open
sabzo opened this issue Feb 27, 2021 · 26 comments
Open

Memory Error & No Auto Complete when binary flict is large ~ 6mb #403

sabzo opened this issue Feb 27, 2021 · 26 comments
Labels
area: word-prediction Word predictions related stuff (both current and next-word) bug A bug report bug-confirmed A confirmed and reproducible bug report

Comments

@sabzo
Copy link

sabzo commented Feb 27, 2021

Referring to this feature

I generated a binary flict using the dict-tools for en where size was 6.8 mb -- about 3 times the default en binary flict. Size is larger due to more words and the highest dimension for n-grams used was 2. However when file is reduced to a small size, significantly fewer tokens, word-complete works fine.

Short description

Memory runs out when flict is too large

Steps to reproduce

  1. Generate large flict > 6mb
  2. Type an entire paragraph as quickly as possible
  3. No auto-complete will show and keyboard will quickly crash in seconds
  4. See error

Environment information

  • FlorisBoard Version: current main branch, commit 38baac1af92fea80a86f5fdd8850bc677a27a3d2.
  • Install Source: Github
  • Device: Emulator Android Studio
  • Android version, ROM: Android 11
~~~ 1614445172221.stacktrace ~~~

java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw an exception; no stack trace available
~~~ 1614445172167.stacktrace ~~~

java.lang.OutOfMemoryError: Failed to allocate a 80 byte allocation with 8 free bytes and 8B until OOM, target footprint 201326592, growth limit 201326592
	at java.util.Arrays.copyOf(Arrays.java:3257)
	at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
	at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
	at java.lang.StringBuilder.append(StringBuilder.java:137)
	at dev.patrickgold.florisboard.ime.core.FlorisBoard.onUpdateSelection(FlorisBoard.kt:388)
	at android.inputmethodservice.InputMethodService$InputMethodSessionImpl.updateSelection(InputMethodService.java:906)
	at android.inputmethodservice.IInputMethodSessionWrapper.executeMessage(IInputMethodSessionWrapper.java:104)
	at com.android.internal.os.HandlerCaller$MyHandler.handleMessage(HandlerCaller.java:44)
	at android.os.Handler.dispatchMessage(Handler.java:106)
	at android.os.Looper.loop(Looper.java:223)
	at android.app.ActivityThread.main(ActivityThread.java:7656)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)

@patrickgold

@sabzo sabzo added the bug A bug report label Feb 27, 2021
@patrickgold
Copy link
Member

patrickgold commented Feb 27, 2021

Thanks for reporting this issue! I am currently analyzing a few memory crashes (both related to suggestions as well to other features) and somewhere in the code some references don't clear up, which very quickly builds up garbage and crashes the app on some devices. Will comment here if I have found out more what could be the cause.

Generate large flict > 6mb

Hmm I am surprised that the load method doesn't crash for this size. I think I will start there (because the entire binary file is currently read and then analyzed, instead of reading only small chunks and processing it bit by bit), then move forward to the actual suggestion algorithm.

@sabzo
Copy link
Author

sabzo commented Feb 27, 2021

@patrickgold There's actually a bigger and more serious issue I believe: the flict compression algorithm actually makes the file bigger than it has to be, significantly bigger. I used the dict-tools from aosp and the binary created was 1.9 mb. The same combined wordlist on florisboard creates 6.8 mb. Besides the memory error issue there's something wrong with the compression issue when using a premade combined wordlist and running it with flict.py.

@sabzo
Copy link
Author

sabzo commented Feb 27, 2021

@patrickgold It may help to use the default https://github.com/remi0s/aosp-dictionary-tools on a wordlist and also use the same wordlist on florisboard dict-tools to verify my results.

@patrickgold
Copy link
Member

@sabzo One thing that the Python library currently does not do is utilizing the end byte grouping the Flict spec defines,because it always glitched out in the claculation and at the time I just commented out the code: https://github.com/florisboard/dictionary-tools/blob/cc712ffc70b8485ea8fbd25d06381a4fc2b9b906/flict.py#L129-L140

For your file size of >6MB, this can definitely save a lot of bytes in the file size once fixed.

@sabzo
Copy link
Author

sabzo commented Feb 27, 2021

One thing that the Python library currently does not do is utilizing the end byte grouping the Flict spec defines,because it always glitched out in the claculation and at the time I just commented out the code: https://github.com/florisboard/dictionary-tools/blob/cc712ffc70b8485ea8fbd25d06381a4fc2b9b906/flict.py#L129-L140

Since Luminiso doesn't support higher n-grams, is fixing the calculation error in flict.py part of phase two work? Or will the solution be more about reducing memory so that large binaries don't cause a crash?

@patrickgold
Copy link
Member

Since Luminiso doesn't support higher n-grams, is fixing the calculation error in flict.py part of phase two work? Or will the solution be more about reducing memory so that large binaries don't cause a crash?

No the end bug is something I prefer to have included still in phase 1, because it reduces the file size and also reduces the amount of bytes read into memory in runtime when the binary file is cached.

@patrickgold
Copy link
Member

patrickgold commented Feb 27, 2021

@sabzo I've pushed a fix for the end count bug on both the dictionary-ttols main branch and on FlorisBoard master branch, could you try out if

a) the file size of the binary Flictionary decreases and
b) if the smaller version of your sample still loads correctly (the large 6MB file will most likely still fail, as I've not done improvements in that direction yet).

Thanks in advance!

@sabzo
Copy link
Author

sabzo commented Feb 27, 2021

@patrickgold File size decreased by 2 mb to 4.9 mb, however the aosp-tools jar still produces for same wordlist a 1.9 mb file. So aosp-tools is saving 3mb more than the flict algorithm.

b)

I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 493us total 222.374ms
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.537ms total 9.717ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 668us total 215.366ms
    Forcing collection of SoftReferences for 32B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 6(144B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 2.449ms total 480.191ms
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 1(16B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.671ms total 7.485ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 421us total 280.729ms
I/orisboard.debu: Forcing collection of SoftReferences for 56B allocation
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Thread[6,tid=16689,WaitingInMainSignalCatcherLoop,Thread*=0xdab02a10,peer=0x12c452b0,"Signal Catcher"]: reacting to signal 3
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 1(32B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 2.830ms total 1.408s
I/orisboard.debu: Starting a blocking GC Alloc
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 920us total 15.779ms
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 424us total 319.413ms
I/orisboard.debu: WaitForGcToComplete blocked Background on HeapTrim for 3.215s
I/orisboard.debu: WaitForGcToComplete blocked Alloc on HeapTrim for 5.091s
    Starting a blocking GC Alloc
I/orisboard.debu: Forcing collection of SoftReferences for 80B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 12(360B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.957ms total 913.433ms
I/orisboard.debu: WaitForGcToComplete blocked HeapTrim on HeapTrim for 914.009ms
I/orisboard.debu: WaitForGcToComplete blocked Alloc on HeapTrim for 4.132s
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 5(104B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.989ms total 11.307ms
    Starting a blocking GC Alloc
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 2(40B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 462us total 293.508ms
    Starting a blocking GC Alloc
I/orisboard.debu: WaitForGcToComplete blocked ObjectsAllocated on HeapTrim for 2.835s
I/orisboard.debu: WaitForGcToComplete blocked Alloc on ObjectsAllocated for 303.973ms
    Starting a blocking GC Alloc
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 2.534ms total 15.820ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 388us total 263.354ms
    Forcing collection of SoftReferences for 56B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 4(96B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 2.081ms total 658.383ms
    Starting a blocking GC Alloc
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.401ms total 7.697ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 291us total 215.581ms
    Forcing collection of SoftReferences for 80B allocation
I/orisboard.debu: Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.367ms total 483.564ms
I/orisboard.debu: WaitForGcToComplete blocked Alloc on ObjectsAllocated for 1.645s
    Starting a blocking GC Alloc
W/orisboard.debu: Throwing OutOfMemoryError "Failed to allocate a 80 byte allocation with 8 free bytes and 8B until OOM, target footprint 201326592, growth limit 201326592" (VmSize 1373984 kB)
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 922us total 9.618ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 289us total 205.918ms
I/orisboard.debu: Forcing collection of SoftReferences for 32B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.629ms total 471.029ms
I/orisboard.debu: WaitForGcToComplete blocked Alloc on ObjectsAllocated for 6.915s
I/orisboard.debu: Starting a blocking GC Alloc
    Starting a blocking GC Alloc
W/orisboard.debu: Throwing OutOfMemoryError "Failed to allocate a 32 byte allocation with 8 free bytes and 8B until OOM, target footprint 201326592, growth limit 201326592" (VmSize 1373984 kB)
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 941us total 7.843ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 392us total 226.733ms
    Forcing collection of SoftReferences for 16B allocation
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC ObjectsAllocated
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Clamp target GC heap from 215MB to 192MB
    Alloc concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.703ms total 463.760ms
W/orisboard.debu: Throwing OutOfMemoryError "Failed to allocate a 16 byte allocation with 8 free bytes and 8B until OOM, target footprint 201326592, growth limit 201326592" (VmSize 1373984 kB)
I/orisboard.debu: WaitForGcToComplete blocked ObjectsAllocated on ObjectsAllocated for 3.030s
I/orisboard.debu: WaitForGcToComplete blocked Alloc on ObjectsAllocated for 1.391s
    Starting a blocking GC Alloc
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Alloc young concurrent copying GC freed 0(0B) AllocSpace objects, 0(0B) LOS objects, 0% free, 191MB/192MB, paused 1.370ms total 8.237ms
    Starting a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Waiting for a blocking GC Alloc
I/orisboard.debu: Wrote stack traces to tombstoned
~~~ 1614447754638.stacktrace ~~~

java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw an exception; no stack trace available
~~~ 1614447794240.stacktrace ~~~

java.lang.NoClassDefFoundError: kotlinx.coroutines.CoroutineExceptionHandlerImplKt
	at kotlinx.coroutines.CoroutineExceptionHandlerImplKt.handleCoroutineExceptionImpl(CoroutineExceptionHandlerImpl.kt:27)
	at kotlinx.coroutines.CoroutineExceptionHandlerKt.handleCoroutineException(CoroutineExceptionHandler.kt:33)
	at kotlinx.coroutines.DispatchedTask.handleFatalException$kotlinx_coroutines_core(DispatchedTask.kt:146)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:115)
	at android.os.Handler.handleCallback(Handler.java:938)
	at android.os.Handler.dispatchMessage(Handler.java:99)
	at android.os.Looper.loop(Looper.java:223)
	at android.app.ActivityThread.main(ActivityThread.java:7656)
	at java.lang.reflect.Method.invoke(Native Method)
	at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:592)
	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:947)
Caused by: java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw OutOfMemoryError; no stack trace available
~~~ 1614454619716.stacktrace ~~~

java.lang.OutOfMemoryError: Failed to allocate a 40 byte allocation with 16 free bytes and 16B until OOM, target footprint 201326592, growth limit 201326592
	at sun.misc.Cleaner.create(Cleaner.java:133)
	at libcore.util.NativeAllocationRegistry.registerNativeAllocation(NativeAllocationRegistry.java:245)
	at android.os.BinderProxy.getInstance(BinderProxy.java:420)
	at android.os.Parcel.nativeReadStrongBinder(Native Method)
	at android.os.Parcel.readStrongBinder(Parcel.java:2483)
	at com.android.internal.view.IInputMethod$Stub.onTransact(IInputMethod.java:203)
	at android.os.Binder.execTransactInternal(Binder.java:1159)
	at android.os.Binder.execTransact(Binder.java:1123)
~~~ 1614454619721.stacktrace ~~~

java.lang.OutOfMemoryError: OutOfMemoryError thrown while trying to throw an exception; no stack trace available

@sabzo
Copy link
Author

sabzo commented Feb 27, 2021

Memory Management + Large File sizes seem to be contributing to the above -- makes me wonder if this is why C++ is used in the AOSP keyboards outthere for word complete/prediction...

@patrickgold
Copy link
Member

Memory Management + Large File sizes seem to be contributing to the above -- makes me wonder if this is why C++ is used in the AOSP keyboards outthere for word complete/prediction...

Java is known to not be the best language in terms of memory management, but it is definitely possible to write Kotlin/JVM code which runs smoothly, just requires a lot of attention and care. I just don't want to use C++ besides Kotlin for this, because then the code is as readable as the AOSP prediction/dictionary code...

I will rewrite the load function either later this evening or tomorrow midday to not preload the binary file at once but read the InputStream bit by bit, then I will investigate the behavior with the Android Studio Memory Profiler.

@patrickgold
Copy link
Member

Above PR fixes a lot in the prediction algorithm (both the Flictionary load function and memory management), could you have a look into it if this fixes the memory crash for you?

@sabzo
Copy link
Author

sabzo commented Mar 1, 2021

@patrickgold the same memory error and keyboard freezing and restarting still persists unfortunately. The new flict algorithm produced a 4.9 mb binary, while the same combined wordlist using aosp tools still built a 1.9mb binary. The compression worked, but it doesn't do better than the aosp compression algorithm.

@sabzo
Copy link
Author

sabzo commented Mar 9, 2021

@patrickgold Revisiting this -- are there plans to reduce size of flicts to be <= their equivalent AOSP binary size?

@patrickgold
Copy link
Member

@sabzo I will definitely try to decrease the file size of a Flictionary, but the 4.9MB of your Flictionary are not the problem, the problem lies in the representation of the data after being parsed in runtime, which in its current state is far from memory-efficient. Thus the OOM errors and crashes occur. In the next few days I will begin with phase 2 of the suggestion feature and then I will see where I can improve.

@sabzo
Copy link
Author

sabzo commented Mar 9, 2021

@patrickgold understood. If you can point me to that part of the code where the representation of data happens I can help troubleshoot as well. I've a good amount of time on my hands.

@patrickgold
Copy link
Member

This is the definition of a NgramNode:

open class NgramNode(
val order: Int,
val char: Char,
val freq: Int,
val sameOrderChildren: MutableList<NgramNode> = mutableListOf(),
val higherOrderChildren: MutableList<NgramNode> = mutableListOf()
) {

The main issue in this is the MutableList (which is an ArrayList in JVM runtime), which takes a lot of overhead for each node, even if the list is empty. In the current representation this is 2 times per node and in my precompiled version the Ngram node occurs about 670k times. This is even worse for your 4.9MB Flictionary, thus crashing the app with OOM errors. The problem is I can't really find a more efficient mutable array than ArrayList (except if I had primitives, which I don't have here).

Another thing I see is that I can use a short instead of int for the freq and byte for the order, this would save 5 bytes per node, which is a few MBs per Flictionary in runtime. Will begin experimenting with this in the next few days.

@patrickgold
Copy link
Member

makes me wonder if this is why C++ is used in the AOSP keyboards outthere for word complete/prediction...

I didn't want to believe this when you originally said this, but after investigating and learning more about the JVM, it is more and more clear for me why the AOSP keyboard uses native C++ for NLP. The JVM just represents data in a very inefficient way, which works well for normal applications and small data sets but becomes quickly a big problem for large data sets like a Flictionary once parsed. I will look into maybe rewriting the dictionary base implementation (everything which requires a lot of memory) in native C++ or Rust (depends if the F-Droid build servers support compiling Rust, else I will resort to C++) and this should hopefully be the solution to the OOM errors.

@tsiflimagas
Copy link
Collaborator

@patrickgold f-droid servers can probably compile rust, since this app https://github.com/jensstein/oandbackup
required it and it's on f-droid.

@patrickgold
Copy link
Member

@tsiflimagas Good to know, thanks for linking!

@sabzo
Copy link
Author

sabzo commented Mar 29, 2021

@patrickgold heads up that the latest release https://github.com/florisboard/florisboard/releases/tag/v0.3.10-beta04 breaks the suggestions on the default en.flict. While more suggestions show, clicking on any of them does nothing. Perhaps maybe a memory error too. The app doesn't crash, but the selection of suggestions doesn't work either. On an emulator it works (as the computer has more resources than a regular phone I assume) but on the devices I've had multiple issues.

@patrickgold
Copy link
Member

The fact that the suggestions show but you can't select them makes me believe there's a bug in the CadidateView touch logic rather that in the suggestions core. If you go to Settings > Typing > Suggestions display mode and set it to classic or dynamic width, does it work then?

@sabzo
Copy link
Author

sabzo commented Mar 29, 2021

@patrickgold You're right about that. Only classic, dynamic width work. dyanmic width & scrollable doesn't work.

@patrickgold
Copy link
Member

I have an idea what happens: For all modes, a suggestions is marked as soon as your pointer touches down on that. When your pointer moves while being down, it automatically cancels the selection for the scrollable display mode to scroll the candidate view. I suspect that an ACTION_MOVE is triggered so fast because your touch pointer (finger) can't be pixel-perfect still. On an emulator though holding still without moving is far easier because you use a mouse. Anyways, I guess I will have to introduce a minimum distance threshold before the scrolling starts to prevent this kind of bug.

@sabzo
Copy link
Author

sabzo commented Mar 30, 2021

@patrickgold understood. I've been experimenting with loading multiple flicks at the same time so users can have multiple language support without having to manually switch between languages. I know flictionary.load loads a single flict. Is there a quick easy way to hack this o load multiple flicts?

@patrickgold
Copy link
Member

patrickgold commented Mar 30, 2021

Is there a quick easy way to hack this o load multiple flicts?

Into one runtime flict? Untested but this hacked version could work, which just loops over all passed assetRefs. For n-grams wich appear multiple times the last loaded Flictionary will take precedence:

    fun load(context: Context, assetRefs: List<AssetRef>): Result<Flictionary> {
            val buffer = ByteArray(5000) { 0 }

            var headerStr: String? = null
            var date: Long = 0
            var version = 0
            val ngramTree = NgramTree()

            for (assetRef in assetRefs) {
                val inputStream: InputStream
                if (assetRef.source == AssetSource.Assets) {
                    inputStream = context.assets.open(assetRef.path)
                } else {
                    return Result.failure(Exception("Only AssetSource.Assets is currently supported!"))
                }
                var pos = 0
                val ngramOrderStack = mutableListOf<Int>()
                val ngramTreeStack = mutableListOf<NgramNode>()

                while (true) {
                    if (inputStream.readNext(buffer, 0, 1) <= 0) {
                        break
                    }
                    val cmd = buffer[0].toInt() and 0xFF
                    when {
                        (cmd and MASK_BEGIN_PTREE_NODE) == CMDB_BEGIN_PTREE_NODE -> {
                            if (pos == 0) {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_CMD_BEGIN_PTREE_NODE,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            val order = ((cmd and ATTR_PTREE_NODE_ORDER) shr 4) + 1
                            val type = ((cmd and ATTR_PTREE_NODE_TYPE) shr 2)
                            val size = (cmd and ATTR_PTREE_NODE_SIZE) + 1
                            val freq: Int
                            val freqSize: Int
                            when (type) {
                                ATTR_PTREE_NODE_TYPE_CHAR -> {
                                    freq = NgramNode.FREQ_CHARACTER
                                    freqSize = 0
                                }
                                ATTR_PTREE_NODE_TYPE_WORD_FILLER -> {
                                    freq = NgramNode.FREQ_WORD_FILLER
                                    freqSize = 0
                                }
                                ATTR_PTREE_NODE_TYPE_WORD -> {
                                    if (inputStream.readNext(buffer, 1, 1) > 0) {
                                        freq = buffer[1].toInt() and 0xFF
                                    } else {
                                        inputStream.close()
                                        return Result.failure(
                                            ParseException(
                                                errorType = ParseException.ErrorType.UNEXPECTED_EOF,
                                                address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                            )
                                        )
                                    }
                                    freqSize = 1
                                }
                                else -> return Result.failure(Exception("TODO: shortcut not supported"))
                            }
                            if (inputStream.readNext(buffer, freqSize + 1, size) > 0) {
                                val char = String(buffer, freqSize + 1, size, Charsets.UTF_8)[0]
                                val node = NgramNode(order, char, freq)
                                val lastOrder = ngramOrderStack.lastOrNull()
                                if (lastOrder == null) {
                                    ngramTree.higherOrderChildren.add(node)
                                } else {
                                    if (lastOrder == order) {
                                        ngramTreeStack.last().sameOrderChildren.add(node)
                                    } else {
                                        ngramTreeStack.last().higherOrderChildren.add(node)
                                    }
                                }
                                ngramOrderStack.add(order)
                                ngramTreeStack.add(node)
                                pos += (freqSize + 1 + size)
                            } else {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_EOF,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                        }

                        (cmd and MASK_BEGIN_HEADER) == CMDB_BEGIN_HEADER -> {
                            if (pos != 0) {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_CMD_BEGIN_HEADER,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            version = cmd and ATTR_HEADER_VERSION
                            if (version != VERSION_0) {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNSUPPORTED_FLICTIONARY_VERSION,
                                        address = pos,
                                        cmdByte = cmd.toByte(),
                                        absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            if (inputStream.readNext(buffer, 1, 9) > 0) {
                                val size = (buffer[1].toInt() and 0xFF)
                                date =
                                    ((buffer[2].toInt() and 0xFF).toLong() shl 56) +
                                        ((buffer[3].toInt() and 0xFF).toLong() shl 48) +
                                        ((buffer[4].toInt() and 0xFF).toLong() shl 40) +
                                        ((buffer[5].toInt() and 0xFF).toLong() shl 32) +
                                        ((buffer[6].toInt() and 0xFF).toLong() shl 24) +
                                        ((buffer[7].toInt() and 0xFF).toLong() shl 16) +
                                        ((buffer[8].toInt() and 0xFF).toLong() shl 8) +
                                        ((buffer[9].toInt() and 0xFF).toLong() shl 0)
                                if (inputStream.readNext(buffer, 10, size) > 0) {
                                    headerStr = String(buffer, 10, size, Charsets.UTF_8)
                                    ngramOrderStack.add(-1)
                                    ngramTreeStack.add(NgramTree())
                                    pos += (10 + size)
                                } else {
                                    inputStream.close()
                                    return Result.failure(
                                        ParseException(
                                            errorType = ParseException.ErrorType.UNEXPECTED_EOF,
                                            address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                        )
                                    )
                                }
                            } else {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_EOF,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                        }

                        (cmd and MASK_END) == CMDB_END -> {
                            if (pos == 0) {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_CMD_END,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            val n = (cmd and ATTR_END_COUNT)
                            if (n > 0) {
                                if (n <= ngramTreeStack.size) {
                                    for (c in 0 until n) {
                                        ngramOrderStack.removeLast()
                                        ngramTreeStack.removeLast()
                                    }
                                } else {
                                    inputStream.close()
                                    return Result.failure(
                                        ParseException(
                                            errorType = ParseException.ErrorType.UNEXPECTED_ABSOLUTE_DEPTH_DECREASE_BELOW_ZERO,
                                            address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size - n
                                        )
                                    )
                                }
                            } else {
                                inputStream.close()
                                return Result.failure(
                                    ParseException(
                                        errorType = ParseException.ErrorType.UNEXPECTED_CMD_END_ZERO_VALUE,
                                        address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                    )
                                )
                            }
                            pos += 1
                        }
                        else -> {
                            inputStream.close()
                            return Result.failure(
                                ParseException(
                                    errorType = ParseException.ErrorType.INVALID_CMD_BYTE_PROVIDED,
                                    address = pos, cmdByte = cmd.toByte(), absoluteDepth = ngramTreeStack.size
                                )
                            )
                        }
                    }
                }
                inputStream.close()

                if (ngramTreeStack.size != 0) {
                    return Result.failure(
                        ParseException(
                            errorType = ParseException.ErrorType.UNEXPECTED_ABSOLUTE_DEPTH_NOT_ZERO_AT_EOF,
                            address = pos, cmdByte = 0x00.toByte(), absoluteDepth = ngramTreeStack.size
                        )
                    )
                }
            }

            return Result.success(
                Flictionary(
                    name = "flict",
                    label = "flict",
                    authors = listOf(),
                    headerStr = headerStr ?: "",
                    date = date,
                    version = version,
                    languageModel = FlorisLanguageModel(ngramTree)
                )
            )
        }

@sabzo
Copy link
Author

sabzo commented Mar 30, 2021

hmmm I can see that working -- but the more I try to debug, even switching between languages doesn't work well. There's an incredible delay. I tried to clear the dictionaryCache to reduce the memory footprint and it didn't help either. The NLP work desperately needs a native implementation -- as Rust doesn't have official support in the NDK, although there are work arounds, it seems C/C++ is the only way forward.

@patrickgold patrickgold added the area: word-prediction Word predictions related stuff (both current and next-word) label May 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: word-prediction Word predictions related stuff (both current and next-word) bug A bug report bug-confirmed A confirmed and reproducible bug report
Projects
None yet
Development

No branches or pull requests

3 participants