Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TIMOB-25599] Android: Fix Java object referencing #9675

Merged
merged 5 commits into from Feb 24, 2018

Conversation

garymathews
Copy link
Contributor

@garymathews garymathews commented Dec 12, 2017

  • Maintain a soft Java object reference to allow JVM to manage our Java objects
TEST CASE
  • Select a couple of large images
  • Repeatedly view each image until crash
var window = Ti.UI.createWindow(),
    tableView = Ti.UI.createTableView(),
    start = Ti.UI.createButton({
        title: 'START',
        left: '10dp',
        bottom: '10dp',
    });

start.addEventListener('click',(e) => {
    Ti.Media.openPhotoGallery({
        allowMultiple: true,
        allowEditing: false,
        autohide: true,
        mediaTypes: Ti.Media.MEDIA_TYPE_PHOTO,
        success:(e) => {
        
            for (let i in e.images) {
                var row = Ti.UI.createTableViewRow({title: e.images[i].media.file.name});

                row.addEventListener('click',() => {
                    var child = Ti.UI.createWindow();
                    child.add(Ti.UI.createImageView({
                        image: e.images[i].media
                    }));

                    child.addEventListener('open',() => {
                        setTimeout(() => { child.close() }, 500);
                    });

                    child.open();
                });

                tableView.appendRow(row);
            }
        }
    });
});

window.add([tableView, start]);
window.open();

JIRA Ticket

@@ -172,8 +174,9 @@ void JavaObject::MakeJavaStrong()
if (stored == NULL) {
// Sanity check. Did we get into a state where it was weak on Java, got GC'd but the C++ proxy didn't get deleted yet?
LOGE(TAG, "!!! OH NO! We tried to move a weak Java object back to strong, but it's aleady been GC'd by JVM! We're in a bad state! Key: %d", refTableKey_);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this fix should definitely be applied. I assume if the store ref is NULL, trying to delete the local ref will be very bad.

@@ -123,8 +123,10 @@ void JavaObject::MakeJSWeak()
// but this time, we say call us back as a finalizer so we can resurrect the
// object (save it from really being GCd by V8) and move it's Java object twin
// to a weak reference in the JVM. (where we can track when that gets GC'd by the JVM to call back and kill this)
persistent().SetWeak(this, DetachCallback, v8::WeakCallbackType::kFinalizer); // MUST BE kFinalizer or our object cannot be resurrected!
persistent().MarkIndependent();
if (!isDetached()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if you're using this quard, it's specifically checking isWeakRef_ is true, which only happens if #MakeJavaWeak() was called (and you've commented that call out from #detach() now). So I'm not sure this guard will ever be false now.

Copy link
Contributor

@sgtcoolguy sgtcoolguy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the last else block is correct and will avoid some crashes, but I still think this points to a memory management issue.

Ideally we should never reach a state where the JS side is "weak" and the Java side has been GC'd but suddenly the JS side is re-referenced.

The memory lifecycle here is that we basically wrap the Java object and tie it to a JS object and then explicitly tell V8 that the JS object is "independent" and to let us know when it thinks it can be GC'd because there's no more references. When we get that notification we basically stop the GC call and make the Java side weak. So at that point V8 thinks it no longer needs it and Java can try and clean it up at any point.

It's here that somehow our C++ code is trying to call getJavaObject() to resurrect the references and the Java side was already GC'd (if you're seeing that log message). So it looks like we have a reference issue on the C++ lifecycle.

@sgtcoolguy
Copy link
Contributor

Looking at the ticket, it's actually crashing in MakeJSWeak() so it's likely that the JS object already got GC'd.

So from the logging you noted and the crash location in this ticket, it would appear that both/either Java/JS sides are getting GC'd before we've expected them to, so somehow/where we're not properly tracking references in our C++ code, I believe.

@garymathews garymathews force-pushed the TIMOB-25599 branch 2 times, most recently from fd32375 to 082bc21 Compare December 12, 2017 23:59
@garymathews
Copy link
Contributor Author

garymathews commented Dec 13, 2017

@sgtcoolguy So I added the ability to create soft references to our Java objects. This also fixes the issue and makes GC less inclined to collect objects that are still in use, but still allows them to be collected when under memory constraints.

Also, isDetached can still return true even if isWeakRef_ is false. I added this condition to prevent the hard crash if the Java object had already been collected.

@sgtcoolguy
Copy link
Contributor

So my only concern is that we may potentially be holding on to objects for a long time until memory constraints cause the JVM to GC them and we then can GC the JS object. But, this isn't as much of an issue since...

I think this whole weak/soft reference handling was always a sort of safety-valve/catch for possible memory leaks anyways. Ideally for any proxies we (the SDK internally) are generating we should be explicitly removing the Java object and wiping the persistent JS reference. So I assume this is more about user JS code which instantiate proxies/wrap Java objects and somehow we or the VMs are getting confused about trying to clean them up too quickly.

If using soft references keeps the objects around longer and solves memory issues and they don't get collected until the JVM starts really trying to clean up, then that's probably OK (well, it's better than our current state of being too aggressive and GC'ing them early leading to crashes). But it would be interesting to be able to add some debug details to try and track what soft references are still alive after certain periods of time so we can see exactly what it is that's living a long time here and try and get a better handle on what use cases we may not be explicitly handling the lifecycle of (and possible ways to improve it).

Maybe some sort of periodic timer that spits out the list of soft references in the table which are non-null. The wrapped java class name would likely be enough as a first pass to identify proxy types that may be an issue, though I'd assume we'd eventually want to try and track to the JS object it's tied to.

@sgtcoolguy
Copy link
Contributor

Any chance we can automate this test by just repeatedly opening/closing image views on some large image file in the test app?

@garymathews
Copy link
Contributor Author

I don't think we can create a test case for this, I wasn't able to reproduce this through automation and it's very temperamental

@lokeshchdhry
Copy link
Contributor

lokeshchdhry commented Jan 8, 2018

@garymathews ,the app crashes for me:

01-08 16:00:33.157 13585 13585 F DEBUG   : Build fingerprint: 'google/sailfish/sailfish:7.1.1/NMF26O/3514931:user/release-keys'
01-08 16:00:33.157 13585 13585 F DEBUG   : Revision: '0'
01-08 16:00:33.157 13585 13585 F DEBUG   : ABI: 'arm64'
01-08 16:00:33.158 13585 13585 F DEBUG   : pid: 13476, tid: 13476, name: .app.timob25599  >>> com.app.timob25599 <<<
01-08 16:00:33.158 13585 13585 F DEBUG   : signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
01-08 16:00:33.160 13585 13585 F DEBUG   : Abort message: 'art/runtime/indirect_reference_table.cc:80] JNI ERROR (app bug): attempt to use stale Global 0x2 (should be 0x100002)'
01-08 16:00:33.160 13585 13585 F DEBUG   :     x0   0000000000000000  x1   00000000000034a4  x2   0000000000000006  x3   0000000000000008
01-08 16:00:33.161 13585 13585 F DEBUG   :     x4   0000000000000114  x5   0000000000000000  x6   0000000000000000  x7   feff666e6b61686b
01-08 16:00:33.161 13585 13585 F DEBUG   :     x8   0000000000000083  x9   ffffffffffffffdf  x10  0000000000000000  x11  0000000000000001
01-08 16:00:33.161 13585 13585 F DEBUG   :     x12  ffffffffffffffff  x13  ffffffffffffffff  x14  ff00000000000000  x15  ffffffffffffffff
01-08 16:00:33.161 13585 13585 F DEBUG   :     x16  00000070d4e6fee0  x17  00000070d4e19250  x18  0000000000000000  x19  00000070d70f7b40
01-08 16:00:33.161 13585 13585 F DEBUG   :     x20  0000000000000006  x21  00000070d70f7a98  x22  000000000000000b  x23  0000000000000016
01-08 16:00:33.161 13585 13585 F DEBUG   :     x24  0000000000000000  x25  0000000000000000  x26  6b51d13d77d4c5d5  x27  0000007fe8d69021
01-08 16:00:33.161 13585 13585 F DEBUG   :     x28  00000070d31febc8  x29  0000007fe8d68f20  x30  00000070d4e166f8
01-08 16:00:33.161 13585 13585 F DEBUG   :     sp   0000007fe8d68f00  pc   00000070d4e19258  pstate 0000000060000000
01-08 16:00:33.181 13585 13585 F DEBUG   : 
01-08 16:00:33.181 13585 13585 F DEBUG   : backtrace:
01-08 16:00:33.181 13585 13585 F DEBUG   :     #00 pc 000000000006b258  /system/lib64/libc.so (tgkill+8)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #01 pc 00000000000686f4  /system/lib64/libc.so (pthread_kill+64)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #02 pc 0000000000023ce8  /system/lib64/libc.so (raise+24)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #03 pc 000000000001c76c  /system/lib64/libc.so (abort+52)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #04 pc 00000000004300fc  /system/lib64/libart.so (_ZN3art7Runtime5AbortEPKc+456)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #05 pc 00000000000e5818  /system/lib64/libart.so (_ZN3art10LogMessageD2Ev+1576)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #06 pc 000000000024b7ec  /system/lib64/libart.so (_ZN3art22IndirectReferenceTable17AbortIfNoCheckJNIERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE+192)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #07 pc 00000000002f3a98  /system/lib64/libart.so (_ZNK3art22IndirectReferenceTable10GetCheckedEPv+432)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #08 pc 00000000002efe7c  /system/lib64/libart.so (_ZN3art9JavaVMExt12DecodeGlobalEPv+24)
01-08 16:00:33.181 13585 13585 F DEBUG   :     #09 pc 0000000000453134  /system/lib64/libart.so (_ZNK3art6Thread13DecodeJObjectEP8_jobject+216)
01-08 16:00:33.182 13585 13585 F DEBUG   :     #10 pc 000000000032a054  /system/lib64/libart.so (_ZN3art3JNI12NewGlobalRefEP7_JNIEnvP8_jobject+584)
01-08 16:00:33.182 13585 13585 F DEBUG   :     #11 pc 0000000000456f74  /data/app/com.app.timob25599-2/lib/arm64/libkroll-v8.so (_ZN8titanium10JavaObject

@lokeshchdhry
Copy link
Contributor

@garymathews , I still see a crash with your latest fixes but this time it took more try's:
https://gist.github.com/lokeshchdhry/0b21e9fbb04b9acfc89ebebcdc49afe5

@garymathews garymathews force-pushed the TIMOB-25599 branch 2 times, most recently from 47bba10 to 3f65f63 Compare February 8, 2018 19:41
@build build added the android label Feb 8, 2018
@build
Copy link
Contributor

build commented Feb 8, 2018

Messages
📖

💾 Here's the generated SDK zipfile.

Generated by 🚫 dangerJS

@lokeshchdhry lokeshchdhry merged commit 05c2bd4 into tidev:master Feb 24, 2018
@lokeshchdhry
Copy link
Contributor

Forgot to comment.

FR passed.

No app crash seen.

Studio Ver: 5.0.0.201712081732
SDK Ver: 7.2.0 local build
OS Ver: 10.13.2
Xcode Ver: Xcode 9.2
Appc NPM: 4.2.12
Appc CLI: 7.0.2
Daemon Ver: 1.0.1
Ti CLI Ver: 5.0.14
Alloy Ver: 1.11.0
Node Ver: 8.9.1
NPM Ver: 5.5.1
Java Ver: 1.8.0_101
Devices: ⇨ google Nexus 5 --- Android 6.0.1
⇨ google Nexus 6P --- Android 8.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants