New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Android JNI code leaks local references in some cases #175

Merged
merged 3 commits into from Feb 3, 2018

Conversation

Projects
None yet
4 participants
@acthompson-google-com
Contributor

acthompson-google-com commented Jan 31, 2018

The current Android JNI code does not explicitly delete all of the local references it creates. This works in cases where DetachLocalThread is called because that function will delete all local references that were created. However, in cases where DetachLocalThread is not called, the local references accumulate, which eventually causes the process to abort. Sample code and crash are below.

While fixing that issue, I noticed that the JNI method IDs were being looked up every time through ares_get_android_server_list. Those can be cached, so I also moved those lookups to ares_library_init_android. See also: https://www.ibm.com/developerworks/library/j-jni/#notc

This does change the behavior of ares_library_init_android slightly. For older Android API versions that don't have these classes or methods, previously ares_library_init_android would return ARES_SUCCESS (although subsequent calls to ares_get_android_server_list would always return NULL). With this change, ares_library_init_android will return ARES_ENOTINITIALIZED (ares_get_android_server_list continues to return NULL). That seems more truthful to me, but we can keep the original semantics if needed.

This code snippet will crash when run in an Android app:

`/* ares_library_init, ares_library_init_jvm, and ares_library_android_init must be called first. */
void test_ares_android_jni(JavaVM *android_jvm) {
JNIEnv *env = NULL;
int need_detatch = 0;
int res;
int i;
ares_channel channel;

res = (*android_jvm)->GetEnv(android_jvm, (void **)&env, JNI_VERSION_1_6);
if (res == JNI_EDETACHED)
{
env = NULL;
res = (*android_jvm)->AttachCurrentThread(android_jvm, &env, NULL);
need_detatch = 1;
}
if (res != JNI_OK || env == NULL)
abort();

for (i = 0; i < 1000; i++)
{
ares_init(&channel);
ares_destroy(channel);
}

if (need_detatch)
(*android_jvm)->DetachCurrentThread(android_jvm);
}
`

The crash will look something like this:

JNI ERROR (app bug): local reference table overflow (max=512)
local reference table dump:
Last 10 entries (of 509):
508: 0x12d3fae0 java.lang.String "8.8.4.4"
507: 0x12d383e0 java.net.Inet4Address
506: 0x12d3fac0 java.lang.String "8.8.8.8"
505: 0x12d383c0 java.net.Inet4Address
504: 0x12d33dd0 java.lang.String "2001:4860:4860::... (20 chars)
503: 0x12d39240 java.net.Inet6Address
502: 0x12d33d98 java.lang.String "2001:4860:4860::... (20 chars)
501: 0x12d39218 java.net.Inet6Address
500: 0x70925de8 java.lang.Class<java.net.InetAddress>
499: 0x70940e88 java.lang.Class<java.util.List>
Summary:
170 of java.lang.Class (6 unique instances)
169 of java.lang.String (169 unique instances)
84 of java.net.Inet4Address (84 unique instances)
84 of java.net.Inet6Address (84 unique instances)

@daviddrysdale

This comment has been minimized.

Member

daviddrysdale commented Jan 31, 2018

@user-none, would you be able to take a look?

@user-none

This comment has been minimized.

Contributor

user-none commented Jan 31, 2018

Some of the references are indeed not being cleaned up and the code is not trying to rely on detach thread for clean up. That is an issue that needs to be resolved.

Caching is certainly possible. Honestly, I didn't do it because and ares_init(_options) should be called so infrequently the performance penalty would be negligible. More work and complexity for little real benefit. One channel should be used throughout an applications life cycle except when there is an event that can cause the name servers to change. Like switching from LTE to wifi... I’ll concede that this can happen often on Android so caching might end up being be a benefit.

The issue I see with the initialization returning ARES_ENOTINITIALIZED is an app running on an older Android version will see that return and fail to start. Or will error saying it can’t properly run. Or be in some other bad situation because it doesn’t think it can do DNS lookups even though it can using __system_property_get.

I’d rather only have ARES_ENOTINITIALIZED on a true failure where DNS cannot be used period. It’s currently returning success because the fallback to __system_property_get will most likely work on older versions that don’t have some of the needed ConnectivityManager functions.

Checking if it’s a true failure would be sufficient. Such as if there is a failure in the initialization code, check if __system_property_get will return DNS servers and only return ARES_ENOTINITIALIZED if it doesn’t. I think it's more misleading to reaturn ARES_ENOTINITIALIZED when setting up Android even though DNS servers can still be queried.

Other than the initialize return causing problems if the fallback method can still be used I don’t see anything wrong with the code.

@acthompson-google-com

This comment has been minimized.

Contributor

acthompson-google-com commented Jan 31, 2018

Thanks very much for reviewing.

Caching is certainly possible. Honestly, I didn't do it because and ares_init(_options) should be called so infrequently the performance penalty would be negligible.

Curl calls ares_init every time an easy handle is created (https://github.com/curl/curl/blob/master/lib/asyn-ares.c#L122), so an application that does large numbers of transfers using Curl is going to benefit as well.

The issue I see with the initialization returning ARES_ENOTINITIALIZED is an app running on an older Android version will see that return and fail to start. Or will error saying it can’t properly run. Or be in some other bad situation because it doesn’t think it can do DNS lookups even though it can using __system_property_get.

That's fair, but unfortunately the flip side is also true. Returning ARES_SUCCESS to an app running on a newer Android when one of the FindClass or GetMethodID calls fails will cause it to continue even though its DNS lookups will definitely fail. Admittedly, it should be very uncommon for those to fail on newer Android.

I’d rather only have ARES_ENOTINITIALIZED on a true failure where DNS cannot be used period. It’s currently returning success because the fallback to __system_property_get will most likely work on older versions that don’t have some of the needed ConnectivityManager functions.
Checking if it’s a true failure would be sufficient. Such as if there is a failure in the initialization code, check if __system_property_get will return DNS servers and only return ARES_ENOTINITIALIZED if it doesn’t. I think it's more misleading to reaturn ARES_ENOTINITIALIZED when setting up Android even though DNS servers can still be queried.

Agreed, it would be better if ARES_ENOTINITIALIZED was returned if and only if permanent failure is confirmed, but that's a bigger change than I'd like to make as part of this pull request. I've added another commit to this request that keeps the current semantics. Just to reiterate, the current semantics have the opposite problem, that ARES_SUCCESS may be returned when in fact there is a permanent failure, but I'm fine with keeping that behavior for now.

@user-none

This comment has been minimized.

Contributor

user-none commented Jan 31, 2018

Lastest commit looks good to me.

@bradh352 bradh352 merged commit 2ee16d7 into c-ares:master Feb 3, 2018

1 check failed

continuous-integration/travis-ci/pr The Travis CI build failed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment