Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompositeException issue #1405

Closed
tomrozb opened this issue Jul 3, 2014 · 26 comments
Closed

CompositeException issue #1405

tomrozb opened this issue Jul 3, 2014 · 26 comments

Comments

@tomrozb
Copy link
Contributor

tomrozb commented Jul 3, 2014

I'm using RxJava on Android and I've encountered issues with CompositeException. Because it collects throwables as Collection<Throwable> errors the exceptions are not logged to the logcat output if there is an unhandled Exception in onError callback.

Here is a sample output:

2739:E/AndroidRuntime(14639): FATAL EXCEPTION: main
2740:E/AndroidRuntime(14639): rx.exceptions.OnErrorFailedException: Error occurred when trying to propagate error to Observer.onError
2741:E/AndroidRuntime(14639):   at rx.observers.SafeSubscriber._onError(SafeSubscriber.java:182)
2742:E/AndroidRuntime(14639):   at rx.observers.SafeSubscriber.onError(SafeSubscriber.java:103)
2743:E/AndroidRuntime(14639):   at rx.internal.operators.NotificationLite.accept(NotificationLite.java:144)
2744:E/AndroidRuntime(14639):   at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.pollQueue(OperatorObserveOn.java:139)
2745:E/AndroidRuntime(14639):   at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.access$000(OperatorObserveOn.java:61)
2746:E/AndroidRuntime(14639):   at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber$1.call(OperatorObserveOn.java:121)
2747:E/AndroidRuntime(14639):   at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:43)
2748:E/AndroidRuntime(14639):   at android.os.Handler.handleCallback(Handler.java:730)
2749:E/AndroidRuntime(14639):   at android.os.Handler.dispatchMessage(Handler.java:92)
2750:E/AndroidRuntime(14639):   at android.os.Looper.loop(Looper.java:137)
2751:E/AndroidRuntime(14639):   at android.app.ActivityThread.main(ActivityThread.java:5103)
2752:E/AndroidRuntime(14639):   at java.lang.reflect.Method.invokeNative(Native Method)
2753:E/AndroidRuntime(14639):   at java.lang.reflect.Method.invoke(Method.java:525)
2754:E/AndroidRuntime(14639):   at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:737)
2755:E/AndroidRuntime(14639):   at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:553)
2756:E/AndroidRuntime(14639):   at dalvik.system.NativeStart.main(Native Method)
2757:E/AndroidRuntime(14639): Caused by: rx.exceptions.CompositeException: 2 exceptions occurred. See them in causal chain below.
2758:E/AndroidRuntime(14639):   ... 16 more

That's it. No logs about real cause. I can wrap my onError code in try catch block and log the real cause, but I think this can be handled better by RxJava. What do you guys think?

@benjchristensen
Copy link
Member

How do you recommend doing it differently?

@tomrozb
Copy link
Contributor Author

tomrozb commented Jul 3, 2014

What about using initCause (API 1) or I'm sure addSuppressed is dedicated for this case but as far as I remember it was introduced in java 7, so it's available in KitKat (API 19)

@benjchristensen
Copy link
Member

We print them out when printStackTrace() is called, but we no longer add them as part of the actual causal chain as it could end up causing circular loops under odd edge-cases. This was just barely changed in the last release: #1388

Here is the code: https://github.com/Netflix/RxJava/blob/master/rxjava-core/src/main/java/rx/exceptions/CompositeException.java

Note how it will print all causes: https://github.com/Netflix/RxJava/blob/master/rxjava-core/src/main/java/rx/exceptions/CompositeException.java#L111

What version of RxJava are you running to get that stack trace?

In the example you give, it looks like the chain is being cropped when it had more to display (2 causes).

I'm wondering if Android does something different than normal Java and doesn't actually invoke the printStackTrace() method and thus won't see the causal chain?

@tomrozb
Copy link
Contributor Author

tomrozb commented Jul 3, 2014

I'm using 0.19.2. I've also encountered problems with the previous version when unhandled exception in onError ended up in circular loop and crash of a StringBuilder but current implementation doesn't provide any useful information for debugging. If you are using error logging service like Bugsense or Crashlytics the only information you see is the stacktrace I've pasted above - it's useless.

Solution with initCause works perfect for me. What do you think about it?

@benjchristensen
Copy link
Member

As I said above, we used to use initCause, but because this is a composite of many exceptions, we are creating an artificial causal chain and it could result in infinite loops, hence it being removed in 0.19.2.

I can't speak to Bugsense or Crashlytics, but if something printing a stack trace calls printStackTrace then it would work and print out the full list of causes.

For historical information, here is the version that used initCause: https://github.com/Netflix/RxJava/blob/ea2249afecbf4d1380f2d7b0775fec60e37e3369/rxjava-core/src/main/java/rx/exceptions/CompositeException.java#L108 and the current one without it: https://github.com/Netflix/RxJava/blob/master/rxjava-core/src/main/java/rx/exceptions/CompositeException.java#L111

We need a solution that either does not use initCause due to the circular-reference problem, or solves the circular-reference problem. The change done in #1388 was based on the assumption that anything printing a stack trace would do it correctly via the printStackTrace() method and thus we can override the printing of the causes so the causal chain is shown, but not actually mutate the chain by using initCause which is what causes the issues.

@tomrozb
Copy link
Contributor Author

tomrozb commented Jul 3, 2014

Ok I've dive deeper into the implementation and it turns out that none of the printStackTrace methods are called when an OnErrorFailedException is thrown. Only getMessage and getCause methods are called so the real cause is not printed to the output. Here is my simple test with logged methods of the CompositeException class.

        AndroidRuntime  D  Shutting down VM
              dalvikvm  W  threadid=1: thread exiting with uncaught exception (group=0x4195e700)
 gt_CompositeException  I  CompositeException.getMessage
                        I  CompositeException.getCause
              dalvikvm  D  GC_FOR_ALLOC freed 453K, 5% free 11287K/11776K, paused 29ms, total 30ms
 gt_CompositeException  I  CompositeException.getCause
                        I  CompositeException.getMessage
                        I  CompositeException.getCause
        AndroidRuntime  E  FATAL EXCEPTION: main
                        E  rx.exceptions.OnErrorFailedException: a
                        E   at test.MainActivity.refreshTasksLists(MainActivity.java:267)
                        E   at test.MainActivity.onCreateOptionsMenu(MainActivity.java:156)
                        E   at android.app.Activity.onCreatePanelMenu(Activity.java:2504)
                        E   at com.android.internal.policy.impl.PhoneWindow.preparePanel(PhoneWindow.java:413)
                        E   at com.android.internal.policy.impl.PhoneWindow.doInvalidatePanelMenu(PhoneWindow.java:775)
                        E   at com.android.internal.policy.impl.PhoneWindow$1.run(PhoneWindow.java:198)
                        E   at android.view.Choreographer$CallbackRecord.run(Choreographer.java:749)
                        E   at android.view.Choreographer.doCallbacks(Choreographer.java:562)
                        E   at android.view.Choreographer.doFrame(Choreographer.java:531)
                        E   at android.view.Choreographer$FrameDisplayEventReceiver.run(Choreographer.java:735)
                        E   at android.os.Handler.handleCallback(Handler.java:730)
                        E   at android.os.Handler.dispatchMessage(Handler.java:92)
                        E   at android.os.Looper.loop(Looper.java:137)
                        E   at android.app.ActivityThread.main(ActivityThread.java:5103)
                        E   at java.lang.reflect.Method.invokeNative(Native Method)
                        E   at java.lang.reflect.Method.invoke(Method.java:525)
                        E   at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:737)
                        E   at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:553)
                        E   at dalvik.system.NativeStart.main(Native Method)
                        E  Caused by: test.CompositeException: 2 exceptions occurred. See them in causal chain below.
                        E   ... 19 more
 gt_CompositeException  I  CompositeException.getMessage
                        I  CompositeException.getCause
                        I  CompositeException.getMessage
                        I  CompositeException.getCause

                           Process 28215 ended

It turns out Android has its own implementation of a Throwable with uses private void private void printStackTrace(Appendable err, String indent, StackTraceElement[] parentStack) to print the stacktrace. If the cause is null nothing is printed to the output.

I don't see any solution for this issue. For KitKat and above the addSuppressed is an easy fix, but what about older versions?

@tomrozb
Copy link
Contributor Author

tomrozb commented Jul 3, 2014

What about appending the stack trace to the message of a CompositeException? This is some kind of a dirty hack but it will work on any platform regardless of printStackTrace implementation.

@benjchristensen
Copy link
Member

Appending the stacktrace to the message is bad practice and would be despised in non-Android environments.

It's very annoying that Android does the wrong thing for this.

If we do anything, it would need to special-case behavior just for Android (such as appending the whole stack trace as a message).

@tomrozb
Copy link
Contributor Author

tomrozb commented Jul 3, 2014

I think any working solution would be appreciated by the Android developers society. I ended up wrapping the whole implementation of onError in try-catch block so I'm able to log the cause. So far this is a simple workaround but better error logging by the RxJava should be really addressed in the future release.

@benjchristensen
Copy link
Member

@mttkay What do you suggest be done for Android and CompositeException since it doesn't behave correctly with Throwable.printStackTrace()?

@tomrozb According to the code I see for Android, this should work if the logger is calling printStackTrace() as it looks exactly like the Java one: https://android.googlesource.com/platform/libcore.git/+/android-4.2.2_r1/luni/src/main/java/java/lang/Throwable.java

How are the logs being generated? Is this default Android logging that bypasses printStackTrace() or a 3rd party logging library?

@mttkay
Copy link
Contributor

mttkay commented Jul 4, 2014

Thanks for flagging this. We're still on 0.19.1 so haven't had this problem yet. I'll have a look as well.

@tomrozb
Copy link
Contributor Author

tomrozb commented Jul 4, 2014

@benjchristensen
I'm not using any 3rd party logging library. The logs I've attached are from the standard logging tool called logcat. Logs are generated by Android system. The second posted log is slightly different because I'm using tool to present it in a user-friendly way, but the log is still being generated by Android OS.

Crash reporting tools like Crashlytics or Bugsense utilize the Thread.UncaughtExceptionHandler to obtain the stack trace and send it to the server. As I wrote above, stack traces obtained this way are exactly the same as printed by the system.

The Throwable implementation is different for Android and plain Java. If they are using the private printStackTrace method with 3 arguments to print the stack trace, other public printStackTrace methods, which are overridden by RxJava, will not be called when a crash occurred.

I can confirm this bug on:
Galaxy Nexus 4.3 (stock Android)
G900F 4.4.2
GT-I9506 4.4.2
GT-I9195 4.4.2
GT-N7100 4.4.2
SM-T335 4.4.2
SM-T320 4.4.2

I've not tested this on non-samsung devices but I'm sure this apply to all Android devices.

@tomrozb
Copy link
Contributor Author

tomrozb commented Jul 4, 2014

It is very easy to understand what is going under the hood even without writing a line of code.

The simplest test case:

throw new RuntimeException(new CompositeException(Arrays.asList(new RuntimeException("the real cause"))));

No matter which one of the printStackTrace methods is called in the first place it end up calling the printStackTrace(err, "", null) on a first RuntimeException.

Now take a look at the Throwable implementation (Android 4.2.2)

    private void printStackTrace(Appendable err, String indent, StackTraceElement[] parentStack)
            throws IOException {
        err.append(toString());
        err.append("\n");
        StackTraceElement[] stack = getInternalStackTrace();
        if (stack != null) {
            int duplicates = parentStack != null ? countDuplicates(stack, parentStack) : 0;
            for (int i = 0; i < stack.length - duplicates; i++) {
                err.append(indent);
                err.append("\tat ");
                err.append(stack[i].toString());
                err.append("\n");
            }
            if (duplicates > 0) {
                err.append(indent);
                err.append("\t... ");
                err.append(Integer.toString(duplicates));
                err.append(" more\n");
            }
        }
        // Print suppressed exceptions indented one level deeper.
        if (suppressedExceptions != null) {
            for (Throwable throwable : suppressedExceptions) {
                err.append(indent);
                err.append("\tSuppressed: ");
                throwable.printStackTrace(err, indent + "\t", stack);
            }
        }
        Throwable cause = getCause();
        if (cause != null) {
            err.append(indent);
            err.append("Caused by: ");
            cause.printStackTrace(err, indent, stack);
        }
    }

What is important the only method used on a cause is printStackTrace with 3 arguments. Our cause is the CompositeException which doesn't have a cause and there is no way to override private printStackTrace method so that's why it doesn't work this way.

@mttkay
Copy link
Contributor

mttkay commented Jul 10, 2014

Just wondering, if the problem is that Android's Throwable refuses to print the stack trace when the cause is null, why not simply return this from CompositeException.getCause? That way we still don't have to initCause on the wrapped exceptions and make Android forward to CompositeException's custom printStackTrace impl.

This is just by reading the discussion mind you, so sorry if I'm not making any sense. I still haven't had time to bump RxJava to a newer version in our app .

@mattrjacobs
Copy link
Contributor

We see more threads getting stuck in infinite loops in removeDuplicatedCauses. Re-opening as we will need to re-fix this.

@mattrjacobs mattrjacobs reopened this Aug 20, 2014
@benjchristensen
Copy link
Member

So should we revert to what it was when we fixed it last time and special case Android? Or does someone have a legit fix for this?

I'm planning on reverting to what was a known fix for non-Android, and then following up with a special case for Android.

benjchristensen pushed a commit to benjchristensen/RxJava that referenced this issue Aug 26, 2014
- reverting to version that just does printStackTrace
- see ReactiveX#1405
@benjchristensen
Copy link
Member

Closing out again as I believe this is solved in #1632

As per comment #1632 (comment):

This passed the Netflix production canary over night. There are no threads spinning on circular references.

This pull request should work for both Android and Oracle JVMs as it supports both printStackTrace and getCause mechanisms. The printStackTrace approach is better, but the getCause one should work normally and then bail if it sees a circular reference.

@tomrozb
Copy link
Contributor Author

tomrozb commented Sep 3, 2014

There are still issues with stack trace printing on Android with RxJava 0.20.3. For example when UnknownHostException is thrown from OnSubscribe implementation nothing is printed to the logcat output. I've tested it on S5 4.4.2, S4 4.4.2, Galaxy Nexus 4.3.

Simple example to reproduce:

        Observable.create(new OnSubscribe<Object>() {
            @Override
            public void call(Subscriber<? super Object> subscriber) {
                subscriber.onError(new UnknownHostException());
            }
        }).subscribeOn(Schedulers.io())
                .observeOn(AndroidSchedulers.mainThread())
                .subscribe(new Action1<Object>() {
                    @Override
                    public void call(Object o) {
                        // Does nothing intentionally.
                    }
                }, new Action1<Throwable>() {
                    @Override
                    public void call(Throwable e) {
                        throw new RuntimeException(e);
                    }
                });

Logcat output:

        AndroidRuntime  D  Shutting down VM
              dalvikvm  W  threadid=1: thread exiting with uncaught exception (group=0x417a7da0)
        AndroidRuntime  E  FATAL EXCEPTION: main
                        E  Process: pl.example.test, PID: 25099

Change UnknownHostException to IOException, now logcat output looks like below:

        AndroidRuntime  D  Shutting down VM
              dalvikvm  W  threadid=1: thread exiting with uncaught exception (group=0x417a7da0)
        AndroidRuntime  E  FATAL EXCEPTION: main
                        E  Process: pl.example.test, PID: 9153
                        E  rx.exceptions.OnErrorFailedException: Error occurred when trying to propagate error to Observer.onError
                        E      at rx.observers.SafeSubscriber._onError(SafeSubscriber.java:175)
                        E      at rx.observers.SafeSubscriber.onError(SafeSubscriber.java:97)
                        E      at rx.internal.operators.NotificationLite.accept(NotificationLite.java:144)
                        E      at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.pollQueue(OperatorObserveOn.java:169)
                        E      at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber.access$000(OperatorObserveOn.java:66)
                        E      at rx.internal.operators.OperatorObserveOn$ObserveOnSubscriber$2.call(OperatorObserveOn.java:151)
                        E      at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:43)
                        E      at android.os.Handler.handleCallback(Handler.java:733)
                        E      at android.os.Handler.dispatchMessage(Handler.java:95)
                        E      at android.os.Looper.loop(Looper.java:136)
                        E      at android.app.ActivityThread.main(ActivityThread.java:5586)
                        E      at java.lang.reflect.Method.invokeNative(Native Method)
                        E      at java.lang.reflect.Method.invoke(Method.java:515)
                        E      at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:1268)
                        E      at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1084)
                        E      at dalvik.system.NativeStart.main(Native Method)
                        E  Caused by: rx.exceptions.CompositeException: 2 exceptions occurred.
                        E      ... 16 more
                        E  Caused by: rx.exceptions.CompositeException$CompositeExceptionCausalChain: Chain of Causes for CompositeException In Order Received =>
                        E      at android.util.Log.getStackTraceString(Log.java:421)
                        E      at android.util.Slog.e(Slog.java:151)
                        E      at com.android.internal.os.RuntimeInit$UncaughtHandler.uncaughtException(RuntimeInit.java:109)
                        E      at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:693)
                        E      at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:690)
                        E      ... 1 more
                        E  Caused by: java.io.IOException
                        E      at pl.example.test.MyActivity$3.call(MyActivity.java:48)IOException
                        E      at pl.example.test.MyActivity$3.call(MyActivity.java:45)
                        E      at rx.Observable.unsafeSubscribe(Observable.java:8587)
                        E      at rx.internal.operators.OperatorSubscribeOn$1$1.call(OperatorSubscribeOn.java:62)
                        E      at rx.internal.schedulers.ScheduledAction.run(ScheduledAction.java:43)
                        E      at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:422)
                        E      at java.util.concurrent.FutureTask.run(FutureTask.java:237)
                        E      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:152)
                        E      at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:265)
                        E      at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1112)
                        E      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:587)
                        E      at java.lang.Thread.run(Thread.java:841)
                        E  Caused by: java.lang.RuntimeException: Duplicate found in causal chain so cropping to prevent loop ...
                        E      at android.util.Log.getStackTraceString(Log.java:421)
                        E      at android.util.Slog.e(Slog.java:151)
                        E      at com.android.internal.os.RuntimeInit$UncaughtHandler.uncaughtException(RuntimeInit.java:109)
                        E      at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:693)
                        E      at java.lang.ThreadGroup.uncaughtException(ThreadGroup.java:690)
                        E      at dalvik.system.NativeStart.main(Native Method)

Can anyone confirm this issue with the code I've attached?

@benjchristensen
Copy link
Member

I can't help to test this as I don't do Android development or have an environment to test with.

The current implementation uses getCause() to provide the causes, so as long as Android calls getCause() then it should work. If however it even bypasses that (as it bypasses printStackTrace) and reaches into the internal super field, then it won't work, as we don't wire up the causal chain unless getCause() is invoked (instead of construction time as non-Android platforms don't need or want that). If Android is doing that ... then apparently it doesn't obey any APIs.

If Android is completely ignoring even getCause() then the only way I can think of fixing this is having two different implementations and an Android specific check that delegates to a different solution.

/cc @mttkay

@tomrozb
Copy link
Contributor Author

tomrozb commented Sep 4, 2014

Maybe there's something wrong with my code. I've put the code above in onResume of a totally empty Activity to nail down where's the issue. If somebody can test this on other devices it would be very helpful. I don't have any non-Samsung devices at the moment.

@mttkay
Copy link
Contributor

mttkay commented Sep 8, 2014

I'll put some time aside either tomorrow or Thursday to work through a backlog of Rx specific things. I can try to reproduce then.

I haven't personally seen this issue yet (we're on 0.20.3) but I'll double check the crash logs.

@mttkay
Copy link
Contributor

mttkay commented Sep 9, 2014

Just a quick heads up that I can confirm this issue.

Not sure yet why it's happening, but swapping out the exception type does appear to fix it. Which is odd, since UnknownHostException is an IOException? I also looked at Exceptions.throwIfFatal if it had to do with rethrowing exceptions or not, but it does not mention either type.

@tomrozb
Copy link
Contributor Author

tomrozb commented Sep 9, 2014

UnknownHostException is not the only one that is not logged. I'm sure I've found at least 3 different exceptions that do not produce any output to the logcat. I'll post these exceptions here once I figure out which one causing problems. Good to hear that this issue is confirmed and easy to reproduce.

@mttkay
Copy link
Contributor

mttkay commented Sep 9, 2014

It's really hard to debug, unfortunately. I was able to pour your example in an automated test and run it on a Genymotion 4.3 instance -- that's the easy part. What I'm not able to do is debug into Android's stack trace methods (parts of it are native code, too, but the problem might exist on the Java layer). This might be an issue with the new project setup, I'm usually able to debug platform classes.

I'll poke around some more.

@mttkay
Copy link
Contributor

mttkay commented Sep 9, 2014

Sorry I'm still fighting the tools :-/ I might have to pick this up again another day...

@benjchristensen
Copy link
Member

Moving Android specific discussion to ReactiveX/RxAndroid#21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants