-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ANR on player release #4352
Comments
|
Agreed we need a full bug report (or at least the stack traces of all of the threads) to debug this. |
|
@andrewlewis sent you an e-mail with ANR bug reports from different devices. |
Looking at the stack traces, it looks like We do see this kind of ANR happen occasionally on devices, when the underlying platform media server gets into a bad state. This is not something we'd expect to occur more frequently with 2.8.0 than earlier releases, however, and I'm not aware of anything that we did that's likely to have exacerbated the issue. Are you certain these ANRs are increased as of 2.8.0, rather than some other change, such as them being better reported than they were previously? What ANR rate did you see when releasing the player previously? It should have been higher than 0 if ANRs were being reported correctly, since we've been aware of rare/sporadic ANRs when calling |
@ojw28 Thanks! We have not seen any of ExoPlayer related ANR reports before. Even can't find any of them. Do you have any ideas why could we have not seen ExoPlayer related ANRs before? |
That's (unfortunately!) strange, unless you were previously not calling |
Hey, |
After the final release we do not see any valuable ANR growth. However, we saw it (ExoPlayer-related ANRs) in reports of Beta-versions. Probably, was caught "problematc" beta testers. Also, during research we've found some specific devices on which changing Surface on-the-fly causes ANRs. Will open separate issue on this. Thanks! |
Hi, I'm seeing the same ANR issue on both the release() method of the exoplayer and also the blockingSendMessages of the exoplayer. Am on exoplayer 2.2.0. Seeing that this issue has still not been resolved, is there good reason to update to the latest version and try? Also, any workarounds for this issue? I have attached stack traces for both issues. |
We are experiencing this issue in 2.7.1 as well, stacktrace from the Google Play Console, as you can see we call at java.lang.Object.wait (Object.java)
- waiting on <0x057d9834> (a com.google.android.exoplayer2.k)
at com.google.android.exoplayer2.ExoPlayerImplInternal.release (ExoPlayerImplInternal.java:232)
- locked <0x057d9834> (a com.google.android.exoplayer2.k)
at com.google.android.exoplayer2.ExoPlayerImpl.release (ExoPlayerImpl.java:342)
at com.google.android.exoplayer2.SimpleExoPlayer.release (SimpleExoPlayer.java:681)
at com.plexapp.plex.videoplayer.local.v2.Exo2VideoPlayer.disconnect This looks to be only picked up now because of the changes to ANR's on the Google Play Console, rather than an issue in a new version of ExoPlayer v2 (at least from what we can tell). |
@ojw28 / @tonihei After all discussions about threading and the adding of your internal stat handler that creates the race conditions #4278. I finally followed the recommendations and switched to all access on mainthread and I'm now facing this ANR too. (Google Vitals, unable to reproduce yet)
Related code is;
IMO you can't ask people to call that on main thread as there's so many possible reasons for a message sent to a thread to be delayed or not arriving at all. Hard to choose between very rare race conditions or rare ANR. And a little disappointed that the issue was known yet pushing toward main thread :( Is there a way to reduce risk of race conditions with your stat handler when moving back the release call to background thread? |
The rare/sporadic ANRs occur when
I don't think this is an accurate reflection of the choice. It's more like choosing between:
|
You send a message to a thread and wait for something that is supposed to be done there for infinite time. About the choice, currently the app only use AudioPlayer part, so being a bad citizen is less an issue, and this is just a small part of the app. Release occurs when the user changes it's player to something not ExoPlayer related so hopefully won't reuse the player until the codec is released. If it's never released then maybe there's something to address in ExoPlayer too if possible, else in all cases, crash or issue user is not happy about the app. And currently app is rated 4.75 and in top 2% of all Play Store apps about crash and 7% for ANR (Both remaining issues being due to bugs in Support Library that should be fixed in 28 and Android 8 bug with startForeground). I'm lucky to have an active community that report issues and allow building such app, so everything I can do on my side to ensure them a proper experience is important. |
This is hypothetical and not true in practice. The message queue is completely under our control, and will normally only have one or two messages queued at any given point in time.
There are some cases where not waiting is significantly worse, as is the case here. There are other cases where waiting on the main thread is unavoidable. For example if you've passed a |
@ojw28 Is there a way to detect that? I don't mind showing a wait dialog or anything useful to the users but randoms ANR or app killing for P without a message or anything to warn the user for something that is known and I hope can be detected does not sounds OK to provide quality app. I'm open to any suggestion or workaround. Just don't want ANR / app killing without proper user message. PS: Not wanting to argue more but
Just 2 messages and a wait is some ms and dropped frames, of course it's not that awful and very far from the ANRs, but it's still dropped frames so lag to the user. And just saying normally implies that there could be anormal cases. |
Unfortunately the only real fix for the MediaCodec issue is to better prevent devices from shipping with the underlying issues that cause it in the first place. Which we've been working on, but it's a hard problem. The percentages are so low and the failures so sporadic, that it's not possible to write a test that will reliably detect such issues during Android device certification. I'm not aware of a way to kill the ExoPlayer thread without killing the process. I don't think that's possible.
Whether there are dropped frames depends entirely on how long those messages take to process, so this in itself does not seem like a strong argument. I'm not saying there are never dropped frames, but I am saying (a) it's not inevitable that the piece of code in question will cause dropped frames, and (b) the benefit of blocking until we've released the limited resources is more important. The ANR issue affects our own applications (e.g. YouTube) as well. If we thought there were a better solution then we would have implemented it. |
Well there was no offense I'm not native English speaker so often sound harsh. I totally suppose that you'll have fixed it if it was possible. All I'm saying is that ANR / crash are bad, a message saying there's a problem press ok to restart the app is better. I can understand it does not fit Youtube, but for my user target it's a better option. About sporadic, I only have 7500 beta testers and the audio player is used by 5% max, beta have run for 3 days and already happened twice on an S6 edge and Moto X. So for my use case the remaining solution would be to propose to the user to restart the app when there's the lock. The best solution (for my need) would be that my app only propose that when the next player start and can't access the MediaCodec since it's still locked by previous unreleased one. (That I'll release in background thread). For more background, my app is not primarily a player, it's more a global controller, that connect and control things. At some point user might be playing audio from multiple source to local device and that's the part where ExoPlayer kicks in. User can then stop playing locally and start playing on Chromecast or anything else. Having the app hang and ANR because of ExoPlayer that is no more used and will probably won't until next app start, and have the user loose it's music and queue running on the other player is not something cool if it can be avoided. |
Facing the same issue exoplayer version 2.9.1 Affected android versions 8.0,8.1,9.0
stack trace
|
I can reproduce it quite regularly with the attached sample after a few "play + exit the player" action on a Nexus 6 - API 7.1.1. The sample recording shows no problems in other devices and Android versions. About the attached sample, it isn't a regular video. Don't expect any image or audio. The video content is dark and is just for testing purposes. It has been recorded using the Android MediaRecorder with a time-lapse Camcorder profile, setting the recording rate at 30fps and a capture rate at 90fps. |
We are facing this same issue on version 2.9.4 of ExoPlayer, unlikely described above, we didn't have any errors on Android versions 7.0 and above, but this is likely because of our user base. We noticed that this happens on certain streams, notably a Is there any news about this issue? a workraround or something like it? thx |
In addition when the error happens this is the logcat thrown by the MediaCodec:
|
I had some success in getting around the ANR, although for the time being looks more like black-yu-yu-magic than a proper solution. And still, it has some side effects but which at least can be managed. So, by calling seekToDefaultPosition before the stop and release methods the ANR disappears, yet the logs show that there is failure from MediaCodec. It is likely that the MediaCodec stop flow fails altogether, but this is just a guess. Then the player is not usable for a couple of seconds. If trying to use it again right away then ExoPlayer throws an exception, but since it is thrown via the onPlayerError callback then this one can be gracefully handled. After a couple of seconds, the player recovers, and everything work fine again. So, to resolve the ANR, I do as next:
Also noticed that if seekToDefaultPosition is called after the stop method then the ANR still happens so must be called before the stop & release. And these are the 2 new logs, thrown only when the error happens, instead of having the ANR. This first log is shown when releasing the player, since we started to also call seekToDefaultPosition:
This second log, is the exception thrown by ExoPlayer via onPlayerError when trying to use the player again right away, but which no longer happens after a short moment:
|
As you say, it's likely that the calls to stop/release the codec are still failing (throwing instead of just blocking). When you create a player shortly afterwards, |
@andrewlewis |
This comment has been minimized.
This comment has been minimized.
Hey guys. we also facing mediacodec release or stop ANR. logcat only this hints: @perracolabs did u mean call flush before stop & release could workaround? |
Facing ANR in release();
|
Either leave the ANR as Google want expecting this to be a proper user experience .... Specially on P+ where ANR just close the app. (Hint: This is not proper user experience at all) Or workaround yourself on the app side, release on the background, properly manage initialisation errors with users messages and if after a time you evaluate long enough init still does not work, propose to the user to restart the app to fix the issue and restart the process. |
Device: Samsung SM-N9109W Android 6.0.1, API 23 Bug reproduces steps:
But if i put the prepare, play, release player action in single activity, everything is fine! protected boolean flushOrReleaseCodec() {
if (codec == null) {
return false;
}
if (codecDrainAction == DRAIN_ACTION_REINITIALIZE
|| codecNeedsFlushWorkaround
|| (codecNeedsEosFlushWorkaround && codecReceivedEos)) {
releaseCodec();
return true;
}
codec.flush(); // here is blocked
....
} the stracktrace is
|
@liungkejin Can you provide an example app for this? If it's 100% reproducible it should be possible to figure out what's wrong. |
This is caused by the fact that ExoPlayer calls into MediaCodec.setVideoScalingMode() directly from its OnFrameRenderedListener() callback. The Android stagefright library has a bug where it holds a lock while calling back onFrameRenderedListener callbacks. I believe it is a bug for any component to hold a lock while calling an arbitrarily registered listener because you cannot know whether that listener will then need to acquire the same lock, which would then lead to a deadlock - which is what is happening here. The setVideoScalingMode() call itself goes into native stagefright code which then waits for some other event from MediaCodec ... but that event cannot be delivered by the MediaCodec thread that would deliver it because to do so would require acquiring the same MediaCodec lock that is already being held by the thread calling setVideoScalingMode. A workaround is to have ExoPlayer's MediaCodecVideoRenderer.OnFrameRenderedListenerV23 instance defer the call to 'onProcessedTunneledBuffer' to a runnable posted to the handler, which will allow the onFrameRenderedListenerCallback to release its lock and return before the call to setVideoScalingMode that onProcessedTunneledBuffer will be made. We are currently accomplishing this by deferring a Runnable to call onProcessedTunneledBuffer() from within OnFrameRenderedListenerV23.onFrameRendered() ... but we recognize that this would be accomplished in a manner more coherent with the MediaCodecVideoRenderer implementation by posting an event to be handled by the player thread instead, and we are working on a pull request that will accomplish this. By the way, the actual bug is in libstagefright. It should NOT hold locks while calling back listeners; that's elementary in any multithreaded code that uses callbacks. I would submit this to the stagefright project but I couldn't figure out how to do that; perhaps the ExoPlayer library maintainers know how to do this ... |
@tonihei I can do one better... Here's the threads that are deadlocked (in our case, tunneling causes the issue, but in the general case it is calling back into I'll send a bug report, too big to post... Here's the threads involved:
As per the docs, MediaCodec is calling back and the creator's So the wait down in native here is: At the same time, the native side of handling the video format change is trying to post a message that will complete the java side, but it is waiting on the same lock (0x0bfaa167):
Meanwhile our lowly application main thread is waiting forever to for the MSG_RELEASE to process, presumably it's in the queue behind the deadlocked frame rendered callback.
|
That's a great find! I'll try to get this fixed in the platform code if possible [Done - Will be fixed in future Android releases] To clarify for other readers - this bug only affects tunneled playbacks, so the majority of reported ANRs on player release won't be fixed by this. |
Thanks for the heads up, I deleted our branch. You’ll have to check with the Google folks when the will merge dev-v2. You might want to
… On Mar 26, 2020, at 2:56 PM, Sergei Dryganets ***@***.***> wrote:
Wonder what is the plan for this particular case:
1ec0519 <1ec0519>
It seems to be was merged to the dev-2 a while ago. How long it usually takes for the changes to get to the release-v2?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#4352 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADBF6ECAWIQNUSRJQHR6QTRJPFPDANCNFSM4FD6CAQA>.
|
I also suffered from this issue. (No response at MediaCodec.flush) After some tests, it wasn't clear, but the issue appeared after the setSurface call. I was using SurfaceView to hand over the Surface, but as a test I used TextureView and the issue disappeared. I leave a post in case there is anything that might help. |
@Tolriq May I know if releasing player in backgrond thread solves this locking problem? Btw, I found that for Instagram's they have quite an interesting implementation. It looks like they spin up another process for ExoPlayer and make use of IPC to do playing stuff. |
Yes it does works perfectly for my use case as when I release the player I know I won't use it instantly later and since there's still no way to properly detect / handle this case. And failure to initialize player is something that can be handled and show to user for better experience. I also use in R8
To have this check removed from production release for a very insignificant performance gain :p I still do not understand why there's no better workaround so that this that could be handled by devs and not hidden behind potential ANRs. ANRs outside of dev control are bad. |
Using a timeout prevents ANRs in cases where the underlying platform gets blocked forever, so we enable this feature by default. Issue: #4352 PiperOrigin-RevId: 335642485
We now enabled a change that times out the release call so that it's less likely that the app runs into ANRs. Instead the player will report an error through The underlying platform issue for this ANR is solved from Android 11, and this workaround is probably all we can do for the existing Android versions. |
Issue description
According to log reports from users, after upgrade to r2.8.1 (from r2.6.1), on about 1% of app sessions we have ANR on SimpleExoPlayer.release().
Never seen this issue (reports) on versions <=r2.6.1.
Reproduction steps
Play some stream and then call SimpleExoPlayer.release(), called in main thread.
Link to test content
Unknown for a while.
Version of ExoPlayer being used
r2.8.1.
Device(s) and version(s) of Android being used
Different devices and Android versions: Nvidia Shield TV, Xiaomi Mi-Box, etc.
A full bug report captured from the device
The text was updated successfully, but these errors were encountered: