Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gracefully handle WebView Crashes #5820

Conversation

david-allison
Copy link
Member

@david-allison david-allison commented Mar 14, 2020

Purpose / Description

Out of Memory issues in the WebView Renderer (e.g. large fonts) were causing the application to crash and restart, going to the deck browser. This was listed as a soft crash in the bug report, but was actually killing the PID.

If a card fails once, we reset the WebView and try to render the card again.

If a card has not been rendered, or if the same card fails twice, we fail gracefully and return to the deck browser. We show a Toast to the user explaining that an OOM occurred.

Fixes

Fixes #5780

Approach

We catch the OOM using onRenderProcessGone() (API >= 26) and recreate the WebView to allow the application to continue with minor disruption.

How Has This Been Tested?

Locally on my Android device (Android 9). Placing crashWebViewRenderer() inside displayCardAnswer()

  • A single failure recreates the WebView
  • A second failure on the same card takes the user to the Deck Browser.
  • A failure on a different card recreates the WebView.

Learning (optional, can help others)

StackOverflow example: https://stackoverflow.com/questions/46737984/onrenderprocessgonewebview-view-renderprocessgonedetail-detail-example

API: https://developer.android.com/reference/android/webkit/WebViewClient#onRenderProcessGone(android.webkit.WebView,%20android.webkit.RenderProcessGoneDetail)

https://developer.android.com/guide/webapps/managing-webview#termination-handle

https://chromium.googlesource.com/chromium/src/+/master/android_webview/browser/aw_browser_terminator.cc#93

Checklist

  • [ x ] You have not changed whitespace unnecessarily (it makes diffs hard to read)
  • [ x ] You have a descriptive commit message with a short title (first line, max 50 chars).
  • [ x ] Your code follows the style of the project (e.g. never omit braces in if statements)
  • [ x ] You have commented your code, particularly in hard-to-understand areas
  • [ x ] You have performed a self-review of your own code

Review Goals

  • I'm unfamiliar with internationalisation. May have screwed up
  • Do we need to do anything more to handle lower API levels?
  • inflateNewView - doesn't feel like good code.
  • We currently use Toasts because they're easy, but they don't stay on the screen for long or have much room.
  • lastCrashingCardId - may not be a perfect test.
  • We only report the failing card ID to the user. We could do better (not sure if out of scope).

Out of Memory issues in the WebView Renderer (e.g. large fonts)
were causing the application to crash and restart, going to the deck
browser.

We catch the OOM using `onRenderProcessGone()` (API >= 26) and recreate
the WebView to allow the application to continue with minor disruption.

If a card has not been rendered, or if the same card fails twice, we
fail gracefully and return to the deck browser.
@david-allison david-allison force-pushed the issue-5780-webview-render-crash-design-review branch from 454b1f7 to 9cc0244 Compare March 14, 2020 18:20
@mikehardy
Copy link
Member

Wow, this is great - the re-render attempt is a nice touch.

  • I18N looks fine
  • I am sad it is API26 and higher only but at least Android upgrades are becoming more common, and this covers as many as is possible anyway, can only do what we can do. I'm not aware of any API that covers lower versions though this might be where trying to figure out why the reviews were showing up as new again as interesting, under API26 this will keep happening
  • inflateNewView does seem a little chewy but in the absence of better ideas I have no problem with tested code that fixes WebView crashes since that's definitely our Update 2.0.1 branch from flerda #1 crash at this point
  • we have a choice of Toasts (long or short), Snackbars, and Dialogs. A long toast actually does hang around for a while, and I think if someone has cards (or a device) where the WebView crashes they will see this repeatedly and a long toast might be good enough. I think you chose the long toast, so that works for me
  • perhaps the last crash card id should be a member, persisted across Activity restarts (onSaveInstanceState etc) but otherwise the mechanism seems like a solid idea, but yes with care that we know the actual id the WebView crashed on, and know what card is loaded and is currently attempting to display

@david-allison
Copy link
Member Author

I'm not aware of any API that covers lower versions

Likewise, it's a shame

though this might be where trying to figure out why the reviews were showing up as new again as interesting, under API26 this will keep happening

I didn't see this when I studied.

Theory: New card data is stored in col, which is saved at a different time from cards (grep: col.save()). Crashing the application loses this data. I think new cards are more likely to have a backlog greater than the daily setting, whereas reviews, learnings and lapses typically provide all possible cards at once.

The effect would be that the "new cards studied today" count is reset, not that reviews are being lost/undone.

we have a choice of Toasts (long or short), Snackbars, and Dialogs. A long toast actually does hang around for a while, and I think if someone has cards (or a device) where the WebView crashes they will see this repeatedly and a long toast might be good enough. I think you chose the long toast, so that works for me

Perfect! I chose a long toast, but the other options sound better.

I'd like to turn this into a Dialog, and display a short preview of the first few words of the first field of the card (obtained via Spannable.fromHtml, or the text used by the Card Browser).

Will screenshot to confirm it's appropriate.


perhaps the last crash card id should be a member, persisted across Activity restarts (onSaveInstanceState etc) but otherwise the mechanism seems like a solid idea, but yes with care that we know the actual id the WebView crashed on, and know what card is loaded and is currently attempting to display

Sounds solid at first thought.

Documentation states we should not try to clean up unrelated instances
of WebViews.

Returning false will cause a crash. Therefore return true to state
that we don't need to crash.
@david-allison
Copy link
Member Author

@mikehardy Coming back to this as my next priority.

What's the git strategy with review comments?

Append + Squash merge, or should I rebase?

It was an error which showed the card details (id). It's best to allow
the user to explicitly dismiss it, as they may want to screenshot the
details.
Sometimes, even after writing a handler, the application crashed.

I believe that the C++ counts Java references when determining what to
check when a WebView process dies. We perform a GC to ensure these are
cleaned up by the time te C++ executes.
@david-allison
Copy link
Member Author

david-allison commented Mar 16, 2020

Review comments should be fixed, except

perhaps the last crash card id should be a member, persisted across Activity restarts (onSaveInstanceState etc) but otherwise the mechanism seems like a solid idea, but yes with care that we know the actual id the WebView crashed on, and know what card is loaded and is currently attempting to display

I feel there's minimal value in this. If we're in a crash loop, it won't need user input to trigger. It seems like added complexity for 1 less iteration of the crash loop before returning to the browser.


That all being said, say the word and I'll do it.

@mikehardy
Copy link
Member

@mikehardy Coming back to this as my next priority.

Yeah - you have a lot of plates spinning :)

What's the git strategy with review comments?

Append + Squash merge, or should I rebase?

In general we want a squeaky clean commit history. So when I'm working on AnkiDroid stuff I'm always doing rebase HEAD~N, fixing things, and re-force-pushing.

For final merge if the commit history is really clean (via the above strategy) I just rebase and merge, if it's messy I squash and merge

Copy link
Member

@mikehardy mikehardy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 nits and one testing remnant (I think) - but this is looking really good, I'm excited to merge this

@mikehardy
Copy link
Member

Excellent - just waiting on CI then. I'm obviously motivated by crash stats at the moment, the impact this PR should have on the stats should be incredible, even with just the higher APIs supporting it.

@david-allison
Copy link
Member Author

@mikehardy Do you have a list of most impactful crash bugs? Would be great to power through these over the next week.

@mikehardy mikehardy changed the title Fix #5780 - WebView OOM Crashes Application Gracefully handle WebView Crashes Mar 16, 2020
@mikehardy mikehardy merged commit 9c12b9a into ankidroid:master Mar 16, 2020
@mikehardy
Copy link
Member

Honestly I will have to re-evaluate after this OOM fix, and the Custom Study Dialog fixes go out. It will take a week or so before I'm prepared to point at any issue now as I'm not sure there are any left that will have significant impact but it's hard to say. The last couple months has gotten almost all of them.

If you really want to look at something impactful you might take on the zombiest issue of all time #1377 - a WYSIWYG editor mode. I'm sure there's a component for that somewhere (there has to be?) that can be bolted on? Not having the ability to do basic bold / underline / italic is a huge issue for many.

@mikehardy
Copy link
Member

User report - success!

I just got the update a little while earlier, while doing my usual routine it still seems like crashing sometimes like earlier, but then it doesn't really crash and resumes like normal just showing a pop-up in the lower part of the screen (webview renderer something, I didn't bother to actually write it up because I think you already know what it is), and most importantly my new cards never gets reset, it's basically normal operation with an hiccup here and there.
Thanks again, from what I can see the bothersome part of this problem has been fixed!

@mikehardy
Copy link
Member

Incidentally, this covers 58.6% of our userbase already (API26 / Android 8+)

@david-allison david-allison deleted the issue-5780-webview-render-crash-design-review branch March 22, 2020 09:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Soft crash to decklist during reviews.
2 participants