Skip to content
This repository was archived by the owner on Nov 1, 2022. It is now read-only.

Conversation

@jonalmeida
Copy link
Contributor

Previously, we wanted to throw on all unknown push errors so that we
were notified on them. Since this seems to be more common than
originally expected, we should just catch them and in a future version,
we should log them without crashing.

All of these push errors can be considered recoverable except
for InternalPanic.


Spoke offline with @jrconlin, and it's safe to not crash on these errors.

Pull Request checklist

  • Quality: This PR builds and passes detekt/ktlint checks (A pre-push hook is recommended)
  • Tests: This PR includes thorough tests or an explanation of why it does not
  • Changelog: This PR includes a changelog entry or does not need one
  • Accessibility: The code in this PR follows accessibility best practices or does not include any user facing features

After merge

  • Milestone: Make sure issues closed by this pull request are added to the milestone of the version currently in development.
  • Breaking Changes: If this is a breaking change, please push a draft PR on Reference Browser to address the breaking issues.

@jonalmeida jonalmeida added the 🕵️‍♀️ needs review PRs that need to be reviewed label Jan 22, 2020
@jonalmeida jonalmeida requested a review from grigoryk January 22, 2020 18:47
@codecov
Copy link

codecov bot commented Jan 22, 2020

Codecov Report

Merging #5679 into master will increase coverage by 1.9%.
The diff coverage is 88%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #5679      +/-   ##
============================================
+ Coverage     79.76%   81.66%    +1.9%     
+ Complexity     5209     1597    -3612     
============================================
  Files           565      193     -372     
  Lines         25107     7665   -17442     
  Branches       3762      988    -2774     
============================================
- Hits          20026     6260   -13766     
+ Misses         3669      982    -2687     
+ Partials       1412      423     -989
Impacted Files Coverage Δ Complexity Δ
...mozilla/components/feature/push/AutoPushFeature.kt 96.21% <87.5%> (+4.27%) 30 <1> (+1) ⬆️
...a/mozilla/components/concept/push/PushProcessor.kt 95.83% <88.88%> (-4.17%) 0 <0> (ø)
.../components/concept/engine/prompt/PromptRequest.kt 73.33% <0%> (ø) 0% <0%> (ø) ⬇️
...ts/feature/pwa/feature/WebAppHideToolbarFeature.kt 58.82% <0%> (ø) 8% <0%> (ø) ⬇️
...a/mozilla/components/concept/engine/media/Media.kt 72.72% <0%> (ø) 4% <0%> (ø) ⬇️
...ozilla/components/feature/downloads/ext/Context.kt 0% <0%> (-4.77%) 0% <0%> (ø)
...ents/service/fretboard/source/kinto/KintoClient.kt 94.73% <0%> (ø) 9% <0%> (ø) ⬇️
...wser/session/engine/request/LoadRequestMetadata.kt 100% <0%> (+8.33%) 6% <0%> (ø) ⬇️
...ava/mozilla/components/feature/pwa/ext/Activity.kt 100% <0%> (ø) 0% <0%> (ø) ⬇️
...n/java/mozilla/components/feature/qr/QrFragment.kt 45.97% <0%> (ø) 24% <0%> (ø) ⬇️
... and 529 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e1d27a2...c712dad. Read the comment docs.

crashReporter?.submitCaughtException(GenericPushError(error.desc))
}
}
crashReporter?.submitCaughtException(error)
Copy link
Contributor

@grigoryk grigoryk Jan 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm don't think this is right - and your changes to PushError. Questions:

  • If we get a PushError.Rust, will Sentry/Soccorro display the cause exception? Before we submitted it explicitly, now we're assuming it'll somehow show up? Can you check?
  • I think we will drop desc on the floor for all other exceptions. You're not overriding Exception's message, nor are you passing desc in Exception's constructor, nor do you have your own toString impl.

TBH, I'd just keep things as they were.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These questions were answered offline. I've overridden the message for extra certainty and for the RustError we override the throwable as well.

is TranscodingError,
is RecordNotFoundError,
is UrlParseError -> false
else -> true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, new exception types will throw? Not sure that's worth creating us extra work in the future (to keep this list up-to-date). Either way, these exceptions will end-up in Sentry - either as infos, or as fatal.

There's an idea in building robust software systems - be lenient with what you accept, and strict with what you send out. I think it applies here. We're handling an external message, subject to all sorts of unexpected issues outside of our control. If we crash here in response to a bad message, we're not gaining much! So, maybe our push stack got into a pickle and doesn't function correctly (either due to a server issue, or a client issue, or some network corruption, or..) - what do we gain by crashing here? The rest of the app is, most likely, perfectly functional. You'll just be getting in a user's way.

As long as we're aware of these problems, we can hopefully fix them - and that's what submitCaughtException already buys you. Submit these would-be crashes as infos, and let the app live :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the original discussion, we wanted to crash on newly added exceptions but we had to list them all out, so we opted for a whitelist instead. Now that we know that we don't need to crash on fatal errors, we're just catching and submitting them and we're at least aware of changes to the system this way.

Are you suggesting we don't crash at all? The exceptions that would go through here would be non-push specific (besides InternalPanic) since we're only logging them with submitCaughtException.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's differentiate between error handling for onMessageReceived and for other types of exceptions (one way is to add onMessageReceivedError to AutoPushFeature in addition to onError - but see last paragraph).

And not crash at all (but rather, log and submit via crash reporter) in case of those types of errors. The other errors we can be more crashy with, I think.

Underlying idea is that it doesn't feel right to me to crash in case of a failure of a fairly isolated system due to an external event.

Also, you'll probably notice that there's currently a bit of a circle dance around error handling - e.g. AbstractFirebasePushService.onMessageReceived -> AutopushFeature.onMessageReceived -> throws -> AbstractFirebasePushService catches -> AutopushFeature.onError.

It seems like this could be simplified. e.g. AutopushFeature.onMessageReceived can just deal with exceptions directly, and if it throws, consumers should re-throw.

Copy link
Contributor

@grigoryk grigoryk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, our conversation about never throwing in onMessageReceived notwithstanding :) I'm fine with either direction, but would prefer to never see fatal crashes coming out of that method.

Previously, we wanted to throw on all unknown push errors so that we
were notified on them. Since this seems to be more common than
originally expected, we should just catch them and in a future version,
we should log them without crashing.

All of these push errors can be considered recoverable except
for InternalPanic.
@jonalmeida
Copy link
Contributor Author

LGTM, our conversation about never throwing in onMessageReceived notwithstanding :) I'm fine with either direction, but would prefer to never see fatal crashes coming out of that method.

I've filed #5691 as follow-up.

@jonalmeida
Copy link
Contributor Author

bors r=grigoryk

@jonalmeida
Copy link
Contributor Author

bors status

bors bot pushed a commit that referenced this pull request Jan 23, 2020
5679: Closes #5677: Catch all known non-fatal push errors r=grigoryk a=jonalmeida

Previously, we wanted to throw on all unknown push errors so that we
were notified on them. Since this seems to be more common than
originally expected, we should just catch them and in a future version,
we should log them without crashing.

All of these push errors can be considered recoverable except
for InternalPanic.




5683: Closes #5682: Remove failedToLaunch property from AppLinksUseCases r=NotWoods a=rocketsroger



5688: Closes #5684: Intermittent failures in WebExtensionBrowserMenuItemTest r=Amejia481,psymoon a=csadilek



Co-authored-by: Jonathan Almeida <jalmeida@mozilla.com>
Co-authored-by: Roger Yang <royang@mozilla.com>
Co-authored-by: Christian Sadilek <christian.sadilek@gmail.com>
@bors
Copy link

bors bot commented Jan 23, 2020

Build succeeded

  • complete-push

@bors bors bot merged commit c712dad into mozilla-mobile:master Jan 23, 2020
@jonalmeida jonalmeida deleted the issue-5677 branch December 15, 2020 00:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

🕵️‍♀️ needs review PRs that need to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants