- 
                Notifications
    
You must be signed in to change notification settings  - Fork 476
 
Closes #5677: Catch all known non-fatal push errors #5679
Conversation
          Codecov Report
 @@             Coverage Diff              @@
##             master    #5679      +/-   ##
============================================
+ Coverage     79.76%   81.66%    +1.9%     
+ Complexity     5209     1597    -3612     
============================================
  Files           565      193     -372     
  Lines         25107     7665   -17442     
  Branches       3762      988    -2774     
============================================
- Hits          20026     6260   -13766     
+ Misses         3669      982    -2687     
+ Partials       1412      423     -989
 Continue to review full report at Codecov. 
  | 
    
| crashReporter?.submitCaughtException(GenericPushError(error.desc)) | ||
| } | ||
| } | ||
| crashReporter?.submitCaughtException(error) | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm don't think this is right - and your changes to PushError. Questions:
- If we get a PushError.Rust, will Sentry/Soccorro display the 
causeexception? Before we submitted it explicitly, now we're assuming it'll somehow show up? Can you check? - I think we will drop 
descon the floor for all other exceptions. You're not overridingException's message, nor are you passingdescinException's constructor, nor do you have your owntoStringimpl. 
TBH, I'd just keep things as they were.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These questions were answered offline. I've overridden the message for extra certainty and for the RustError we override the throwable as well.
| is TranscodingError, | ||
| is RecordNotFoundError, | ||
| is UrlParseError -> false | ||
| else -> true | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, new exception types will throw? Not sure that's worth creating us extra work in the future (to keep this list up-to-date). Either way, these exceptions will end-up in Sentry - either as infos, or as fatal.
There's an idea in building robust software systems - be lenient with what you accept, and strict with what you send out. I think it applies here. We're handling an external message, subject to all sorts of unexpected issues outside of our control. If we crash here in response to a bad message, we're not gaining much! So, maybe our push stack got into a pickle and doesn't function correctly (either due to a server issue, or a client issue, or some network corruption, or..) - what do we gain by crashing here? The rest of the app is, most likely, perfectly functional. You'll just be getting in a user's way.
As long as we're aware of these problems, we can hopefully fix them - and that's what submitCaughtException already buys you. Submit these would-be crashes as infos, and let the app live :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the original discussion, we wanted to crash on newly added exceptions but we had to list them all out, so we opted for a whitelist instead. Now that we know that we don't need to crash on fatal errors, we're just catching and submitting them and we're at least aware of changes to the system this way.
Are you suggesting we don't crash at all? The exceptions that would go through here would be non-push specific (besides InternalPanic) since we're only logging them with submitCaughtException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's differentiate between error handling for onMessageReceived and for other types of exceptions (one way is to add onMessageReceivedError to AutoPushFeature in addition to onError - but see last paragraph).
And not crash at all (but rather, log and submit via crash reporter) in case of those types of errors. The other errors we can be more crashy with, I think.
Underlying idea is that it doesn't feel right to me to crash in case of a failure of a fairly isolated system due to an external event.
Also, you'll probably notice that there's currently a bit of a circle dance around error handling - e.g. AbstractFirebasePushService.onMessageReceived -> AutopushFeature.onMessageReceived -> throws -> AbstractFirebasePushService catches -> AutopushFeature.onError.
It seems like this could be simplified. e.g. AutopushFeature.onMessageReceived can just deal with exceptions directly, and if it throws, consumers should re-throw.
d843041    to
    6b7243c      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, our conversation about never throwing in onMessageReceived notwithstanding :) I'm fine with either direction, but would prefer to never see fatal crashes coming out of that method.
Previously, we wanted to throw on all unknown push errors so that we were notified on them. Since this seems to be more common than originally expected, we should just catch them and in a future version, we should log them without crashing. All of these push errors can be considered recoverable except for InternalPanic.
6b7243c    to
    c712dad      
    Compare
  
    
          
 I've filed #5691 as follow-up.  | 
    
| 
           bors r=grigoryk  | 
    
| 
           bors status  | 
    
5679: Closes #5677: Catch all known non-fatal push errors r=grigoryk a=jonalmeida Previously, we wanted to throw on all unknown push errors so that we were notified on them. Since this seems to be more common than originally expected, we should just catch them and in a future version, we should log them without crashing. All of these push errors can be considered recoverable except for InternalPanic. 5683: Closes #5682: Remove failedToLaunch property from AppLinksUseCases r=NotWoods a=rocketsroger 5688: Closes #5684: Intermittent failures in WebExtensionBrowserMenuItemTest r=Amejia481,psymoon a=csadilek Co-authored-by: Jonathan Almeida <jalmeida@mozilla.com> Co-authored-by: Roger Yang <royang@mozilla.com> Co-authored-by: Christian Sadilek <christian.sadilek@gmail.com>
          Build succeeded
  | 
    
Previously, we wanted to throw on all unknown push errors so that we
were notified on them. Since this seems to be more common than
originally expected, we should just catch them and in a future version,
we should log them without crashing.
All of these push errors can be considered recoverable except
for InternalPanic.
Spoke offline with @jrconlin, and it's safe to not crash on these errors.
Pull Request checklist
After merge