Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7927 unhandled error tb v2 #1022

Closed

Conversation

jlitzingerdev
Copy link
Contributor

@jlitzingerdev jlitzingerdev commented Jun 3, 2018

Contributor Checklist:

This changeset addresses ticket 7927. Much of this code started with a PR for ticket 9333 [1], now closed as a duplicate. In the interest of preserving the review comments and feedback, the original commit changes and messages are preserved, while the content has been rebased onto trunk. I'm happy to clean the history up as needed.

Another key part of this is the creation of a new API instead of modifying formatEvent() directly. The rationale is that modifying formatEvent will likely result in duplicate tracebacks for any users who have made changes to append tracebacks to the exising result of formatEvent.

[1] #939

This API will format log events and append a traceback if the
log_failure key is present.  The existing formatEvent function cannot
be used without altering its semantics and breaking at least one
consumer (users (e.g. textFileLogObserver).

This API is intentionally unexported from twisted.logger.  If it has
value as a public API it can be exported later.
Currently, events from the new log system with a log_failure key do
not show tracebacks.  The formatEvent API does not append tracebacks,
so use the formatEventWithTraceback API to ensure tracebacks are printed.
Caught during review, the code as-is would catch everything, including
SystemExit, which was not the intent.
Changes requested in review.
Changes made as a result of review feedback.
Requested during review.
The review comment requested removal of the Unformattable() object in
the event key as it was extraneous to the test.  This is true, but it
does represent a valid test path on its own -- the path where the
object throws an exception when __repr__ is invoked.  Rather than
discard this case, I split the test into two tests.
These changes were requested during review, they ensure the behavior
matches the legacy log system.
Two line length issues, and one missing blank line between class methods.
Formatted log events must be unicode, but on Python2,
failure.getTraceback() returns bytes.  While UTF-8 will work for ascii
compatible encodings, other encodings would raise a UnicodeDecodeError
and lose valuable traceback information.  To make every effort to
prevent loss of traceback data, a replacment error strategy is used
for decoding.
Currently, formatEvent does not handle tracebacks, yet it is commonly
used to format log events.  A common approach to dealing with this
shortcoming is to append a traceback to the result of formatEvent.
This is the approach taken by formatEventAsClassicLogText, and likely
some third party code as well.  Modifying the behavior of formatEvent
will cause duplicate tracebacks, which, while not necessarily the
worst error, will be confusing and is probably best avoided.

The alternate approach is a new API that allows code that calls
formatEvent directly to migrate deliberately to the new API.
In addition to fixing the case where beginLoggingTo is not called, a
new API is exposed for formatting events with traceback.
The references followed both in-tree and in PEP-263 have a closing
-*-, so should test_format.
Ensure the user is aware of both the legacy formatEvent API and the
new formatEventWithTraceback API.  Recommend the latter unless
tracebacks aren't important.
This allows the API to declare when it was introduced when a release
is generated.
@codecov
Copy link

codecov bot commented Jun 3, 2018

Codecov Report

Merging #1022 into trunk will decrease coverage by 0.29%.
The diff coverage is 100%.

@@            Coverage Diff            @@
##            trunk    #1022     +/-   ##
=========================================
- Coverage   91.84%   91.55%   -0.3%     
=========================================
  Files         844      844             
  Lines      150593   150695    +102     
  Branches    13141    13146      +5     
=========================================
- Hits       138318   137974    -344     
- Misses      10178    10606    +428     
- Partials     2097     2115     +18

Copy link
Member

@glyph glyph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some reservations so I'm not merging - but ultimately I do think this is in mergeable shape and it's just up to you how much you want to address :).

traceback = failure.getTraceback()
if isinstance(traceback, bytes):
traceback = traceback.decode('utf-8', errors='replace')
except Exception as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be BaseException; we really want to catch everything here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up refactoring this based on a review from @markrwilliams (#939 (comment)). Originally I had bare except, specifically to catch everything, is there a reason BaseException is preferred? To me the bare except more clearly communicates "everything", in part because it is normally avoided. Just curious, I'll use BaseException in the update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some linters complain about it, but more practically speaking: you can't except as e:. So if you want to capture the variable you have to specify some kind of exception, and BaseException is the synonym for except:. (I do wish this were spelled differently, like except * as e or something, but I'm not in charge of python syntax :))

if isinstance(traceback, bytes):
traceback = traceback.decode('utf-8', errors='replace')
except Exception as e:
traceback = u"(unable to obtain traceback from event):" + unicode(e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates logic from formatEventAsClassicLogText; it would be good to refactor these together.

It differs in that it does add the unicode() of the exception though; however, this (the invocation of unicode) should be something like twisted.python.reflect.safe_repr - we really don't want to be bailing out of here with an exception.

I'm OK with deferring both of these fixes to a subsequent branch, but let me know if you want to just fix them here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! I only have a moment right now and wanted to quickly make a few comments.

As to the logical duplication, it does intentionally duplicate some of the logic in formatEventAsClassicLogText, but there is a subtle difference -- this code isn't (I don't think based on tests I added) subject to 9456 (https://twistedmatrix.com/trac/ticket/9456#ticket).

As to why the intentional duplication -- to give a choice to the maintainers that reviewed. Short, contained changes are easy to review and can always be expanded, while massive change-sets are exhausting (to me at least). My preference would be to refactor them together and address your comments now, but I'll also commit to subsequent work if the desire to merge this as-is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good instinct; shorter changesets are indeed preferable.

"""
try:
traceback = failure.getTraceback()
if isinstance(traceback, bytes):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that this happens randomly sometimes feels like a bug in getTraceback. I kinda want to fix it there eventually instead. It looks like this method isn't public though so I'm OK with this strategy for now :)

@@ -136,6 +136,30 @@ def formatTime(when, timeFormat=timeFormatRFC3339, default=u"-"):



def formatFailureTraceback(failure):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a new public API, this needs a @since: Twisted NEXT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My original intent was to use this to subsequently address 9456, but with the new eventAsText api it I see no benefit in keeping this method standalone at all.

@@ -202,7 +202,7 @@ Format strings
~~~~~~~~~~~~~~


Format strings provide observers with a standard way to format an event as text suitable for a human being to read, via the function :api:`twisted.logger.formatEvent <formatEvent>` .
Format strings provide observers with a standard way to format an event as text suitable for a human being to read. Formatting is accomplished using the functions :api:`twisted.logger.formatEvent <formatEvent>` and :api:`twisted.logger.formatEventWithTracebck <formatEventWithTraceback>`. The latter should be preferred unless potential traceback information in the event is not important.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the fact that we now have 3 subtly different formatting functions (formatEvent, formatEventWithTraceback, formatEventAsClassicLogText). These are separate APIs just to preserve the compatibility contract; it's not a thoughtful API surface.

However, in addition to new public API, this is also a serious bugfix, so I don't want to waffle on it much longer.

What I think I'd like to see in the future is a single API: eventAsText, which takes

  • event: the event, obviously
  • includeTraceback=True: include a traceback if log_failure
  • includeTimestamp=True: include a prefixed timestamp
  • includeSystem=True: include the [system] string before the message, but after the timestamp if included

formatEvent, formatEventWithTraceback, and formatEventAsClassicLogText would then just become aliases for various combinations of parameters to this function, and we could just document this function.

We could just skip straight to that endpoint if you wanted to do extra work right now though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with your comments on the thoughtfulness of the API. For personal context, backward compatibility is something that is extremely important to me (code I wrote against 13.x hasn't needed much updating to run on 17.9, that is amazing, kudos to the years of hard work in Twisted), so when I make changes such as this I always err on the side of compatibility...unfortunately, that can lead to this kind of mess.

@glyph I'm in no rush, I'd rather see a better public API if maintainers are open to it, and I like what you've suggested above. One change to your proposal I suggest is, since formatEventWithTraceback was a new API proposed in this PR, it needn't exist as an alias if eventAsText exists with the semantics above. The other APIs are legacy and as such should be preserved.

I'd also like to fix 9456 as well, if there are no objections?

Do you want a new ticket for all of this or can I just use 7927?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we're very much in sync on the compatibility / thoughtfulness stuff. I'd like to get this landed soon, since we've got lots of great traceback changes coming in 18.7, and I'd really like to resolve this additional annoyance quickly following on the heels of that release, but I don't want to rush it especially if you're available to respond to the feedback promptly.

Smaller PRs are better, but, I don't think 9456 would substantially increase the size of this one, and it would be nice to close more tickets, so feel free to pull it in here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also going to throw a mention for @wsanchez on here since this API involved a lot of feedback from him and if we're lucky he'll have a minute to look at the suggestion of eventAsText and see if there are any corner cases I've forgotten.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glyph I should be able to get to it fairly quickly. Obviously things come up, so there's an element out of my hands, but barring that I'd expect the changes soon. A procedure question, where this is moving in a different direction -- do you want me to use the same PR or start fresh with a new branch and clean history?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dealer's choice - just make it clear on the ticket where the code is when you put it back into review.

@jlitzingerdev jlitzingerdev mentioned this pull request Jul 12, 2018
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants