New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create singleton ExceptionSink object to centralize logging of fatal errors #6533

Merged
merged 4 commits into from Sep 20, 2018

Conversation

Projects
None yet
2 participants
@cosmicexplorer
Copy link
Contributor

cosmicexplorer commented Sep 19, 2018

Problem

See #6530. We want to make it simpler to understand where logging of fatal errors occurs and to consume those logs in a structured way.

Solution

  • First step in implementing #6530, a (process-)global singleton to manage where logs are kept (no logging is done yet).
  • Add some very basic validation of input and very basic tests.

Result

We now have an object that can be used to centralize the logging of fatal errors and enables the creation of a single point of failure (by which I mean, a single area of code which describes where this process's exceptions are logged and how to access and append to them), to be implemented in followup PRs.

cosmicexplorer added some commits Sep 19, 2018

@cosmicexplorer cosmicexplorer requested a review from stuhood Sep 19, 2018

@stuhood
Copy link
Member

stuhood left a comment

Thanks! I think moving Exiter._log_exception to this class blocks landing this, but it looks fine otherwise.

return cls()

def __init__(self):
self._cached_destination = None

This comment has been minimized.

@stuhood

stuhood Sep 19, 2018

Member

This should probably default to cwd so that it is always useable, even before the destination has been set the first time.

Also I'm not sure it's really a "cached" destination... maybe just "destination".

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 19, 2018

Contributor

That sounds great, implementing this and the below now.


class ExceptionSinkError(Exception): pass

def _get_timestamp(self):

This comment has been minimized.

@stuhood

stuhood Sep 19, 2018

Member

Rather than adding timestamp logic, I'd recommend moving the Exiter._log_exception method into this class.

And I think that if you guarantee that self._cached_destination is always "something" by initializing it to cwd, then you should be able to have set_destination log to the previous destination if it fails.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 20, 2018

Contributor

I have done that and added thorough testing in c562a49!

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor

cosmicexplorer commented Sep 20, 2018

Moving Exiter._log_exception() to ExceptionSink was larger in scope (conceptually, in my head) than I expected this PR to be, but it works perfectly -- the implementation is now just:

  def _log_exception(self, msg):
    if self._workdir:
      ExceptionSink.instance.set_destination(self._workdir)
    ExceptionSink.instance.log_exception(msg)

It's great.

@stuhood
Copy link
Member

stuhood left a comment

Looks good, thanks!

Please: add the per-pid logs before landing.

class ExceptionSink(object):
"""A mutable singleton object representing where exceptions should be logged to."""

@memoized_classproperty

This comment has been minimized.

@stuhood

stuhood Sep 20, 2018

Member

A pattern we've used elsewhere to do the "mutable singleton" thing would be to ban calls to __new__, and then put all properties directly on the class.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 20, 2018

Contributor

Done!

return cls()

def __init__(self, destination=None):
# TODO: what assumptions can we make about the current directory of a process? Does it have to

This comment has been minimized.

@stuhood

stuhood Sep 20, 2018

Member

The point of initializing it to something is just to increase the chances that even before it's been configured that we're still able to capture exceptions somewhere... so we're not really make assumptions, other than trying to increase our chances. The goal isn't every to actually use it without configuring it... rather to just still catch errors if we happen to fail that early.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 20, 2018

Contributor

Yes, my concern here was more "can we assume we can write to os.getcwd()?"

This comment has been minimized.

@stuhood

stuhood Sep 20, 2018

Member

We can't, but that's not cromulent given the above.

def _iso_timestamp_for_now(self):
return datetime.datetime.now().isoformat()

# NB: This includes a trailing newline, but no leading newline.

This comment has been minimized.

@stuhood

stuhood Sep 20, 2018

Member

Making the logs structured is fine. But we are still going to want to have a separate log per $pid. The client shouldn't have to parse anything... just look for a log with the relevant pid.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 20, 2018

Contributor

Done!

@stuhood
Copy link
Member

stuhood left a comment

Thanks!

@classmethod
def set_destination(cls, dir_path):
if not is_writable_dir(dir_path):
# TODO: when this class sets up excepthooks, raising this should be safe, because we always

This comment has been minimized.

@stuhood

stuhood Sep 20, 2018

Member

This could just be a note.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 21, 2018

Contributor

I can see you've removed it in #6539, sounds great.

This comment has been minimized.

@stuhood

stuhood Sep 21, 2018

Member

If you need any of this in your next PR, let me know and I can backport it. Or feel free to just cherry-pick to your branch.

This comment has been minimized.

@stuhood

stuhood Sep 21, 2018

Member

Alternatively, I could land #6539 with the test marked xfail.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 21, 2018

Contributor

I'll cherry-pick whatever I need, I was thinking -- the changes are mostly orthogonal to what I have now. Landing with it marked xfail is a-ok to me as long as the changes don't break other tests (wouldn't think so).

class ExceptionSink(object):
"""A mutable singleton object representing where exceptions should be logged to."""

_destination = os.getcwd()

This comment has been minimized.

@stuhood

stuhood Sep 20, 2018

Member

Should probably explain why we do this ("blah blah increase the chances that if an exception occurs early in initialization we still record it somewhere").

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 21, 2018

Contributor

Absolutely agree, this is highly mysterious as is.

# avoids any errors with concurrent modification of the shared log affecting the per-pid log.
with safe_open(cls._exceptions_log_path(for_pid=pid), 'a') as pid_error_log:
pid_error_log.write(fatal_error_log_entry)
# TODO: we should probably guard this against concurrent modification somehow.

This comment has been minimized.

@stuhood

stuhood Sep 20, 2018

Member

I think that because this is a single write call it's fairly likely it goes to the log atomically. Not sure though.

This comment has been minimized.

@cosmicexplorer

cosmicexplorer Sep 21, 2018

Contributor

The implication from a few minutes of searching is that people have lots of ways to create a file atomically, but appending atomically is a less common use case. I might check the CPython source for fun later. My first question would be: if I open a file to append with mode 'a', does python seek to the end of the file in the open call, or in the write() call? I would assume the latter and since this is a small message I could see atomicity. I will flesh out this TODO somewhat in the followup PR.

@stuhood stuhood merged commit 17ddd39 into pantsbuild:master Sep 20, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

stuhood added a commit that referenced this pull request Sep 22, 2018

pantsd client logs exceptions from server processes (#6539)
### Problem

#6530 describes cases where when `pantsd` (or a `pantsd-runner` instance) is killed, there can be gaps in our exception logging (and rendering) that result in exceptions not reaching either the client or a log.

### Solution

Build atop #6533 to consume written logs and render them for a `pantsd` client. When either `pantsd` or a `pantsd-runner` instance die, if the `ExceptionSink` (or `Exiter`) has installed an exception hook that renders the exception to the appropriate `pid`, it will be logged, and rendered to the user.

### Result

Killed `pantsd-runner` or `pantsd` processes have increased visibility and debuggability.

cosmicexplorer added a commit that referenced this pull request Oct 3, 2018

create singleton ExceptionSink object to centralize logging of fatal …
…errors (#6533)

### Problem

See #6530. We want to make it simpler to understand where logging of fatal errors occurs and to consume those logs in a structured way.

### Solution

- First step in implementing #6530, a (process-)global singleton to manage where logs are kept (no logging is done yet).
- Add some very basic validation of input and very basic tests.

### Result

We now have an object that can be used to centralize the logging of fatal errors and enables the creation of a single point of failure (by which I mean, a single area of code which describes where this process's exceptions are logged and how to access and append to them), to be implemented in followup PRs.

cosmicexplorer added a commit that referenced this pull request Oct 3, 2018

pantsd client logs exceptions from server processes (#6539)
### Problem

#6530 describes cases where when `pantsd` (or a `pantsd-runner` instance) is killed, there can be gaps in our exception logging (and rendering) that result in exceptions not reaching either the client or a log.

### Solution

Build atop #6533 to consume written logs and render them for a `pantsd` client. When either `pantsd` or a `pantsd-runner` instance die, if the `ExceptionSink` (or `Exiter`) has installed an exception hook that renders the exception to the appropriate `pid`, it will be logged, and rendered to the user.

### Result

Killed `pantsd-runner` or `pantsd` processes have increased visibility and debuggability.

cosmicexplorer added a commit that referenced this pull request Oct 10, 2018

create singleton ExceptionSink object to centralize logging of fatal …
…errors (#6533)

### Problem

See #6530. We want to make it simpler to understand where logging of fatal errors occurs and to consume those logs in a structured way.

### Solution

- First step in implementing #6530, a (process-)global singleton to manage where logs are kept (no logging is done yet).
- Add some very basic validation of input and very basic tests.

### Result

We now have an object that can be used to centralize the logging of fatal errors and enables the creation of a single point of failure (by which I mean, a single area of code which describes where this process's exceptions are logged and how to access and append to them), to be implemented in followup PRs.

cosmicexplorer added a commit that referenced this pull request Oct 10, 2018

pantsd client logs exceptions from server processes (#6539)
### Problem

#6530 describes cases where when `pantsd` (or a `pantsd-runner` instance) is killed, there can be gaps in our exception logging (and rendering) that result in exceptions not reaching either the client or a log.

### Solution

Build atop #6533 to consume written logs and render them for a `pantsd` client. When either `pantsd` or a `pantsd-runner` instance die, if the `ExceptionSink` (or `Exiter`) has installed an exception hook that renders the exception to the appropriate `pid`, it will be logged, and rendered to the user.

### Result

Killed `pantsd-runner` or `pantsd` processes have increased visibility and debuggability.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment