Skip to content
This repository has been archived by the owner on Mar 26, 2021. It is now read-only.

ensure logging is consistent policy- and mechanism-wise #35

Closed
phritz opened this issue Apr 15, 2020 · 0 comments
Closed

ensure logging is consistent policy- and mechanism-wise #35

phritz opened this issue Apr 15, 2020 · 0 comments
Assignees

Comments

@phritz
Copy link
Contributor

phritz commented Apr 15, 2020

Background:

Problem:

  • when running esp in production we want to have a clear signal that something truly unexpected has gone wrong and a developer should look at the problem. often by convention this signal is an ERROR log line. we don't have a consistent policy or mechanism for making this signal. (note that what we do with an ERROR can be context dependent. maybe in a server in production we log it and have an alert notify us. or when in development of the client we panic as a way to require the issue not be ignored.)
  • similarly, since we haven't had the need for conventions around logging there is no way to tune how spammy or what kind of log information is desired beyond the noms verbose setting. or even to know what is or is not appropriate to log. some conventions would be helpful here, as well as mechanisms. for example we may want to log verbosely in the client for development but only log errors in production so we don't fill up their phone. or we might want logs to normally be quiet to save money but to be able to enable spewing debugging information while we're tracking down a problem.
  • noms as a library takes an opinion about how logging should happen: it says logging should use golang's log package and callers can't change that. if a consumer of noms wants to log differently, say using a more featureful logger, a structure dlogger, or adding some additional context to log lines, it can't because there is no mechanism. concretely if say replicache wanted to annotate log lines with a requestid, it can't do that to noms lines, they're inaccessible to the caller. (libraries often solve this problem by defining the log interface they expect and enabling callers to pass in a thingy that maches.)

I think it's also important to look down the road at problems that are likely coming soon on the server side:

  • when using a cloud logging service like we must with zeit...
    • these services often work better if our server emits structured logs (eg, json). this makes searching the logs easier because the logging service doesn't have to be taught how to parse log lines and it can infer data types. for example a structured json log field emitted as an integer is typically immediately available for range queries without having to tell the log service to parse out /lastMutationId: (\d+)/ and treat $1 as an Integer. (note that IF we had structured logging this doesn't mean we would HAVE to use structured logs everywhere, eg when developing locally; it's easy enough to change log formats depending on environment)
    • we are likely to want to include meta information in log lines that enables us to aggregate across requests eg the accountid or path requested, or to correlate across process boundaries eg a requestid enabling us to locate the log lines from the server from a particular client request. it is annoying and error prone to try to include this information manually in each log line so often a context logger is created for the duration of a request that automatically annotates log lines with this information.
  • we will likely want to start tracking timing of operations and suboperations. this is often easiest to roll into something like a context logger, though there are many strategies.

High-level Proposal

Let's create a logging policy and just a little more structure to give us a place to add/tweak logging behavior that we know is coming. Let's do this in a way that:

  • is as simple as possible and easy to use
  • is easily satisfiable by the plain vanilla golang 'log'
  • introduces only small changes to the code
  • gives us a place to to put logging logic should we need it

Sketch of a Proposal

  1. Establish three log levels:

Having the ERROR level makes it easy to do something with this important signal. Having INFO makes it easy to know what to show by default. Having DEBUG makes it possible to spew lots of information when you need it. (We probably log at DEBUG level by default until things are stable.)

There are obviously some grey areas but I'm confident we can work them out.

  1. Expand https://github.com/rocicorp/diff-server/blob/master/util/log/log.go to have three new public functions that our code uses to log. These functions have the signature of Printf and are thus easily satisfiable by golang's log.Printf:
func Error(format string, v ...interface{}) {...}
func Info(format string, v ...interface{}) {...}
func Debug(format string, v ...interface{}) {...}

For now all they do is write to golang's log.Printf with a "ERROR"/"INFO"/"DEBUG" prefix.

  1. Convert replicache client and diffserver to use rlog.Error/Info/Debug consistent with the policy above.

  2. In order to get consistent log output (eg that has the appropriate annotations) I think it is probably also worthwhile to convert noms to use these interfaces (and maybe for rlog to live there). If we don't want to change the output of noms for some reason we can always do this transparently, eg by configuring noms rlog to write all of error/info/debug directly to log.Printf without a prefix. Consumers of noms should be able to easily pass their own logger in if they want to control what it does (eg if they use a more feature logging library it should be easy to wrap it and pass it in for noms to use).

Additional Info

  • at some point we might want a way to get logs from clients in the field
  • this issue is somewhat related to Followup replicache#34
@phritz phritz added this to the Public Announcement 1 milestone Apr 20, 2020
@aboodman aboodman self-assigned this May 13, 2020
aboodman referenced this issue in rocicorp/replicache-sdk-flutter May 14, 2020
Addresses rocicorp/replicache#35 for the flutter repo.
aboodman referenced this issue in rocicorp/replicache-sdk-flutter May 14, 2020
Addresses rocicorp/replicache#35 for the flutter repo.
aboodman referenced this issue in rocicorp/replicache-sdk-flutter May 14, 2020
Addresses rocicorp/replicache#35 for the flutter repo.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants