Make logging output location more programmatically configurable #6213

slimsag · 2016-04-05T04:08:08Z

Overview

This PR includes various improvements to make InfluxDB's logging output location more programmatically configurable in general. This feature is only important for users who embed InfluxDB into their Go programs, and was already half-supported. This change just improves the existing feature.

For an in-depth example of why this change is important, please see the description of sourcegraph/appdash#131

Required for all non-trivial PRs

Rebased/mergable
Tests pass
CHANGELOG.md updated
Sign CLA (if not already signed)

slimsag · 2016-04-05T08:10:50Z

The CI failure is now listed as:

--- FAIL: TestServer_BackupAndRestore (2.05s)
    backup_restore_test.go:97: query results wrong:
            exp: {"results":[{"series":[{"name":"myseries","columns":["time","host","value"],"values":[["1970-01-01T00:00:00.001Z","A",23]]}]}]}
            got: {"results":[{}]}

But unless I am crazy wrong, I don't see how this change could cause it exactly? Is it perhaps a flaky test?

jonseymour · 2016-04-06T00:56:12Z

@slimsag might be worth repushing your change - it seems the CI failure affected more than just your test and I wasn't able to reproduce it locally.

I am intrigued by your use case. Are you intending to imbed a full influxdb server in instrumented apps rather than just an influx client? Or is influxdb just an example of an app that you could instrument with your solution?

e-dard · 2016-04-06T10:56:09Z

@slimsag that's a flaky test. I'll run a rebuild of your branch.

e-dard · 2016-04-06T11:04:58Z

tsdb/engine/tsm1/engine.go

@@ -184,7 +184,10 @@ func (e *Engine) Close() error {
 }

 // SetLogOutput is a no-op.
-func (e *Engine) SetLogOutput(w io.Writer) {}
+func (e *Engine) SetLogOutput(w io.Writer) {


It's no longer a no-op.

Again, either need a note on usage (that SetLogOutput should not be called once the shard is open) or we're going to need a lot more locking around access to e.logger.

Fixed in d0283c9

e-dard · 2016-04-06T11:27:35Z

@slimsag I like the idea of making it easier to make the logging output more configurable.

Since I doubt anyone would want to dynamically update the logging locations once influx is running, some notes about calling SetLogOutput on each service, before calling Open on the server would be useful.

e-dard · 2016-04-15T10:25:42Z

@slimsag needs a rebase. Happy to look at this again when you've taken a look through my comments.

slimsag · 2016-04-15T20:56:12Z

Thanks for the CI help! I apologize for the delays in updating this PR -- really sorry about that :)

I've rebased against master and will address all comments now.

slimsag · 2016-04-15T21:40:31Z

@jonseymour

I am intrigued by your use case. Are you intending to imbed a full influxdb server in instrumented apps rather than just an influx client? Or is influxdb just an example of an app that you could instrument with your solution?

So Appdash, as you may or may not know, is an open-source project for tracing the performance of applications (both standalone and distributed ones). Basically we measure real-time performance of operations for e.g. everything that happens in serving an HTTP request, and then we need a place to store those numbers for later viewing by the user (typically a developer, but sometimes even the end user of a product). In turn, we have a massive number of writes (~1KB of data per HTTP request), and not very many reads (maybe 5% of data inserted is actually read).

In the past, we used some rather simple but primitive storage mechanisms. Like literally just a Go map[int64][]byte in memory. As the project has grown, we want to answer more important queries like "what are the slowest HTTP endpoints in the entire system as a whole?", so we wrote some more primitive storage mechanisms to handle this sort of data-aggregation, but it quickly became very complex and very non-performant. Of course, I know of one other open-source time series database written in Go and so @chris-ramon has been working hard over the past few months to see how compatible it is with Appdash, and the results are looking very promising.

One major constraint for Appdash, however, is that it is typically no-setup-required. For example, Appdash should ideally scale to multi-node distributed setups like what you might observe inside high-throughput environments like Google, Facebook, etc. But very importantly, it also needs to be literally zero-setup for use within small web servers, CLI tools, and in the case of Sourcegraph, the company I work at, easy to setup on-premises and local dev machine deployments (because when users see our app going slow, asking them to set up tracing software + database software is not a nice response compared to "just copy+paste the JSON trace" or something like that =) ).

This last no-setup-required case, is exactly where Appdash needs to be able to embed InfluxDB within the process itself. To do this we're using github.com/influxdata/influxdb/cmd/influxd/run.NewServer to spin up a new InfluxDB server in-process; and this works surprisingly well! The only issue is that InfluxDB logs pollute our app's logs, and thus PR will let us direct them to a log file and make debugging issues much easier.

Hopefully this clarifies everything and, while I know it was very verbose, I hope this wasn't an unwarranted response! Hopefully I didn't come across as trying to pitch anything, and if this type of comment isn't right to put here just let me know and I'll be glad to remove it.

slimsag · 2016-04-15T21:42:42Z

CHANGELOG.md updated

It's not 100% clear to me if I should update the changelog for this PR; and would I update the Features or Bug Fixes section?

slimsag · 2016-04-16T00:36:17Z

The AppVeyor failure looks fishy:

=== RUN TestTimer_Elapsed
--- FAIL: TestTimer_Elapsed (2.00s)
    stress_test.go:58: expected around 2s got 1.9998922s

Is this an open issue somewhere else?

e-dard · 2016-04-16T07:01:25Z

@slimsag you can update CHANGELOG to say you added this as a feature. Yes the flaky Windows test is known to us. We will sort it next week.

e-dard · 2016-04-16T07:02:56Z

@slimsag I just need to get another reviewer to look at this before it can be accepted. Unless you decide to do it yourself and repush, we will also squash your commits down when we merge.

slimsag · 2016-04-18T06:15:07Z

Got it; I've updated the changelog. Please squash as you see fit :)

jwilder · 2016-04-19T16:14:56Z

👍

e-dard · 2016-04-19T16:18:22Z

@slimsag we're ready to merge this. You may want to diff with master and see if anything has landed in the last two weeks that involves logging stuff you might want to make more customisable.

When you're ready ping me and I'll merge this (or any updates you add).

…SetLogger This change causes all services to be added at the time `NewServer` is called. Effectively, this enables writing code that calls `NewServer`, iterates over each `Server.Services` in order to specify where log output should go via `Service.SetLogger`. Without this change, it is not possible to catch the output of various startup logs like these: ``` [subscriber] 2016/04/04 18:33:59 opened service [monitor] 2016/04/04 18:33:59 Starting monitor system [monitor] 2016/04/04 18:33:59 'build' registered for diagnostics monitoring [monitor] 2016/04/04 18:33:59 'runtime' registered for diagnostics monitoring [monitor] 2016/04/04 18:33:59 'network' registered for diagnostics monitoring [monitor] 2016/04/04 18:33:59 'system' registered for diagnostics monitoring [cluster] 2016/04/04 18:33:59 Starting cluster service [shard-precreation] 2016/04/04 18:33:59 Starting precreation service with check interval of 10m0s, advance period of 30m0s [snapshot] 2016/04/04 18:33:59 Starting snapshot service [copier] 2016/04/04 18:33:59 Starting copier service [admin] 2016/04/04 18:33:59 Starting admin service [admin] 2016/04/04 18:33:59 Listening on HTTP: [::]:8083 [monitor] 2016/04/04 18:33:59 Storing statistics in database '_internal' retention policy 'monitor', at interval 10s [continuous_querier] 2016/04/04 18:33:59 Starting continuous query service ``` because these services were previously appended at, and wrote these log messages at, `Server.Open` time.

This adds a top-level `Server.Logger` field which can be overridden after a call to `NewServer` such that all `Server`-related logs end up being sent to this `Logger`. This is primarily useful for embedding InfluxDB within other Go programs.

This has various benefits: - Users embedding InfluxDB within other Go programs can specify a different logger / prefix easily. - More consistent with code used elsewhere in InfluxDB (e.g. services, other `run.Server.*` fields, etc). - This is also more efficient, because it means `executeQuery` no longer allocates a single `*log.Logger` each time it is called.

… Logger too Prior to this change, you could invoke `SetLogger` but it would only omit certain logs, not inclusive of the underlying HTTP handler logs for basic route logging or panic logging. These logs looked like: ``` [http] 2016/04/04 19:27:15 127.0.0.1 - demo [04/Apr/2016:19:27:15 -0700] GET /query?db=&q=CREATE+DATABASE+IF+NOT+EXISTS+appdash+WITH+DURATION+1d+NAME+one_day_only HTTP/1.1 200 40 - InfluxDBClient e9147081-fad5-11e5-8001-000000000000 165.773786ms ``` After this change, invoking `SetLogger` properly sets the underlying `Handler.Logger` too, such that it is possible to change the HTTPD logger post-creation and direct its logs somewhere else (this is valuable when embedding InfluxDB in other Go programs).

Prior to this change the log output location for tsdb was fixated to `os.Stderr` in various locations. After this change, a `SetLogOutput` on `tsdb.Store` will effectively set the log output location for underlying shards / tsdb engines. In an ideal world, the `tsdb.Store.Logger` would bubble down, but because `tsdb/engine` can be an abstract implementation, these engines need their own log prefix and thus this is not easily achievable.

…ogOutput

In 9cf4baea223ae972b24f34eceab20980f24e2467 I mistakenly copied the wrong code around. Bring back the exact service appension code from latest `master` as it should have been originally.

Given that tsdb/engine is an abstract implementation and that it is not really valuable to strip out the `[servicename]` prefix to all logs, instead of adding `SetLogger` everywhere add a `SetLogOutput` everywhere instead. This lets the service still choose their name prefix, and clients can choose at a high level (or lower level, if they choose) where logs go. Actual implementation of services SetLogger -> SetLogOutput change left for followup commit for readability purposes.

slimsag · 2016-04-20T02:14:00Z

Hi @e-dard, and thanks for the help on this! I'm sorry to leave a larger change for later, I hope it's okay, but I've made a somewhat verbose change that I think is better than having SetLogger methods.

At this point I can easily call Server.SetLogOutput(somefile) to direct all my InfluxDB server output to a log file; so I am 100% content with this PR as-is.

I've rebase, so PTAL at my new commits starting with commit 8c43c75 (tsdb/engine/tsm1: add and use CacheLoader.SetLogOutput).

slimsag · 2016-04-20T02:58:21Z

I'm thinking this is unrelated, let me know if not:

--- FAIL: TestServer_BackupAndRestore (2.03s)
    backup_restore_test.go:97: query results wrong:
            exp: {"results":[{"series":[{"name":"myseries","columns":["time","host","value"],"values":[["1970-01-01T00:00:00.001Z","A",23]]}]}]}
            got: {"results":[{}]}

slimsag mentioned this pull request Apr 5, 2016

InfluxDBStore: add InfluxDBStoreConfig.LogOutput for configurable log output sourcegraph/appdash#131

Merged

slimsag force-pushed the sg/logging branch 2 times, most recently from e0a2bdd to 32836b6 Compare April 5, 2016 04:54

e-dard reviewed Apr 6, 2016
View reviewed changes

slimsag force-pushed the sg/logging branch from 32836b6 to d0283c9 Compare April 15, 2016 20:55

jwilder added this to the 0.13.0 milestone Apr 19, 2016

e-dard self-assigned this Apr 19, 2016

slimsag added 5 commits April 19, 2016 18:23

slimsag added 18 commits April 19, 2016 18:23

cluster: update test to use QueryExecuter.Logger instead of LogOutput

11531cf

cmd/influxd/run: update test to use QueryExecuter.Logger instead of L…

1a7bbdb

…ogOutput

tsdb/engine/tsm1: update the Engine.SetLogOutput doc strings

944f29f

tsdb/engine/tsm1: simplify NewWAL by inlining SetLogOutput

ed41ccb

tsdb/engine/tsm1: update the WAL.SetLogOutput method docstring

fff0313

tsdb: fix Shard.SetLogOutput and clarify docstring

e7fb688

tsdb: update Store.SetLogOutput method docstring

d4e1f6a

influxql: fix build

b57a8e9

cmd/influxd/run: fix build / mistake during rebase

83a5a93

In 9cf4baea223ae972b24f34eceab20980f24e2467 I mistakenly copied the wrong code around. Bring back the exact service appension code from latest `master` as it should have been originally.

CHANGELOG: mention PR influxdata#6213

574e9f3

tsdb/engine/tsm1: add and use CacheLoader.SetLogOutput

8c43c75

tsdb/engine/tsm1: add and use FileStore.SetLogOutput

56f85b7

services: SetLogger -> SetLogOutput

28fe9b8

cluster: SetLogger -> SetLogOutput

4049b55

influxql: add SetLogOutput

223ff97

monitor: SetLogger -> SetLogOutput

d8ef40c

tsdb: SetLogger -> SetLogOutput

01f623a

slimsag force-pushed the sg/logging branch from 1fb6685 to 01f623a Compare April 20, 2016 02:10

slimsag added 2 commits April 19, 2016 19:20

all: fix test failures

7aac6fa

empty commit to trigger CircleCI

8646f68

e-dard merged commit 9dc09c5 into influxdata:master Apr 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make logging output location more programmatically configurable #6213

Make logging output location more programmatically configurable #6213

slimsag commented Apr 5, 2016 •

edited by jwilder

Loading

slimsag commented Apr 5, 2016

jonseymour commented Apr 6, 2016

e-dard commented Apr 6, 2016

e-dard Apr 6, 2016

e-dard Apr 6, 2016

slimsag Apr 15, 2016

e-dard commented Apr 6, 2016

e-dard commented Apr 15, 2016

slimsag commented Apr 15, 2016

slimsag commented Apr 15, 2016 •

edited

Loading

slimsag commented Apr 15, 2016

slimsag commented Apr 16, 2016

e-dard commented Apr 16, 2016

e-dard commented Apr 16, 2016

slimsag commented Apr 18, 2016

jwilder commented Apr 19, 2016

e-dard commented Apr 19, 2016

slimsag commented Apr 20, 2016

slimsag commented Apr 20, 2016

Make logging output location more programmatically configurable #6213

Make logging output location more programmatically configurable #6213

Conversation

slimsag commented Apr 5, 2016 • edited by jwilder Loading

Overview

Required for all non-trivial PRs

slimsag commented Apr 5, 2016

jonseymour commented Apr 6, 2016

e-dard commented Apr 6, 2016

e-dard Apr 6, 2016

Choose a reason for hiding this comment

e-dard Apr 6, 2016

Choose a reason for hiding this comment

slimsag Apr 15, 2016

Choose a reason for hiding this comment

e-dard commented Apr 6, 2016

e-dard commented Apr 15, 2016

slimsag commented Apr 15, 2016

slimsag commented Apr 15, 2016 • edited Loading

slimsag commented Apr 15, 2016

slimsag commented Apr 16, 2016

e-dard commented Apr 16, 2016

e-dard commented Apr 16, 2016

slimsag commented Apr 18, 2016

jwilder commented Apr 19, 2016

e-dard commented Apr 19, 2016

slimsag commented Apr 20, 2016

slimsag commented Apr 20, 2016

slimsag commented Apr 5, 2016 •

edited by jwilder

Loading

slimsag commented Apr 15, 2016 •

edited

Loading