[query] Log at debug level resolved cluster namespaces, use same logger in query and coordinator #1775

robskillington · 2019-06-25T12:33:42Z

What this PR does / why we need it:

A few people have requested at least some basic insight to what namespaces are being resolved, this is adding the first level of debugging in this area. (More to come later, this we can get out quickly).

Also I needed to flow the config to control the logging level through to the query/coordinator so I ended up consolidating all our logging so that it's the exact same loggers using instrument options all the way down in query and coordinator.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

NONE

Does this PR require updating code package or user-facing documentation?:

NONE

…m3db/m3 into r/debug-logs-resolve-cluster-namespaces

vjtm · 2019-06-25T12:57:38Z

It works great.

Example output in debug mode (it shows two namespaces depending on the retention value):
jun 25 14:55:22 lgmadcengvic01v m3coordinator[53282]: 2019-06-25T14:55:22.268+0200 DEBUG query resolved cluster namespace, will use most granular per result {"query": "", "start": "2019-06-25T14:20:15.000+0200", "end": "2019-06-25T14:55:30.000+0200", "fanoutType": "coversPartialQueryRange", "namespace": "metricsagg", "type": "aggregated", "retention": "20m0s", "resolution": "10s"} jun 25 14:55:32 lgmadcengvic01v m3coordinator[53282]: 2019-06-25T14:55:32.447+0200 DEBUG query resolved cluster namespace, will use most granular per result {"query": "", "start": "2019-06-25T14:46:45.000+0200", "end": "2019-06-25T14:54:00.000+0200", "fanoutType": "coversAllQueryRange", "namespace": "metrics", "type": "unaggregated", "retention": "10m0s", "resolution": "0s"}

Thx!

vjtm · 2019-06-25T14:04:06Z

One detail. I put the mode back to info but the message still appears.

robskillington · 2019-06-26T04:19:02Z

@vjtm interesting, this may be due to the fact two loggers exist when coordinator and dbnode run side by side. I might fix that too with this PR.

… coordinator

robskillington · 2019-06-26T07:22:24Z

Updated with all loggers being constructed equally now in the same process, that should avoid what you're seeing with the logs being logged regardless of config.

vtomasr5 · 2019-06-26T08:23:44Z

Hi,
Some weird is happening now.

I have the following in place (just for testing):

three VMs for etcd cluster
VM: prometheus + m3coordinator + m3dbnode1
VM: m3dbnode2
Note: both processes m3dbnode and m3coordinator are recent builds from this PR code.

The m3dbnode1 with debug mode activated show debug messages in the logs, but no query resolved cluster namespace log lines.
The m3dbnode2 with debug mode activated doesn't show debug messages at all.

The cluster works fine. I see data in grafana. Metrics are comming from https://github.com/open-fresh/avalanche

I am @vjtm, just using a different account :)

vtomasr5 · 2019-06-26T08:30:14Z

Config files:
https://gist.github.com/vtomasr5/a7cdd15ae2c500d1b9ec0f1cb0524f5b

robskillington · 2019-06-26T08:36:39Z

Hey @vtomasr5 @vjtm the reason I believe you're not seeing the message anymore is you actually need to set log level to debug for the coordinator specifically in your config file.

So at the top of m3coordinator.yml adding this:

logging:
  level: debug

I manage to see the correct debug output for the namespaces resolved when sending queries to the coordinator itself now.

arnikola

Approved /w nits; only real point is we may miss visibility in local v remote storage

arnikola · 2019-06-26T08:35:52Z

src/query/api/v1/handler/prometheus/validator/handler.go

+		SetLookbackDuration(h.lookbackDuration).
+		SetGlobalEnforcer(nil).
+		SetInstrumentOptions(h.instrumentOpts.
+			SetMetricsScope(h.instrumentOpts.MetricsScope().SubScope("debug_engine")))


nit: split this up into 2 statements?

arnikola · 2019-06-26T08:46:29Z

src/query/executor/transform/types.go

+}
+
+// NewOptions enforces that fields are set when options is created.
+func NewOptions(p OptionsParams) (Options, error) {


What's this for, just removing direct struct generation here?

Yeah, it was possible to create it without InstrumentOptions and consequently it could easily lead to nil pointer panics. So now it's protected against that by always enforcing it to be set when constructing Options.

arnikola · 2019-06-26T08:51:15Z

src/query/server/server.go

+		SetStore(backendStorage).
+		SetLookbackDuration(*cfg.LookbackDuration).
+		SetGlobalEnforcer(perQueryEnforcer).
+		SetInstrumentOptions(instrumentOptions.


nit: split into 2 statements

arnikola · 2019-06-26T08:53:42Z

src/query/storage/m3/storage.go

+	debugLog := s.logger.Check(zapcore.DebugLevel,
+		"query resolved cluster namespace, will use most granular per result")
+	if debugLog != nil {
+		for _, n := range namespaces {


Quick note: currently I'm not sure you'll be able to tell if you're serving the query yourself or serving a remote query here.

That's probably ok for just simple debugging. We can get more complex later perhaps?

Sure, that's fine; only concern is if you get random spam in your logs from remote queries

arnikola · 2019-06-26T08:58:23Z

src/query/tsdb/remote/codecs.go

@@ -241,7 +245,7 @@ func retrieveMetadata(streamCtx context.Context) context.Context {
 		}
 	}

-	return logging.NewContextWithID(streamCtx, id)
+	return logging.NewContextWithID(streamCtx, instrumentOpts, id)


nit: put options last here

Sure thing.

arnikola · 2019-06-26T08:59:20Z

src/query/util/logging/log.go

 	// Attach a rqID with all logs so that its simple to trace the whole call stack
 	rqID := uuid.NewRandom()
-	return NewContextWithID(ctx, rqID.String())
+	return NewContextWithID(ctx, instrumentOpts, rqID.String())


nit: options last here

Sure thing.

arnikola · 2019-06-26T08:59:32Z

src/query/util/logging/log.go

-func NewContextWithID(ctx context.Context, id string) context.Context {
+func NewContextWithID(
+	ctx context.Context,
+	instrumentOpts instrument.Options,


nit: set the options as last param

Sure thing.

vtomasr5 · 2019-06-26T09:05:39Z

I've added those lines at the top of m3coordinator.yml. Restarted the service.

But I am unable to see the debug logging.
This is what I see:

Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: 2019/06/26 10:49:55 Go Runtime version: go1.12.6
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: 2019/06/26 10:49:55 Build Version:      v0.10.2
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: 2019/06/26 10:49:55 Build Revision:     880ba430
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: 2019/06/26 10:49:55 Build Branch:       r/debug-logs-resolve-cluster-namespaces
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: 2019/06/26 10:49:55 Build Date:         2019-06-26-10:32:40
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: 2019/06/26 10:49:55 Build TimeUnix:     1561537960
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538995.7882032,"msg":"tracing disabled for m3query; set `tracing.backend` to enable"}
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538995.8114905,"msg":"successfully loaded cache from file","file":"/var/lib/m3kv/_kv_default_env_m3db_embedded.json"}
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538995.8630068,"msg":"waiting for dynamic topology initialization, if this takes a long time, make sure that a topology/placement is configured"}
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538995.8637817,"msg":"adding a watch","service":"m3db","env":"default_env","zone":"embedded","includeUnhealthy":true}
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538995.8656628,"msg":"successfully loaded cache from file","file":"/var/lib/m3kv/m3db_embedded.json"}
Jun 26 10:49:55 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538995.8790286,"msg":"initial topology / placement value received"}
Jun 26 10:49:56 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538996.265582,"msg":"resolved cluster namespace","namespace":"metrics"}
Jun 26 10:49:56 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538996.2656374,"msg":"resolved cluster namespace","namespace":"metricsagg"}
Jun 26 10:49:56 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538996.4539897,"msg":"successfully updated topology","numHosts":1}
Jun 26 10:49:56 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538996.5211146,"msg":"successfully updated topology","numHosts":1}
Jun 26 10:49:56 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538996.860387,"msg":"configuring downsampler to use with aggregated cluster namespaces","numAggregatedClusterNamespaces":1}
Jun 26 10:49:56 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561538996.8606749,"msg":"successfully loaded cache from file","file":"/var/lib/m3kv/_kv_default_env_m3db_embedded.json"}
Jun 26 10:50:06 lgmadcengvic01v m3coordinator[8056]: {"level":"error","ts":1561539006.886191,"msg":"error initializing namespaces values, retrying in the background","key":"/namespaces","error":"initializing value error:init watch timeout"}
Jun 26 10:50:06 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561539006.9341915,"msg":"received kv update","version":1,"key":"/placement"}
Jun 26 10:50:06 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561539006.934315,"msg":"election manager opened successfully"}
Jun 26 10:50:07 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561539007.9357057,"msg":"election state changed from follower to leader"}
Jun 26 10:50:07 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561539007.944032,"msg":"no m3msg server configured"}
Jun 26 10:50:07 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561539007.9440768,"msg":"starting API server","address":"0.0.0.0:7201"}
Jun 26 10:50:07 lgmadcengvic01v m3coordinator[8056]: {"level":"info","ts":1561539007.9444742,"msg":"registered new interrupt handler"}

On the other hand, in the m3dbnode1 with the debug option I cab see the debug messages. Not in the m3dbnode2 which seems to happen the same as in the m3coordinator.

robskillington · 2019-06-26T09:32:43Z

@vtomasr5 are you sure you are sending queries to that coordinator?

I'm seeing it in my tests:

{"level":"debug","ts":1561541480.8201895,"msg":"query resolved cluster namespace, will use most granular per result","query":"","start":1561537575,"end":1561541490,"fanoutType":"coversAllQueryRange","namespace":"metrics_10s_48h","type":"aggregated","retention":"48h0m0s","resolution":"10s"}

Relevant build details:

$ docker-compose logs -f m3coordinator01
Attaching to m3_stack_m3coordinator01_1
m3coordinator01_1  | 2019/06/26 09:30:25 Go Runtime version: go1.12.6                                                                                                                                                                       m3coordinator01_1  | 2019/06/26 09:30:25 Build Version:      v0.10.2
m3coordinator01_1  | 2019/06/26 09:30:25 Build Revision:     531cc35f
m3coordinator01_1  | 2019/06/26 09:30:25 Build Branch:       r/debug-logs-resolve-cluster-namespaces
m3coordinator01_1  | 2019/06/26 09:30:25 Build Date:         2019-06-26-07:06:32
m3coordinator01_1  | 2019/06/26 09:30:25 Build TimeUnix:     1561532792

vjtm · 2019-06-26T09:38:28Z

I'm sorry. Prometheus was not starting fine.

It works perfect.

robskillington added 4 commits June 25, 2019 22:32

[query] Log at debug level resolved cluster namespaces

1ecf1c0

Merge branch 'master' into r/debug-logs-resolve-cluster-namespaces

9030bfe

Add fanoutType to debug log

3c15535

Merge branch 'r/debug-logs-resolve-cluster-namespaces' of github.com:…

531cc35

…m3db/m3 into r/debug-logs-resolve-cluster-namespaces

robskillington added 2 commits June 26, 2019 17:18

Use instrument options everywhere and use logger config for query and…

5c1ce0d

… coordinator

Merge branch 'master' into r/debug-logs-resolve-cluster-namespaces

1001d39

robskillington changed the title ~~[query] Log at debug level resolved cluster namespaces~~ [query] Log at debug level resolved cluster namespaces, use same logger in query and coordinator Jun 26, 2019

Fix build

880ba43

Fix tests

de31b60

arnikola approved these changes Jun 26, 2019

View reviewed changes

Address feedback

cf76c47

Merge branch 'master' into r/debug-logs-resolve-cluster-namespaces

acb0479

robskillington merged commit bc2c906 into master Jun 26, 2019

robskillington deleted the r/debug-logs-resolve-cluster-namespaces branch June 26, 2019 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[query] Log at debug level resolved cluster namespaces, use same logger in query and coordinator #1775

[query] Log at debug level resolved cluster namespaces, use same logger in query and coordinator #1775

robskillington commented Jun 25, 2019 •

edited

Loading

vjtm commented Jun 25, 2019 •

edited

Loading

vjtm commented Jun 25, 2019 •

edited

Loading

robskillington commented Jun 26, 2019

robskillington commented Jun 26, 2019

vtomasr5 commented Jun 26, 2019 •

edited

Loading

vtomasr5 commented Jun 26, 2019

robskillington commented Jun 26, 2019

arnikola left a comment

arnikola Jun 26, 2019

arnikola Jun 26, 2019

robskillington Jun 26, 2019

arnikola Jun 26, 2019

arnikola Jun 26, 2019

robskillington Jun 26, 2019

arnikola Jun 27, 2019

arnikola Jun 26, 2019

robskillington Jun 26, 2019

arnikola Jun 26, 2019

robskillington Jun 26, 2019

arnikola Jun 26, 2019

robskillington Jun 26, 2019

vtomasr5 commented Jun 26, 2019

robskillington commented Jun 26, 2019

vjtm commented Jun 26, 2019

[query] Log at debug level resolved cluster namespaces, use same logger in query and coordinator #1775

[query] Log at debug level resolved cluster namespaces, use same logger in query and coordinator #1775

Conversation

robskillington commented Jun 25, 2019 • edited Loading

vjtm commented Jun 25, 2019 • edited Loading

vjtm commented Jun 25, 2019 • edited Loading

robskillington commented Jun 26, 2019

robskillington commented Jun 26, 2019

vtomasr5 commented Jun 26, 2019 • edited Loading

vtomasr5 commented Jun 26, 2019

robskillington commented Jun 26, 2019

arnikola left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vtomasr5 commented Jun 26, 2019

robskillington commented Jun 26, 2019

vjtm commented Jun 26, 2019

robskillington commented Jun 25, 2019 •

edited

Loading

vjtm commented Jun 25, 2019 •

edited

Loading

vjtm commented Jun 25, 2019 •

edited

Loading

vtomasr5 commented Jun 26, 2019 •

edited

Loading