Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include cluster name in log line. #4387

Merged
merged 1 commit into from
Sep 11, 2019
Merged

Include cluster name in log line. #4387

merged 1 commit into from
Sep 11, 2019

Conversation

evanchaoli
Copy link
Contributor

When multiple Concourse clusters send logs to a single log server, this will be super helpful to decide where a log line comes from.

Say cluster name is "dev", them sample log lines are as below: (notice "source" field in logs)

web_1     | {"timestamp":"2019-09-09T08:38:20.133553327Z","level":"info","source":"dev.tsa","message":"dev.tsa.connection.channel.command.register.done","data":{"command":"forward-worker","remote":"172.19.0.4:32946","session":"1.4.1.2","worker-address":"127.0.0.1:44903","worker-platform":"linux","worker-tags":""}}
web_1     | {"timestamp":"2019-09-09T08:38:21.008126381Z","level":"info","source":"dev.atc","message":"dev.atc.pipelines.radar.scan-resource-type.interval-runner.tick.running-worker.garden-connection.retryable-http-client.retrying","data":{"error":"dial tcp 127.0.0.1:37871: connect: connection refused","failed-attempts":3,"pipeline":"p","ran-for":"3.012469958s","resource-type":"rt-tiny","session":"16.1.1.1.1.4.1.1","team":"main"}}

Signed-off-by: Chao Li chaol@vmware.com

@evanchaoli evanchaoli requested review from a team September 9, 2019 08:46
@hstenzel
Copy link

hstenzel commented Sep 9, 2019

It'd also be helpful in other contexts, especially if the the cluster name is exposed to resources in the standard metadata fashion as say $CLUSER_NAME.

That would allow me to write reusable pipelines that interacted with resources in a standard way (assuming the resource allows metadata variable access. I could have a vNext concourse cluster running in parallel with my main concourse cluster, with the same pipelines, but producing artifacts going to the vNext places and updating PRs with unit-test.vNext instead of unit-test-main.

@evanchaoli
Copy link
Contributor Author

The test concourse-ci/watsjs failed but seems to not related to my code change. Can anybody take a look and trigger the test?

@vito
Copy link
Member

vito commented Sep 9, 2019

The change itself is innocuous but I want to ask if there's another way to do it since I've been thinking a lot about our logging UX recently.

Our logs right now are verbose, hard to read, and hard for contributors to understand and add their own (there are unusual patterns like "hyphenating-your-messages"). Basically, our logs are written by humans as if they're going to be read by machines, but in reality I'm pretty sure humans are reading them more.

I'm thinking our logging should instead focus on being human-friendly, while providing clarity to an operator and tinkerer as to how Concourse works, with a stronger focus on the 'happy path' so that tinkerers can verify their expectations against the logs.

To that end, I've been thinking about switching from lager to a more common logging package - likely sirupsen/logrus. One dramatic way I've been planning to go about this is to literally remove all of our info/debug logs (preserving error logs) and only add high-level logs that I would expect to see as an operator, initially at small scale. @xoebus and I started on a way to automate this (in my prohibition repo).

So, the reason I'm asking here is because this is adding a bit more information to all of the logs, and I'm actually leaning towards trimming the information down to the bare necessities. Is there some way to accomplish this without annotating every log line? For example, could this be done at the syslog level instead, where log entries can be associated to a system or program?

Thanks!

@evanchaoli
Copy link
Contributor Author

evanchaoli commented Sep 9, 2019

@vito We running ATCs on K8S and exporting logs to LogInsight, where LogInsight is a VMware production, not sure if you are familiar with that. To us, JSON formatted logs are quite suitable for LogInsight, easy to query, group, statistic log events. Though LogInsight accepts arbitrary format logs as well, JSON format is much preferred. So even if Concourse is moving to other logging package, we'd like to preserve the ability to output JSON formatted logs.

Current each log line has a source field whose value is "atc", "tsa" etc. I originally thought to add a "cluster" field, but then found lager supports only component as a common field. So I had to add cluster name as a prefix to source. If you know how to add a separate cluster field to log line, which would be merrier.

With LogInInsight I don't think we can auto inject some info to logs based on where logs coming from. Even if LogInsight could annotate logs with source IP, it wouldn't have idea with Concourse cluster name. So I still think this feature will be super helpful to us.

If you don't want to burden users who don't care about cluster name in log lines, maybe we add a command line option to switch the feature?

@vito
Copy link
Member

vito commented Sep 10, 2019

@evanchaoli Got it, that makes sense. Making it configurable feels like a good middle ground. We can leave it off by default but enable users to flip it on if they have their logs going to a common sink and can't automatically 'bucket' them based on the source.

logrus has support for JSON formatting out of the box, and has support for annotating log entries with arbitrary fields, so I think we can still meet your needs if we switch logging packages. We'll just need a flag to enable JSON logging, since I don't think it should be the default.

When we switch to logrus we won't be constructing log.messages.like-these anymore, so we'd put the cluster name in the log fields instead. So for future-compatibility, could we do something like this instead of prepending it to the 'component'?:

logger = logger.WithData(lager.Data{
  "cluster": clusterName,
})

This will add a "cluster-name" field to all logs emitted with the logger.

Thanks!

…name in log line.

Say cluster name is "dev", them sample log lines are as below: (notice "cluster" field in logs)

web_1     | {"timestamp":"2019-09-11T00:53:59.706488300Z","level":"info","source":"atc","message":"atc.cmd.finish","data":{"cluster":"dev","duration":378200,"session":"1"}}
web_1     | {"timestamp":"2019-09-11T00:53:59.707583200Z","level":"info","source":"tsa","message":"tsa.listening","data":{"cluster":"dev"}}

Signed-off-by: Chao Li <chaol@vmware.com>
@evanchaoli
Copy link
Contributor Author

@vito I have added a new cli option --log-cluster-name and added field cluster to logger, please review the code. Log messages now look like:

web_1     | {"timestamp":"2019-09-11T00:53:59.706488300Z","level":"info","source":"atc","message":"atc.cmd.finish","data":{"cluster":"dev","duration":378200,"session":"1"}}
web_1     | {"timestamp":"2019-09-11T00:53:59.707583200Z","level":"info","source":"tsa","message":"tsa.listening","data":{"cluster":"dev"}}

Copy link
Member

@vito vito left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 thanks!

@vito vito merged commit 66c450e into concourse:master Sep 11, 2019
@vito vito added this to the v5.6.0 milestone Sep 11, 2019
vito added a commit that referenced this pull request Sep 11, 2019
ref. #4311 and #4387

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
vito added a commit that referenced this pull request Sep 11, 2019
ref. #4311 and #4387

Signed-off-by: Alex Suraci <suraci.alex@gmail.com>
@jamieklassen jamieklassen added the release/documented Documentation and release notes have been updated. label Sep 11, 2019
@evanchaoli evanchaoli deleted the log branch September 24, 2019 13:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logging release/documented Documentation and release notes have been updated.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants