Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured Logging For the operator #24

Closed
jlewi opened this issue Aug 21, 2017 · 12 comments
Closed

Structured Logging For the operator #24

jlewi opened this issue Aug 21, 2017 · 12 comments

Comments

@jlewi
Copy link
Contributor

jlewi commented Aug 21, 2017

I think it would be useful if the operator used structured logging.

For example, it would be nice if the operator outputted json formatted records with various metadata tags. One tag could be the name of the job a log message pertains to. This would make it easy to filter the log messages by job.

https://github.com/sirupsen/logrus is a Go package for structured logging. The main reason I initially didn't use that and went with https://github.com/golang/glog was because logrus doesn't support outputting the file and line number of an error.

Ideally we'd like the best of both packages; i.e. structured logs with file and line number.

@jlewi
Copy link
Contributor Author

jlewi commented Nov 5, 2017

Lots of workarounds discussed in sirupsen/logrus#63. prometheus implemented this way so we could potentially copy them.

@jlewi jlewi added this to the Kubecon Europe milestone Jan 10, 2018
@jlewi
Copy link
Contributor Author

jlewi commented Jan 10, 2018

I'm adding this to our next milestone because I think structured logging will be critical to scaling. As we scale up to more jobs and larger jobs we will need to be able to easily filter logs by pod, job etc... to get to relevant logs.

@gaocegege
Copy link
Member

Personally, I recommend glog since most of repos in the Kubernetes community use glog.

@ScorpioCPH
Copy link
Member

+1 for glog :)

@jlewi
Copy link
Contributor Author

jlewi commented Jan 25, 2018

If we use glog is there a way to output json logs with metadata such as the job and replica a log message is associated with?

@gaocegege
Copy link
Member

I am afraid not 🤔 , since there is no function about it in the docs https://godoc.org/github.com/golang/glog

@jlewi
Copy link
Contributor Author

jlewi commented Feb 22, 2018

With glog how do we make it really easy to filter the TFJob operator logs so we can see log messages for a particular job.

I think this will be super useful for debugging troubleshooting.

If we use structured logging then we can add a tag corresponding to the job name. Then it should be very easy to filter the logs to find all log messages for a particular job.

@jlewi
Copy link
Contributor Author

jlewi commented Feb 22, 2018

This solution looks promising
sirupsen/logrus#63 (comment)
sirupsen/logrus#63 (comment)

I believe this solution just uses the filename hook
https://github.com/onrik/logrus

I think we can just define a logrus logger with that hook and it will work.
https://github.com/onrik/logrus

Would be great if someone could just try it out using the example here:
https://github.com/onrik/logrus

@ankushagarwal
Copy link

I'm looking into this. I will try the filenameHook from onrik/logrus and post the results here

@gaocegege
Copy link
Member

Now we use flag package to support command line flags, and glog also uses it by default. Then you can see our binary have more flags than we thing although we use logrus instead of glog:

➜  tf-operator git:(416) ✗ ./tf-operator -h               
Usage of ./tf-operator:
  -alsologtostderr
    	log to standard error as well as files
  -chaos-level int
    	DO NOT USE IN PRODUCTION - level of chaos injected into the TFJob created by the operator. (default -1)
  -controller-config-file string
    	Path to file containing the controller config.
  -gc-interval duration
    	GC interval (default 10m0s)
  -json-log-format
    	Set true to use json style log format. Set false to use plaintext style log format (default true)
  -log_backtrace_at value
    	when logging hits line file:N, emit a stack trace
  -log_dir string
    	If non-empty, write log files in this directory
  -logtostderr
    	log to standard error instead of files
  -stderrthreshold value
    	logs at or above this threshold go to stderr
  -v value
    	log level for V logs
  -version
    	Show version and quit
  -vmodule value
    	comma-separated list of pattern=N settings for file-filtered logging

There are some pros and cons:

  • We can support vendor and client's glog, since we have glog's flags
  • But the users may be confused since the tf-operator outputs logs regardless of the flag -logtostderr

@gaocegege
Copy link
Member

I think we could close the issue after #416 merged. And I will file a new issue for the extra flag problem. But it is now a big problem. We can refer to etcd/etcd-operator.

@gaocegege
Copy link
Member

xref #424

gaocegege pushed a commit that referenced this issue Mar 1, 2018
* Move from glog to sirupsen/logrus for logging
* Add a new flag json_log_format
* Refactor all log statements to use logrus
* Use Info level for log level V(1) and below
* Use Debug level for log level V(2) and above
* Tested locally

Addresses #24

Sample logs
```
{"filename":"app/server.go:54","level":"info","msg":"EnvKubeflowNamespace not set, use default namespace","time":"2018-02-27T18:25:18-08:00"}
{"filename":"app/server.go:59","level":"info","msg":"[Version: 0.3.0+git Git SHA: Not provided. Go Version: go1.9.3 Go OS/Arch: darwin/amd64]","time":"2018-02-27T18:25:18-08:00"}
{"filename":"app/server.go:145","level":"info","msg":"No controller_config_file provided; using empty config.","time":"2018-02-27T18:25:18-08:00"}
{"filename":"controller/controller.go:110","level":"info","msg":"Setting up event handlers","time":"2018-02-27T18:25:18-08:00"}
```
jimexist pushed a commit to jimexist/tf-operator that referenced this issue Mar 7, 2018
* Move from glog to sirupsen/logrus for logging
* Add a new flag json_log_format
* Refactor all log statements to use logrus
* Use Info level for log level V(1) and below
* Use Debug level for log level V(2) and above
* Tested locally

Addresses kubeflow#24

Sample logs
```
{"filename":"app/server.go:54","level":"info","msg":"EnvKubeflowNamespace not set, use default namespace","time":"2018-02-27T18:25:18-08:00"}
{"filename":"app/server.go:59","level":"info","msg":"[Version: 0.3.0+git Git SHA: Not provided. Go Version: go1.9.3 Go OS/Arch: darwin/amd64]","time":"2018-02-27T18:25:18-08:00"}
{"filename":"app/server.go:145","level":"info","msg":"No controller_config_file provided; using empty config.","time":"2018-02-27T18:25:18-08:00"}
{"filename":"controller/controller.go:110","level":"info","msg":"Setting up event handlers","time":"2018-02-27T18:25:18-08:00"}
```
jlewi added a commit to jlewi/k8s that referenced this issue Jul 30, 2018
…obs.

* Use <namespace>.<name> as opposed to <namespace>/<name>; the former
  is more consistent with K8s style.

* Add functions for constructing loggers for the pod and unstructured meta
  information. This will allow us to appropriately tag a number of log
  messages with meta information.

* Update a bunch of log messages which weren't logging info with
  appropriate meta information.

* Make json log formatting the default; this was the case for v1.
  json logging should be the default because otherwise we lose the meta
  information in the logs. With json logs its always possible to filter/reformat
  the log entries if you don't care about the metainformation.

Related to:
  kubeflow#24 Use logrus
  kubeflow#635
k8s-ci-robot pushed a commit that referenced this issue Aug 1, 2018
…obs (#765)

* Improve meta information in log messages to make it easier to debug jobs.

* Use <namespace>.<name> as opposed to <namespace>/<name>; the former
  is more consistent with K8s style.

* Add functions for constructing loggers for the pod and unstructured meta
  information. This will allow us to appropriately tag a number of log
  messages with meta information.

* Update a bunch of log messages which weren't logging info with
  appropriate meta information.

* Make json log formatting the default; this was the case for v1.
  json logging should be the default because otherwise we lose the meta
  information in the logs. With json logs its always possible to filter/reformat
  the log entries if you don't care about the metainformation.

Related to:
  #24 Use logrus
  #635

* Fix lint by running; goimports.
Syulin7 added a commit to Syulin7/training-operator that referenced this issue May 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants