-
Notifications
You must be signed in to change notification settings - Fork 699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Structured Logging For the operator #24
Comments
Lots of workarounds discussed in sirupsen/logrus#63. prometheus implemented this way so we could potentially copy them. |
I'm adding this to our next milestone because I think structured logging will be critical to scaling. As we scale up to more jobs and larger jobs we will need to be able to easily filter logs by pod, job etc... to get to relevant logs. |
Personally, I recommend glog since most of repos in the Kubernetes community use glog. |
+1 for |
If we use glog is there a way to output json logs with metadata such as the job and replica a log message is associated with? |
I am afraid not 🤔 , since there is no function about it in the docs https://godoc.org/github.com/golang/glog |
With glog how do we make it really easy to filter the TFJob operator logs so we can see log messages for a particular job. I think this will be super useful for debugging troubleshooting. If we use structured logging then we can add a tag corresponding to the job name. Then it should be very easy to filter the logs to find all log messages for a particular job. |
This solution looks promising I believe this solution just uses the filename hook I think we can just define a logrus logger with that hook and it will work. Would be great if someone could just try it out using the example here: |
I'm looking into this. I will try the |
Now we use flag package to support command line flags, and glog also uses it by default. Then you can see our binary have more flags than we thing although we use logrus instead of glog:
There are some pros and cons:
|
I think we could close the issue after #416 merged. And I will file a new issue for the extra flag problem. But it is now a big problem. We can refer to etcd/etcd-operator. |
xref #424 |
* Move from glog to sirupsen/logrus for logging * Add a new flag json_log_format * Refactor all log statements to use logrus * Use Info level for log level V(1) and below * Use Debug level for log level V(2) and above * Tested locally Addresses #24 Sample logs ``` {"filename":"app/server.go:54","level":"info","msg":"EnvKubeflowNamespace not set, use default namespace","time":"2018-02-27T18:25:18-08:00"} {"filename":"app/server.go:59","level":"info","msg":"[Version: 0.3.0+git Git SHA: Not provided. Go Version: go1.9.3 Go OS/Arch: darwin/amd64]","time":"2018-02-27T18:25:18-08:00"} {"filename":"app/server.go:145","level":"info","msg":"No controller_config_file provided; using empty config.","time":"2018-02-27T18:25:18-08:00"} {"filename":"controller/controller.go:110","level":"info","msg":"Setting up event handlers","time":"2018-02-27T18:25:18-08:00"} ```
* Move from glog to sirupsen/logrus for logging * Add a new flag json_log_format * Refactor all log statements to use logrus * Use Info level for log level V(1) and below * Use Debug level for log level V(2) and above * Tested locally Addresses kubeflow#24 Sample logs ``` {"filename":"app/server.go:54","level":"info","msg":"EnvKubeflowNamespace not set, use default namespace","time":"2018-02-27T18:25:18-08:00"} {"filename":"app/server.go:59","level":"info","msg":"[Version: 0.3.0+git Git SHA: Not provided. Go Version: go1.9.3 Go OS/Arch: darwin/amd64]","time":"2018-02-27T18:25:18-08:00"} {"filename":"app/server.go:145","level":"info","msg":"No controller_config_file provided; using empty config.","time":"2018-02-27T18:25:18-08:00"} {"filename":"controller/controller.go:110","level":"info","msg":"Setting up event handlers","time":"2018-02-27T18:25:18-08:00"} ```
…obs. * Use <namespace>.<name> as opposed to <namespace>/<name>; the former is more consistent with K8s style. * Add functions for constructing loggers for the pod and unstructured meta information. This will allow us to appropriately tag a number of log messages with meta information. * Update a bunch of log messages which weren't logging info with appropriate meta information. * Make json log formatting the default; this was the case for v1. json logging should be the default because otherwise we lose the meta information in the logs. With json logs its always possible to filter/reformat the log entries if you don't care about the metainformation. Related to: kubeflow#24 Use logrus kubeflow#635
…obs (#765) * Improve meta information in log messages to make it easier to debug jobs. * Use <namespace>.<name> as opposed to <namespace>/<name>; the former is more consistent with K8s style. * Add functions for constructing loggers for the pod and unstructured meta information. This will allow us to appropriately tag a number of log messages with meta information. * Update a bunch of log messages which weren't logging info with appropriate meta information. * Make json log formatting the default; this was the case for v1. json logging should be the default because otherwise we lose the meta information in the logs. With json logs its always possible to filter/reformat the log entries if you don't care about the metainformation. Related to: #24 Use logrus #635 * Fix lint by running; goimports.
kubeflow#24) Signed-off-by: Syulin7 <735122171@qq.com>
I think it would be useful if the operator used structured logging.
For example, it would be nice if the operator outputted json formatted records with various metadata tags. One tag could be the name of the job a log message pertains to. This would make it easy to filter the log messages by job.
https://github.com/sirupsen/logrus is a Go package for structured logging. The main reason I initially didn't use that and went with https://github.com/golang/glog was because logrus doesn't support outputting the file and line number of an error.
Ideally we'd like the best of both packages; i.e. structured logs with file and line number.
The text was updated successfully, but these errors were encountered: