Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go implementation log file exporter jira ticket log 1188 #979

Conversation

pmoogi-redhat
Copy link
Contributor

@pmoogi-redhat pmoogi-redhat commented Apr 13, 2021

Description

Summary
Implement an independent process to monitor CRIO container log file rotations that can be used to publish total_bytes_logged metrics in prometheus format. The motivation for moving away from a Ruby implementation is Ruby is not efficient in tracking file rotations. Also collectors can change over time so making metric computation outside of fluentd for logging total_bytes_logged is more preferred.

Exposes metric endpoint that is consumable by Prometheus at port :2112
Exposes metric of total_bytes_logged

/cc @alanconway @jcantrill
/assign @jcantrill

Links

  • Related PR(s): origin-aggregated-logging:Metric-for-inbound-log-data-loss-at-the-collector-JIRA-ticket-LOG-1032
    Got in_tail and prometheus plugin changes which enable total_bytes_collected metric computed and published in prometheus plugin
  • JIRA: https://issues.redhat.com/browse/LOG-1188

Copy link
Contributor

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 13, 2021
//Get new watcher
w := &FileWatcher {
metrics: prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "gofilewatcher_input_status_total_bytes_logged",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another one for @blockloop to weigh in. There are some specific promtail conventions about the order of bytes, total etc. The fluentd metric name also should be consistent wit hthis.

@@ -0,0 +1,17 @@
FROM openshift/origin-release:golang-1.15 AS build
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a single image with two entrypoints, or two images? I was assuming two images, since we will deploy this as part of the fluentd DaemonSet not with the CLO.

internal/go-symlink-logfiles-watcher/Docker/go.mod Outdated Show resolved Hide resolved
internal/go-symlink-logfiles-watcher/Makefile Outdated Show resolved Hide resolved
@alanconway
Copy link
Contributor

alanconway commented Apr 15, 2021

Final re-naming! @pmoogi-redhat

Executable: log_exporter

Metrics:

  • log_logged_bytes_total: total number of bytes written to a single log file path, accounting for rotations.
  • log_collected_bytes_total: total number of bytes collected from a single log file.
  • Both metrics have the label "path" which is the absolute path to the log file.

@pmoogi-redhat pmoogi-redhat force-pushed the Go-implementation-log-file-exporter-JIRA-ticket-LOG-1188 branch from 12c3aae to 4cf01db Compare April 15, 2021 14:05
Copy link
Contributor

@alanconway alanconway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few minor renaming diffs now that we've decided on the metric names finally. Otherwise this is good to go.

Copy link
Contributor

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure to format before pushing changes make fmt

internal/logs-files-metric-exporter/Docker/Dockerfile Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved

func TestWatchesRealFiles(t *testing.T) {
f := NewFixture(t)
assert, require := assert.New(t), require.New(t)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved into the fixture to reduce code dup

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alanconway seeking your advice here.

internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
e, err := w.watcher.Event()
if err != nil {
log.Error(err, "Watcher.Event returning err")
os.Exit(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know of any recoverable reason for an inotify FD to be closed by the kernel. The inotify object exists indepenently of any files or directories that it is watching.
We could try to re-establish the watch with timed backoff - we don't want to go into a hard CPU spin.

//Get new watcher
w := &FileWatcher {
metrics: prometheus.NewCounterVec(prometheus.CounterOpts{
Name: "logs_files_metric_exporter_input_status_bytes_logged_total",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locked: log_logged_bytes_total

@openshift-ci-robot openshift-ci-robot added the midstream/Dockerfile A Dockerfile.in sync is needed with midstream label Apr 22, 2021
Copy link
Contributor

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run make fmt before commiting

internal/log-file-metric-exporter/Docker/Dockerfile Outdated Show resolved Hide resolved
internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
@pmoogi-redhat pmoogi-redhat force-pushed the Go-implementation-log-file-exporter-JIRA-ticket-LOG-1188 branch from 55db0b0 to 6908a87 Compare April 26, 2021 08:12
@alanconway
Copy link
Contributor

/approve
I'll let @jcantrill do the LGTM since he's raised a couple more points.

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 26, 2021
Copy link
Contributor

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please run make fmt before sumitting

internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
internal/pkg/symnotify/symnotify.go Outdated Show resolved Hide resolved
Copy link
Contributor

@alanconway alanconway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some suggestions about handling the regexp, otherwise looks good.

internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: pmoogi-redhat
To complete the pull request process, please ask for approval from jcantrill after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 28, 2021
Dockerfile Outdated Show resolved Hide resolved
internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
internal/cmd/log-file-metric-exporter/main.go Outdated Show resolved Hide resolved
pkg/k8shandler/fluentd.go Show resolved Hide resolved
@@ -337,6 +354,26 @@ func (clusterRequest *ClusterLoggingRequest) createOrUpdateFluentdSecret() error
return nil
}

func newLogMetricExporterContainer() v1.Container {
resources := v1.ResourceRequirements{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alanconway I left this blank when I provided the PR, but how should we pass in resources here. I'm not fond of opening up the API, but maybe we need to allow a backdoor for setting mem and cpu, maybe via annotations?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a "MetricExporterSpec" to the API, I don't think that would be too bad. The API already has componet-specific requirements (FluentdSpec) so it's not making it any worse.
Alternatively we could split the Fluentd requirements - e.g. give 90% to fluentd and 10% to the exporter. We'd need to do some benchmarks to make sure that's a reasonable split but we should be doing CPU/Mem benehmarks anyway so it's useful work.
The backdoor approach means our collector is using resources that the user doesn't know about, which is not ideal.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would lean towards just adding to the API. This is something to clean up in the "API v2" effort when we get smarter about scaling and resource use, but it's not worse than whats already there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pmoogi-redhat this means we need to add soming like https://github.com/openshift/cluster-logging-operator/blob/master/pkg/apis/logging/v1/clusterlogging_types.go#L35 with the resource requirements. Make sure it additionally includes the comments as these are used for the code generation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcantrill @alanconway added changes to support LogFileMetricExportedSpec - please review my changes done in
modified: apis/logging/v1/clusterlogging_types.go
modified: k8shandler/defaults.go
modified: k8shandler/fluentd.go
Hoping above is sufficient for specifying individually resources for LogFileMetricExporter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vimalk78 please review as well

Copy link
Contributor Author

@pmoogi-redhat pmoogi-redhat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alanconway incorporated your regex based parsing k8 filename suggestion. please see.

@alanconway
Copy link
Contributor

@pmoogi-redhat This all looks good. Add unit tests for FileWatcher, including the file name parsing and I'll LGTM.
Go has very easy-to-use test coverage tools to check your coverage, see https://blog.golang.org/cover

@pmoogi-redhat pmoogi-redhat force-pushed the Go-implementation-log-file-exporter-JIRA-ticket-LOG-1188 branch from e75e742 to ac012ab Compare July 6, 2021 17:38
@pmoogi-redhat
Copy link
Contributor Author

/retest

@pmoogi-redhat
Copy link
Contributor Author

/test lint

1 similar comment
@pmoogi-redhat
Copy link
Contributor Author

/test lint

@pmoogi-redhat
Copy link
Contributor Author

/test e2e-gcp

1 similar comment
@pmoogi-redhat
Copy link
Contributor Author

/test e2e-gcp

@pmoogi-redhat pmoogi-redhat force-pushed the Go-implementation-log-file-exporter-JIRA-ticket-LOG-1188 branch from ac012ab to dc12c4f Compare July 7, 2021 16:55
@pmoogi-redhat
Copy link
Contributor Author

/test e2e-gcp

@pmoogi-redhat
Copy link
Contributor Author

/test functional

@pmoogi-redhat
Copy link
Contributor Author

/test e2e-gcp

1 similar comment
@pmoogi-redhat
Copy link
Contributor Author

/test e2e-gcp

@pmoogi-redhat pmoogi-redhat force-pushed the Go-implementation-log-file-exporter-JIRA-ticket-LOG-1188 branch from 92d982c to 029dc99 Compare July 12, 2021 16:36
@pmoogi-redhat
Copy link
Contributor Author

/test functional

@pmoogi-redhat
Copy link
Contributor Author

/test e2e-gcp

1 similar comment
@pmoogi-redhat
Copy link
Contributor Author

/test e2e-gcp

@pmoogi-redhat pmoogi-redhat force-pushed the Go-implementation-log-file-exporter-JIRA-ticket-LOG-1188 branch from 029dc99 to 5935ef3 Compare July 13, 2021 10:52
@pmoogi-redhat
Copy link
Contributor Author

/test e2e-gcp

…pec to expose metric log_logged_bytes_total

Delete Makefile.vimal.fix - clean up this Makefile
Reverted back to quay.io hosted fluentd image and added comment in fluentd.go
Changed back clo pull policy to IfNotPresent
merge issue clean up in bundle and manifest csv
with https scheme for log-file-metric-exporter + other review comments changes
fixed https support issues
fixed merge issue in fluentd.go
fix to be tested
Using fluentdContainer.SecurityContext
quay.io hosted ci image for log-file-metric-exporter added in bundle/manifests/cluster-logging.v5.2.0.clusterserviceversion.yaml  manifests/5.2/cluster-logging.v5.2.0.clusterserviceversion.yaml
taking out /bin/sh -c wrapper from cmd for log-file-metric-exporter
taken out dead code on exporterContainer
cleanup unnecessary Volume like /usr/local/bin - not required to be mounted
varlogcontainers volumemount fix
exporterContainer.Command fix - flat string of command line found not working so fixed by providing slice of strings for each argv
@pmoogi-redhat pmoogi-redhat force-pushed the Go-implementation-log-file-exporter-JIRA-ticket-LOG-1188 branch from 5935ef3 to e43f772 Compare July 14, 2021 09:13
@pmoogi-redhat
Copy link
Contributor Author

/test lint

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 14, 2021

@pmoogi-redhat: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-operator 8e6327563f4f6a09fb9729c52b6f0cee661ba2ec link /test e2e-operator

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@pmoogi-redhat
Copy link
Contributor Author

/test lint

@jcantrill
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 14, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 14, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alanconway, jcantrill, pmoogi-redhat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [alanconway,jcantrill]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 28b7abf into openshift:master Jul 14, 2021
@pmoogi-redhat
Copy link
Contributor Author

#979 - PR merged on golang based file watcher named as log-file-metric-exporter. It watches log files registered into its watcher object. It publishes log_logged_bytes_total for these log files.

pmoogi-redhat pushed a commit to pmoogi-redhat/cluster-logging-operator that referenced this pull request Apr 26, 2022
…-log-file-exporter-JIRA-ticket-LOG-1188

Go implementation log file exporter jira ticket log 1188
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. midstream/Dockerfile A Dockerfile.in sync is needed with midstream
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants