`loki.source.kubernetes` should handle log rotation #5040

skl · 2023-08-31T09:56:08Z

Background

Over on the k8s monitoring helm repo, we found that loki.source.kubernetes stops tailing logs after log file rotation.

Proposal

Test again once the underlying issue in Kubernetes being tracked here is resolved:

kubelet: fix handle new log file in ReadLogs after log rotation kubernetes/kubernetes#118500

The text was updated successfully, but these errors were encountered:

captncraig · 2023-08-31T19:17:24Z

Is there any potential for a workaround on this? My fear is that even if k8s makes a fix in a future version, this will still be a problem for years after that until all the cloud providers make that version GA, and everybody upgrades, which can be real slow.

rfratto · 2023-08-31T19:42:23Z

@slim-bean has a potential workaround involving detecting when logs drop off and force resetting the connection. It's nasty, but it works.

n888 · 2023-09-11T18:41:41Z

Nice to hear there is a potential workaround, @slim-bean would you be able to share what you have so far?

sharovmerk · 2023-10-05T12:47:39Z

any news here?

sharovmerk · 2023-10-05T12:48:22Z

btw. I see the same behaviour with loki.source.podlogs

rfratto · 2023-10-11T17:10:07Z

I don't have an exact timeline, but I've been told the code that was written to fix this should be shared within the next few weeks.

btw. I see the same behaviour with loki.source.podlogs

loki.source.podlogs and loki.source.kubernetes share the same code for tailing logs, so the fix will resolve the behavior seen in both components.

rfratto · 2023-10-26T15:26:53Z

kubernetes/kubernetes#115702 got merged, which will fix the problem for us, but it's unclear how many versions of Kubernetes the fix will be backported to, and how quickly users will upgrade to get the fix.

In general, we'll still need a workaround for versions of Kubernetes that don't have the fix available.

Versions of Kubernetes that do not contain kubernetes/kubernetes#115702 will fail to detect rolled log files, causing the API to stop sending logs to the agent for processing. To work around this, this commit intorduces a rolling average calculator to determine the average delta between log entries per target. If 3x the normal delta time has elapsed since the last entry, the tailer is restarted. False positives here are acceptable, but false negatives mean that log lines may not appear for an extended period of time until the rolling detection succeeds. Closes grafana#5040

Versions of Kubernetes that do not contain kubernetes/kubernetes#115702 will fail to detect rolled log files, causing the API to stop sending logs to the agent for processing. To work around this, this commit intorduces a rolling average calculator to determine the average delta between log entries per target. If 3x the normal delta time has elapsed since the last entry, the tailer is restarted. False positives here are acceptable, but false negatives mean that log lines may not appear for an extended period of time until the rolling detection succeeds. Closes grafana#5040 Co-authored-by: Edward Welch <edward.welch@grafana.com>

* component/prometheus: fix panic in interceptor when child isn't set This commit fixes a panic in prometheus.Interceptor where an interceptor which doesn't forward samples to another appendable panics when appending data. Co-authored-by: Edward Welch <edward.welch@grafana.com> * loki.source.kubernetes: improve detection of rolled log files Versions of Kubernetes that do not contain kubernetes/kubernetes#115702 will fail to detect rolled log files, causing the API to stop sending logs to the agent for processing. To work around this, this commit intorduces a rolling average calculator to determine the average delta between log entries per target. If 3x the normal delta time has elapsed since the last entry, the tailer is restarted. False positives here are acceptable, but false negatives mean that log lines may not appear for an extended period of time until the rolling detection succeeds. Closes #5040 Co-authored-by: Edward Welch <edward.welch@grafana.com> * loki.source.kubernetes: support clustering Add support for loki.source.kubernetes to distribute targets using clustering. Closes #4502 Co-authored-by: Edward Welch <edward.welch@grafana.com> * loki.source.podlogs: support clustering Add support for loki.source.podlogs to distribute targets using clustering. * service/cluster: add common block for clustering arguments * remove irrelevant TODO comment #5623 (comment) --------- Co-authored-by: Edward Welch <edward.welch@grafana.com>

skl added the proposal Proposal or RFC label Aug 31, 2023

skl mentioned this issue Aug 31, 2023

Tracking issue for loki.source.kubernetes grafana/k8s-monitoring-helm#101

Closed

tpaschalis added the type/signals label Sep 5, 2023

rfratto mentioned this issue Oct 16, 2023

Grafana FLOW Logs tailing stopped and not coming back. Pods Unhealthy #5482

Closed

rfratto mentioned this issue Oct 26, 2023

Flow: Improve Kubernetes log collection #5623

Merged

rfratto closed this as completed in #5623 Oct 30, 2023

github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Feb 21, 2024

github-actions bot locked as resolved and limited conversation to collaborators Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`loki.source.kubernetes` should handle log rotation #5040

`loki.source.kubernetes` should handle log rotation #5040

skl commented Aug 31, 2023

captncraig commented Aug 31, 2023

rfratto commented Aug 31, 2023

n888 commented Sep 11, 2023

sharovmerk commented Oct 5, 2023

sharovmerk commented Oct 5, 2023

rfratto commented Oct 11, 2023 •

edited

Loading

rfratto commented Oct 26, 2023

loki.source.kubernetes should handle log rotation #5040

loki.source.kubernetes should handle log rotation #5040

Comments

skl commented Aug 31, 2023

Background

Proposal

captncraig commented Aug 31, 2023

rfratto commented Aug 31, 2023

n888 commented Sep 11, 2023

sharovmerk commented Oct 5, 2023

sharovmerk commented Oct 5, 2023

rfratto commented Oct 11, 2023 • edited Loading

rfratto commented Oct 26, 2023

`loki.source.kubernetes` should handle log rotation #5040

`loki.source.kubernetes` should handle log rotation #5040

rfratto commented Oct 11, 2023 •

edited

Loading