Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stern stops tailing logs after a while (possible log file rotation??) #287

Closed
rouilj opened this issue Dec 10, 2023 · 7 comments
Closed
Labels
kind/bug Something isn't working

Comments

@rouilj
Copy link
Contributor

rouilj commented Dec 10, 2023

What happened:

I start stern and it start tailing the log file for my pods. I am using:

 stern deployment/roundup-demo

After some period of time (a few hours), the logs stop being reported.

What you expected to happen:

Logs should continue until I stop stern or the pods are stopped.

How to reproduce it (as minimally and precisely as possible):

I'm not sure exactly. I run two pods that emit output with the following config.yaml:

kubelet-arg:
  - container-log-max-files=4
  - container-log-max-size=500Ki

The deployment etc can be obtained from: https://wiki.roundup-tracker.org/DeployWithK3s#preview

I am trying to determine if kubectl logs -f deployment/roundup-demo is stopping at the same point.

Anything else we need to know?:

When stern is properly tailing the pod logs, I have tested that stern follows the new and old log files
when deployment rollout is restarted.

Environment:

  • stern version (use stern --version): 1.27.0
  • OS (e.g: cat /etc/os-release): Ubuntu 20.04.6 LTS
  • Install tools (e.g: Homebrew): apt
  • Others:
    • kubernetes instance: k3s
    • hardware: odroid-C4
@rouilj rouilj added the kind/bug Something isn't working label Dec 10, 2023
@rouilj
Copy link
Contributor Author

rouilj commented Dec 10, 2023

Just a followup. It does look related to log file rotation. My kubectl logs -f deployment/roundup-demo stopped
before stern. However stern was displaying logs from only one of the two pods when it stopped.

My guess is kubectl was displaying the logs for the other pod which was rotated 23 minutes earlier.

When stern stopped displaying logs, I looked at the logs in the log directory. Stern stopped displaying the log lines from the log it was reporting about 4 lines before the end of the rotated log file. The log file was rotated from
0.log (that I assume stern was following) to 0.log.20231210-211718.

This looks like stern doesn't have the equivalent of tail -F and is acting like tail -f.

@tksm
Copy link
Contributor

tksm commented Dec 14, 2023

Hi,

Thank you for reporting the issue.

I have confirmed that both kubectl and stern stop tailing logs when log file rotations occur in Kubernetes v1.28.0.

I believe the issue is caused by kubernetes/kubernetes#115701, which has been addressed in kubernetes/kubernetes#115702. I have verified that the issue has been resolved in Kubernetes v1.29.0.

Fixed the bug where kubelet couldn't output logs after log file rotated when kubectl logs POD_NAME -f is running.

Steps to reproduce

  1. Create a Kubernetes cluster using kind with the following config.
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  image: kindest/node:v1.28.0@sha256:b7a4cad12c197af3ba43202d3efe03246b3f0793f162afb40a33c923952d5b31
  kubeadmConfigPatches:
  - |
    kind: InitConfiguration
    nodeRegistration:
      kubeletExtraArgs:
        "container-log-max-size": "1Ki"
  1. Deploy the following pod.
apiVersion: v1
kind: Pod
metadata:
  name: p
spec:
  terminationGracePeriodSeconds: 0
  containers:
  - name: p
    image: busybox
    command: ["/bin/sh", "-c", "while true; do date; sleep 0.01; done"]
  1. Confirm the issue by observing that both kubectl and stern stop tailing logs in about 10 seconds.
stern --timestamps pod/p
kubectl logs --timestamps -f p

When we create a Kubernetes v1.29.0 cluster with the same config, the issue is no longer reproduced. (A node image for Kubernetes v1.29.0 is not yet released, so we need to build a v1.29.0 image using kind build.)

@rouilj
Copy link
Contributor Author

rouilj commented Dec 14, 2023

Thanks for the followup and the reproducer. I have reported this upstream using your diagnosis info to k3s-io/k3s#9060.

Hopefully, they can use the same or similar fix. I assume once this is fixed stern will work as expected.
Have a great weekend.

@rouilj rouilj closed this as completed Dec 14, 2023
@irons
Copy link

irons commented Jan 12, 2024

Did this problem manifest as if the container had exited cleanly? For instance:

pod-12345678-9m4rc container-name merrily, merrily
pod-12345678-9m4rc container-name merrily, merrily
- pod-12345678-9m4rc › container-name

I interpreted this as the pod having finished successfully, except that there was no reason for it to have terminated there. When checking logs in GCP, I found that it hadn't actually terminated, and continued to emit additional logs after this point. Restarting stern with the same query showed the above logs, and what came after them, without the above indication that the pod had terminated.

I'm currently on 1.25, and I'm not sure if the linked kubernetes bug is older than 1.28. I'll open a new issue if this doesn't sound like the same behavior. Thanks for your time.

@tksm
Copy link
Contributor

tksm commented Jan 14, 2024

@irons Hi,

As far as I confirmed, the logs just stopped without the - line in this case.

$ stern --timestamps pod/p
+ p › p
p p 2024-01-14T09:30:51.744125720+09:00 Sun Jan 14 00:30:51 UTC 2024
p p 2024-01-14T09:30:51.755554091+09:00 Sun Jan 14 00:30:51 UTC 2024
# ...
p p 2024-01-14T09:31:01.731528831+09:00 Sun Jan 14 00:31:01 UTC 2024
p p 2024-01-14T09:31:01.742724026+09:00 Sun Jan 14 00:31:01 UTC 2024
# The logs just stop without the `-` line.

The --verbosity=10 flag might help to find why stern removes the container.

stern --verbosity=10 <query>

@rouilj
Copy link
Contributor Author

rouilj commented Jan 14, 2024

I have updated k3s since I opened this and I do see the - trailing line:

roundup-demo-66f56565d8-757qh roundup-demo 10.42.0.1 - - [14/Jan/2024 03:08:24] "GET /demo/ HTTP/1.1" 200 -
roundup-demo-66f56565d8-757qh roundup-demo 10.42.0.1 - - [14/Jan/2024 03:08:34] "GET /demo/ HTTP/1.1" 200 -
roundup-demo-66f56565d8-757qh roundup-demo 10.42.0.1 - - [14/Jan/2024 03:08:44] "GET /demo/ HTTP/1.1" 200 -
roundup-demo-66f56565d8-757qh roundup-demo 10.42.0.1 - - [14/Jan/2024 03:08:53] "GET /demo/ HTTP/1.1" 200 -
roundup-demo-66f56565d8-757qh roundup-demo 10.42.0.1 - - [14/Jan/2024 03:09:03] "GET /demo/ HTTP/1.1" 200 -
roundup-demo-66f56565d8-757qh roundup-demo 10.42.0.1 - - [14/Jan/2024 03:09:13] "GET /demo/ HTTP/1.1" 200 -
- roundup-demo-66f56565d8-nb478 › roundup-demo
- roundup-demo-66f56565d8-757qh › roundup-demo
^C
r2d2@r2d2:~$ kubectl get pods
NAME                            READY   STATUS    RESTARTS        AGE
roundup-demo-66f56565d8-nb478   1/1     Running   2 (4d13h ago)   18d
roundup-demo-66f56565d8-757qh   1/1     Running   2 (4d13h ago)   18d
r2d2@r2d2:~$ k3s -v
k3s version v1.28.5+k3s1 (5b2d1271)
go version go1.20.12

restarting stern picked up after the last timestamp logged for both pods.

@irons
Copy link

irons commented Jan 14, 2024

Thanks, and also yikes. I don't want to have to attempt restarting after each pod exit to see if it was kidding, and we won't get to 1.29 any time soon. I'll try again with high verbosity and see if I can get more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants