Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl logs should support log rotation for CRI container runtime. #59902

Closed
Random-Liu opened this issue Feb 15, 2018 · 15 comments
Closed

kubectl logs should support log rotation for CRI container runtime. #59902

Random-Liu opened this issue Feb 15, 2018 · 15 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@Random-Liu
Copy link
Member

Today no matter for docker or other CRI container runtime, kubectl logs is NOT able to:

  1. Trace back to back to rotated logs.
  2. Detect current log file is rotated and start monitoring new log file.

Previously we don't support it, because docker doesn't follow CRI log format, and kubectl logs doesn't understand how docker rotate container logs.

However, after #59898, kubelet will be responsible for rotating container logs for CRI container runtime. And kubectl logs can know potentially support log rotation, because kubelet has all the information needed.

We can do this after #59898 lands.

@kubernetes/sig-node-feature-requests
/kind feature

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. kind/feature Categorizes issue or PR as related to a new feature. labels Feb 15, 2018
k8s-github-robot pushed a commit that referenced this issue Feb 23, 2018
Automatic merge from submit-queue (batch tested with PRs 60214, 58762, 59898, 59897, 60204). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add CRI container log rotation support

Fixes #58823.

This PR:
1) Added `pkg/kubelet/logs/container_log_manager.go` which manages and rotates container logs.
2) Added a feature gate `CRIContainerLogRotation` to enable the alpha feature. And 2 kubelet flags `--container-log-max-size` and `--container-log-max-files` to configure the rotation behavior.
3) Added unit test and node e2e test for container log rotation.

Note that:
1) Container log manager only starts when the container runtime is `remote` (not docker), because we can't implement `ReopenContainerLog` for docker.
2) Rotated logs are compressed with `gzip`.
2) The latest rotated log is not compressed. Because fluentd may still be reading the file right after rotation.
3) `kubectl logs` still doesn't support log rotation. This is not a regression anyway, it doesn't support log rotation for docker log today. We'll probably fix this in the future. (Issue: #59902)

An example of container log directory with `--container-log-max-files=3`:
```console
$ ls -al /var/log/pods/57146449-11ec-11e8-90e1-42010af00002
total 592
drwxr-xr-x 2 root root   4096 Feb 15 01:07 .
drwxr-xr-x 3 root root  12288 Feb 15 01:06 ..
-rw-r----- 1 root root 176870 Feb 15 01:07 log-container_0.log
-rw-r--r-- 1 root root  40239 Feb 15 01:07 log-container_0.log.20180215-010737.gz
-rw-r----- 1 root root 365996 Feb 15 01:07 log-container_0.log.20180215-010747
```

/assign @mtaufen for the config change.
/assign @dashpole @crassirostris for the log change.
/assign @feiskyer for CRI related change.
/cc @yujuhong @feiskyer @abhi @mikebrow @mrunalp @runcom 
/cc @kubernetes/sig-node-pr-reviews @kubernetes/sig-instrumentation-pr-reviews 

**Release note**:

```release-note
[Alpha] Kubelet now supports container log rotation for container runtime which implements CRI(container runtime interface).
The feature can be enabled with feature gate `CRIContainerLogRotation`.
The flags `--container-log-max-size` and `--container-log-max-files` can be used to configure the rotation behavior.
```
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 16, 2018
@yujuhong yujuhong added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 16, 2018
@ShahNewazKhan
Copy link

Any updates on this front?

@louygan
Copy link

louygan commented Aug 15, 2019

which release will include this change?

@ehashman
Copy link
Member

/help

Now that CRIContainerLogRotation is GA, we can potentially add this as a follow-on feature.

Due to the potential for version skew issues (kubernetes/enhancements#2418 (comment)) this will probably need a design doc or perhaps a small KEP.

@k8s-ci-robot
Copy link
Contributor

@ehashman:
This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.

In response to this:

/help

Now that CRIContainerLogRotation is GA, we can potentially add this as a follow-on feature.

Due to the potential for version skew issues (kubernetes/enhancements#2418 (comment)) this will probably need a design doc or perhaps a small KEP.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@vaibhav2107
Copy link
Member

Hi @Random-Liu,
Is this issue active to work on.

@dharmit
Copy link

dharmit commented Jul 8, 2022

Hi @Random-Liu @ehashman @derekwaynecarr, I'd like to work on this. I haven't worked on k8s code base except when using it as a library in the other project I mostly work on. But I will be able to pick things up. Can I assign?

/assign

bors bot pushed a commit to spacemeshos/go-spacemesh that referenced this issue Nov 10, 2022
## Motivation
System tests in CI fail sporadically because `kubectl logs -f <systest pod>` exits prematurely (while tests are still running). An example of a CI run failed in this way: https://github.com/spacemeshos/go-spacemesh/actions/runs/3421691184/jobs/5698440170.

I suspect this is because of log rotation, which `kubectl` doesn't support properly (see kubernetes/kubernetes#59902 and kubernetes/kubernetes#28369).

## Changes
The fix is to wait for the systests to actually finish. It turns out that it's not possible when using a Pod because `kubectl wait --for=<>` doesn't provide a way to wait for Pod to "complete". Kubernetes has a different kind of object for this - a [Job](https://kubernetes.io/docs/concepts/workloads/controllers/job/).

Hence in this PR, I changed the deployment of systests to run in Jobs and I added a script to wait for the tests to finish.

`make run` now returns the exit code properly depending if the tests passed or failed. Thanks to it the CI step running tests is automatically marked as failed if tests failed - I could remove additional steps to check tests result.

Finally, in order to spawn a Job with restart policy = Never and additional labels, I had to create it via `kubectl apply -f` because `kubectl create job` is too limited. In order to create a Job deployment yaml file (to be used with `kubectl apply -f`), I use a template file (`systest_job.yml.tmpl`) which is rendered using [gomplate](https://gomplate.ca).

## Test Plan
CI passing

## DevOps Notes
- [x] This PR does not require configuration changes (e.g., environment variables, GitHub secrets, VM resources)
- [x] This PR does not affect public APIs
- [x] This PR does not rely on a new version of external services (PoET, elasticsearch, etc.)
- [x] This PR does not make changes to log messages (which monitoring infrastructure may rely on)
@dharmit
Copy link

dharmit commented Mar 5, 2023

Sincere apologies for not having looked into this yet. Unassigning so that someone who has bandwidth can pick it up.

Sorry folks. :(

/unassign

@SaberStrat
Copy link

SaberStrat commented May 28, 2023

I take it it's not (just) kubectl that can't handle rotated logs, it's the pods API?

@wu0407
Copy link

wu0407 commented Sep 11, 2023

@zmedico
Copy link

zmedico commented Jan 5, 2024

Fixed by #115702?

@xyz-li
Copy link
Contributor

xyz-li commented Aug 7, 2024

Fixed by #115702?

I think this has been fixed by #115702.

@vaibhav2107
Copy link
Member

As per the comment we can close this issue

@vaibhav2107
Copy link
Member

/close

@k8s-ci-robot
Copy link
Contributor

@vaibhav2107: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/feature Categorizes issue or PR as related to a new feature. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

Successfully merging a pull request may close this issue.