Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRI: Symlink docker logs to CRI defined log path. #34858

Merged
merged 2 commits into from
Oct 24, 2016

Conversation

Random-Liu
Copy link
Member

@Random-Liu Random-Liu commented Oct 14, 2016

This PR symlink docker logs to the CRI defined log path.

I manually test myself, and it works. We still need some more unit test and a node e2e test.

@resouer is working on a node e2e test #34661.
For the unit test, the current MakeFakePodSandbox and MakeFakeContainer is not enough for unit test, because I need to create multiple instances for one container and sandbox to test garbage collection.

I'll send a separate PR to refactor the unit test framework in kuberuntime and finish the unit test.

@yujuhong @feiskyer
/cc @kubernetes/sig-node


This change is Reviewable

@Random-Liu Random-Liu added area/kubelet area/kubelet-api sig/node Categorizes an issue or PR as relevant to SIG Node. and removed cla: yes labels Oct 14, 2016
@Random-Liu Random-Liu added the release-note-none Denotes a PR that doesn't merit a release note. label Oct 14, 2016
@k8s-github-robot k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 14, 2016
@Random-Liu
Copy link
Member Author

@k8s-bot cri test this.

@resouer
Copy link
Contributor

resouer commented Oct 15, 2016

node e2e test is #34877

@Random-Liu
Copy link
Member Author

@resouer Thanks!

@Random-Liu
Copy link
Member Author

Add one more commit to add the legacy container log location support.

if err != nil {
return fmt.Errorf("failed to start container %q: %v", containerID, err)
}
// TODO: Should we stop the container if an error occurs during start.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. If not, dockershim/kubelet doesn't know there is an error happened.

@@ -269,7 +277,8 @@ func makeMounts(opts *kubecontainer.RunContainerOptions, container *api.Containe
// here we just add a random id to make the path unique for different instances
// of the same container.
cid := makeUID()
containerLogPath := path.Join(opts.PodContainerDir, cid)
containerLogPath := filepath.Join(opts.PodContainerDir, cid)
// TODO: We should try to use os interface here.
fs, err := os.Create(containerLogPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace this with m.osInterface.Create?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small problem is that makeMounts is not a member function of m. We can easily change that, just don't know whether we should make this change just for the unit test.

@yujuhong
Copy link
Contributor

Reviewed 8 of 14 files at r1.
Review status: 8 of 15 files reviewed at latest revision, 3 unresolved discussions.


pkg/kubelet/dockershim/docker_container.go, line 240 at r1 (raw file):

      return "", "", fmt.Errorf("failed to inspect container %q: %v", containerID, err)
  }
  path, ok := info.Config.Labels[containerLogPathLabelKey]

If there is no label, maybe we can just return an empty path and don't error out?


Comments from Reviewable

@Random-Liu
Copy link
Member Author

pkg/kubelet/dockershim/docker_container.go, line 194 at r2 (raw file):

Previously, feiskyer (Pengfei Ni) wrote…

I think so. If not, dockershim/kubelet doesn't know there is an error happened.

It depends on how important we think the container log file is. For a container which is running but doesn't have corresponding container log file, do we consider it to be usable or not?

@yujuhong WDYT?


Comments from Reviewable

@Random-Liu
Copy link
Member Author

pkg/kubelet/dockershim/docker_container.go, line 240 at r1 (raw file):

Previously, yujuhong (Yu-Ju Hong) wrote…

If there is no label, maybe we can just return an empty path and don't error out?

It should be fine for `StartContainer`, because we just write the label in `CreateContainer`. However, for `RemoveContainer`, we should not error out, because we may want to remove old containers and old containers may not have labels. I'll update the PR to return empty path and deal with the empty path at the caller.

Comments from Reviewable

@yujuhong
Copy link
Contributor

pkg/kubelet/dockershim/docker_container.go, line 194 at r2 (raw file):

Previously, Random-Liu (Lantao Liu) wrote…

It depends on how important we think the container log file is.
For a container which is running but doesn't have corresponding container log file, do we consider it to be usable or not?

@yujuhong WDYT?

Let's do the best-effort cleanup and stop the container here.

Comments from Reviewable

@Random-Liu
Copy link
Member Author

pkg/kubelet/dockershim/docker_container.go, line 194 at r2 (raw file):

Previously, yujuhong (Yu-Ju Hong) wrote…

Let's do the best-effort cleanup and stop the container here.

ACK.

Comments from Reviewable

@Random-Liu
Copy link
Member Author

@feiskyer @yujuhong Addressed comments. PTAL~

@Random-Liu
Copy link
Member Author

@k8s-bot cri test this.

@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 20, 2016
@feiskyer
Copy link
Member

LGTM

}
// Create container log symlink.
if err := ds.createContainerLogSymlink(containerID); err != nil {
// Stop the container if the symlink creation fails.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @Random-Liu offline. We'll not stop the container here as the shim has no context of how to stop a user container properly. Since not being able to create a symlink should be rare, the function will just return an error to let kubelet report an event.

@yujuhong yujuhong added this to the v1.5 milestone Oct 21, 2016
@yujuhong
Copy link
Contributor

@Random-Liu let's prioritize this one to unblock the unit test PR :-)

@Random-Liu
Copy link
Member Author

@yujuhong ACK. Will address this soon.

@Random-Liu
Copy link
Member Author

@yujuhong Addressed comments.

The change I made:

  1. Only gc pod log directories when all sources are ready.
  2. Do not stop container when fail to create container log symlink.

@Random-Liu
Copy link
Member Author

@k8s-bot cri test this.

@feiskyer
Copy link
Member

/lgtm

@k8s-github-robot k8s-github-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 22, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins verification failed for commit 4de3d4f2d7a83ca9369ec34754385816a1b6bf53. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 24, 2016
@Random-Liu
Copy link
Member Author

Rebase and sent again, apply LGTM based on #34858 (comment)

@Random-Liu Random-Liu added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 24, 2016
@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot
Copy link
Contributor

Jenkins GCE etcd3 e2e failed for commit fad4672. Full PR test history.

The magic incantation to run this job again is @k8s-bot gce etcd3 e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 37dc74f into kubernetes:master Oct 24, 2016
@Random-Liu Random-Liu deleted the symlink-docker-log branch October 24, 2016 04:19
k8s-github-robot pushed a commit that referenced this pull request Oct 24, 2016
Automatic merge from submit-queue

CRI: Refactor kuberuntime unit test

Based on #34858

This PR:
1) Refactor the fake runtime service and some kuberuntime unit test.
2) Add better garbage collection unit test.
3) Fix init container unit test which isn't testing correctly. Some other unit tests may also need to be fixed.
4) Add pod log directory garbage collection unit test.

@feiskyer @yujuhong 
/cc @kubernetes/sig-node
k8s-github-robot pushed a commit that referenced this pull request Nov 2, 2016
Automatic merge from submit-queue

CRI: Add kuberuntime container logs

Based on #34858.

The first 2 commits are from #34858. And the last 2 commits are new.

This PR added kuberuntime container logs support and add unit test for it.

I've tested all the functions manually, and I'll send another PR to write a node e2e test for container log.

**_Notice: current implementation doesn't support log rotation**_, which means that:
- It will not retrieve logs in rotated log file.
- If log rotation happens when following the log:
  - If the rotation is using create mode, we'll still follow the old file.
  - If the rotation is using copytruncate, we'll be reading at the original position and get nothing.

To solve these issues, kubelet needs to rotate the log itself, or at least kubelet should be able to control the the behavior of log rotator. These are doable but out of the scope of 1.5 and will be addressed in future release.

@yujuhong @feiskyer @yifan-gu 
/cc @kubernetes/sig-node
@cyphar
Copy link

cyphar commented Apr 4, 2017

This is causing us issues with cri-o/cri-o#162. Has this problem been resolved, and if not where does the issue lie (the Docker or k8s side)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet area/kubelet-api lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants