Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRI: proposal for managing container stdout/stderr streams #34376

Merged
merged 1 commit into from
Nov 22, 2016

Conversation

yujuhong
Copy link
Contributor

@yujuhong yujuhong commented Oct 8, 2016

This scope of this proposal is limited to the stdout/stderr logs streams of the
containers.

This addresses #30709


This change is Reviewable

@yujuhong
Copy link
Contributor Author

yujuhong commented Oct 8, 2016

/cc @kubernetes/sig-node

@yujuhong yujuhong added release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Oct 8, 2016
@k8s-github-robot k8s-github-robot added kind/design Categorizes issue or PR as related to design. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. retest-not-required-docs-only labels Oct 8, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins Kubemark GCE e2e failed for commit 18e7a0d. Full PR test history.

The magic incantation to run this job again is @k8s-bot kubemark e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins GCI GCE e2e failed for commit 18e7a0d. Full PR test history.

The magic incantation to run this job again is @k8s-bot gci gce e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

and ends with a newline.

```
2016-10-06T00:17:09.669794202Z The content of the log entry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to distinguish between stdout and stderr?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the kubectl logs feature it doesn't matter, for log aggregators it definitely could

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, kubectl logs looks at combined output, but we can definitely distinguish the two streams for log aggregation. I'll modify the proposal to include that.

Copy link
Contributor

@lucab lucab Oct 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this supposed to interact with the container-app being attached to a TTY? It is my understanding that you can't actually get split stdout/stderr and a TTY at the same time, see moby/moby#19696.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the stream type.

@lucab I think that's a limitation of the underlying implementation. For now, maybe we can do what docker does and report only stdout?



1. **Allow log collectors to integrate with kubernetes without knowledge of
the underlying container runtime.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a requirement? It's certainly nice to have, but I would have phrased this much weaker:

"Allow log collectors to integrate with Kubernetes across different container run times while preserving efficient storage and retrieval"

Copy link
Contributor Author

@yujuhong yujuhong Oct 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with a making it a wekaker statement. I thought about categorizing some of them as nice-haves, but was worried that it'd complicate the proposal. It's an important factor nonetheless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

2. **Allow kubelet to manage the lifecycle of the logs to pave the way for
better disk management in the future.** This implies that the lifecycle
of containers and their logs need to be decoupled.
3. **Provide ways for CRI-compliant runtimes to support all existing logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would probably be #1 requirement in an ordered list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, if we need to rank them. I'd still prefer the solution that addresses all three requirements/nice-to-haves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mandate the logging format, assuming runtime already includes plugins for
various logging formats. Unfortunately, this is not true for existing runtimes
such as docker, which supports log retrieval only for a very limited number of
log drivers [2]. On the other hand, the downside is that we would be enforcing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really a critical problem? If supporting more log drivers in docker is important, we can always support that.

Copy link
Contributor Author

@yujuhong yujuhong Oct 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just to point out that runtime support for log drivers are far from perfect today, and it's not good enough to simply say "runtime already supports log drivers, why can we reuse that?"

defining complex logging API.

There are also possibilities to implement the current proposal by imposing the
log file paths, while leveraging the runtime to access and/or manage logs. This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this proposal makes a really strong argument against this (which I would consider the "default" path because it leverages the existing runtime support).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few other things to consider:

  1. Log collection should not require going through a daemon - A process like fluentd should be able to collect logs without having to rely on another daemon being up. This implies that logs must be a set of files at a known location in a format known to log aggregators.
  2. Rotation policies might vary by containers -- I don't think there are any runtimes now that support log rotation at any reasonable granularity, each runtime would have to independently develop that feature
  3. Logs persisting and being rotated after the container is removed - If a runtime is specifically only a container runtime, decoupling the lifetime of container+logs while having the runtime manage it will result in either the runtime having to have a firstclass concept of logs separate from containers (I don't think any do), or in the shim having to manage the logs itself... if the shim has to do it, then it might as well be kubelet.

I originally was in favor of having the runtimes manage everything, but if we add all of the above requirements, I think it's a good argument against it. We can question how much we actually need most of those requirements still though, since several of them are quite forward-looking.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to @euank's comment, which explains more clearly than the statement in the proposal.

TBH, I am not completely against leveraging runtime for various log operations. However,

  1. We are not sure what we want yet, since disk management is still in its early stages.
  2. Container runtimes such as docker also does not provide good support of logging yet. Even if we were to improve that (or hack docker internals in the shim), it'd take a longer time.

Instead of committing resources to do this right now, I think the current proposal gives us room to figure things out without having to develop and commit to a premature api.

Copy link
Contributor Author

@yujuhong yujuhong Oct 10, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can question how much we actually need most of those requirements still though, since several of them are quite forward-looking.

I think I did this on purpose. I wanted to see if we can try finding a solution that addresses all requirements first. If, instead, sig-node thinks that some requirements can be dropped, it's perfectly reasonable to question the requirements themselves, and convince people who are in favor of the requirements.

the underlying container runtime.**

2. **Allow kubelet to manage the lifecycle of the logs to pave the way for
better disk management in the future.** This implies that the lifecycle
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also implies the kubelet must be able to rotate logs, and thus must be able to work with the given format.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another component can be set up to rotate the log if needed. I think the proposal discusses more about the logging formats later. Let me know if you find that insufficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an orthogonal concern... So long as it's not strict control.

@coffeepac
Copy link
Contributor

Reviewed 1 of 1 files at r1.
Review status: all files reviewed at latest revision, 7 unresolved discussions, some commit checks failed.


docs/proposals/kubelet-cri-logging.md, line 35 at r1 (raw file):

## Goals and non-goals

Container Runtime Interfacej(CRI) is an ongoing project to allow container

small typo in "Interfacej(CRI)", there's a 'j' in there that probably shouldn't be.


docs/proposals/kubelet-cri-logging.md, line 116 at r1 (raw file):

Previously, smarterclayton (Clayton Coleman) wrote…

Is this a requirement? It's certainly nice to have, but I would have phrased this much weaker:

"Allow log collectors to integrate with Kubernetes across different container run times while preserving efficient storage and retrieval"

I am in favor of the wording of the original requirement. I'd like to be able to write a collector that interfaces with kubernetes, not 'kubernetes on docker' or 'kubernetes on rocket'. I see where you're coming from in that you want to lock down the most important parts of the interface but I'd like the interface to not leak.

docs/proposals/kubelet-cri-logging.md, line 175 at r1 (raw file):

Previously, euank (Euan Kemp) wrote…

For the kubectl logs feature it doesn't matter, for log aggregators it definitely could

For log aggregating it does matter. Either separate files with stderr/stdout in the name or an additional field in the file of err/out. To make 'kubectl logs' work easily, probably easier to add a field.

Comments from Reviewable


Because kubelet determines where the logs are stores and can access them
directly, this meets requirement (1). As for requirement (2), the log collector
can easily extracts all necessary metadata (e.g., pod UID, container name) from

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

..all necessary metadata... I would say this is misleading given for the openshift aggregated logging solution we consider the pod labels 'necessary' but are unable to 'access' them without a call to the api server.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Could we have a unique directory per pod (each being symlinked into that known path), where that directory contains a metadata file with the pod's name, id, annotations, and the same for its containers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll reword it as "basic pod metadata". I think it's reasonable to dump a metadata file in the directory. I was going to add it here, but forgot to do so earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the wording, and add the future work for writing a metadata file.

@euank
Copy link
Contributor

euank commented Oct 10, 2016

cc @lucab, references your attach work as a path for rkt to integrate with this, and any insight you have would be appreciated!

I wouldn't be surprised if this comes up in sig-node tomorrow :)

In the short term, the ongoing docker-CRI integration [3] will support the
proposed solution only partially by (1) creating symbolic links for kubelet
to access, but not manage the logs, and (2) add support for json format in
kubelet. A more sophisticated solution that either involves using a custom plugin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this saying for the near short term, we tolerate different logging formats than the one proposed above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This helps bridge the transition.

@yifan-gu
Copy link
Contributor

Thank you @yujuhong 👍
I am ok with sticking to one logging format defined by kubelet if that's doable.

@lucab @euank Does that mean we could make journald in rkt stage1 as optional?

@thockin
Copy link
Member

thockin commented Oct 11, 2016

I am OK with this proposal. I understand all the arguments, and I think dithering any further is a waste of energy. @yujuhong do you want to make more edits, or do you think it is good? @smarterclayton are your questions advisory or a request to halt a merge?

@thockin
Copy link
Member

thockin commented Oct 11, 2016

wrt streaming through the runtime:

a) Can we get away with that FOR NOW?
b) Is it going to help us define and stabilize the overall design in the short term?
c) How expensive is it to pay off later?

as a mid-term solution.

For rkt, there is a pending PR to enable systemd to redirect stdout/stderr to
user-provided streams [4].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit more context on this:

For rkt, implementation will rely on providing external file-descriptors for stdout/stderr
to applications via systemd [4]. Those streams are currently managed by a journald
sidecar, which collects stream outputs and store them in the journal file of the pod.
This will replaced by a custom sidecar which can produce logs in the format expected
by this specification and can handle clients attaching as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Included this in the proposal.

@euank
Copy link
Contributor

euank commented Oct 11, 2016

LGTM.

From the rkt side, we can implement this as @lucab commented, or by having some process act as a journald "remote" endpoint and have it transform logs to this format. The benefit of the latter is simply that the existing machined / journald integration would still work, but if you're using Kubernetes it's less important to have that and more important to have a generic on-node tools to work with pod logs (out of scope of this issue).

Until the implementation work is done rkt side, it's likely we'll want to have a runtime-specific workaround in kubelet.


## Goals and non-goals

Container Runtime Interfacej(CRI) is an ongoing project to allow container

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: typo: j

@feiskyer
Copy link
Member

Please run hack/update-munge-docs.sh

@thockin thockin added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 19, 2016

**Log format**

The runtime should decorate each log entry with a RFC 3339 timestamp prefix,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/RFC 3339/RFC 3339Nano

Found this during coding.
@yujuhong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

**Log format**

The runtime should decorate each log entry with a RFC 3339 timestamp prefix,
the stream type (i.e., "stdout" or "stderr"), and ends with a newline.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about when the container was created with at tty?

In that case, there's no way to distinguish stdout from stderr (per discussion in 32869).

If you look at docker's json-log file for a container started with -t, it will only have stdout items.

Should we specify that here, or assume a tty'd container just shouldn't produce any logs (they'll often have escape sequences and such regardless)?

cc @lucab

Copy link
Member

@Random-Liu Random-Liu Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may want to assume that container created with tty should only generate logs with stream type "stdout". (As you mentioned, docker behaves like this)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do, we should document this.

But I also think it could make sense to not support logs for that because a container with a tty is likely to emit terminal control characters or be used for read/writing large chunks of data (per @smarterclayton use-case for attach).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may want to assume that container created with tty should only generate logs with stream type "stdout". (As you mentioned, docker behaves like this)

+1. only one stream is supported because of tty.

But I also think it could make sense to not support logs for that because a container with a tty is likely to emit terminal control characters or be used for read/writing large chunks of data.

It's better to escape those terminal control characters instead of not supporting logs.

Copy link
Contributor

@euank euank Nov 3, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternate option would be a special tty type. This could have value for log collectors being able to decide to ingress it or not, or for knowing when to play back as text to a terminal vs raw for any tool akin to kubectl logs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I created a new issue for this: #37247

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 21, 2016
@yujuhong yujuhong added this to the v1.5 milestone Nov 21, 2016
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 21, 2016
@yujuhong
Copy link
Contributor Author

@k8s-bot bazel test this

@yujuhong yujuhong added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 21, 2016
@k8s-ci-robot
Copy link
Contributor

Jenkins Bazel Build failed for commit 398c82a4142e4ec7a533ed1352ce90f973fc1180. Full PR test history.

The magic incantation to run this job again is @k8s-bot bazel test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins verification failed for commit 398c82a4142e4ec7a533ed1352ce90f973fc1180. Full PR test history.

The magic incantation to run this job again is @k8s-bot verify test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@k8s-ci-robot
Copy link
Contributor

Jenkins kops AWS e2e failed for commit 398c82a4142e4ec7a533ed1352ce90f973fc1180. Full PR test history.

The magic incantation to run this job again is @k8s-bot kops aws e2e test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

This scope of this proposal is limited to the stdout/stderr logs streams of the
containers.
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 21, 2016
@yujuhong yujuhong added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 22, 2016
@yujuhong
Copy link
Contributor Author

@k8s-bot e2e test this

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 9322304 into kubernetes:master Nov 22, 2016
@yujuhong yujuhong deleted the cri_logs branch March 13, 2017 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet