Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide logging behavior policies applied by conmon to stdout/stderr #84

Open
portante opened this issue Nov 14, 2019 · 18 comments
Open
Assignees

Comments

@portante
Copy link

Today conmon is engaging in a byte-capture of stdout/stderr pipes, recording data as a series of JSON documents in a log file. All bytes from both file descriptors are captured without discrimination. We also see that conmon tries to write to disk as fast as possible (but sequentially) with what it reads from both pipes.

There is a natural back-pressure presented by conmon to the processes in the container writing to stdout/stderr that results from the rate the disk can accepts write()s. It is usually the case that all conmon processes server containers write to the same disk. Which means it is possible for "noisy" containers to take up much of the available disk bandwidth leading to unwanted impacts of one container on another.

Further, the rate at which containers write to disk is independent of the rate at which log file readers (or scrapers; e.g. fluentd, fluent-bit, rsyslog, syslog-ng, filebeat, etc.) can read. Because log file rotation is performed independent of the readers and conmon on most platforms, it is possible for conmon to write more data than what a reader can take in before a log file is deleted behind the reader's back with it ever seeing it.

These two problems suggest that log behavior policies applied by conmon to the log stream will give administrators the chance to solve them.

There are three policy behaviors we can begin to consider, each applied to stdout/stderr independently:

  1. back-pressure
  • Given a rate specified in "bytes per interval", stop reading from the given pipe if the number of bytes read so far exceeds the limit for that interval, continuing to read again at the beginning of the next interval
  • Suggested interval could just be 1 second by default, and/or allow it to be configured along with the # of bytes for that interval
  1. drop
  • Given a rate like specified for back-pressure, drop bytes read over the limit for the remainder of the interval, accepting them again at the start of the next interval
  1. ignore
  • Ignore all bytes read from a given pipe without writing them to disk.

This will allow an administrator to apply a policy to all conmon processes. Since conmon processes don't "know" about each other, coordination of the setting of that policy would have to be managed externally to conmon.

It is possible that conmon processes could periodically poll for configuration changes, or accept some sort of signal to re-evaluate the change.

@rhatdan
Copy link
Member

rhatdan commented Nov 17, 2019

@giuseppe Could you see about adding these flags.

@giuseppe
Copy link
Member

I've a few questions.

  • Is ignore the equivalent of drop with a bytes-per-interval=0?

  • Are these limits supposed to be changed at runtime? e.g. can I change bytes-per-interval while the container is running?

  • How these settings will work for Podman/CRI-O? e.g. for CRi-O: should it be done globally or in an annotation?

@mheon
Copy link
Member

mheon commented Nov 18, 2019 via email

@rhatdan
Copy link
Member

rhatdan commented Nov 18, 2019

I've a few questions.

* Is `ignore` the equivalent of `drop` with a bytes-per-interval=0?

@portante WDYT?

* Are these limits supposed to be changed at runtime?  e.g. can I change bytes-per-interval while the container is running?

No these should be constant for the container

* How these settings will work for Podman/CRI-O?  e.g. for CRi-O: should it be done globally or in an annotation?

CRI-O would need to use annotations until this gets support in upstream Kubernetes.
I agree with @mheon Podman would use --log-opt.

@haircommander
Copy link
Collaborator

* How these settings will work for Podman/CRI-O?  e.g. for CRi-O: should it be done globally or in an annotation?

CRI-O would need to use annotations until this gets support in upstream Kubernetes.
I agree with @mheon Podman would use --log-opt.

While using annotations in CRI-O to set this would work, I'm not sure anyone would use it (us included) before it makes it upstream. conmon would still have to log in the old format to conform, and thus we'd have to have concurrent logging (the new and old way). It may make most sense to implement for podman, and propose in CRI, before we do work in CRI-O that may not be used or may not be really desired

@portante
Copy link
Author

Is ignore the equivalent of drop with a bytes-per-interval=0?

Certainly it could be implemented that way, but I was hoping we could make it so that at the --log-opt interface level drop with bytes-per-interval=0 is rejected. This way the intention is explicit instead of implicit: you want one of three behaviors, ignore (no logs at all), drop(logs if in bounds), back-pressure (all logs, with max bandwidth).

The other behavior we need is to be sure an SRE has the option to set the max bandwidth without the user increasing it, while still allowing the users to pick the behavior.

@mffiedler
Copy link

Example of the behavior described by @portante in https://bugzilla.redhat.com/show_bug.cgi?id=1741955. Container logging rates and container runtime log rotation policy can cause the logs to outstrip the log scraper's ability to keep up.

@syedriko
Copy link

syedriko commented Dec 9, 2019

I'm adding two new log options to podman CLI in my POC, passed as part of --log-opt:

policy=backpressure|drop|ignore|passthrough

Peter has covered the first three, passthrough is current behavior with unrestricted logging.

rate-limit=RATE
Limit the transfer to a maximum of RATE bytes per second. A suffix of "K", "M", "G", or "T" can be
added to denote kibibytes (*1024), mebibytes, and so on.

@rhatdan
Copy link
Member

rhatdan commented Dec 9, 2019

@syedriko @portante Could one of you open a PR to add this feature?

@syedriko
Copy link

syedriko commented Dec 9, 2019

@rhatdan A WIP PR: #92
Another PR in libpod for the CLI changes: containers/podman#4663

@rhatdan
Copy link
Member

rhatdan commented Dec 9, 2019

@syedriko I mean a PR to containers/common to add these limits.

@bparees
Copy link

bparees commented Dec 9, 2019

what happens in the back-pressure case? the logs continue to accumulate in the pipe/buffer indefinitely(Assuming the log rate never drops)? can this cause memory consumption issues? who's memory budget does that come out of?

@haircommander
Copy link
Collaborator

who's memory budget does that come out of?

By default, CRI-O puts conmon in the system.slice, but OCP ships with it in the pod slice, so it'd be charged to the pod

@bparees
Copy link

bparees commented Dec 9, 2019

so it'd be charged to the pod

that sounds like the right place for it, so that's good.

@syedriko
Copy link

syedriko commented Dec 9, 2019

what happens in the back-pressure case?

When over output quota in a particular time period, container log writes will block on a pipe that conmon reads from until the end of that time period. There won't be log accumulation.

@portante
Copy link
Author

portante commented Dec 9, 2019

what happens in the back-pressure case?

When over output quota in a particular time period, container log writes will block on a pipe that conmon reads from until the end of that time period. There won't be log accumulation.

One complication is that I think the logging rate should be specified separately for stdout from stderr. We should not lump the two together.

@syedriko
Copy link

syedriko commented Dec 9, 2019

what happens in the back-pressure case?

When over output quota in a particular time period, container log writes will block on a pipe that conmon reads from until the end of that time period. There won't be log accumulation.

One complication is that I think the logging rate should be specified separately for stdout from stderr. We should not lump the two together.

We can sure do that. With the PRs that are currently in-flight, I'm aiming at building a minimal working but end-to-end implementation we can play with, pick apart and make sure it makes sense.

@syedriko
Copy link

@syedriko I mean a PR to containers/common to add these limits.

@rhatdan Could you give me a bit more details? I'm new to the containers code base and I'm not quite seeing how containers/common fits into what I'm doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants