Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting logs from all containers into Logstash #139

Open
matrixik opened this issue Aug 2, 2017 · 8 comments
Open

Getting logs from all containers into Logstash #139

matrixik opened this issue Aug 2, 2017 · 8 comments
Assignees

Comments

@matrixik
Copy link
Member

matrixik commented Aug 2, 2017

Hi,

I'm investigating now how to gather all logs from all containers on host machine.
Related: #121 @kornicameister

For Monasca output we will use: https://github.com/logstash-plugins/logstash-output-monasca_log_api

There is still one problem that I'm not sure how to handle now when we have multiple same services running on one machine. And how to distinguish them.
Probably Docker emit some id's with logs.

Docker logging driver

This one would be the fastest/easiest one.

Use Docker logging driver:

  • Configure all services to log to STDOUT and STDERR.

Two ways:

  1. Use Docker logging driver to output logs straight to the Logstash (using Graylog Extended Format driver).
  2. Have log agent in container and gather all logs from other containers.

Two problems with this one:

  1. We would need to make sure all containers output logs by default to STDOUT and STDERR.
  2. Docker logging driver by default emit all logs line by line. So multiline log will be emitted as many one line logs. And Docker devs don't plan/want to fix that: log driver should support multiline moby/moby#22920

There is also issue in Logstash logstash-plugins/logstash-input-gelf#37 that someone requested option to merge this logs into one message but it's not looks like someone will fix it in some visible time frame.

Some workarounds for this:

  1. Configure all services to emit logs in json format. Not sure if it's possible to configure in such way Java applications.

Looks like monasca-transform emit logs to file and we would need to change monasca-transform itself https://github.com/openstack/monasca-transform/blob/master/monasca_transform/log_utils.py#L41

https://github.com/monasca/monasca-docker/blob/master/monasca-transform/Dockerfile#L7

Also Spark logs to file: https://github.com/monasca/monasca-docker/blob/master/spark/Dockerfile#L6

Looks like only AWS CloudWatch support multiline logs but it won't be added to STDOUT driver: moby/moby#30891

For this one we would need to have a way to get all containers on the host machine.
For this we could use https://github.com/gliderlabs/logspout that attach to all containers by default. And with plugins send all logs straight to Logstash (https://github.com/looplab/logspout-logstash).

Con: For now it only captures stdout and stderr.

Other approaches

We could use Docker data volume.
Then we would need to configure all services to log to file on shared disk that log agent would monitor.

Con: need to configure many same services on one host to not save to one file.

You can read about other approaches here:
https://www.loggly.com/blog/top-5-docker-logging-methods-to-fit-your-container-deployment-strategy/

Do you have any thoughts?

@timothyb89 @mhoppal

@timothyb89
Copy link
Member

So we've done a fair bit of work already to support centralized logging via docker logs to stdout and stderr. Just about every service now writes primarily to stdout/stderr except (as you've noted) monasca-transform.

The dirty secret there is that our spark and monasca-transform images are not being supported (by anyone from HPE at least) and would need to be improved a lot if we want to include them in our main docker-compose or helm environments. To be honest we'd probably be best off deleting them from the repo for now, especially since we've been using monasca-aggregator for our aggregation needs lately.

I think almost all of our components are capable of writing JSON logs via oslo.log or log4j2, but plaintext has been mostly sufficient for our needs. We'd welcome patches to add support but would prefer it to be toggleable.

@cheld
Copy link

cheld commented Aug 3, 2017

So we've done a fair bit of work already to support centralized logging via docker logs to stdout and stderr

Perfect. All containers should log to stdout. Logs are written by Docker to the host disc. A log agent can be used to collect logs from this location.

@matrixik
Copy link
Member Author

matrixik commented Aug 3, 2017

@cheld In perfect world, where every app send only one event per line yes. But I'm talking here about multiline logs (like from Java, Python crashes) that Docker split and emit one log per line of output.

@cheld
Copy link

cheld commented Aug 3, 2017

@matrixik Docker does not interpret logs. It just writes logs line by line to disc (AFAIK with some meta data). It is up to the log agent to combine multiple lines to a single Monasca api call just like it would be without using Docker.

@megastef
Copy link

@matrixik pls. check Sematext Docker Agent - it parses multi-line container logs out of the box and supports custom patterns to define the start of a new multiline-message and has a library of log patterns for several popular images and handles already structured logs in JSON format. You could forward the logs to any Elasticsearch server or hosted solutions like Sematext Cloud supporting Elasticsearch bulk indexing API. The agent can buffer logs to disk, in case logs can't be shipped to Elasticsearch.

@matrixik
Copy link
Member Author

Thank you @megastef will look into it when I'll find some time.

@cheld
Copy link

cheld commented Sep 7, 2017

@matrixik one more comment: I assume the approach to be choosen should be compatible with Kubernetes deployments. In Kubernetes the container runtime can be exchanged. On the other hand, the logging location is most likely standarized by OCR. In GKE the logs are forwarded by parsing the docker logs from disk (with fluentd. I don't know if Google handles multi line logs)

@matrixik
Copy link
Member Author

#228 that add working monasca log agent container was merged, multiline is not supported at this moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants