-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting logs from all containers into Logstash #139
Comments
So we've done a fair bit of work already to support centralized logging via docker logs to stdout and stderr. Just about every service now writes primarily to stdout/stderr except (as you've noted) monasca-transform. The dirty secret there is that our spark and monasca-transform images are not being supported (by anyone from HPE at least) and would need to be improved a lot if we want to include them in our main docker-compose or helm environments. To be honest we'd probably be best off deleting them from the repo for now, especially since we've been using monasca-aggregator for our aggregation needs lately. I think almost all of our components are capable of writing JSON logs via oslo.log or log4j2, but plaintext has been mostly sufficient for our needs. We'd welcome patches to add support but would prefer it to be toggleable. |
Perfect. All containers should log to stdout. Logs are written by Docker to the host disc. A log agent can be used to collect logs from this location. |
@cheld In perfect world, where every app send only one event per line yes. But I'm talking here about multiline logs (like from Java, Python crashes) that Docker split and emit one log per line of output. |
@matrixik Docker does not interpret logs. It just writes logs line by line to disc (AFAIK with some meta data). It is up to the log agent to combine multiple lines to a single Monasca api call just like it would be without using Docker. |
@matrixik pls. check Sematext Docker Agent - it parses multi-line container logs out of the box and supports custom patterns to define the start of a new multiline-message and has a library of log patterns for several popular images and handles already structured logs in JSON format. You could forward the logs to any Elasticsearch server or hosted solutions like Sematext Cloud supporting Elasticsearch bulk indexing API. The agent can buffer logs to disk, in case logs can't be shipped to Elasticsearch. |
Thank you @megastef will look into it when I'll find some time. |
@matrixik one more comment: I assume the approach to be choosen should be compatible with Kubernetes deployments. In Kubernetes the container runtime can be exchanged. On the other hand, the logging location is most likely standarized by OCR. In GKE the logs are forwarded by parsing the docker logs from disk (with fluentd. I don't know if Google handles multi line logs) |
#228 that add working monasca log agent container was merged, multiline is not supported at this moment. |
Hi,
I'm investigating now how to gather all logs from all containers on host machine.
Related: #121 @kornicameister
For Monasca output we will use: https://github.com/logstash-plugins/logstash-output-monasca_log_api
There is still one problem that I'm not sure how to handle now when we have multiple same services running on one machine. And how to distinguish them.
Probably Docker emit some id's with logs.
Docker logging driver
This one would be the fastest/easiest one.
Use Docker logging driver:
Two ways:
Two problems with this one:
There is also issue in Logstash logstash-plugins/logstash-input-gelf#37 that someone requested option to merge this logs into one message but it's not looks like someone will fix it in some visible time frame.
Some workarounds for this:
Looks like monasca-transform emit logs to file and we would need to change monasca-transform itself https://github.com/openstack/monasca-transform/blob/master/monasca_transform/log_utils.py#L41
https://github.com/monasca/monasca-docker/blob/master/monasca-transform/Dockerfile#L7
Also Spark logs to file: https://github.com/monasca/monasca-docker/blob/master/spark/Dockerfile#L6
Looks like only AWS CloudWatch support multiline logs but it won't be added to STDOUT driver: moby/moby#30891
For this one we would need to have a way to get all containers on the host machine.
For this we could use https://github.com/gliderlabs/logspout that attach to all containers by default. And with plugins send all logs straight to Logstash (https://github.com/looplab/logspout-logstash).
Con: For now it only captures stdout and stderr.
Other approaches
We could use Docker data volume.
Then we would need to configure all services to log to file on shared disk that log agent would monitor.
Con: need to configure many same services on one host to not save to one file.
You can read about other approaches here:
https://www.loggly.com/blog/top-5-docker-logging-methods-to-fit-your-container-deployment-strategy/
Do you have any thoughts?
@timothyb89 @mhoppal
The text was updated successfully, but these errors were encountered: