New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1740263: enable the parsing of docker log-driver=json-file continuation lines using fluent concat plugin #1723
Conversation
This commit allows fluentd to reconstruct partial lines written by the docker json-file and journald log drivers when logs exceed the 16K byte limit. There are two new environment variables: * `USE_MULTILINE_JSON` - by default this is false - if you do `oc set env ds/logging-fluentd USE_MULTILINE_JSON=true` then fluentd will be able to reconstruct docker json-file partial logs. * `USE_MULTILINE_JOURNAL` - by default this is false - if you do `oc set env ds/logging-fluentd USE_MULTILINE_JOURNAL=true` then fluentd will be able to reconstruct docker journald partial logs. For json-file logs, the "log" field ends in `\n` for the final part of the log, and does not end in `\n` for starting and continuation lines. For journald logs, the field `CONTAINER_PARTIAL_MESSAGE=true` is present for starting and continuation lines, but is omitted for final lines. fluent-plugin-concat 2.4.0 was backported to work with ruby 2.0 and fluentd 0.12. The main feature was the ability to have only `multiline_end_regexp` without `multiline_start_regexp` which is required for docker json-file log support. The partial_key support for journald was already there for cri-o. The wrinkle with journald is that _all_ records to be reconstructed must have the `CONTAINER_PARTIAL_MESSAGE` field, so a filter was added to set `CONTAINER_PARTIAL_MESSAGE=false` for container log records which did not already have the `CONTAINER_PARTIAL_MESSAGE` field, in order to make the concat filter work for partial_key. If you want to try this out without building the image, you can follow these steps: Hack fluent.conf like this: ``` #@include configs.d/dynamic/input-docker-*.conf <source> @type tail @id docker-input @Label @ingress path "/var/log/containers/*.log" pos_file "/var/log/es-containers.log.pos" time_format %Y-%m-%dT%H:%M:%S.%N%Z tag kubernetes.* format json keep_time_key true read_from_head "true" exclude_path [] @Label @concat </source> <label @concat> <filter kubernetes.**> @type concat key log multiline_end_regexp /\n$/ </filter> <match kubernetes.**> @type relabel @Label @ingress </match> </label> ... <label @ingress> ## filters @include configs.d/openshift/filter-pre-*.conf <filter journal> @type record_modifier <record> ignoreme ${if record.key?("CONTAINER_ID_FULL") && !record.key?("CONTAINER_PARTIAL_MESSAGE"); record["CONTAINER_PARTIAL_MESSAGE"] = "false"; end; "ignoreme"} </record> remove_keys ignoreme </filter> <filter journal> @type concat key MESSAGE separator "" stream_identity_key CONTAINER_ID_FULL partial_key CONTAINER_PARTIAL_MESSAGE partial_value true </filter> ``` Create a special configmap for the plugin code: ``` mkdir cm-fluentd-plugin oc get pods -l component=fluentd fpod=logging-fluentd-xxx for file in $( oc exec $fpod -- ls /etc/fluent/plugin ) ; do oc exec $fpod -- cat /etc/fluent/plugin/$file > cm-fluentd-plugin/$file done cp cm-fluentd-plugin/filter_concat.rb cm-fluentd-plugin/filter_concat.rb.orig cp /path/to/new/filter_concat.rb cm-fluentd-plugin/filter_concat.rb oc create configmap fluentd-plugin --from-file=cm-fluentd-plugin/ ``` Then, add the volume and volumemount to the fluentd daemonset: ``` oc edit ds/logging-fluentd ``` Add to volumeMounts and volumes ``` volumeMounts: - mountPath: /etc/fluent/plugin name: fluentd-plugin readOnly: true ... volumes: - configMap: defaultMode: 420 name: fluentd-plugin name: fluentd-plugin ``` Restart fluentd ``` oc delete pods -l component=fluentd ``` You may see errors like this in the fluentd log: ``` /etc/fluent/plugin/viaq_docker_audit.rb:51: warning: already initialized constant Fluent::ViaqDockerAudit::ENV_HOSTNAME ``` You can ignore them. add support for USE_MULTILINE_JOURNAL If USE_MULTILINE_JOURNAL=true, then docker log-driver=journald logs that are spread over multiple records using CONTAINER_PARTIAL_MESSAGE will be concatenated together as a single record. bug fixes dump indices upon error
@richm: This pull request references a valid Bugzilla bug. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/test json-file |
1 similar comment
/test json-file |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: nhosoi, richm The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest Please review the full test history for this PR and help us cut down flakes. |
6 similar comments
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
/retest Please review the full test history for this PR and help us cut down flakes. |
@richm: All pull requests linked via external trackers have merged. The Bugzilla bug has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Hi, unfortunately this fails due to the require statement of the v2.4.0 filter_concat.rb which is for fluentd v0.14+:
fluentd pods in CrashLoop with Log:
Best regards |
Correct, which is why we had to create our own special version of filter_concat.rb, backported from 2.4.0, but that works on fluentd 0.12 - https://github.com/openshift/origin-aggregated-logging/blob/release-3.11/fluentd/lib/filter_concat/lib/filter_concat.rb |
Thank you, Pods are running fine now. |
Tested with following Log lines:
==> working.
Regarding the fluent config, I think the first label to INGRESS is not needed since we always run through CONCAT?
Best regards |
This is the CI test for multiline support: https://github.com/openshift/origin-aggregated-logging/blob/release-3.11/test/docker_multiline.sh It creates a file with a length > 16384 bytes. It then creates a test namespace and runs a test pod that writes the contents of this file as the If some output test is not working, please first confirm that you can see the logs using re: the |
This commit allows fluentd to reconstruct partial lines written
by the docker json-file and journald log drivers when logs
exceed the 16K byte limit.
There are two new environment variables:
USE_MULTILINE_JSON
- by default this is false - if you dooc set env ds/logging-fluentd USE_MULTILINE_JSON=true
then fluentd will be able to reconstruct docker json-file
partial logs.
USE_MULTILINE_JOURNAL
- by default this is false - if you dooc set env ds/logging-fluentd USE_MULTILINE_JOURNAL=true
then fluentd will be able to reconstruct docker journald
partial logs.
For json-file logs, the "log" field ends in
\n
for the finalpart of the log, and does not end in
\n
for starting andcontinuation lines. For journald logs, the field
CONTAINER_PARTIAL_MESSAGE=true
is present for starting andcontinuation lines, but is omitted for final lines.
fluent-plugin-concat 2.4.0 was backported to work with ruby 2.0
and fluentd 0.12. The main feature was the ability to have
only
multiline_end_regexp
withoutmultiline_start_regexp
which is required for docker json-file log support. The
partial_key support for journald was already there for cri-o.
The wrinkle with journald is that all records to be
reconstructed must have the
CONTAINER_PARTIAL_MESSAGE
field,so a filter was added to set
CONTAINER_PARTIAL_MESSAGE=false
for container log records which did not already have the
CONTAINER_PARTIAL_MESSAGE
field, in order to make the concatfilter work for partial_key.
If you want to try this out without building the image, you can
follow these steps:
Hack fluent.conf like this:
Create a special configmap for the plugin code:
Then, add the volume and volumemount to the fluentd daemonset:
Add to volumeMounts and volumes
Restart fluentd
You may see errors like this in the fluentd log:
You can ignore them.
add support for USE_MULTILINE_JOURNAL
If USE_MULTILINE_JOURNAL=true, then docker log-driver=journald logs
that are spread over multiple records using CONTAINER_PARTIAL_MESSAGE
will be concatenated together as a single record.
bug fixes
dump indices upon error