New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 1554293 - logging-eventrouter event not formatted correctly in El… #1357
Conversation
|
/test logging |
|
WIP - adding a test case... |
|
/test logging |
|
Unfortunately, the current fix is incomplete. It does not pass my to-be-added-test cases. :( |
test/eventrouter.sh
Outdated
| # Make sure there's no MUX | ||
| oc set env ds/logging-fluentd MUX_CLIENT_MODE- 2>&1 | artifact_out | ||
| oc label node --all logging-infra-fluentd- | ||
| oc label node --all logging-infra-fluentd=true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the oc set env has the effect of restarting fluentd - if that is sufficient in this case, you can get rid of the oc label node and add something like this, to wait for fluentd to be up and running. That is, the usual idiom is - label node with logging-infra-fluentd- to shut off fluentd, then wait for the daemonset numberReady to be 0. Then, edit the fluentd ds, configmap, whatever changes are needed to fluentd. Then, relabel the node, and wait for a fluentd pod to be running:
# undeploy fluentd
oc label node --all logging-infra-fluentd- 2>&1 | artifact_out
os::cmd::try_until_text "oc get daemonset logging-fluentd -o jsonpath='{ .status.numberReady }'" "0" $FLUENTD_WAIT_TIME
# oc set env, edit ds, whatever
oc label node --all logging-infra-fluentd=true 2>&1 | artifact_out
os::cmd::try_until_text "oc get pods -l component=fluentd" "^logging-fluentd-.* Running "
|
Finally, the eventrouter test is not running: |
|
/lgtm |
|
Thank you, @richm . I pushed the new patch based on your suggestion. (I just wanted to make sure Fluentd is restarted with the valid MUX_CLIENT_MODE...) Regarding the issue eventrouter is not enabled, I thought "-e openshift_logging_install_eventrouter=True" was passed to ansible... Where I can enable it? |
https://github.com/openshift/aos-cd-jobs/tree/master/sjb/config/test_cases - but it is tricky - I think if you re-enable |
Indeed, it's tricky... I'd like to enable the eventrouter test in 3.9 and above. To do so, I'm supposed to discomment from "# -e openshift_logging_install_eventrouter=True" and move it to the right place in all these files? (I don't see 39 or older yml files in the aos-cd-jobs/sjb/config/test_cases directory.) Thanks. |
yes |
|
@richm , eventrouter was disabled with this comment. I wonder the bug has been already solved?
|
I think so - I think it was an ansible version problem |
Thank you, @richm !! I've submit this PR. |
|
/retest |
|
bot, retest this please |
|
/test logging |
…asticsearch when using MUX
When fluentd is configured as a collector and MUX, event logs from the event
router need to be processed by MUX not by the collector fluentd for the both
MUX_CLIENT_MODE=maximal and minimal cases. It is because if an event log is
formatted in the collector (note: the event record is put under the kubernetes
key), the log is forwarded to MUX and passed to the k8s-meta plugin and the
existing kubernetes record is overwritten.
To avoid the replacement, if the log is from event router, the tag is rewritten
to ${tag}.raw in input-post-forward-mux.conf, which makes the log treated in
the MUX_CLIENT_MODE=minimal way.
There was another bug in ansible. That is, the environment variable TRANSFORM_
EVENTS was not set in MUX even if openshift_logging_install_eventrouter is set
to true. This PR fixes the issue.
openshift/openshift-ansible#10207
In Fluentd run.sh, "process_kubernetes_events false" is set in the filter-viaq-
data-model plugin to suppress processing the event logs. It is _not_ set, i.e.,
event router log processing is enabled, when TRANSFORM_EVENTS is true and the
fluentd is standalone (no MUX configured) or the fluentd is MUX.
Correct the order of shutdown fluentd, reset MUX_CLIENT_MODE, then restart.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
Thanks, @richm . Removed the hold label. |
|
/cherrypick release-3.11 |
|
@nhosoi: new pull request created: #1374 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherrypick release-3.10 |
|
@nhosoi: new pull request created: #1375 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherrypick release-3.9 |
|
@nhosoi: new pull request created: #1376 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…asticsearch when using MUX
Disable process_kubernetes_events if TRANSFORM_EVENTS is false or MUX client.
Depends on openshift/openshift-ansible#10207