updates fluentd in fluentd-es-image to fluentd 1.1.0 #58525

monotek · 2018-01-19T16:27:53Z

What this PR does / why we need it: Updates fluentd in fluentd-es-image to fluentd 1.1.0.

Its also needed to be able to use the new fluentd-elasticsearch helm chart from: helm/charts#3379

Release note:

updates fluentd in fluentd-es-image to fluentd 1.1.0

added configmap changes raised fluentd-es-configmap version fixed missing version match

coffeepac · 2018-01-19T16:54:58Z

/ok-to-test

coffeepac · 2018-01-19T16:56:08Z

@monotek you need to format your relases note comment correctly. The release note should be in the grey box when previewing the PR description

coffeepac · 2018-01-19T16:57:58Z

@monotek keeping up with new versions is always good but between this and the linked chart PR you are implying there is a concrete reason for this upgrade. Can you share the reason?

monotek · 2018-01-19T17:16:32Z

@coffepac
With the new fluentd versions its possible to use the "@type record_transformer" in fluentd.conf without needing to install the "record reformer" plugin as before.

That the reason we like to use the newer fluentd version in our setup and therefore its used for this pull too.

coffeepac · 2018-01-19T18:03:10Z

cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml

      path /var/log/containers/*.log
-      pos_file /var/log/es-containers.log.pos
+      pos_file /var/log/fluentd-containers.log.pos


is there a reason to change this? could have adverse effects if people are upgrading.

coffeepac · 2018-01-19T18:04:46Z

cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml

      format syslog
      path /var/log/startupscript.log
-      pos_file /var/log/es-startupscript.log.pos
+      pos_file /var/log/startupscript.log.pos


same as above, is there a reason to change this file name?

coffeepac · 2018-01-19T18:05:01Z

cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml

      format /^time="(?<time>[^)]*)" level=(?<severity>[^ ]*) msg="(?<message>[^"]*)"( err="(?<error>[^"]*)")?( statusCode=($<status_code>\d+))?/
      path /var/log/docker.log
-      pos_file /var/log/es-docker.log.pos
+      pos_file /var/log/docker.log.pos


file name change, same as before

coffeepac · 2018-01-19T18:06:10Z

cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml

      filters [{ "_SYSTEMD_UNIT": "docker.service" }]
-      pos_file /var/log/gcp-journald-docker.pos
+      #pos_file /var/log/journald-docker.pos


do we not need the position file here? What happens if the fluentd container gets restarted

EDIT: I'm guessing the block takes care of it. if so, best to delete the line then leave it commented out

Sorry for barging in 😄 but since you have read_from_head true, I don't think it's wise to remove the pos_file, this combination will result in fluentd reading all the events in the journal when it is restarted as there's no reference where it last stopped, this could lead to huge bursts of re-ingested logs.

pos_file is deprecated in the new plugin version. see: https://github.com/reevoo/fluent-plugin-systemd/blob/1e14ace403891b9aa745dc8a2a920103e5fda21e/README.md

pos_file

This parameter is deprecated and will be removed in favour of storage in v1.0.

Path to pos file, stores the journald cursor. File is created if does not exist.

storage

Configuration for a storage plugin used to store the journald cursor.

Upgrading from pos_file

If pos_file is specified in addition to a storage plugin with persistent set to true, the cursor will be copied from the pos_file on startup, and the old pos_file removed.

@monotek you're absolutely right, but should you consider the "upgrade" scenario? keeping the exisiting pos_file configuration will ensure the cursor is copied to the new storage.

Also, shouldn't there be a path in the <storage> section?

@dannyk81
Not necessarily. I've used the root_dir option. See: https://docs.fluentd.org/v1.0/articles/storage_local

This is the reason for the new system.conf file: https://github.com/monotek/kubernetes/blob/0aff9bcf0402179c2dd5d11da31fccf3b228b5a1/cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml#L9

Beside that you'll get deprecation warnings when you still use the pos files. See:
https://github.com/reevoo/fluent-plugin-systemd/blob/dbd7359c6a8ce7e3b7ad9cfa447c35a5ec2fde04/lib/fluent/plugin/in_systemd.rb#L18

So i thought its better to have clean log when fluentd is started.

coffeepac · 2018-01-19T18:09:33Z

cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml

-       disable_retry_limit
-       # Use multiple threads for processing.
-       num_threads 2
+      @id elasticsearch


thanks for fixing this spacing issue.

coffeepac · 2018-01-19T18:17:05Z

/approve no-issue

k8s-ci-robot · 2018-01-19T18:17:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: coffeepac, monotek

Associated issue requirement bypassed by: coffeepac

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~cluster/addons/fluentd-elasticsearch/OWNERS~~ [coffeepac]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

k8s-github-robot · 2018-01-23T13:01:45Z

/test all [submit-queue is verifying that this PR is safe to merge]

fejta-bot · 2018-01-23T15:59:24Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

k8s-ci-robot · 2018-01-23T16:33:23Z

@monotek: The following test failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
pull-kubernetes-e2e-kubeadm-gce-canary	`ec187f7`	link	`/test pull-kubernetes-e2e-kubeadm-gce-canary`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

fejta-bot · 2018-01-23T19:08:25Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

StevenACoffman · 2018-01-23T20:03:03Z

@fejta that was a lot of recurrent failures before it actually succeeded. You're probably aware, but, in case you weren't. whew.

k8s-github-robot · 2018-01-23T21:17:33Z

Automatic merge from submit-queue (batch tested with PRs 56206, 58525). If you want to cherry-pick this change to another branch, please follow the instructions here.

Random-Liu · 2018-01-25T23:32:42Z

I believe the this PR broke the elastic logging test https://k8s-testgrid.appspot.com/google-gce#gci-gce-es-logging.

@kubernetes/sig-instrumentation-bugs

StevenACoffman · 2018-01-26T00:30:32Z

For reference, the equivalent daemonset version is being updated in this branch.

monotek · 2018-01-26T08:19:08Z

@Random-Liu Pretty sure this is because the needed 2.0.4 image is not yet available in gcr.io. See: https://console.cloud.google.com/gcr/images/google-containers/GLOBAL/fluentd-elasticsearch

monotek · 2018-01-31T08:21:09Z

Could somebody please trigger the 2.0.4 container image build?
@coffeepac ?

StevenACoffman · 2018-01-31T11:54:33Z

@piosz or @crassirostris can one of you two build and push the new 2.0.4 version?

monotek · 2018-01-31T12:05:10Z

By the way... Why is this not build automatically?

coffeepac · 2018-02-02T18:34:46Z

needs to be triggered by a googler, I am not a googler.

I've got a sig-instrumentation task to make this not-google specific. but then we get a bit tied up with the general issue of getting k8s images built without requiring a google eng to push the button.

monotek · 2018-02-04T09:40:33Z

@Random-Liu

Can you trigger the build?

Automatic merge from submit-queue (batch tested with PRs 56206, 58525). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. updates fluentd in fluentd-es-image to fluentd 1.1.0 **What this PR does / why we need it**: Updates fluentd in fluentd-es-image to fluentd 1.1.0. Its also needed to be able to use the new fluentd-elasticsearch helm chart from: helm/charts#3379 **Release note**: ```release-note updates fluentd in fluentd-es-image to fluentd 1.1.0 ```

monotek · 2018-02-13T18:18:07Z

Maybe we also have to wait for: #59665

monotek · 2018-02-14T11:38:53Z

The dockerfile was fixed in #59665.
Could somebody trigger the build again please?

crassirostris · 2018-02-14T11:42:44Z

@monotek done already, 2.0.4 is built and pushed

monotek · 2018-02-14T11:43:22Z

Thanks! :-)

Random-Liu · 2018-03-28T01:21:45Z

cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml

      path /var/log/containers/*.log
      pos_file /var/log/es-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      tag raw.kubernetes.*
+      format json
      read_from_head true


Why do we remove the multi_format? Without this elasticsearch-fluentd won't work with containerd or cri-o.

monotek · 2018-03-28T06:15:46Z

Sorry. Wasn't aware of that.
I'm also likely not able to review / fix / test before beginning / mid of April.

Random-Liu · 2018-03-28T08:14:50Z

@monotek I sent out a fix #61818.

Automatic merge from submit-queue (batch tested with PRs 61818, 61800). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Add CRI container log format support back for elastic search. The CRI container log format support was removed accidentally in #58525. This PR adds that back. I've tested it, and it works: ``` SSSSS ------------------------------ [sig-instrumentation] Cluster level logging using Elasticsearch [Feature:Elasticsearch] should check that logs from containers are ingested into Elasticsearch /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/instrumentation/logging/elasticsearch/basic.go:39 [BeforeEach] [sig-instrumentation] Cluster level logging using Elasticsearch [Feature:Elasticsearch] /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:141 STEP: Creating a kubernetes client Mar 28 08:09:01.724: INFO: >>> kubeConfig: /home/lantaol/.kube/config STEP: Building a namespace api object Mar 28 08:09:02.952: INFO: No PodSecurityPolicies found; assuming PodSecurityPolicy is disabled. STEP: Waiting for a default service account to be provisioned in namespace [BeforeEach] [sig-instrumentation] Cluster level logging using Elasticsearch [Feature:Elasticsearch] /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/instrumentation/logging/elasticsearch/basic.go:32 [It] should check that logs from containers are ingested into Elasticsearch /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/instrumentation/logging/elasticsearch/basic.go:39 Mar 28 08:09:02.988: INFO: Checking the Elasticsearch service exists. Mar 28 08:09:03.025: INFO: Checking to make sure the Elasticsearch pods are running Mar 28 08:09:03.066: INFO: Checking to make sure we are talking to an Elasticsearch service. Mar 28 08:09:03.176: INFO: Checking health of Elasticsearch service. Mar 28 08:09:03.299: INFO: Starting repeating logging pod synthlogger STEP: Waiting for logs to ingest Mar 28 08:09:17.420: INFO: Sending a search request to Elasticsearch with the following query: kubernetes.pod_name:synthlogger AND kubernetes.namespace_name:e2e-tests-es-logging-pqlx7 Mar 28 08:09:27.420: INFO: Sending a search request to Elasticsearch with the following query: kubernetes.pod_name:synthlogger AND kubernetes.namespace_name:e2e-tests-es-logging-pqlx7 Mar 28 08:09:37.420: INFO: Sending a search request to Elasticsearch with the following query: kubernetes.pod_name:synthlogger AND kubernetes.namespace_name:e2e-tests-es-logging-pqlx7 Mar 28 08:09:47.420: INFO: Sending a search request to Elasticsearch with the following query: kubernetes.pod_name:synthlogger AND kubernetes.namespace_name:e2e-tests-es-logging-pqlx7 Mar 28 08:09:57.420: INFO: Sending a search request to Elasticsearch with the following query: kubernetes.pod_name:synthlogger AND kubernetes.namespace_name:e2e-tests-es-logging-pqlx7 Mar 28 08:10:07.420: INFO: Sending a search request to Elasticsearch with the following query: kubernetes.pod_name:synthlogger AND kubernetes.namespace_name:e2e-tests-es-logging-pqlx7 [AfterEach] [sig-instrumentation] Cluster level logging using Elasticsearch [Feature:Elasticsearch] /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/framework/framework.go:142 Mar 28 08:10:07.607: INFO: Waiting up to 3m0s for all (but 0) nodes to be ready STEP: Destroying namespace "e2e-tests-es-logging-pqlx7" for this suite. Mar 28 08:10:57.758: INFO: Waiting up to 30s for server preferred namespaced resources to be successfully discovered Mar 28 08:11:00.046: INFO: namespace: e2e-tests-es-logging-pqlx7, resource: bindings, ignored listing per whitelist Mar 28 08:11:00.338: INFO: namespace e2e-tests-es-logging-pqlx7 deletion completed in 52.693713026s • [SLOW TEST:118.614 seconds] [sig-instrumentation] Cluster level logging using Elasticsearch [Feature:Elasticsearch] /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/instrumentation/common/framework.go:23 should check that logs from containers are ingested into Elasticsearch /go/src/k8s.io/kubernetes/_output/dockerized/go/src/k8s.io/kubernetes/test/e2e/instrumentation/logging/elasticsearch/basic.go:39 ------------------------------ SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSMar 28 08:11:00.346: INFO: Running AfterSuite actions on all node Mar 28 08:11:00.346: INFO: Running AfterSuite actions on node 1 Ran 1 of 845 Specs in 123.981 seconds SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 844 Skipped PASS Ginkgo ran 1 suite in 2m4.323020647s Test Suite Passed 2018/03/28 08:11:00 process.go:152: Step './hack/ginkgo-e2e.sh --ginkgo.focus=Cluster\slevel\slogging\susing\sElasticsearch' finished in 2m5.943972428s 2018/03/28 08:11:00 e2e.go:83: Done ``` Mark 1.10, because this is a regression for CRI container runtimes in 1.10. The original support was added in 1.9. #54777 **Release note**: ```release-note none ```

André Bauer added 2 commits January 10, 2018 15:20

added fluent-plugin-detect-exceptions plugin to fluentd-es-image

5080ef2

added configmap changes raised fluentd-es-configmap version fixed missing version match

updated fluentd-es-image to use fluentd 1.1.0

e457d6c

k8s-github-robot assigned piosz and coffeepac Jan 19, 2018

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 19, 2018

monotek mentioned this pull request Jan 19, 2018

added fluentd-elasticsearch helm/charts#3379

Merged

Merge branch 'master' into fluentd-1.1.0

ec187f7

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 19, 2018

k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 19, 2018

k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 19, 2018

updated fluentd configmap with 1.1.0 compatible version

9ab59be

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 19, 2018

updated iamge & configmap versions

cbfcc3f

coffeepac reviewed Jan 19, 2018

View reviewed changes

k8s-github-robot merged commit 5b9ba2f into kubernetes:master Jan 23, 2018

This was referenced Jan 24, 2018

Upgrade to Fluent-bit 1.x fabric8io/docker-fluentd-kubernetes#28

Open

Consider updating to Fluentd 1.x heptio/fluentd-yaml#3

Open

piosz removed their assignment Jan 25, 2018

Random-Liu mentioned this pull request Jan 25, 2018

ElasticSearch e2e test is red. #58837

Closed

Random-Liu reviewed Mar 28, 2018

View reviewed changes

Random-Liu mentioned this pull request Mar 28, 2018

Add CRI container log format support back for elastic search. #61818

Merged

monotek mentioned this pull request Dec 5, 2018

REQUEST: New membership for monotek kubernetes/org#288

Closed

6 tasks

updates fluentd in fluentd-es-image to fluentd 1.1.0 #58525

updates fluentd in fluentd-es-image to fluentd 1.1.0 #58525

Conversation

monotek commented Jan 19, 2018 • edited Loading

coffeepac commented Jan 19, 2018

coffeepac commented Jan 19, 2018

coffeepac commented Jan 19, 2018

monotek commented Jan 19, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coffeepac Jan 19, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

monotek Jan 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coffeepac commented Jan 19, 2018

k8s-ci-robot commented Jan 19, 2018

k8s-github-robot commented Jan 23, 2018

fejta-bot commented Jan 23, 2018

k8s-ci-robot commented Jan 23, 2018 • edited Loading

fejta-bot commented Jan 23, 2018

StevenACoffman commented Jan 23, 2018

k8s-github-robot commented Jan 23, 2018

Random-Liu commented Jan 25, 2018 • edited Loading

StevenACoffman commented Jan 26, 2018

monotek commented Jan 26, 2018

monotek commented Jan 31, 2018

StevenACoffman commented Jan 31, 2018

monotek commented Jan 31, 2018

coffeepac commented Feb 2, 2018

monotek commented Feb 4, 2018

monotek commented Feb 13, 2018 • edited Loading

monotek commented Feb 14, 2018

crassirostris commented Feb 14, 2018

monotek commented Feb 14, 2018

Choose a reason for hiding this comment

monotek commented Mar 28, 2018 • edited Loading

Random-Liu commented Mar 28, 2018

monotek commented Jan 19, 2018 •

edited

Loading

coffeepac Jan 19, 2018 •

edited

Loading

monotek Jan 20, 2018 •

edited

Loading

k8s-ci-robot commented Jan 23, 2018 •

edited

Loading

Random-Liu commented Jan 25, 2018 •

edited

Loading

monotek commented Feb 13, 2018 •

edited

Loading

monotek commented Mar 28, 2018 •

edited

Loading