Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filebeat does not recover when it fails to connect with k8s API #14164

Open
GreenKnight15 opened this issue Oct 21, 2019 · 2 comments
Open

Filebeat does not recover when it fails to connect with k8s API #14164

GreenKnight15 opened this issue Oct 21, 2019 · 2 comments

Comments

@GreenKnight15
Copy link

@GreenKnight15 GreenKnight15 commented Oct 21, 2019

I am using the filebeat elastic helm chart under an Istio service mesh. Filebeat is configured to pull k8s meta data using the add_kubernetes_metadata processor. As of filebeat 7.4.0 with the new [k8s client #13630 (https://github.com//pull/13051) filebeat starts faster than the Istio side car which blocks outbound requests to the k8s API. After the failure filebeat never recovers the k8s connection and I lose all k8s meta data on my log packets from that point forward. I would expect the k8s client or filebeat to attempt to reestablish the connection and only a few number for logs would be missing their k8s meta data.

Looks related to #13081

Error logs:

2019-10-17T20:33:29.733Z ERROR kubernetes/util.go:85 kubernetes: Querying for pod failed with error: Get https://10.100.0.1:443/api/v1/namespaces/bootstrap/pods/bootstrap-filebeat-x-pj8n9: dial tcp 10.100.0.1:443: connect: connection refused
E1017 20:33:29.734464 1 reflector.go:125] github.com/elastic/beats/libbeat/common/kubernetes/watcher.go:235: Failed to list *v1.Pod: Get https://10.100.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Dlocalhost&limit=500&resourceVersion=0: dial tcp 10.100.0.1:443: connect: connection refused

I would share the istio logs for this outbound request but istio could never start before filebeat so the request was never captured.

Filebeat container configuration:

      - type: container
        enabled: true
        paths:
          - /var/lib/docker/containers/*/*.log
        stream: all
        processors:
        # This adds all of the kubernetes properties to the log packets i.e. pod_name, namespace, image, etc.
        - add_kubernetes_metadata: {}
        # This rename processor makes K8s pods using old label convention 'kubernetes.labels.app' to use the new convention 'kubernetes.labels.app.kubernetes.io/name'
        # This is necessary to avoid index pattern conflicts in Kibana which result in missing logs
        # If a pod ever has both the old and new conventions, 'app' will not be renamed and will appear in kibana as 'kubernetes.labels.app.value'
        - rename:
            when:
              and:
                - has_fields: ['kubernetes.labels.app']
                - not:
                    has_fields: ['kubernetes.labels.app.kubernetes.io/name']
            fields:
              - from: 'kubernetes.labels.app'
                to: 'kubernetes.labels.app_kubernetes_io/name'
            ignore_missing: true
            fail_on_error: false

For confirmed bugs, please report:

// Install isito: https://istio.io/docs/setup/install/helm/
// Inject namespace
kubectl label namespace default istio-injection=enabled
// Add elastic helm repo
helm repo add elastic https://helm.elastic.co
// Install filebeat
helm install --name filebeat elastic/filebeat --namespace default

Filebeat will start before the istio side car which causes the initial outbound k8s requests fail.
It is not an issue with the istio configuration it will still happen even if mTLS and RBAC is none restrictive.

@ephill

This comment has been minimized.

Copy link

@ephill ephill commented Dec 20, 2019

Has anyone come up with a solid workaround for this? I can only come up with really hacky modifications to the official filebeat chart to prevent it from starting too early. Or, not using Istio on my filebeat pods. Neither feel like good options.

@GreenKnight15

This comment has been minimized.

Copy link
Author

@GreenKnight15 GreenKnight15 commented Dec 20, 2019

I have not found a solid solution, I'm still using a hacky mod to the filebeat container command to sleep before start up. It allows the side car to start in time. It's works most of the time but I'm not happy about it. Haven't heard anything from elastic :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.