Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EFK: Add logs aggregation layer based on fluentd #51

Closed
ricsanfre opened this issue Jun 20, 2022 · 5 comments · Fixed by #59
Closed

EFK: Add logs aggregation layer based on fluentd #51

ricsanfre opened this issue Jun 20, 2022 · 5 comments · Fixed by #59
Labels
enhancement New feature or request
Milestone

Comments

@ricsanfre
Copy link
Owner

ricsanfre commented Jun 20, 2022

Enhancement Request

Add logs aggregation layer to logging architecture. From this layer logs can be aggregated, filtered and routed to different destinations to further processing (elasticsearch, kafka, s3, etc.)

log-architecture

source: Common architecture patterns with fluentd and fluentbit

Implementation details

Log aggregation layer can be based on fluentd or fluentbit. Both of them can be used as Log forwarders and Log aggregators fluentbit documentation. The difference is only in the number of plugins (input, output, etc) available.

Fluentbit does not support kafka input plugin (only output). Fluentd supports kafka integration as input and output. Fluentd should be the right choice for log aggregation layer, in case the logging architecture evolve in future to have a Kafka cluster as buffer mechanism between log forwarders and log aggregators.

Kafka as log buffering

source: One Year of Log Management at Vinted

Changes to the current logging architecture:

  • Fluentbit collectors need to be reconfigured to forward logs to the fluentd aggregator.
  • Fluentd need to be deployed in kubernetes cluster not as daemon set but as a deployment. See an example of how to do it here
  • Fluentd aggregator need to be configured to forward the logs to elasticsearch.
  • Fluentd aggregator can be configured to forward logs as well to a Kafka topic.
@ricsanfre ricsanfre added the enhancement New feature or request label Jun 20, 2022
@ricsanfre ricsanfre added this to the backlog milestone Jun 20, 2022
@ricsanfre
Copy link
Owner Author

Fluentd also need to be configured to export Prometheus metrics. See [fluentd documentation] (https://docs.fluentd.org/monitoring-fluentd/monitoring-prometheus)

@ricsanfre
Copy link
Owner Author

ricsanfre commented Jul 1, 2022

About exposing fluentd forwarder service to collect logs outside the cluster

Make fluentd forwarder port available from outside the cluster for collecting logs coming from external hosts (i.e. gateway) and remove the current exposure of ES service.
Communications with fluentd exposed service must be secured. TLS need to be enabled in the exposed service for encrypting the communications and authentication mechanism need to be activated.
Within the cluster Linkerd is already encrypting inter-pod communications, but authentication mechanism must be provided between fluentbit forwarders and fluentd aggregator.

For receiving logs outside the cluster, TLS need to be enabled anyway. TLS certificate can be automatically generated by cert-manager.

<source>
  @type forward
  port 24224
  bind 0.0.0.0
  <transport tls>
     cert_path /fluentd/certs/tls.crt
     private_key_path /fluentd/certs/tls.key
  </transport
 <security>
      self_hostname fluend-aggregator
      shared_key s1cret0
  </security>
</source>

@ricsanfre
Copy link
Owner Author

ricsanfre commented Jul 1, 2022

About TLS certificate generation and loading in fluentd POD

  1. Generate fluentd TLS certificate with certmanager using custom cluster CA.
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: fluentd-tls
  namespace: k3s-logging
spec:
  # Secret names are always required.
  secretName: fluentd-tls
  duration: 2160h # 90d
  renewBefore: 360h # 15d
  commonName: fluentd.picluster.ricsanfre.com
  isCA: false
  privateKey:
    algorithm: ECDSA
    size: 256
  usages:
    - server auth
    - client auth
  # At least one of a DNS Name, URI, or IP address is required.
  dnsNames:
    - fluentd.picluster.ricsanfre.com
  # ClusterIssuer: ca-issuer.
  issuerRef:
    name: ca-issuer
    kind: ClusterIssuer
    group: cert-manager.io

Certmanager will create a TLS Secret:

apiVersion: v1
kind: Secret
metadata:
  name: fluentd-tls
  namespace: k3s-logging
data:
  tls.crt: base64 encoded cert
  tls.key: base64 encoded key
type: kubernetes.io/tls
  1. That certificate can be mounted in fluentd pod as volume /fluentd/certs

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      labels:
        app: fluentd
      name: fluentd
      namespace: k3s-logging
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: fluentd
      template:
        metadata:
          labels:
            app: fluentd
        spec:
          containers:
          - image: "{{ efk_fluentd_aggregator_image }}"
            imagePullPolicy: Always
            name: fluentd
            env:
              # Elastic operator creates elastic service name with format cluster_name-es-http
              - name:  FLUENT_ELASTICSEARCH_HOST
                value: efk-es-http
                # Default elasticsearch default port
              - name:  FLUENT_ELASTICSEARCH_PORT
                value: "9200"
              # Elasticsearch user
              - name: FLUENT_ELASTICSEARCH_USER
                value: "elastic"
              # Elastic operator stores elastic user password in a secret
              - name: FLUENT_ELASTICSEARCH_PASSWORD
                valueFrom:
                  secretKeyRef:
                    name: "efk-es-elastic-user"
                    key: elastic
              # Setting a index-prefix for fluentd. By default index is logstash
              - name:  FLUENT_ELASTICSEARCH_INDEX_NAME
                value: fluentd
              - name: FLUENT_ELASTICSEARCH_LOG_ES_400_REASON
                value: "true"
            ports:
            - containerPort: 24224
              name: forward
              protocol: TCP
            - containerPort: 24231
              name: prometheus
              protocol: TCP
            volumeMounts:
            - mountPath: /fluentd/etc
              name: config
              readOnly: true
            - mountPath: "/fluentd/certs"
              name: fluentd-tls
              readOnly: true
          volumes:
          - configMap:
              defaultMode: 420
              name: fluentd-config
            name: config

@ricsanfre ricsanfre modified the milestones: backlog, release 1.4 Jul 19, 2022
@ricsanfre
Copy link
Owner Author

About production ready forwarder/aggregator configuration

  1. Fluentbit and fluentd filesystem buffering mechanisms should be enabled.

  2. Fluentd aggregator should be deployed in HA, Kubernetes deployment with several replicas. Kubernetes HPA (Horizontal POD Autoscaler) could be configured to automatically scale the number of replicas.

  3. Fluentd could be deployed as Statefulset instead of Deployment with dedicated pvc for disk buffer. This way if pod is terminated buffer information is not lost.

@ricsanfre
Copy link
Owner Author

About the use of official helm fluentd chart

fluentd official helm chart also supports the deployment of fluentd as deployment or statefulset instead of daemonset. In case of deployment HPA is also supported.

values.yml could be something like this:

# Deploy fluentd as deployment
kind: "Deployment"
# Number of replicas
replicaCount: 1
# Enabling HPA
autoscaling:
  enabled: true
  minReplicas: 1
  maxReplicas: 100
  targetCPUUtilizationPercentage: 80

# Do not create serviceAccount, RBAC and podSecurityPolicy objects
serviceAccount:
  create: false
rbac:
  create: false
podSecurityPolicy:
  enabled: false

## Additional environment variables to set for fluentd pods
env:
  ...

# Volumes and VolumeMounts (only configuration files and certificates)
volumes:
- name: etcfluentd-main
  configMap:
    name: fluentd-main
    defaultMode: 0777
- name: etcfluentd-config
  configMap:
    name: fluentd-config
    defaultMode: 0777
- name: fluentd-tls
  secret:
    secretName: fluentd-tls

volumeMounts:
- name: etcfluentd-main
  mountPath: /etc/fluent
- name: etcfluentd-config
  mountPath: /etc/fluent/config.d/
- mountPath: /fluentd/certs
  name: fluentd-tls
  readOnly: true

service:
  type: "ClusterIP"
  annotations: {}
  # loadBalancerIP:
  # externalTrafficPolicy: Local
  ports:
  - name: "forwarder"
    protocol: TCP
    containerPort: 24224
  - name: prometheus
    containerPort: 24231
    protocol: TCP

## Fluentd list of plugins to install
##
plugins: []
# - fluent-plugin-out-http

## Add fluentd config files from K8s configMaps
##
configMapConfigs:
  - fluentd-prometheus-conf
# - fluentd-systemd-conf

## Fluentd configurations:
##
fileConfigs:
  01_sources.conf: |-
    ## logs from podman
    <source>
      @type tail
      @id in_tail_container_logs
      @label @KUBERNETES
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      tag kubernetes.*
      read_from_head true
      <parse>
        @type multi_format
        <pattern>
          format json
          time_key time
          time_type string
          time_format "%Y-%m-%dT%H:%M:%S.%NZ"
          keep_time_key false
        </pattern>
        <pattern>
          format regexp
          expression /^(?<time>.+) (?<stream>stdout|stderr)( (.))? (?<log>.*)$/
          time_format '%Y-%m-%dT%H:%M:%S.%NZ'
          keep_time_key false
        </pattern>
      </parse>
      emit_unmatched_lines true
    </source>
  02_filters.conf: |-
    <label @KUBERNETES>
      <match kubernetes.var.log.containers.fluentd**>
        @type relabel
        @label @FLUENT_LOG
      </match>
      # <match kubernetes.var.log.containers.**_kube-system_**>
      #   @type null
      #   @id ignore_kube_system_logs
      # </match>
      <filter kubernetes.**>
        @type kubernetes_metadata
        @id filter_kube_metadata
        skip_labels false
        skip_container_metadata false
        skip_namespace_metadata true
        skip_master_url true
      </filter>
      <match **>
        @type relabel
        @label @DISPATCH
      </match>
    </label>
  03_dispatch.conf: |-
    <label @DISPATCH>
      <filter **>
        @type prometheus
        <metric>
          name fluentd_input_status_num_records_total
          type counter
          desc The total number of incoming records
          <labels>
            tag ${tag}
            hostname ${hostname}
          </labels>
        </metric>
      </filter>
      <match **>
        @type relabel
        @label @OUTPUT
      </match>
    </label>
  04_outputs.conf: |-
    <label @OUTPUT>
      <match **>
        @type elasticsearch
        host "elasticsearch-master"
        port 9200
        path ""
        user elastic
        password changeme
      </match>
    </label>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant