Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: get status request failed:failed to get audit status reply: no reply received #33258

Open
mdnfiras opened this issue Oct 5, 2022 · 3 comments
Labels
Auditbeat Team:Security-Linux Platform Linux Platform Team in Security Solution

Comments

@mdnfiras
Copy link

mdnfiras commented Oct 5, 2022

we are testing auditbeat and we set it up as Daemonset in our GKE cluster. few pods randomly print this error line:

ERROR: get status request failed:failed to get audit status reply: no reply received

As far as i know, it has nothing to do with the nodes: we can restart the pods as much as we want, new pods on the same nodes sometimes work without printing that error, sometimes they print that error even though previous pods on the same nodes were working fine.

Configuration:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: auditbeat
  namespace: monitoring
  labels:
    app: auditbeat
spec:
  selector:
    matchLabels:
      app: auditbeat
  template:
    metadata:
      labels:
        app: auditbeat
    spec:
      tolerations:
      - effect: NoExecute
        operator: Exists
      - effect: NoSchedule
        operator: Exists
      serviceAccountName: auditbeat
      terminationGracePeriodSeconds: 30
      hostNetwork: true
      hostPID: true  # Required by auditd module
      dnsPolicy: ClusterFirstWithHostNet
      # initContainer systemctl to unregister systemd-journald as audit process, otherwise we
      # will see the error "failed to set audit PID. An audit process is already running (PID n)"
      initContainers:
      - name: systemctl
        image: centos
        command:
        - /bin/sh
        - "-c"
        - |
          set -e
          systemctl stop systemd-journald-audit.socket
          systemctl mask systemd-journald-audit.socket
          systemctl restart systemd-journald
          set +e
          systemctl status systemd-journald-audit.socket
          systemctl status systemd-journald
        env:
          - name: SYSTEMD_IGNORE_CHROOT
            value: "1"
        securityContext:
          runAsUser: 0
          capabilities:
            add:
              - 'SYS_ADMIN'
        volumeMounts:
          - name: run
            mountPath: /run
      containers:
      - name: auditbeat
        image: docker.elastic.co/beats/auditbeat:7.16.1
        args:
        - "-c"
        - /etc/auditbeat.yml
        - "-e"
        env:
        - name: ELASTICSEARCH_HOST
          valueFrom: ***
        - name: ELASTICSEARCH_PORT
          valueFrom: ***
        - name: ELASTICSEARCH_USERNAME
          valueFrom: ***
        - name: ELASTICSEARCH_PASSWORD
          valueFrom: ***
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: TINI_SUBREAPER
          value: "1"
        securityContext:
          runAsUser: 0
          capabilities:
            add:
              # Capabilities needed for auditd module
              - 'AUDIT_READ'
              - 'AUDIT_WRITE'
              - 'AUDIT_CONTROL'
        volumeMounts:
        - name: config
          mountPath: /etc/auditbeat.yml
          readOnly: true
          subPath: auditbeat.yml
        - name: modules
          mountPath: /usr/share/auditbeat/modules.d
          readOnly: true
        - name: data
          mountPath: /usr/share/auditbeat/data
        - name: bin
          mountPath: /hostfs/bin
          readOnly: true
        - name: sbin
          mountPath: /hostfs/sbin
          readOnly: true
        - name: usrbin
          mountPath: /hostfs/usr/bin
          readOnly: true
        - name: usrsbin
          mountPath: /hostfs/usr/sbin
          readOnly: true
        - name: etc
          mountPath: /hostfs/etc
          readOnly: true
        # Directory with root filesystems of containers executed with containerd, this can be
        # different with other runtimes. This volume is needed to monitor the file integrity
        # of files in containers.
        - name: run-containerd
          mountPath: /run/containerd
          readOnly: true
      volumes:
      - name: bin
        hostPath:
          path: /bin
      - name: usrbin
        hostPath:
          path: /usr/bin
      - name: sbin
        hostPath:
          path: /sbin
      - name: usrsbin
        hostPath:
          path: /usr/sbin
      - name: etc
        hostPath:
          path: /etc
      - name: config
        configMap:
          defaultMode: 0640
          name: auditbeat-config
      - name: modules
        configMap:
          defaultMode: 0640
          name: auditbeat-daemonset-modules
      - name: data
        hostPath:
          # When auditbeat runs as non-root user, this directory needs to be writable by group (g+w).
          path: /var/lib/auditbeat-data
          type: DirectoryOrCreate
      - name: run-containerd
        hostPath:
          path: /run/containerd
          type: DirectoryOrCreate
      # the run volume for the initContainer systemctl
      - name: run
        hostPath:
          path: /run

Startup logs:

2022-10-05T14:40:16.784Z	INFO	instance/beat.go:686	Home path: [/usr/share/auditbeat] Config path: [/usr/share/auditbeat] Data path: [/usr/share/auditbeat/data] Logs path: [/usr/share/auditbeat/logs] Hostfs Path: [/]
2022-10-05T14:40:16.784Z	INFO	instance/beat.go:694	Beat ID: 65f16e32-9784-4c27-a7bd-b318839d4c59
2022-10-05T14:40:16.787Z	WARN	[add_cloud_metadata]	add_cloud_metadata/provider_aws_ec2.go:95	error when check request status for getting IMDSv2 token: http request status 405. No token in the metadata request will be used.
2022-10-05T14:40:16.978Z	INFO	[seccomp]	seccomp/seccomp.go:124	Syscall filter successfully installed
2022-10-05T14:40:16.978Z	INFO	[beat]	instance/beat.go:1040	Beat info	{"system_info": {"beat": {"path": {"config": "/usr/share/auditbeat", "data": "/usr/share/auditbeat/data", "home": "/usr/share/auditbeat", "logs": "/usr/share/auditbeat/logs"}, "type": "auditbeat", "uuid": "65f16e32-9784-4c27-a7bd-b318839d4c59"}}}
2022-10-05T14:40:16.978Z	INFO	[beat]	instance/beat.go:1049	Build info	{"system_info": {"build": {"commit": "d420ccdaf201e32a524632b5da729522e50257ae", "libbeat": "7.16.3", "time": "2022-01-07T00:30:56.000Z", "version": "7.16.3"}}}
2022-10-05T14:40:16.978Z	INFO	[beat]	instance/beat.go:1052	Go runtime info	{"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":8,"version":"go1.17.5"}}}
2022-10-05T14:40:17.079Z	INFO	[beat]	instance/beat.go:1056	Host info	{"system_info": {"host": {"architecture":"x86_64","boot_time":"2022-10-05T04:52:39Z","containerized":false,"name":"***","ip":[***],"os":{"type":"linux","family":"redhat","platform":"centos","name":"CentOS Linux","version":"7 (Core)","major":7,"minor":9,"patch":2009,"codename":"Core"},"timezone":"***","timezone_offset_sec":0,"id":"***"}}}
2022-10-05T14:40:17.079Z	INFO	[beat]	instance/beat.go:1085	Process info	{"system_info": {"process": ***}}
2022-10-05T14:40:17.178Z	INFO	instance/beat.go:328	Setup Beat: auditbeat; Version: 7.16.3
2022-10-05T14:40:17.178Z	INFO	[index-management]	idxmgmt/std.go:184	Set output.elasticsearch.index to 'auditbeat-7.16.3' as ILM is enabled.
2022-10-05T14:40:17.178Z	INFO	[esclientleg]	eslegclient/connection.go:102	elasticsearch url: http://elasticsearch.monitoring:***
2022-10-05T14:40:17.178Z	INFO	[publisher]	pipeline/module.go:113	Beat name: ***
2022-10-05T14:40:17.179Z	INFO	[monitoring]	log/log.go:142	Starting metrics logging every 30s
2022-10-05T14:40:17.179Z	INFO	instance/beat.go:492	auditbeat start running.
2022-10-05T14:40:17.179Z	INFO	[add_cloud_metadata]	add_cloud_metadata/add_cloud_metadata.go:105	add_cloud_metadata: hosting provider type detected as gcp, metadata={***}
2022-10-05T14:40:17.180Z	INFO	[auditd]	auditd/audit_linux.go:107	auditd module is running as euid=0 on kernel=5.4.188+
2022-10-05T14:40:17.277Z	INFO	[auditd]	auditd/audit_linux.go:134	socket_type=unicast will be used.
2022-10-05T14:40:17.277Z	INFO	cfgfile/reload.go:164	Config reloader started
2022-10-05T14:40:17.277Z	INFO	add_kubernetes_metadata/kubernetes.go:72	add_kubernetes_metadata: kubernetes env detected, with version: v1.21.14-gke.2700
2022-10-05T14:40:17.277Z	INFO	[kubernetes]	kubernetes/util.go:122	kubernetes: Using node *** provided in the config	{"libbeat.processor": "add_kubernetes_metadata"}
2022-10-05T14:40:17.278Z	INFO	[auditd]	auditd/audit_linux.go:107	auditd module is running as euid=0 on kernel=5.4.188+
2022-10-05T14:40:17.278Z	INFO	[auditd]	auditd/audit_linux.go:134	socket_type=unicast will be used.
2022-10-05T14:40:17.279Z	INFO	cfgfile/reload.go:224	Loading of config files completed.
2022-10-05T14:40:17.330Z	INFO	[auditd]	auditd/audit_linux.go:279	Deleted 4 pre-existing audit rules.
2022-10-05T14:40:17.330Z	INFO	[auditd]	auditd/audit_linux.go:298	Successfully added 4 of 4 audit rules.
2022-10-05T14:40:17.330Z	INFO	[auditd]	auditd/audit_linux.go:322	audit status from kernel at start	{"audit_status": {"Mask":0,"Enabled":1,"Failure":0,"PID":0,"RateLimit":0,"BacklogLimit":8192,"Lost":0,"Backlog":0,"FeatureBitmap":127,"BacklogWaitTime":0}}
2022-10-05T14:40:17.330Z	INFO	[auditd]	auditd/audit_linux.go:346	Setting kernel backlog wait time to prevent backpressure propagating to the kernel.
2022-10-05T14:40:17.378Z	INFO	[publisher_pipeline_output]	pipeline/output.go:143	Connecting to backoff(elasticsearch(http://elasticsearch.monitoring:***))
2022-10-05T14:40:17.379Z	INFO	[publisher]	pipeline/retry.go:219	retryer: send unwait signal to consumer
2022-10-05T14:40:17.379Z	INFO	[publisher]	pipeline/retry.go:223	  done
2022-10-05T14:40:17.585Z	INFO	[esclientleg]	eslegclient/connection.go:282	Attempting to connect to Elasticsearch version 7.16.1
2022-10-05T14:40:17.877Z	INFO	[esclientleg]	eslegclient/connection.go:282	Attempting to connect to Elasticsearch version 7.16.1
2022-10-05T14:40:17.879Z	INFO	[file_integrity]	file_integrity/eventreader_fsnotify.go:99	Started fsnotify watcher	{"file_path": ["/hostfs/bin", "/hostfs/etc", "/hostfs/sbin", "/hostfs/usr/bin", "/hostfs/usr/sbin"], "recursive": true}
2022-10-05T14:40:17.972Z	INFO	[index-management]	idxmgmt/std.go:261	Auto ILM enable success.
2022-10-05T14:40:17.977Z	INFO	[index-management.ilm]	ilm/std.go:170	ILM policy auditbeat exists already.
2022-10-05T14:40:17.977Z	INFO	[index-management]	idxmgmt/std.go:397	Set setup.template.name to '{auditbeat-7.16.3 {now/d}-000001}' as ILM is enabled.
2022-10-05T14:40:17.977Z	INFO	[index-management]	idxmgmt/std.go:402	Set setup.template.pattern to 'auditbeat-7.16.3-*' as ILM is enabled.
2022-10-05T14:40:17.977Z	INFO	[index-management]	idxmgmt/std.go:436	Set settings.index.lifecycle.rollover_alias in template to {auditbeat-7.16.3 {now/d}-000001} as ILM is enabled.
2022-10-05T14:40:17.977Z	INFO	[index-management]	idxmgmt/std.go:440	Set settings.index.lifecycle.name in template to {auditbeat {"policy":{"phases":{"hot":{"actions":{"rollover":{"max_age":"30d","max_size":"50gb"}}}}}}} as ILM is enabled.
2022-10-05T14:40:17.984Z	INFO	template/load.go:110	Template "auditbeat-7.16.3" already exists and will not be overwritten.
2022-10-05T14:40:17.984Z	INFO	[index-management]	idxmgmt/std.go:297	Loaded index template.
2022-10-05T14:40:17.987Z	INFO	[index-management.ilm]	ilm/std.go:126	Index Alias auditbeat-7.16.3 exists already.
2022-10-05T14:40:18.077Z	INFO	[publisher_pipeline_output]	pipeline/output.go:151	Connection to backoff(elasticsearch(http://elasticsearch.monitoring:***)) established

runtime logs, a lot of lines like these, with the error line repeating every few minutes at random interval (2-10 minutes):

...
2022-10-05T15:05:47.193Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": ***}
...
2022-10-05T15:07:48.223Z	ERROR	[auditd]	auditd/audit_linux.go:204	get status request failed:failed to get audit status reply: no reply received
...
2022-10-05T15:08:17.377Z	INFO	[monitoring]	log/log.go:184	Non-zero metrics in the last 30s	{"monitoring": ***}
...

For confirmed bugs, please report:

  • Version: tested with 7.16.1, 7.16.3 and 7.17.6
  • Operating System: GKE v1.21.14-gke.2700, tried with both docker and containerd cluster nodes
  • Steps to Reproduce: apply the above DaemonSet yaml file
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Oct 20, 2022
@botelastic
Copy link

botelastic bot commented Oct 20, 2023

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Oct 20, 2023
@norrietaylor norrietaylor added Team:Security-Linux Platform Linux Platform Team in Security Solution and removed Team:Security-External Integrations labels Jan 31, 2024
@elasticmachine
Copy link
Collaborator

Pinging @elastic/sec-linux-platform (Team:Security-Linux Platform)

@botelastic botelastic bot removed the Stalled label Jan 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Auditbeat Team:Security-Linux Platform Linux Platform Team in Security Solution
Projects
None yet
Development

No branches or pull requests

4 participants