GKE - Falco Logs not appearing for up to 10 minutes #1291

danpop-chainguard · 2020-06-30T01:42:31Z

Describe the bug

GKE - Falco Logs not appearing for up to 10 minutes

How to reproduce it

deploy a gke cluster
use the helm chart to install falco. (helm install falco --set ebpf.enabled=true falcosecurity/falco)
tail one of the falco logs (kubectl logs -f falco-podname
in another terminal try to exec into the pod. or do another falco rule violation.
await logs to show up. (up to 10 minutes)
Expected behaviour

logs/falco event notifications to be immediate or close to it vs 9-10 minutes..

Screenshots

Environment

Falco version:
Falco version: 0.23.0
Driver version: 96bd9bc560f67742738eb7255aeb4d03046b8045
System info:

{
"machine": "x86_64",
"nodename": "gke-popfalco1-default-pool-880376cf-qj8n",
"release": "4.19.109+",
"sysname": "Linux",
"version": "#1 SMP Mon Mar 16 06:27:01 PDT 2020"
}

Cloud provider or hardware configuration:
GKE... Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.8-gke.15". (3 nodes)

Additional context

none

krisnova · 2020-06-30T05:29:57Z

Thanks for the report

Can you share the following

The exact helm chart and associated commands

Recently the community ported to this resource https://github.com/falcosecurity/charts/tree/master/falco#installing-the-chart

Can you confirm you are using the must recent version here?

How are you viewing the logs?

Are you viewing the logs on all nodes in the cluster? Something like

kubectl logs -l app=falco -f

What is your burst set to?

If you look here you can see the following:

Note that 1000 seconds is between 16 and 17 minutes.

# A throttling mechanism implemented as a token bucket limits the
# rate of falco notifications. This throttling is controlled by the following configuration
# options:
#  - rate: the number of tokens (i.e. right to send a notification)
#    gained per second. Defaults to 1.
#  - max_burst: the maximum number of tokens outstanding. Defaults to 1000.
#
# With these defaults, falco could send up to 1000 notifications after
# an initial quiet period, and then up to 1 notification per second
# afterward. It would gain the full burst back after 1000 seconds of
# no activity.

outputs:
  rate: 1
  max_burst: 1000

Probably the quickest way to debug this is to do send a quick batch of alerts consecutively and see if you start getting real-time alerts.

When I demo I usually like to violate 10+ rules as quickly as possible. Usually, I do something like this:

touch /usr/bin/1
touch /usr/bin/2
touch /usr/bin/3
cat /etc/shadow
touch /etc/1
touch /proc/1
cat ~/.bashrc
touch /root/1
touch /root/2
touch /root/3

Let me know of any of that doesn't help you out and I can spin up a GKE cluster tomorrow and tell you exactly what is going on.

krisnova · 2020-06-30T05:55:47Z

So I looked into the Helm chart (It looks like that is what you are using) and I think you are trying to pull logs on the same pod you are trying to trigger a rule with

Which means I bet changing the values here or triggering more alerts could help get you closer to the state you are looking for.

# The rate corresponds to one message every 30 seconds.
# A burst of 10 messages.
helm install falco --set ebpf.enabled=true,outputs.rate=.03333,outputs.maxRate=10 falcosecurity/falco

danpop-chainguard · 2020-06-30T11:00:07Z

seeing the same thing. when attempting to do a simple write to bin/

after using the helm values listed above: helm install falco --set ebpf.enabled=true,outputs.rate=.03333,outputs.maxRate=10 falcosecurity/falco

i also attempted:

touch /usr/bin/1
touch /usr/bin/2
touch /usr/bin/3
cat /etc/shadow
touch /etc/1
touch /proc/1
cat ~/.bashrc
touch /root/1
touch /root/2
touch /root/3

danpop-chainguard · 2020-06-30T11:07:56Z

going to revert to using /usr/bin for my demo. it came up after about 90 or so seconds. /etc/shadow also worked repeatedly.

still a bit sluggish on the initial deploy then seeing logs even with those settings.

leogr · 2020-07-01T13:37:17Z

It seems to me that this issue duplicates:

Could you confirm that's the same problem?

Basically, the output log is delayed when Falco runs in a container in a non-interactive mode (ie. no TTY) and the activity is low (very few events per second). As soon as other events arrive, previous events appear immediately (so you won't notice this problem in a system that generates many events per second).

I encountered the same problem many times, but AFAIK we don't have a fix yet.
I'm still investigating.

fntlnz · 2020-07-01T14:08:18Z

@danpopSD Investigated this with @leodido

This happens because GKE distinguishes between a TTY and plain stdout while flushing logs.

I achieved real time logs by doing this:

Install Falco

helm install falco falcosecurity/falco --set ebpf.enabled=true

wait for falco to be installed..
Add the tty: true to Falco pods containers (you can quickly apply the patch below)

kubectl patch daemonset falco --patch '{"spec":{"template":{"spec":{"$setElementOrder/containers":[{"name":"falco"}],"containers":[{"name":"falco","tty":true}]}}}}'

Enjoy realtime logs

Nota bene

Related issues on kubernetes: kubernetes/kubernetes#57414

So at this point we need to make a decision here, we might want to enable tty in our manifests and helm chart. However, before doing that we need to investigate what are the boundaries and side effects of doing this.

We need to answer some questions like:

does having a tty affects any other Falco functionality? Is it better for Falco? Is it worse?
does having a tty have any performance/security impact on Falco?
does the tty behave differently on different cloud providers? Maybe this works better only on GKE while it get things worse on other providers.

leogr · 2020-07-01T14:25:24Z

This happens because GKE distinguishes between a TTY and plain stdout while flushing logs.

Note that this issue is not GKE specific. I was able to reproduce it with docker (as explained in this comment) and others k8s installations as well.

If we decided to enable TTY by default, we will need to update the docs regarding the docker usage too.

danpop-chainguard · 2020-07-01T18:49:35Z

this #1291 (comment) addressed my issue..

LucaGuerra · 2023-04-13T17:49:49Z

Since this issue pops out from time to time, I'd like to add the quick tl;dr here:

add the -U/--unbuffered option to Falco
turn on the tty option in the helm chart

more information: https://falco.org/docs/alerts/#standard-output-buffering

danpop-chainguard added the kind/bug Something isn't working label Jun 30, 2020

leogr added triage/duplicate Indicates an issue is a duplicate of other open issue. and removed kind/bug Something isn't working labels Jul 1, 2020

poiana added the needs-kind label Jul 1, 2020

leogr added the kind/bug label Jul 1, 2020

poiana removed the needs-kind label Jul 1, 2020

leogr mentioned this issue Jul 2, 2020

correct stdout output buffering option #1296

Merged

poiana closed this as completed in #1296 Jul 3, 2020

LucaGuerra mentioned this issue Apr 26, 2022

In Falco 0.31.1, the triggered rules are not written to the log file #1957

Closed

anders-elastisys mentioned this issue Oct 9, 2023

config: add falco tty templating elastisys/compliantkubernetes-apps#1810

Merged

30 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GKE - Falco Logs not appearing for up to 10 minutes #1291

GKE - Falco Logs not appearing for up to 10 minutes #1291

danpop-chainguard commented Jun 30, 2020 •

edited

Loading

krisnova commented Jun 30, 2020

krisnova commented Jun 30, 2020

danpop-chainguard commented Jun 30, 2020

danpop-chainguard commented Jun 30, 2020 •

edited

Loading

leogr commented Jul 1, 2020

fntlnz commented Jul 1, 2020

leogr commented Jul 1, 2020

danpop-chainguard commented Jul 1, 2020

LucaGuerra commented Apr 13, 2023

GKE - Falco Logs not appearing for up to 10 minutes #1291

GKE - Falco Logs not appearing for up to 10 minutes #1291

Comments

danpop-chainguard commented Jun 30, 2020 • edited Loading

krisnova commented Jun 30, 2020

The exact helm chart and associated commands

How are you viewing the logs?

What is your burst set to?

krisnova commented Jun 30, 2020

danpop-chainguard commented Jun 30, 2020

danpop-chainguard commented Jun 30, 2020 • edited Loading

leogr commented Jul 1, 2020

fntlnz commented Jul 1, 2020

Nota bene

leogr commented Jul 1, 2020

danpop-chainguard commented Jul 1, 2020

LucaGuerra commented Apr 13, 2023

danpop-chainguard commented Jun 30, 2020 •

edited

Loading

danpop-chainguard commented Jun 30, 2020 •

edited

Loading