Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Several restarts of a client makes Falco daemon unresponsive #1268

Closed
bgarber opened this issue Jun 19, 2020 · 7 comments
Closed

Several restarts of a client makes Falco daemon unresponsive #1268

bgarber opened this issue Jun 19, 2020 · 7 comments
Labels

Comments

@bgarber
Copy link

bgarber commented Jun 19, 2020

Describe the bug

I'm writing a simple client to capture and handle alerts from Falco, using the falcosecurity/client-go library and configuring the Falco daemon to open a gRPC port. I noticed that, after restarting several times my client, the Falco daemon starts denying the subscription, with the error "transport: authentication handshake failed: context deadline exceeded". I am able to subscribe again only after restarting the Falco daemon.

For example, the following code:

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

c, err := client.NewForConfig(ctx, &client.Config{
	Hostname:   falco.daemonHostname,
	Port:       falco.daemonPort,
	CertFile:   falco.daemonCertFile,
	KeyFile:    falco.daemonCertKey,
	CARootFile: falco.daemonCARoot,
})
if err != nil {
	log.Fatalf("client failed: %v", err)
}
defer c.Close()

outputClient, err := c.Output()
if err != nil {
	log.Fatalf("output failed: %v", err)
}

fcs, err := outputClient.Subscribe(ctx, &output.Request{Keepalive: true})
if err != nil {
	log.Fatalf("subscribe failed: %v", err)
}

for {
    // .... receive instructions ....
    res, err := fcs.Recv()
    // ....
}

Will eventually hit the error (after restarting the client several times):

subscribe failed: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: context deadline exceeded"

Even if I try different contexts for client.NewForConfig(...) and outputClient.Subscribe(...), I end up with exactly the same behavior. I even tried to put some retries in my code, to no avail.

Investigating a bit further the client-go library, I wasn't able to detect any misuse of the the gRPC library, leading me to believe the problem may be actually in the Falco daemon.

How to reproduce it

  1. Start Falco userspace daemon, with gRPC enabled.
  2. Start the minimal client, connecting through gRPC to daemon.
  3. Stop and restart the client.
  4. Eventually, the problem will arise.
  5. After that, I have to restart Falco daemon.

Expected behaviour

Client restarts should not impact in the new connections with the daemon.

Environment

  • Falco version: 0.23.0
  • System info: Linux 5.6.10-1.el7.elrepo.x86_64
  • Cloud provider or hardware configuration:
    • OS: CentOS Linux
    • Kernel: Linux knowhere 5.6.10-1.el7.elrepo.x86_64 Digwatch compiler #1 SMP Sat May 2 12:42:34 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
    • Installation method: RPM
@bgarber
Copy link
Author

bgarber commented Jun 24, 2020

Have anyone had the chance to check this issue?

@fntlnz
Copy link
Contributor

fntlnz commented Jun 24, 2020

Hi @bgarber - thanks for taking time to write such detailed report. We had several issues with the current gRPC implementation and there’s a branch scheduled to be merged for 0.24.0 - would you mind checking if that solves your issue ?
Please read at: #1241

@bgarber
Copy link
Author

bgarber commented Jun 25, 2020

Hi @fntlnz! Thank you for your reply. I will try the new branch and see if it fixes the problem. Thanks!

@bgarber
Copy link
Author

bgarber commented Jul 1, 2020

Hi @fntlnz! After several tests using the branch of the PR #1241 and the master (after the PR was approved and merged), I see that this issue is not happening anymore. Thank you very much!

@bgarber
Copy link
Author

bgarber commented Jul 1, 2020

Question: the version 0.24.0 of Falco is scheduled to be released when?

@fntlnz
Copy link
Contributor

fntlnz commented Jul 1, 2020

@bgarber glad to know this is working in your environment :) thanks for taking time to try and report back! It is scheduled to be released on July 16th

@bgarber
Copy link
Author

bgarber commented Jul 1, 2020

@fntlnz Thank you!

@bgarber bgarber closed this as completed Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants