Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak when keybase status times out #55

Closed
vladionescu opened this issue Feb 21, 2020 · 6 comments
Closed

Memory leak when keybase status times out #55

vladionescu opened this issue Feb 21, 2020 · 6 comments

Comments

@vladionescu
Copy link

In my GCP Cloud Run logs I have a string of the following

2020-02-20 17:00:00.639 PST2020/02/21 01:00:00 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:00:29.641 PST2020/02/21 01:00:29 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:01:17.639 PST2020/02/21 01:01:17 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:02:27.939 PST2020/02/21 01:02:27 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:02:54.638 PST2020/02/21 01:02:54 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:03:18.541 PST2020/02/21 01:03:18 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:03:43.739 PST2020/02/21 01:03:43 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:04:03.739 PST2020/02/21 01:04:03 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:04:42.038 PST2020/02/21 01:04:42 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:05:09.139 PST2020/02/21 01:05:09 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:05:33.239 PST2020/02/21 01:05:33 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:06:11.238 PST2020/02/21 01:06:11 Listen: failed to auth: unable to run Keybase command
2020-02-20 17:07:00.939 PST2020/02/21 01:07:00 Listen: failed to auth: unable to run Keybase command

Which finally ends in a

2020-02-20 17:18:31.241 PSTMemory limit of 1024M exceeded with 1024M used. Consider increasing the memory limit, see https://cloud.google.com/run/docs/configuring/memory-limits

All of them are Listen: failed to auth: unable to run Keybase command, which comes from

case <-time.After(5 * time.Second):
return "", errors.New("unable to run Keybase command")
}

The reason why getUsername() is failing is unclear (and probably unrelated), but it doesn't seem that memory usage should increase as it keeps retrying.

@vladionescu
Copy link
Author

Adding time keybase status to the entrypoint shell script right before my bot gets executed shows that it returns kinda slowly but still faster than 5 seconds. The 'failed to auth' error persists. Why this is happening is likely a separate issue but I wanted to have this context here.

2020-02-21T01:59:47.814485Z real	0m1.311s A 
2020-02-21T01:59:47.814497Z user	0m0.150s A 
2020-02-21T01:59:47.814509Z sys	0m0.080s A 

@joshblum
Copy link
Member

@vladionescu thanks for the report, we have a ticket internally to investigate this

@vladionescu
Copy link
Author

@malware-unicorn fixed the OOM by adding a <-doneCh in the timeout case:
https://github.com/malware-unicorn/go-keybase-chat-bot/blob/master/kbchat/kbchat.go#L64

@vladionescu
Copy link
Author

On another note, I suspect the reason Command("status") fails on Cloud Run is because Cloud Run captures STDOUT and STDERR for logging.

https://cloud.google.com/run/docs/logging#container-logs

They probably neglected to duplicate those file descriptors. @malware-unicorn doesn't see this fmt.Errorf() printed, so something's going on with process output.

https://github.com/malware-unicorn/go-keybase-chat-bot/blob/master/kbchat/kbchat.go#L65

If that's the case, then p.StdoutPipe() wouldn't be getting the expected output from status.

The fix for this is either

  1. Send Command() output to an FD we control, and read from that.
  2. Use an API instead of shelling out to the keybase binary.

@joshblum
Copy link
Member

@vladionescu #57 should address the memory leak, let's open a new issue for running on google cloud

@vladionescu
Copy link
Author

Opened #58 for the Cloud Run shenanigans.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants