Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance on Probot 9 #970

Closed
wilhelmklopp opened this issue Jul 18, 2019 · 9 comments

Comments

@wilhelmklopp
Copy link
Contributor

commented Jul 18, 2019

We've been trying to upgrade the GitHub Slack integration from Probot 7.5.3 to Probot 9, so that we can make use of all the awesome new features specifically the ability to cache installation tokens in redis. Pull request is here: integrations/slack#877

Yesterday, we tried to deploy this pull request, but we saw a severe performance degradation so we had to roll back the change. We saw memory usage triple, requests started taking much longer and some timed out.

I'm opening this issue because we're not super sure what the underlying cause is, but we did have some initial thoughts:

  • It seemed like we were holding on longer to incoming GitHub webhook requests. The Slack integration gets a lot of webhooks, and we rely on closing the connection as soon as possible. When we last upgraded Probot, we added this logic , which is used here to help tell Probot to close connections quickly. Maybe this is no longer working due to a change in how Probot works?
  • I saw that we upgraded bottleneck in Probot 9, and that there is a new API throttling setup. What's the best place to find out more about how that work and whether we're throttling too aggressively compared to before?

Environment

cc @gr2m @SGrondin in case you have any ideas/insight into what might be causing this

cc @scepticulous since you might be looking into this more 👀

cc @gimenete as the author of https://github.com/integrations/slack/blob/master/lib/activity/processor.js

@issue-label-bot

This comment has been minimized.

Copy link

commented Jul 18, 2019

Issue Label Bot is not confident enough to auto-label this issue. See dashboard for more details.

@gimenete

This comment has been minimized.

Copy link
Contributor

commented Jul 18, 2019

Hey,

In my opinion the work done in processor.js should have no impact independently of the probot version. It's just doing some work in the background, instead of awaiting, which is completely independent of the probot architecture.

But I'll keep an eye on this issue to see if there's something we can change to improve things.

@gr2m

This comment has been minimized.

Copy link
Contributor

commented Jul 18, 2019

The new throttle plugin does more nuance throttling than the previous 1s / request. The Bottleneck settings are not configurable at this point but we could change that: https://github.com/octokit/plugin-throttling.js/blob/b2ac75fde6b862c39710d758adc8cfa306871098/lib/index.js#L14-L38 so you could fine-tune it to your requirements.

You can also use Redis for throttling data persistence instead of memory by passing your own Bottleneck instance: https://github.com/octokit/plugin-throttling.js#clustering

Hope that helps 🙏

GitHub
Octokit plugin for GitHub’s recommended request throttling - octokit/plugin-throttling.js
GitHub
Octokit plugin for GitHub’s recommended request throttling - octokit/plugin-throttling.js
@SGrondin

This comment has been minimized.

Copy link

commented Jul 19, 2019

I'd be curious to know the length of the Bottleneck queue while memory usage is high. I recently heavily refactored Bottleneck's core engine to avoid closures and allocations by reusing prototype functions and that reduced memory usage by 40%.

The throttling plugin is not a simple machine. It has to compose a lot of closures and it's possible that they lead to a high memory usage while the plugin is juggling a lot of requests.

Could you try disabling the throttling plugin to measure the difference that makes?

After, could you please do the same for the retry plugin and then both plugins?

Neither of them is particularly inefficient, but the octokit plugin system is like an onion. Each layer compounds the memory usage of the underlying layers.

@scepticulous

This comment has been minimized.

Copy link

commented Jul 19, 2019

Thanks for sharing all these insights @gr2m and @SGrondin .
I will soon(ish) add a commit that disabled plugins and retry the deploy while observing the error rate but also the memory usage compared to before as suggested. 🙏 🙇
Will post an update once that happened.

@JamesMGreene JamesMGreene referenced this issue Jul 25, 2019
0 of 5 tasks complete
@stale

This comment has been minimized.

Copy link

commented Sep 17, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Sep 17, 2019

@gr2m

This comment has been minimized.

Copy link
Contributor

commented Sep 17, 2019

@wilhelmklopp Any update?

@stale stale bot removed the wontfix label Sep 17, 2019

@wilhelmklopp

This comment has been minimized.

Copy link
Contributor Author

commented Sep 18, 2019

@gr2m yep, we ended up upgrading to Probot 9 a few weeks ago after disabling all the throttling we could. There was still an increase in memory usage and GC activity (presumably due to some of the built in throttling we can't disable), but to a level that was acceptable to us 👍

cc @jnraine @jules2689 @srinjoym @zmoazeni for context

@gr2m

This comment has been minimized.

Copy link
Contributor

commented Sep 18, 2019

There were also updates to the throttling plugin and the underlying Bottleneck library, make sure to keep Bottleneck up-to-date to the latest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.