Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCB Data 8 / Prob 140 Upcoming Kernel Usage #34

Closed
SamLau95 opened this issue Nov 21, 2017 · 11 comments
Closed

UCB Data 8 / Prob 140 Upcoming Kernel Usage #34

SamLau95 opened this issue Nov 21, 2017 · 11 comments

Comments

@SamLau95
Copy link

SamLau95 commented Nov 21, 2017

We'd like to use BinderHub kernels to allow for widgets to display in the Data 8 and Prob 140 textbooks. Although this is still in the works, I wanted to give you a heads up so we can work out potential issues sooner rather than later.

Rough timeline (exact dates TBD):

  • Start of Dec: Discussion with Data 8 + Prob 140 staff about using widgets in textbook
  • Middle of Dec: Preliminary rollout of "Interact" button in Data 8 + Prob 140 textbook that starts BinderHub kernel. We expect ~10 pages from each textbook to use kernels.
  • Middle of Jan: Classes start, so traffic will be higher.

Back-of-the-envelope calculations on usage:

Data 8's textbook gets around 8k views on a peak day. If we assume that the views are evenly distributed through an 8-hour workday, that's 1k views per hour, or 17 views per minute.

Suppose each user's kernel lasts 10 minutes on average, and that each view creates a kernel. This means that there will be an extra 170 kernels running on average during peak hours.

(Prob 140's textbook gets significantly less traffic, so I believe Data 8's usage will dominate.)

How I'm going to help manage load:

  1. Only start a kernel when a user specifically clicks an interact button. This means that not every page view will create a kernel. In addition, only some pages will have the interact button. If 10% of views create a new kernel instead of 100%, this gives an average of 17 extra kernels during peak hours instead of 170.
  2. Reuse the same kernel for the same user moving between pages instead of starting a new one. This should lower the kernel per view ratio since there are 1k unique visitors on a peak day (versus 8k page views).

One idea for managing load on the Binder team side: Set a quota for Data 8 textbook servers and deny requests when the quota is filled.

Please let me know if you foresee any issues with this! Happy to work around any constraints that you have.

cc @yuvipanda @CalebS97 @ryanlovett

@betatim
Copy link
Member

betatim commented Nov 21, 2017

Some (very) preliminary thoughts on the topic of rate limiting single repositories jupyterhub/binderhub#242

@choldgraf
Copy link
Member

just a clarifying question here: you're planning to use mybinder.org, and not set up your own binderhub server?

@SamLau95
Copy link
Author

@choldgraf We are currently planing on using mybinder.org . Let me know if things have changed and mybinder.org is no longer okay to use.

@choldgraf
Copy link
Member

It should work fine but may slow down if a ton of traffic hits it at once, just noting that our goal is to have multiple binderhubs in existence because it is definitely a limited resource :-)

@yuvipanda
Copy link
Contributor

I suspect it'll be 15-20 instances max, and we should have no problems.

@choldgraf
Copy link
Member

choldgraf commented Nov 22, 2017 via email

@minrk
Copy link
Member

minrk commented Feb 7, 2018

BTW, we are now seeing 500 concurrent Binders from these sources, I think because Binders could be requested on pageview, rather than as needed. If the clients do either or both of:

  1. only request a Binder once execution is required (see thebelab for an example)
  2. remember binder info across pageviews via localStorage or something similar

then things should improve dramatically

@choldgraf
Copy link
Member

@minrk do you think we're at a point where we can write a guide for people to plug into the Binder backend in this way?

@choldgraf
Copy link
Member

@minrk think we can close this issue in lieu of more actionable ones as they pop up? The two action items in your comment are more for @SamLau95 to take care of on his end. On ours the only thing I can think of is documentation of this process

@betatim
Copy link
Member

betatim commented Feb 11, 2018

Should we keep this open until we unban the repos? I think as a general strategy it would be good to open an issue on the mybinder-deploy repo for each ban to document why it was banned, contain any discussions related to the ban, and serve as a reminder that there is a ban (that should be lifted eventually). Which would suggest that we close this issue and create such an issue for this ban.

@choldgraf
Copy link
Member

I just added a BAN label for mybinder.org so that we can see it more clearly, and opened this issue:

jupyterhub/mybinder.org-deploy#357

going to close this, but if somebody disagrees then feel free to re-open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants