Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stress test jupyterhub / k8s cluster for 50-100 concurrent users #275

Open
scottyhq opened this issue May 9, 2019 · 5 comments
Open

Stress test jupyterhub / k8s cluster for 50-100 concurrent users #275

scottyhq opened this issue May 9, 2019 · 5 comments

Comments

@scottyhq
Copy link
Member

scottyhq commented May 9, 2019

We're planning to use our NASA ACCESS clusters on AWS for upcoming hackweeks at university of washington. We expect 50-100 simultaneous users. It would be great to have a way to simulate this use before the event. I anticipate JupyterHub will be fine, but depending on the K8s cluster we might run into issues with service limits (such as number of a given instance checked out for our node groups - https://docs.aws.amazon.com/general/latest/gr/aws_service_limits.html)

It seems that there are some tools out there for this, but for bigger use-cases using @yuvipanda's https://github.com/yuvipanda/jupyterhub-loadtest
jupyterhub/helm-chart#46

We have the added component of 50 people potentially launching dask clusters, as discussed here: pangeo-data/pangeo#440

Has anyone already done something like this? Recommendations would be welcome!

@yuvipanda
Copy link
Member

jupyterhub-loadtest is dead, use https://github.com/yuvipanda/hubtraf instead. It could be better documented, but there's some docs inside the code. You can also have it execute arbitrary python code in the notebooks once they start, so theoretically you can use this to test other parts of your infrastructure too. Docs and PRs most welcome! <3

@yuvipanda
Copy link
Member

For 50-100 users, you don't need the helm chart in hubtraf. Just run the python scripts from another VM in the same network and you should be good.

@muthmano-dev
Copy link

hubtraf

@yuvipanda hubtraf is a great tool. I just can't seem to get it working. I have enabled dummyAuthentication, dockerspawner and I'm able to login as any "xyz" user with a global password.

Now, when I'm trying to run the script, it is stuck in "server-start" loop.

2019-06-15T16:23:45.409690Z Server: Attempting to Starting action=server-start attempt=20 phase=attempt-start username=bwai005-aiteamserver-1
2019-06-15T16:23:45.434229Z Server: Retrying after response <ClientResponse(http://35.212.131.55:8000/hub/login?next=/hub/spawn) [200 OK]>
<CIMultiDictProxy('Server': 'TornadoServer/6.0.2', 'Content-Type': 'text/html', 'Date': 'Sat, 15 Jun 2019 16:23:45 GMT', 'x-jupyterhub-version': '1.0.0', 'Access-Control-Allow-Headers': 'accept, content-type, authorization', 'content-security-policy': "frame-ancestors 'self'; report-uri /hub/security/csp-report", 'Etag': '"b5d8cbdf8b89eb1a688fcf8dcd41eeeb857e6597"', 'Content-Length': '4741', 'Connection': 'close')>
action=server-start attempt=20 duration=267.2215327990707 phase=attempt-complete username=bwai005-aiteamserver-1

The debug log on the hubside:

[I 2019-06-15 16:24:02.690 JupyterHub log:174] 302 GET /hub/spawn -> /hub/login?next=%2Fhub%2Fspawn (@::ffff:35.212.131.55) 1.21ms
[I 2019-06-15 16:24:02.698 JupyterHub log:174] 200 GET /hub/login?next=/hub/spawn (@::ffff:35.212.131.55) 2.05ms

I am able to login and start the server via the UI but I'm trying to load test using the script and I guess, the script is not able to start the server and it is waiting for a response with the home page URL that ends like this "/user//tree?" so it gets stuck in this loop.

Am I missing something here? Any help is much appreciated :)

@muthmano-dev
Copy link

hubtraf

@yuvipanda hubtraf is a great tool. I just can't seem to get it working. I have enabled dummyAuthentication, dockerspawner and I'm able to login as any "xyz" user with a global password.

Now, when I'm trying to run the script, it is stuck in "server-start" loop.

2019-06-15T16:23:45.409690Z Server: Attempting to Starting action=server-start attempt=20 phase=attempt-start username=bwai005-aiteamserver-1
2019-06-15T16:23:45.434229Z Server: Retrying after response <ClientResponse(http://35.212.131.55:8000/hub/login?next=/hub/spawn) [200 OK]>
<CIMultiDictProxy('Server': 'TornadoServer/6.0.2', 'Content-Type': 'text/html', 'Date': 'Sat, 15 Jun 2019 16:23:45 GMT', 'x-jupyterhub-version': '1.0.0', 'Access-Control-Allow-Headers': 'accept, content-type, authorization', 'content-security-policy': "frame-ancestors 'self'; report-uri /hub/security/csp-report", 'Etag': '"b5d8cbdf8b89eb1a688fcf8dcd41eeeb857e6597"', 'Content-Length': '4741', 'Connection': 'close')>
action=server-start attempt=20 duration=267.2215327990707 phase=attempt-complete username=bwai005-aiteamserver-1

The debug log on the hubside:

[I 2019-06-15 16:24:02.690 JupyterHub log:174] 302 GET /hub/spawn -> /hub/login?next=%2Fhub%2Fspawn (@::ffff:35.212.131.55) 1.21ms
[I 2019-06-15 16:24:02.698 JupyterHub log:174] 200 GET /hub/login?next=/hub/spawn (@::ffff:35.212.131.55) 2.05ms

I am able to login and start the server via the UI but I'm trying to load test using the script and I guess, the script is not able to start the server and it is waiting for a response with the home page URL that ends like this "/user//tree?" so it gets stuck in this loop.

Am I missing something here? Any help is much appreciated :)

I figured out what my issue was. aiohttp doesn't maintain the session when I'm using the IP address. @yuvipanda You comment here helped. Thanks :)

@yuvipanda
Copy link
Member

@altairpearl glad you found it! hubtraf really needs some docs... Would <3 if you could add some 😄

How did your testing go?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants