Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process starvation in concurrent kernel starts, and scaling out JEG for scalability. #732

Open
esevan opened this issue Sep 10, 2019 · 5 comments

Comments

@esevan
Copy link
Contributor

esevan commented Sep 10, 2019

Description

Recently I’ve monitored EG cannot handle concurrent 30 kernel start requests. Here’s the itest code.

    def scale_test(kernelspec, example_code, _):
        """test function for scalability test"""
        res = True
        gateway_client = GatewayClient()

        kernel = None
        try:
            # scalability test is a kind of stress test, so expand launch_timeout to our service request timeout.
            kernel = gateway_client.start_kernel(kernelspec)
            if example_code:
                res = kernel.execute(example_code)
        finally:
            if kernel is not None:
                gateway_client.shutdown_kernel(kernel)

        return res

    class TestScale(unittest.TestCase):
        def _scale_test(self, spec, test_max_scale):
            LOG.info('Spawn {} {} kernels'.format(test_max_scale, spec))
            example_code = []
            example_code.append('print("Hello World")')

            with Pool(processes=test_max_scale) as pool:
                children = []
                for i in range(test_max_scale):
                    children.append(pool.apply_async(scale_test,
                                                     (self.KERNELSPECS[spec],
                                                      example_code,
                                                      i)))

                test_results = [child.get() for child in children]

            for result in test_results:
                self.assertRegexpMatches(result, "Hello World")

        def test_python3_scale_test(self):
            test_max_sacle = int(self.TEST_MAX_PYTHON_SCALE)
            self._scale_test('python3', test_max_scale)

        def test_spark_python_scale_test(self):
            test_max_sacle = int(self.TEST_MAX_SPARK_SCALE)
            self._scale_test('spark_python', test_max_scale)

I’ve set LAUNCH_TIMEOUT to 60 seconds, and used kernelspecs already pulled in the node. In case of Spark kernel, the situation got worse because spark-submit processes launched by EG makes process starvation among EG process and other spark-submit processes.

When I did the test, CPU utilization rose up to more than 90%. (4 core, 8GiB memory instance)
image

I know that there’s work for HA in progress, but it looks like Active / Stand-by mode. In that approach, we couldn’t make EG scale-out, but scale-up. However, “Scale Up” always has limitations in that we cannot expand our instance to the size bigger than the node EG is running on.

In those reasons, I want to start to increase the scalability of EG, and need your opinion about the following idea. (Let me just assume that EG is running on k8s)

  • Process starvation

    • In order to resolve process starvation in EG instance, I have two ideas.
      • Spawn spark-submit pod and launch-kubernetes pod instead of launching processes. Using container, isolate the spark-submit process from EG instance.
      • Create another submitter pod. submitter pod queues the requests from EG, and launch processes with limited process pool. This submitter pod is also scalable while EG is not scalable yet because EG always passes the parameters for launching a process.
  • Session Persistence (duplicate of High Availability - session persistence on bare metal machines #562, Implementing HA Active/Active with distributed file system #594 )

    • This is actually very flaky issue which has high possibility to make a lot of side effects. My idea is to move all session objects into external in-memory db such as Redis. So everytime EG needs to access session, it reads the session information from the Redis. I cannot estimate how many sources should be modified and never looked into the source code yet. But I’m guessing I have to change Session Manager and Kernel Manager classes. Could anybody give me a feedback about this?

Through those two resolutions, I think we can scale out EG instances. Any advices will be appreciated.

Thanks.

Environment

Enterprise Gateway Version 2.x with Asynchronous kernel start feature (#580)
Platform: Kubernetes
Others: nb2kg latest

@kevin-bates
Copy link
Member

Thanks Evan. I'm cross-referencing the discourse link here: https://discourse.jupyter.org/t/scalable-enterprise-gateway/2014

@esevan
Copy link
Contributor Author

esevan commented Sep 19, 2019

I'm working on Submitter idea by using celery as a PoC. It means this PoC can introduce another microservice dependency; i.e. Redis, Celery (I prefer redis to rabbitmq as a celery message broker because it can be also used as a persistent session store 😁 )

My idea is to replace jupyter_client.launch_kernel with remote_launcher.launch_kernel returning RemoteLauncherProcess which supports poll, kill, send_signal, and wait as same as subprocess.Popen class . The remote_launcher can be implemented to many types.

It means I'm working on implementing celery_launcher and CeleryLauncherProcess in the first place.

It may be so far from now or totally different from this issue, but I'm guessing we can make "Kernel start" lazy by responding job id to nb2kg before kernel is connected.

@kevin-bates Could you please leave the comment about lazy kernel start? nb2kg tricks the frontend as kernel's been already connected but still polls kernel status to EG. When EG is connected to the kernel, nb2kg lazily tries to connect the kernel and report the real kernel status ("IDLE") to frontend.

@kevin-bates
Copy link
Member

@esevan - thanks for looking into this. Got a couple comments.

  1. We can't modify nb2kg since that is tantamount to a protocol change - in this case, the job id. EG exposes a REST API and we must view any client-side interactions relative to that. There are many users of EG that do not use nb2kg, so features that trickle into the client are difficult to justify.
  2. Your process launch abstraction is exactly what the process proxies are all about. They just happen to use jupyter_client.launch_kernel. I would prefer we not introduce another abstract layer - but look at extending the process proxies.
  3. We need to keep in mind that there's a high likelihood we'll be moving to the kernel providers architecture at some point (see the kernel-providers in this org - they should look very familiar 😄). When that time comes, functionality in EG will be decoupled from kernel lifecycle management. Some of this could be done in the RemoteKernelProvider (which is essentially the RemoteKernelManager/RemoteProcessProxy code), but it must be completely self-contained. (In this env, things like Notebook could run remote kernels w/o EG.)
  4. I have no objections to celery, redis, etc. However, I want us to try to adopt third-party applications that might already have been adopted (and vetted) by other projects of the Jupyter community. I see celery also supports MongoDB and Mongo is one that I think has been used elsewhere. I know that hub uses sqllite as its default store and Notebook has sqllite code in it, so I don't know if that would be helpful. Just saying we should consider what existing Jupyter installations might already be using so we don't end up with a bunch of different no-SQL DBs, etc. in use by default.
  5. If at all possible, a thin abstraction layer over whatever third-party is in use might go a long way in allowing others to create a similar shim and bring their own application.

@esevan
Copy link
Contributor Author

esevan commented Sep 20, 2019

@kevin-bates Thank you for the detail!

  1. I easily forget nb2kg is one of EG's clients. Yes I agree with we cannot modify nb2kg :(
  2. Right, good point! I don't have to introduce new classes. I'll extend ProcessProxy 👍
  3. This is great. Do you have any video or document of these decent projects? I'll stay tuned for these projects 👍
  4. Understood. After the PoC (by monitoring how process starvation is relieved), I'm starting with a module w/o dependency on 3rd-party apps.
  5. I agree with "thin abstraction layer". I'll go ahead with thin abstraction layer extending existing ProcessProxy. Thanks!

@lresende
Copy link
Member

As for the persistent layer, I would also recommend defining an abstraction layer so one can choose between mongo or any other data store they use/need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants