Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue running nbclient with a gateway url #26

Closed
golf-player opened this issue Feb 21, 2020 · 16 comments
Closed

Issue running nbclient with a gateway url #26

golf-player opened this issue Feb 21, 2020 · 16 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@golf-player
Copy link
Contributor

MappingKernelManager cannot be used as kernel_manager_class and I only just realized it didn't inherit from KernelManager. Looking at the source, it seems they're very different things, but that's confusing given the names.

Since GatewayKernelManager doesn't inherit from KernelManager, it doesn't seem like there's a way to get this working.

Am I doing something dumb here?

@MSeal
Copy link
Contributor

MSeal commented Feb 21, 2020

You're not doing anything dumb, there were some independent development tracks for various efforts. It might be useful to ping on the of GatewayKernelManager devs into the thread. I remember having a conversation on this topic a long while ago but it's likely my knowledge on GatewayKernelManager is quite outdated.

@golf-player
Copy link
Contributor Author

golf-player commented Feb 21, 2020

Bringing over the discussion to this thread.

From what I see, there's some pretty vast differences between the 2 kernel managers; interfaces are very different, one's async with tornado, the other's not. And then there's this weird client class that nobody uses that I can see: https://github.com/jupyter/enterprise_gateway/blob/master/enterprise_gateway/client/gateway_client.py#L76 which is sync. Haven't tried using it yet though

It's going to be difficult to do this very cleanly without rewriting a lot of things across jupyter_client, nbclient, and enterprise_gateway. I'm going to start with a proof of concept anyway; will post here after.

It seems this is also a bit related: jupyter/nbconvert#809

@golf-player
Copy link
Contributor Author

@takluyver this seems to be something you're tackling with jupyter_kernel_mgmt Please correct me if I'm wrong

@golf-player
Copy link
Contributor Author

golf-player commented Feb 23, 2020

So it seems that in addition to porting over the jupyter kernel mgmt style over to nbclient, a separate gateway client needs to be created since the available clients are interacting with ZMQ directly. or perhaps just a messaging class

@MSeal
Copy link
Contributor

MSeal commented Feb 25, 2020

Hmm, well if we generalized the message getting a bit more it might be generalizable enough to branch between Gateway and normal Kernel managers. I saw @kevin-bates was involved over there, he's got a lot of parallel threads going but he may be able to chime in here on this more directly.

@kevin-bates
Copy link
Member

Thanks for the copy @MSeal. I apologize upfront for the long-winded content.

This is the first I've looked at nbclient (so many repos, so little time!) and I think I see the confusion here. Notebook's kernel_manager_class traitlet is really a reference to a subclass of jupyter_client's MultiKernelManager. The default value for kernel_manager_class is Notebook's MappingKernelManager from which GatewayKernelManager is derived.

MappingKernelManager is responsible for managing multiple KernelManager instances in a map indexed by kernel_id. MappingKernelManager also has a traitlet named kernel_manager_class - whose default value is KernelManager.

GatewayKernelManager replaces MappingKernelManager when --gateway-url is "in play". It does not manage KernelManager instances, but instead, proxies lifecycle actions to the configured Gateway server. GatewayClient is specific to the gateway integration and is essentially a singleton of static information that is constant during the Notebook process's execution.

Nbclient's kernel_manager_class traitlet, on the other hand, is a reference to a subclass of jupyter_client's KernelManager. It looks like nbclient uses the KernelManager instance directly and does not go through any kind of MultiKernelManager to manage the KernelManager instances.

You're right, the two kinds of kernel_manager_class values are quite different. I always look for the kernel_id parameter in the methods. If that parameter exists, then you're dealing with a form of MultiKernelManager, else you're likely dealing with a KernelManager directly.

What are you trying to achieve wrt Enterprise Gateway? You might be able to use EG's RemoteKernelManager in place of KernelManager to launch remote kernels, but that hasn't been done before, so I'm not certain of the outcome.

One of the unfortunate pitfalls of the current architecture is that you can only have one class of KernelManager. That limitation is what the jupyter_kernel_mgmt efforts are addressing. With "JKM", different kernel "providers" can be installed and the kernel discovery model will expose kernel specifications from each registered provider. Then, each provider implements its own kernel manager instance specific to the kind of kernel it's responsible for. The various Enterprise Gateway "process proxy" implementations have been converted to kernel providers in the Enterprise Gateway Experiements organization. The two prototyped providers are KubernetesKernelProvider and YarnKernelProvider from that org, both of which derive from RemoteKernelProvider.

With kernel providers in place, applications like Notebook can leverage things like remote kernels directly - without requiring an EG instance, etc. The plan is that jupyter_server will adopt JKM as its kernel management framework.

@MSeal
Copy link
Contributor

MSeal commented Feb 25, 2020

Thank you for the rich context @kevin-bates.

I'm ok with removing the traitlet and strict class inheritance present. It's only there to make the transition out of nbconvert easy. Sounds like maybe we need to support MultiKernelManager, RemoteKernelProvider, and KernelManager class types and write different code execution paths with as much shared code between them as possible. I'm going to mark this as an enhancement task, but I won't be able to get to such crafting for sometime. Probably going to help get the async changes up and down the stack merged first.

@golf-player if that description makes sense to you and you want to take a crack at a first-attempt I can provide reviews and suggestions. No pressure to do so, as it'll be a bit of work, but glad to have you contribute in significant ways if you want to try.

@MSeal MSeal added enhancement New feature or request help wanted Extra attention is needed labels Feb 25, 2020
@golf-player
Copy link
Contributor Author

@kevin-bates @MSeal appreciate the information here.

What are you trying to achieve wrt Enterprise Gateway?
I have a service running nbconvert on a remote kernel using a custom kernelclient. Since gateway exists and maintaining that kernel client is getting more and more expensive, we're looking to switch over. We need something to remotely execute notebooks.

WRT using RemoteKernelManager, I'm not sure that would make sense since that interacts with the processproxies directly rather than the gateway interface.

Right now, I'm just trying to redo the work previously done to get nbclient to use JKM since that's something that I assume y'all will want done eventually and half-solves the problem since blockingKernelClient should be easy to hook up to gateway (although I'm not 100% sure exactly how it muxes the various channels).

This is something I'll be working on in the near future, hopefully I can get some real amount of time allocated next week as this is an initiative my company seems to be leaning towards funding.

Appreciate the help! What's a good way to get in touch for one-off questions in case I get stuck somewhere? I see the gitter's a bit dead. Mailing list? Slack? Discuss forum?

@kevin-bates
Copy link
Member

RemoteKernelManager and the gateway interface are orthogonal. The gateway interface only proxies /api/kernels and /api/kernelspecs requests to another server. If that server is running Kernel Gateway, any kernels launched are local to that server. If that server is running Enterprise Gateway, kernels launched by that server may or may not be local to that server.

RemoteKernelManager is pretty much just like KernelManager, but it leverages the process proxy, which essentially abstracts the Popen subprocess mechanism, allowing us to launch, discover and terminate the kernel using a resource manager (Hadoop YARN, Kubernetes, etc.). The "remoting" is more a function of the resource manager than the kernel manager/process-proxy. That said, I agree that using RemoteKernelManager directly will probably create more work than its worth - especially given the existence of JKM.

What kinds of systems are you targetting to host your remote kernels? Just trying to understand if Kernel Gateway may be more what you want.

I think using the EG discussion forum or opening an issue on the EG repo are probably your two best locations for EG questions. If the issue is related to the remote kernel providers, please open an issue in those repos. If specific to JKM, let's just use the JKM repo. I watch all of these locations.

The evolution to JKM is a big change. I believe it will be adopted eventually but can't predict when.

@golf-player
Copy link
Contributor Author

golf-player commented Feb 26, 2020

I see what you're saying; or at least I hope I do: you're talking about not involving the gateway server at all and instead directly using RKM to interact with process proxies.

This could be a better approach than the one I was thinking of. I'll try this as well, although now I'm a bit confused about what you meant here: "using RemoteKernelManager directly will probably create more work than its worth - especially given the existence of JKM" why would RKM make things harder than anything else while using jupyter kernel mgmt?

We're hosting remote kernels on a couple kube clusters. Going by https://jupyter-enterprise-gateway.readthedocs.io/en/latest/getting-started.html#kernel-gateway-vs-enterprise-gateway, enterprise gateway is definitely what we'll need.

Keeping this on github seems like a good idea then.

@kevin-bates
Copy link
Member

Hi.

"using RemoteKernelManager directly will probably create more work than its worth - especially given the existence of JKM" why would RKM make things harder than anything else while using jupyter kernel mgmt?

This is because RemoteKernelManager is baked into EG and probably relies on some information that EG provides. It was not designed to be run "standalone". JKM, on the other hand, is designed to be its own isolated entity - although the remote kernel providers we do provide have some of the EG functionality, but are not necessarily dependent on EG (like I suspect RKM to be).

We're hosting remote kernels on a couple kube clusters. Going by https://jupyter-enterprise-gateway.readthedocs.io/en/latest/getting-started.html#kernel-gateway-vs-enterprise-gateway, enterprise gateway is definitely what we'll need.

I agree EG is what you want.

However, nbclient comes at things at a far lower level. I.e., it's not a web application nor is it designed to hit a web application (via REST, etc.). So it seems like JKM is probably your best shot. I would also like to point out that the KubernetesKernelProvider requires its caller (nbclient in this case), also be running in the k8s cluster. That's one of the advantages that EG brings, in that with EG, only it needs to be running in k8s, while the Notebook server that the user hits can be outside.

@golf-player
Copy link
Contributor Author

golf-player commented Feb 26, 2020

Appreciate the clarification. I hadn't looked too deeply into it, but it makes sense that the kuberneteskernelprovider would need to be run in the cluster. Currently the service I have running nbconvert is run outside the cluster, but that could be changed easily.

So it seems like the "correct" long term solution would be to implement jkm, but it's a massive effort. I'll see what baby steps I can find that might help.

@golf-player
Copy link
Contributor Author

golf-player commented Apr 22, 2020

Got pulled into another project. I finally got time to work on this again. @kevin-bates your suggestion to use RemoteKernelManager directly worked like a charm. There are some things I will contribute upstream since configuring it was challenging.

Here's the code I used:

import os
import nbformat
from nbclient import NotebookClient
from enterprise_gateway.services.kernels.remotemanager import RemoteKernelManager, RemoteMappingKernelManager
from enterprise_gateway.enterprisegatewayapp import EnterpriseGatewayApp

eg = EnterpriseGatewayApp()
rmkm = RemoteMappingKernelManager(parent=eg,
                                  log=eg.log,
                                  connection_dir=eg.runtime_dir,)


class RKM(RemoteKernelManager):
    def __init__(self, **kwargs):
        kernel_id = rmkm.new_kernel_id()
        kwargs.update({
            "parent": rmkm,
            "log": rmkm.log,
            "kernel_name": "python_kubernetes",
            "connection_file": os.path.join(rmkm.connection_dir, "kernel-%s.json" % kernel_id)
        })
        super(RKM, self).__init__(**kwargs)
        self.kernel_id = kernel_id


with open("/home/ish/notebooks/Untitled.ipynb") as fp:
    test_notebook = nbformat.read(fp, as_version=4)

client = NotebookClient(nb=test_notebook, kernel_manager_class=RKM)

Couple things I plan to address (and I'd appreciate your thoughts here):

  • I think RemoteKernelManager should make a kernel_id if one is not set (still not actually sure where kernel_id gets set normally; the factory calls don't appear to pass in kernel_id)
  • RemoteKernelManager gets its config from parent.parent (JEG app instance). Do y'all think it makes sense to give it the same config as JEG (not too familiar with traitlets, but I assume there's a way to share config as a mixin or something)? If not, I'll need to add a parent param to the NotebookClient constructor to receive parent and set it on the kernel_manager_class, but tbh I dislike that solution.

@kevin-bates
Copy link
Member

I'm glad to hear you've had some success, but, to be honest, I don't think this is the right approach. You really want a "kernel manager" that is not so strongly tied to a specific application (EG). Had I envisioned RemoteKernelManager being used outside EG, I'd probably have done some things differently, but between figuring out how things work in this ecosystem and the need to get something working, some corners were probably cut. I think the better solution would be to leverage the kernel provider architecture, since this one of the use cases that would address. However, that's not at a level of acceptance yet that would warrant any kind of changes and I'm not sure how the various applications are planning that adoption, if at all.

I would be happy to keep working with you, but I don't advocate making essentially one-off changes to this repository to try to force in something that wasn't designed for this use case.

Regarding kernel_id, the base framework has always treated kernel_id as something only the MultiKernelManager needs - since it represents the index into its hashmap of kernel managers. This was water under the bridge when RMK was developed. Since nbclient is entirely single-use, there's no need for a multi-kernel manager. I believe "ownership" of kernel_id lives in the kernel provider in the JKM architecture.

If you'd like to open an issue in EG about making RMK a bit more independent, that's probably the best approach for now. There might be a way we could massage things such that no nbclient changes would be required - which I view as the primary requirement to this exercise.

@golf-player
Copy link
Contributor Author

Agreed that repo isn't the best place for this. I dislike the solution that touches nbclient too. I've created jupyter-server/enterprise_gateway#803 and I think this one can be closed.

@MSeal
Copy link
Contributor

MSeal commented Apr 22, 2020

Glad there was a alignment in direction and clarity gained on this issue.

@MSeal MSeal closed this as completed Apr 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants