Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for running with istio using mTLS #425

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

harsimranmaan
Copy link
Contributor

@harsimranmaan harsimranmaan commented Aug 31, 2020

Addresses jupyterhub/zero-to-jupyterhub-k8s#697

When istio sidecar injection is enabled in a namespace running jupyterhub, different components fail to talk to each other. Introducing a service to front the user pod helps solve this problem. Running other JH components in a cluster with istio enabled can be troublesome as the proxy can fail due to the service mesh network setup. (In my tests, the configurable-http-proxy did not work and the traefik based one could not forward client info properly). https://github.com/splunk/jupyterhub-istio-proxy can be used as the proxy that configures istio based routing for user pods.

The overall setup:

  • Have istio enabled on the namespace where JH is running.
  • Enable istio sidecar injection so that it could hijack network request and ensure mTLS for intra-ns communication.
  • Replace the proxy with https://github.com/splunk/jupyterhub-istio-proxy#deployment. ensure that ports are set aptly on JH and the deploy to ensure proper communication.
  • Create an istio gateway to handle incoming traffic to JH. TLS termination can also be done here by setting the SSL certs.

The overall communication would now be:

Web traffic -> istio gateway(TLS termination) -> istio Virtual service (created by proxy-api, responsible for routing traffic) -> user's pod --mTLS--> JH

JH --mTLS--> user's pod(via user svc).

A separate PR would be added on the zero2jhk8s project to add the helm chart changes to support jupyterhub-istio-proxy

More more info see: https://medium.com/@harsimran.maan/running-jupyterhub-with-istio-service-mesh-on-kubernetes-a-troubleshooting-journey-707039f36a7b

@shenghu
Copy link

shenghu commented Sep 3, 2020

@harsimranmaan there is conflict that need to be resolved

@shenghu
Copy link

shenghu commented Sep 3, 2020

@yuvipanda can you please review this pr? I'm waiting for this change, to be honest. :)

@yuvipanda
Copy link
Collaborator

Thank you for working on it, @harsimranmaan! Excited to get this to land 👍

Have you seen the code in https://github.com/jupyterhub/kubespawner/blob/master/kubespawner/proxy.py#L45? Many moons ago, I set up a Proxy purely based on ingresses. That too needed services. But instead of using labels, I optimized by creating EndPoint objects directly - this saves a few loops from the k8s service controller, but maybe not worth it. I don't think that proxy class has seen a lot of work since, so I don't mind this becoming an option in the spawner directly.

I'm guessing we'll have one service per user server. Can we put a label like 'hub.jupyter.org/server-name' or some such and use that for service selection? This way the label is more generic, and can be arbitrarily used by others too (for networkpolicy, for example?).

I haven't managed to go through or test the code yet though :( Not sure when I'll have time for that, but happy to engage at a slightly higher level for now.

@harsimranmaan
Copy link
Contributor Author

harsimranmaan commented Sep 3, 2020

I'm guessing we'll have one service per user server. Can we put a label like 'hub.jupyter.org/server-name' or some such and use that for service selection? This way the label is more generic, and can be arbitrarily used by others too (for networkpolicy, for example?).

Thanks @yuvipanda. It makes sense. I'll rebase and refactor it. I'll also look at the endpoint object to see if something can be recycled but that might come in as a separate PR.

kubespawner/spawner.py Outdated Show resolved Hide resolved
@harsimranmaan
Copy link
Contributor Author

@shenghu You might find this helpful https://medium.com/@harsimran.maan/running-jupyterhub-with-istio-service-mesh-on-kubernetes-a-troubleshooting-journey-707039f36a7b

@shenghu
Copy link

shenghu commented Sep 15, 2020

@harsimranmaan thanks a lot. I'm able to enable istio by following your article. One thing I'm still having is that continuous-image-puller could not start istio-proxy properly. It failed at mounting cert. Currently I work around by not injecting istio sidecar to it.

@yuvipanda can this pr be merged so that I can get it easier or is there way to patch kubespawner? Thanks!

@shenghu
Copy link

shenghu commented Sep 17, 2020

@harsimranmaan fyi, there is conflict to be resolved

@bleggett
Copy link

bleggett commented Oct 5, 2020

Thirding this - have encountered the same issue with Jupyterhub not playing nice with Istio and this fixes a big part of that, and seems self-contained.

@harsimranmaan
Copy link
Contributor Author

@harsimranmaan fyi, there is conflict to be resolved

Sorry, I was away in the last few weeks. Will rebase this.

@harsimranmaan
Copy link
Contributor Author

I rebased this. I haven't had a chance yet to test the change with the async/await and the timeout changes that landed between this PR and master in the meantime. This can be reviewed but I'll report once I have some data after upgrading my the JH clusters to the rebased version.

@shenghu
Copy link

shenghu commented Oct 22, 2020

@yuvipanda @harsimranmaan can this be merged?

@harsimranmaan
Copy link
Contributor Author

Validated the changes for create and delete user pods.

[I 2020-10-26 21:04:45.871 JupyterHub spawner:1847] Attempting to create pod jupyter-hmaan, with timeout 3
[I 2020-10-26 21:04:45.913 JupyterHub spawner:1917] Attempting to create svc jupyter-hmaan, with timeout 3

@jwclark
Copy link

jwclark commented Jun 11, 2021

@harsimranmaan @yuvipanda we're also waiting for this to be resolved if it's possible to pick this PR back up again.

@will-ds
Copy link

will-ds commented Jun 12, 2021

@harsimranmaan @yuvipanda we're also waiting for this to be resolved if it's possible to pick this PR back up again.

@harsimranmaan @yuvipanda we could really use this fix as well.

@harsimranmaan
Copy link
Contributor Author

I'll try to rebase this later in the week. If someone else is interested to do so, I'd be happy to merge rebased code into my branch.

@reganmcdonald
Copy link

reganmcdonald commented Jul 19, 2021

@harsimranmaan @yuvipanda thank you for your work on this. Any chance you can resolve the PR conflicts?

@consideRatio
Copy link
Member

Is it correct that #409 and #522 provides the desired functionality, and that this PR can be closed?

@jwclark
Copy link

jwclark commented Apr 22, 2022

Is it correct that #409 and #522 provides the desired functionality, and that this PR can be closed?

Looks good to me. For our particular deployment we need to be able to configure the V1ServicePort 'name' attribute in the make_service function added in #409. So we may still need to extend KubeSpawner unless that can be easily added in a different PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants