-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End-to-end SSL #2055
End-to-end SSL #2055
Conversation
Add Localhost to trusted alt names Update to match refactored certipy names Add the FQDN to cert alt names for hub Ensure notebooks do not trust each other Drop certs in user's home directory Refactor cert creation and movement Make alt names configurable Make attaching alt names more generic Setup ssl_context for the singleuser hub check
Import socket when needed Move pwd import since more than one thing uses it.
Setup general ssl request, not just to api Basic tests comprised of non-ssl test copies Create the context only when request is http Refactor ssl key, cert, ca names Configure the AsyncHTTPClient at app start Change tests to import existing ones with ssl on Override __new__ in MockHub to turn on SSL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
I need to figure out a bit how the cert-staging step is going to work for container-based deployments, but I suspect separating .create_certs
and .move_certs
is already most of what's needed.
We'll need to make the API for .move_certs
very clear, since most Spawners will need to reimplement it. As I understand it:
- key_pair is the dict of keyfile, certfile, cafile paths on the Hub filesystem
- return value is
(keyfile, certfile, cafile)
paths on the notebook server filesystem
e.g. on Docker this could mean creating a docker volume and staging the files into it. On Kubernetes, it could mean creating a Secret.
It would be good to document exactly what files are created, shared, etc. and when, so we know exactly what files are needed by which processes and when, and which are recreated. One important use case is the proxy being in a separate pod/container from the Hub, so we need to make sure that we can create the certs the proxy needs prior to hub start.
jupyterhub/spawner.py
Outdated
if self.internal_ssl: | ||
key, cert, ca = self.move_certs(self.create_certs()) | ||
|
||
args.append('--keyfile="%s"' % key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do this in env instead of args. It's easier to pass things in environment variables with certain deployments than CLI args, and makes things more flexible for alternate entrypoints.
jupyterhub/spawner.py
Outdated
except KeyError: | ||
self.log.debug("User {} not found on system, not moving certs".format(self.user.name)) | ||
|
||
return [key, cert, ca] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe return the dict instead of tuple, so it is clearer what's what. That will also make it easier to preserve backward-compatibility in the future.
|
||
|
||
def upgrade(): | ||
op.add_column('servers', sa.Column('certfile', sa.Unicode(4096))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only need to persist values in the database if these are not predictable per-server. If these are the same (or even deterministic) for all servers, as I think they are, we can leave them out of the database and have them live only in memory.
jupyterhub/app.py
Outdated
@@ -1641,6 +1709,52 @@ def write_pid_file(self): | |||
cfg = self.config.copy() | |||
cfg.JupyterHub.merge(cfg.JupyterHubApp) | |||
self.update_config(cfg) | |||
if self.internal_ssl: | |||
from certipy import Certipy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put this in a .init_internal_ssl
method for slightly easier organization
jupyterhub/spawner.py
Outdated
home = user.pw_dir | ||
|
||
# Create dir for user's certs wherever we're starting | ||
out_dir = "{home}/.jupyter/certs".format(home=home) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's call this .jupyterhub/jupyterhub-certs
just to be clear
For the case where the proxy lives separately from the hub, would it be worthwhile to enable a command like As for your assessment of I'm working through these changes now. I think the only thing I'm curious about are the docs. I was thinking of expanding the docstrings for |
That's worth a try. I'm not 100% sure what will work best.
oooh, that makes things more difficult because it might be hard/impossible to know this before the server starts in some cases. I'm not quite sure how to deal with that.
Yes, absolutely! |
Ok, just as an update, here's what I'm adding/changing/pondering:
|
To better accommodate external certificate management as well as building of trust, Certipy was refactored. This included general improvements to file and record handling. In the process, some of Certipy's APIs changed slightly, but should be more stable now going forward.
This is used to be able to access JupyterHub's CA information and (manually) move it to components that need them (like externally managed proxies).
This reverts commit bcebf0e. Setting change-origin introduces CORS problems
can have consequences if args are re-used
avoids leaving lingering proxy if app fails to start
must be module-scoped, not session-scoped, or it will get reused inconsistently
avoids possible conflict e.g. if a user had the name 'hub-internal'
- trust subdomain_host by default - JupyterHub.trusted_alt_names is inherited by Spawners by default. Do we need Spawner.ssl_alt_names to be separately configurable?
to cover any protocol mismatches
Okay, after resolving the subdomain issues with trusted_alt_names, I believe this is totally working. I just need to update the tests with real-database backends, because |
avoids pollution from one test module to the next
in case it failed to fully start
Woo, it works! |
Wooooooooooooo wow this PR! Massive work done here! I've been keeping an eye on it, well done @tgmachina and @minrk !!! 🎉 |
Wow! Thanks so much for all your help @minrk! |
This is awesome! Thanks @tgmachina and @minrk! |
These changes are in service of turning on SSL internally for JupyterHub (issue #370).
To generate certs, I created a separate module Certipy that wraps pyOpenSsl. Certipy has also been uploaded to PyPI.
To test these changes, I added 3 separate test files that turn on internalSSL via monkey patching. Tests intermittently fail when run together, but pass individually. @minrk had initially suggested making all requests to the hub
public_url
, but I wasn't able to work out a good scheme for that. I set up the 3 separate test files to get these changes through review ASAP, but I'd be happy to iterate on changing these tests.These changes must be used in conjunction with changes made to CHP in order to work (likely automated github testing will fail because of this). I wasn't entirely sure how to specify a certain CHP version as a requirement.
Happy to iterate on whatever changes you see necessary. Thanks for your help and patience!