End-to-end SSL #2055

tgmachina · 2018-07-23T21:54:42Z

These changes are in service of turning on SSL internally for JupyterHub (issue #370).

To generate certs, I created a separate module Certipy that wraps pyOpenSsl. Certipy has also been uploaded to PyPI.

To test these changes, I added 3 separate test files that turn on internalSSL via monkey patching. Tests intermittently fail when run together, but pass individually. @minrk had initially suggested making all requests to the hub public_url, but I wasn't able to work out a good scheme for that. I set up the 3 separate test files to get these changes through review ASAP, but I'd be happy to iterate on changing these tests.

These changes must be used in conjunction with changes made to CHP in order to work (likely automated github testing will fail because of this). I wasn't entirely sure how to specify a certain CHP version as a requirement.

Happy to iterate on whatever changes you see necessary. Thanks for your help and patience!

Add Localhost to trusted alt names Update to match refactored certipy names Add the FQDN to cert alt names for hub Ensure notebooks do not trust each other Drop certs in user's home directory Refactor cert creation and movement Make alt names configurable Make attaching alt names more generic Setup ssl_context for the singleuser hub check

Import socket when needed Move pwd import since more than one thing uses it.

Setup general ssl request, not just to api Basic tests comprised of non-ssl test copies Create the context only when request is http Refactor ssl key, cert, ca names Configure the AsyncHTTPClient at app start Change tests to import existing ones with ssl on Override __new__ in MockHub to turn on SSL

minrk

This is great!

I need to figure out a bit how the cert-staging step is going to work for container-based deployments, but I suspect separating .create_certs and .move_certs is already most of what's needed.

We'll need to make the API for .move_certs very clear, since most Spawners will need to reimplement it. As I understand it:

key_pair is the dict of keyfile, certfile, cafile paths on the Hub filesystem
return value is (keyfile, certfile, cafile) paths on the notebook server filesystem

e.g. on Docker this could mean creating a docker volume and staging the files into it. On Kubernetes, it could mean creating a Secret.

It would be good to document exactly what files are created, shared, etc. and when, so we know exactly what files are needed by which processes and when, and which are recreated. One important use case is the proxy being in a separate pod/container from the Hub, so we need to make sure that we can create the certs the proxy needs prior to hub start.

minrk · 2018-07-26T08:54:32Z

jupyterhub/spawner.py

+        if self.internal_ssl:
+            key, cert, ca = self.move_certs(self.create_certs())
+
+            args.append('--keyfile="%s"' % key)


Let's do this in env instead of args. It's easier to pass things in environment variables with certain deployments than CLI args, and makes things more flexible for alternate entrypoints.

minrk · 2018-07-26T08:55:47Z

jupyterhub/spawner.py

+        except KeyError:
+            self.log.debug("User {} not found on system, not moving certs".format(self.user.name))
+
+        return [key, cert, ca]


Maybe return the dict instead of tuple, so it is clearer what's what. That will also make it easier to preserve backward-compatibility in the future.

minrk · 2018-07-26T08:56:59Z

jupyterhub/alembic/versions/26fc487e2b43_ssl_info_into_server.py

+
+
+def upgrade():
+    op.add_column('servers', sa.Column('certfile', sa.Unicode(4096)))


We only need to persist values in the database if these are not predictable per-server. If these are the same (or even deterministic) for all servers, as I think they are, we can leave them out of the database and have them live only in memory.

minrk · 2018-07-26T08:58:13Z

jupyterhub/app.py

@@ -1641,6 +1709,52 @@ def write_pid_file(self):
            cfg = self.config.copy()
            cfg.JupyterHub.merge(cfg.JupyterHubApp)
            self.update_config(cfg)
+        if self.internal_ssl:
+            from certipy import Certipy


Let's put this in a .init_internal_ssl method for slightly easier organization

minrk · 2018-07-26T09:02:38Z

jupyterhub/spawner.py

+            home = user.pw_dir
+
+            # Create dir for user's certs wherever we're starting
+            out_dir = "{home}/.jupyter/certs".format(home=home)


Let's call this .jupyterhub/jupyterhub-certs just to be clear

tgmachina · 2018-07-27T02:17:48Z

For the case where the proxy lives separately from the hub, would it be worthwhile to enable a command like jupyterhub --generate-config except to set up the initial certificate authorities/certs that internal_ssl uses, for example, something like: jupyterhub --generate-certs? That would allow users to get and move the certs they need manually.

As for your assessment of .move_certs, that is correct. I have to think about the container case a bit more too, because its also important to get the appropriate domain/ip info into the cert in order for ssl verification to work. That capability exists with the use of alt_names at certificate creation, but I am unsure if the info is available or known before container start. I'll think about that some more and work on clarifying this API, good call.

I'm working through these changes now. I think the only thing I'm curious about are the docs. I was thinking of expanding the docstrings for .create_certs and .move_certs as well as including more notes within the proxy.md, spawner.md and websecurity.md docs. Seem reasonable?

minrk · 2018-08-01T08:42:47Z

something like: jupyterhub --generate-certs?

That's worth a try. I'm not 100% sure what will work best.

because its also important to get the appropriate domain/ip info into the cert in order for ssl verification to work

oooh, that makes things more difficult because it might be hard/impossible to know this before the server starts in some cases. I'm not quite sure how to deal with that.

I was thinking of expanding the docstrings for .create_certs and .move_certs as well as including more notes within the proxy.md, spawner.md and websecurity.md docs. Seem reasonable?

Yes, absolutely!

tgmachina · 2018-08-09T21:18:39Z

Ok, just as an update, here's what I'm adding/changing/pondering:

Nearly done with updating the docs, just pending the changes below.
The Proxy class will have reference to its own certs either:
- Created and signed on Proxy init using the Hub CA (basically how its done now).
- Created by some third-party mechanism (Let's Encrypt, Certipy, self-signed, etc) for which the signing CA file is specified in the proxy config so that trust can be propagated appropriately.
Docker/Kube cases w.r.t internal_ssl:
- Swarm/Kube managed deployments: as I understand them, the networking can be set such that containers are connected with tls. If containers/pods running the single-user notebooks can't reach each other, this seems like it would have about the same effect as using internal_ssl.
- Docker does have a mechanism for specifying an ip address prior to launch. This would entail drawing a random, unused ip from Docker's reserved subnet for container ips. I can investigate what enabling that would entail.

To better accommodate external certificate management as well as building of trust, Certipy was refactored. This included general improvements to file and record handling. In the process, some of Certipy's APIs changed slightly, but should be more stable now going forward.

This is used to be able to access JupyterHub's CA information and (manually) move it to components that need them (like externally managed proxies).

This reverts commit bcebf0e. Setting change-origin introduces CORS problems

can have consequences if args are re-used

avoids leaving lingering proxy if app fails to start

must be module-scoped, not session-scoped, or it will get reused inconsistently

avoids possible conflict e.g. if a user had the name 'hub-internal'

- trust subdomain_host by default - JupyterHub.trusted_alt_names is inherited by Spawners by default. Do we need Spawner.ssl_alt_names to be separately configurable?

to cover any protocol mismatches

minrk · 2018-10-16T14:14:14Z

Okay, after resolving the subdomain issues with trusted_alt_names, I believe this is totally working. I just need to update the tests with real-database backends, because test_api can't actually be run multiple times against the same database.

avoids pollution from one test module to the next

in case it failed to fully start

minrk · 2018-10-17T11:27:07Z

Woo, it works!

consideRatio · 2018-10-17T12:13:26Z

Wooooooooooooo wow this PR! Massive work done here! I've been keeping an eye on it, well done @tgmachina and @minrk !!! 🎉

tgmachina · 2018-10-17T14:48:41Z

Wow! Thanks so much for all your help @minrk!

tgamblin · 2018-10-17T15:21:13Z

This is awesome! Thanks @tgmachina and @minrk!

tgmachina added 17 commits July 18, 2018 16:02

Build ssl_context as util, wait_up with context

f7f4759

Add config and wiring for enabling internal ssl in app

a69e906

Propagate certs to everything that needs them

c50cd1b

Use certipy to automate cert creation

c5faf2c

Remove unnecessary flag, forward-ssl

7c6972d

Import socket when needed Move pwd import since more than one thing uses it.

Server cert info into objects and orm

3c21e7d

Only internal_ssl kwargs if internal_ssl is enabled

25e6b31

Testing internal ssl modifications

a549edf

Allow option to specify ssl_context in wait_up

0304dd0

Only import certipy if internal_ssl is turned on

5c39325

Set http[s] as appropriate for the singleuser url

01b2764

Add db migration for ssl changes to servers

fa3437c

Remove vague try-catch

1fc7508

Fix docstring

5de870b

Add Certipy to requirements now that its in PyPI

d429433

minrk reviewed Jul 26, 2018

View reviewed changes

tgmachina added 2 commits July 26, 2018 14:29

Remove certs from the Server orm

6000a84

Pass certfile info via env instead of args

3adbfe3

tgmachina changed the title ~~End to end ssl~~ End-to-end SSL Jul 27, 2018

tgmachina added 3 commits July 27, 2018 16:41

Move internal_ssl init into an init function

dd4df87

Clarify output directory name for user certs

e082b92

Return a dict instead of a tuple from move_certs

9607edc

tgmachina added 3 commits September 4, 2018 15:08

Add the ability to generate JupyterHub's certificates

2a0e5d9

This is used to be able to access JupyterHub's CA information and (manually) move it to components that need them (like externally managed proxies).

Update doc strings for create_certs and move_certs

84deb1f

minrk added 6 commits October 12, 2018 16:24

Revert "Set change-origin so certs behind proxy work"

7c0e113

This reverts commit bcebf0e. Setting change-origin introduces CORS problems

ssl tests can use configproxy

67f21bb

avoid modifying headers in-place

28c6377

can have consequences if args are re-used

register cleanup before start

b72d887

avoids leaving lingering proxy if app fails to start

fix ssl tmpdir in tests

d64853a

must be module-scoped, not session-scoped, or it will get reused inconsistently

ensure AsyncIOMainLoop is registered in tests

f3c2a15

minrk force-pushed the end-to-end-ssl branch from c35e9e9 to f3c2a15 Compare October 15, 2018 14:29

minrk added 5 commits October 16, 2018 15:45

avoid unnecessarily recreating proxy certs

1f31658

add user- prefix to user cert dirs

9a45f4a

avoids possible conflict e.g. if a user had the name 'hub-internal'

consolidate trusted alt names

eb7648a

- trust subdomain_host by default - JupyterHub.trusted_alt_names is inherited by Spawners by default. Do we need Spawner.ssl_alt_names to be separately configurable?

run internal-ssl tests with external http

e921354

to cover any protocol mismatches

ensure user's own subdomain is in trusted alt names

15788be

minrk added 4 commits October 17, 2018 10:38

Delete users in MockHub

301fed3

avoids pollution from one test module to the next

avoid cleaning users when we are testing resume

b0116ee

empty groups, too

e385214

Catch and print errors stopping hub

7a055e6

in case it failed to fully start

minrk merged commit 2d94b29 into jupyterhub:master Oct 17, 2018

minrk mentioned this pull request Oct 17, 2018

Enable SSL/TLS for internal communication #370

Closed

tgmachina deleted the end-to-end-ssl branch October 17, 2018 15:35

aseishas mentioned this pull request Nov 30, 2018

Any way to not have public notebook servers? jupyterhub/batchspawner#31

Closed

rcthomas mentioned this pull request Apr 10, 2019

[idea] batchspawner sprint jupyterhub/batchspawner#138

Closed

yuvipanda mentioned this pull request Oct 5, 2019

Token based security for workers dask/distributed#1686

Closed

manics mentioned this pull request Dec 16, 2019

Support usage of JupyterHub's internal_ssl functionality jupyterhub/zero-to-jupyterhub-k8s#1520

Open

consideRatio mentioned this pull request Apr 3, 2020

support internal_ssl for kubespawner jupyterhub/kubespawner#386

Closed

abagali1 mentioned this pull request Sep 20, 2020

internal_ssl + SlurmSpawner leads to certificate verification error jupyterhub/batchspawner#192

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

End-to-end SSL #2055

End-to-end SSL #2055

tgmachina commented Jul 23, 2018 •

edited

Loading

minrk left a comment

minrk Jul 26, 2018

minrk Jul 26, 2018

minrk Jul 26, 2018

minrk Jul 26, 2018

minrk Jul 26, 2018

tgmachina commented Jul 27, 2018

minrk commented Aug 1, 2018

tgmachina commented Aug 9, 2018

minrk commented Oct 16, 2018

minrk commented Oct 17, 2018

consideRatio commented Oct 17, 2018

tgmachina commented Oct 17, 2018

tgamblin commented Oct 17, 2018



		def upgrade():
		op.add_column('servers', sa.Column('certfile', sa.Unicode(4096)))

End-to-end SSL #2055

End-to-end SSL #2055

Conversation

tgmachina commented Jul 23, 2018 • edited Loading

minrk left a comment

Choose a reason for hiding this comment

minrk Jul 26, 2018

Choose a reason for hiding this comment

minrk Jul 26, 2018

Choose a reason for hiding this comment

minrk Jul 26, 2018

Choose a reason for hiding this comment

minrk Jul 26, 2018

Choose a reason for hiding this comment

minrk Jul 26, 2018

Choose a reason for hiding this comment

tgmachina commented Jul 27, 2018

minrk commented Aug 1, 2018

tgmachina commented Aug 9, 2018

minrk commented Oct 16, 2018

minrk commented Oct 17, 2018

consideRatio commented Oct 17, 2018

tgmachina commented Oct 17, 2018

tgamblin commented Oct 17, 2018

tgmachina commented Jul 23, 2018 •

edited

Loading