diff --git a/README.md b/README.md index 126af9af5d..ad11aeaac8 100644 --- a/README.md +++ b/README.md @@ -75,6 +75,10 @@ If you want multiple users to be able to sign into the server, you will need to The [wiki](https://github.com/jupyter/jupyterhub/wiki/Using-sudo-to-run-JupyterHub-without-root-privileges) describes how to run the server as a less privileged user, which requires more configuration of the system. +## Getting started + +see the [getting started doc](docs/getting-started.md) for some of the basics of configuring your JupyterHub deployment. + ### Some examples generate a default config file: diff --git a/docs/getting-started.md b/docs/getting-started.md new file mode 100644 index 0000000000..1304ef3506 --- /dev/null +++ b/docs/getting-started.md @@ -0,0 +1,389 @@ +# Getting started with JupyterHub + +This document describes some of the basics of configuring JupyterHub to do what you want. +JupyterHub is highly customizable, so there's a lot to cover. + + +## Installation + +See [the readme](../README.md) for help installing JupyterHub. + + +## Overview + +JupyterHub is a set of processes that together provide a multiuser Jupyter Notebook server. +There are three main categories of processes run by the `jupyterhub` command line program: + +- *Single User Server*: a dedicated, single-user, Jupyter Notebook is started for each user on the system + when they log in. The object that starts these processes is called a *Spawner*. +- *Proxy*: the public facing part of the server that uses a dynamic proxy to route HTTP requests + to the *Hub* and *Single User Servers*. +- *Hub*: manages user accounts and authentication and coordinates *Single Users Servers* using a *Spawner*. + +## JupyterHub's default behavior + + +To start JupyterHub in its default configuration, type the following at the command line: + + sudo jupyterhub + +The default Authenticator that ships with JupyterHub authenticates users +with their system name and password (via [PAM][]). +Any user on the system with a password will be allowed to start a single-user notebook server. + +The default Spawner starts servers locally as each user, one dedicated server per user. +These servers listen on localhost, and start in the given user's home directory. + +By default, the *Proxy* listens on all public interfaces on port 8000. +Thus you can reach JupyterHub through: + + http://localhost:8000 + +or any other public IP or domain pointing to your system. + +In their default configuration, the other services, the *Hub* and *Single-User Servers*, +all communicate with each other on localhost only. + +**NOTE:** In its default configuration, JupyterHub runs without SSL encryption (HTTPS). +You should not run JupyterHub without SSL encryption on a public network. +See [below](#Security) for how to configure JupyterHub to use SSL. + +By default, starting JupyterHub will write two files to disk in the current working directory: + +- `jupyterhub.sqlite` is the sqlite database containing all of the state of the *Hub*. + This file allows the *Hub* to remember what users are running and where, + as well as other information enabling you to restart parts of JupyterHub separately. +- `jupyterhub_cookie_secret` is the encryption key used for securing cookies. + This file needs to persist in order for restarting the Hub server to avoid invalidating cookies. + Conversely, deleting this file and restarting the server effectively invalidates all login cookies. + The cookie secret file is discussed [below](#Security). + +The location of these files can be specified via configuration, discussed below. + + +## How to configure JupyterHub + +JupyterHub is configured in two ways: + +1. Command-line arguments +2. Configuration files + +Type the following for brief information about the command line arguments: + + jupyterhub -h + +or: + + jupyterhub --help-all + +for the full command line help. + +By default, JupyterHub will look for a configuration file (can be missing) +named `jupyterhub_config.py` in the current working directory. +You can create an empty configuration file with + + + jupyterhub --generate-config + +This empty configuration file has descriptions of all configuration variables and their default values. +You can load a specific config file with: + + + jupyterhub -f /path/to/jupyterhub_config.py + +See also: [general docs](http://ipython.org/ipython-doc/dev/development/config.html) +on the config system Jupyter uses. + + +## Networking + +In most situations you will want to change the main IP address and port of the Proxy. +This address determines where JupyterHub is available to your users. +The default is all network interfaces (`''`) on port 8000. + +This can be done with the following command line arguments: + + jupyterhub --ip=192.168.1.2 --port=443 + +Or you can put the following lines in a configuration file: + +```python +c.JupyterHub.ip = '192.168.1.2' +c.JupyterHub.port = 443 +``` + +Port 443 is used in these examples as it is the default port for SSL/HTTPS. + +Configuring only the main IP and port of JupyterHub should be sufficient for most deployments of JupyterHub. +However, for more customized scenarios, +you can configure the following additional networking details. + +The Hub service talks to the proxy via a REST API on a secondary port, +whose network interface and port can be configured separately. +By default, this REST API listens on port 8081 of localhost only. +If you want to run the Proxy separate from the Hub, +you may need to configure this IP and port with: + +```python +# ideally a private network address +c.JupyterHub.proxy_api_ip = '10.0.1.4' +c.JupyterHub.proxy_api_port = 5432 +``` + +The Hub service also listens only on localhost by default. +The Hub needs needs to be accessible from both the proxy and all Spawners. +When spawning local servers localhost is fine, +but if *either* the Proxy or (more likely) the Spawners will be remote or isolated in containers, +the Hub must listen on an IP that is accessible. + +```python +c.JupyterHub.hub_ip = '10.0.1.4' +c.JupyterHub.hub_port = 54321 +``` + +## Security + +First of all, since JupyterHub includes authentication and allows arbitrary code execution, +you should not run it without SSL (HTTPS). +This will require you to obtain an official SSL certificate or create a self-signed certificate. +Once you have obtained and installed a key and certificate +you need to pass their locations to JupyterHub's configuration as follows: + +```python +c.JupyterHub.ssl_key = '/path/to/my.key' +c.JupyterHub.ssl_cert = '/path/to/my.cert' +``` + +Some cert files also contain the key, in which case only the cert is needed. +It is important that these files be put in a secure location on your server. + +There are two other aspects of JupyterHub network security. + +The cookie secret is an encryption key, used to encrypt the cookies used for authentication. +If this value changes for the Hub, +all single-user servers must also be restarted. +Normally, this value is stored in the file `jupyterhub_cookie_secret`, +which can be specified with: + +```python +c.JupyterHub.cookie_secret_file = '/path/to/jupyterhub_cookie_secret' +``` + +In most deployments of JupyterHub, you should point this to a secure location on the file system. +If the cookie secret file doesn't exist when the Hub starts, +a new cookie secret is generated and stored in the file. + +If you would like to avoid the need for files, +the value can be loaded in the Hub process from the `JPY_COOKIE_SECRET` env variable: + +```bash +export JPY_COOKIE_SECRET=`openssl rand -hex 1024` +``` + +For security reasons, this env variable should only be visible to the Hub. + +The Hub authenticates its requests to the Proxy via an environment variable, `CONFIGPROXY_AUTH_TOKEN`. +If you want to be able to start or restart the proxy or Hub independently of each other (not always necessary), +you must set this environment variable before starting the server (for both the Hub and Proxy): + + +```bash +export CONFIGPROXY_AUTH_TOKEN=`openssl rand -hex 32` +``` + +This env variable needs to be visible to the Hub and Proxy. +If you don't set this, the Hub will generate a random key itself, +which means that any time you restart the Hub you **must also restart the Proxy**. +If the proxy is a subprocess of the Hub, +this should happen automatically (this is the default configuration). + + + +## Configuring Authentication + +The default Authenticator uses [PAM][] to authenticate system users with their username and password. +The default behavior of this Authenticator is to allow any user with an account and password on the system to login. +You can restrict which users are allowed to login with `Authenticator.whitelist`: + + +```python +c.Authenticator.whitelist = {'mal', 'zoe', 'inara', 'kaylee'} +``` + +Admin users of JupyterHub have the ability to take actions on users' behalf, +such as stopping and restarting their servers, +and adding and removing new users from the whitelist. +Any users in the admin list are automatically added to the whitelist, +if they are not already present. +The set of initial Admin users can configured as follows: + +```python +c.Authenticator.admin_users = {'mal', 'zoe'} +``` + +If `JupyterHub.admin_access` is True (not default), +then admin users have permission to log in *as other users* on their respective machines, for debugging. +**You should make sure your users know if admin_access is enabled.** + +### Adding and removing users + +Users can be added and removed to the Hub via the admin panel or REST API. +These users will be added to the whitelist and database. +Restarting the Hub will not require manually updating the whitelist in your config file, +as the users will be loaded from the database. +This means that after starting the Hub once, +it is not sufficient to remove users from the whitelist in your config file. +You must also remove them from the database, either by discarding the database file, +or via the admin UI. + +The default PAMAuthenticator is one case of a special kind of authenticator, +called a LocalAuthenticator, +indicating that it manages users on the local system. +When you add a user to the Hub, a LocalAuthenticator checks if that user already exists. +Normally, there will be an error telling you that the user doesn't exist. +If you set the configuration value + + +```python +c.LocalAuthenticator.create_system_users = True +``` + +however, adding a user to the Hub that doesn't already exist on the system +will result in the Hub creating that user via the system `useradd` mechanism. +This option is typically used on hosted deployments of JupyterHub, +to avoid the need to manually create all your users before launching the service. +It is not recommended when running JupyterHub in situations where JupyterHub users maps directly onto UNIX users. + + +## Configuring single-user servers + +Since the single-user server is an instance of `ipython notebook`, +an entire separate multi-process application, +there are many aspect of that server can configure, +and a lot of ways to express that configuration. + +At the JupyterHub level, you can set some values on the Spawner. +The simplest of these is `Spawner.notebook_dir`, +which lets you set the root directory for a user's server. +This root notebook directory is the highest level directory users will be able to access in the notebook dashboard. +In this example, the root notebook directory is set to `~/notebooks`, +where `~` is expanded to the user's home directory. + +```python +c.Spawner.notebook_dir = '~/notebooks' +``` + +You can also specify extra command-line arguments to the notebook server with: + +```python +c.Spawner.args = ['--debug', '--profile=PHYS131'] +``` + +This could be used to set the users default page for the single user server: + +```python +c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb'] +``` + +Since the single-user server extends the notebook server application, +it still loads configuration from the `ipython_notebook_config.py` config file. +Each user may have one of these files in `$HOME/.ipython/profile_default/`. +IPython also supports loading system-wide config files from `/etc/ipython/`, +which is the place to put configuration that you want to affect all of your users. + +## External services + +JupyterHub has a REST API that can be used to run external services. +More detail on this API will be added in the future. + +## File locations + +It is recommended to put all of the files used by JupyterHub into standard UNIX filesystem locations. + +* `/srv/jupyterhub` for all security and runtime files +* `/etc/jupyterhub` for all configuration files +* `/var/log` for log files + +## Example + +In the following example, we show a configuration files for a fairly standard JupyterHub deployment with the following assumptions: + +* JupyterHub is running on a single cloud server +* Using SSL on the standard HTTPS port 443 +* You want to use [GitHub OAuth][oauthenticator] for login +* You need the users to exist locally on the server +* You want users' notebooks to be served from `~/assignments` to allow users to browse for notebooks within + other users home directories +* You want the landing page for each user to be a Welcome.ipynb notebook in their assignments directory. +* All runtime files are put into `/srv/jupyterhub` and log files in `/var/log`. + +Let's start out with `jupyterhub_config.py`: + +```python +# jupyterhub_config.py +c = get_config() + +import os +pjoin = os.path.join + +runtime_dir = os.path.join('/srv/jupyterhub') +ssl_dir = pjoin(runtime_dir, 'ssl') +if not os.path.exists(ssl_dir): + os.makedirs(ssl_dir) + + +# https on :443 +c.JupyterHub.port = 443 +c.JupyterHub.ssl_key = pjoin(ssl_dir, 'ssl.key') +c.JupyterHub.ssl_cert = pjoin(ssl_dir, 'ssl.cert') + +# put the JupyterHub cookie secret and state db +# in /var/run/jupyterhub +c.JupyterHub.cookie_secret_file = pjoin(runtime_dir, 'cookie_secret') +c.JupyterHub.db_url = pjoin(runtime_dir, 'jupyterhub.sqlite') +# or `--db=/path/to/jupyterhub.sqlite` on the command-line + +# put the log file in /var/log +c.JupyterHub.log_file = '/var/log/jupyterhub.log' + +# use GitHub OAuthenticator for local users + +c.JupyterHub.authenticator_class = 'oauthenticator.LocalGitHubOAuthenticator' +c.GitHubOAuthenticator.oauth_callback_url = os.environ['OAUTH_CALLBACK_URL'] +# create system users that don't exist yet +c.LocalAuthenticator.create_system_users = True + +# specify users and admin +c.Authenticator.whitelist = {'rgbkrk', 'minrk', 'jhamrick'} +c.Authenticator.admin_users = {'jhamrick', 'rgbkrk'} + +# start single-user notebook servers in ~/assignments, +# with ~/assignments/Welcome.ipynb as the default landing page +# this config could also be put in +# /etc/ipython/ipython_notebook_config.py +c.Spawner.notebook_dir = '~/assignments' +c.Spawner.args = ['--NotebookApp.default_url=/notebooks/Welcome.ipynb'] +``` + +Using the GitHub Authenticator [requires a few additional env variables][oauth-setup], +which we will need to set when we launch the server: + +```bash +export GITHUB_CLIENT_ID=github_id +export GITHUB_CLIENT_SECRET=github_secret +export OAUTH_CALLBACK_URL=https://example.com/hub/oauth_callback +export CONFIGPROXY_AUTH_TOKEN=super-secret +jupyterhub -f /path/to/aboveconfig.py +``` + + +# Further reading + +- TODO: troubleshooting +- [Custom Authenticators](authenticators.md) +- [Custom Spawners](spawners.md) + + +[oauth-setup]: https://github.com/jupyter/oauthenticator#setup +[oauthenticator]: https://github.com/jupyter/oauthenticator +[PAM]: http://en.wikipedia.org/wiki/Pluggable_authentication_module diff --git a/jupyterhub/app.py b/jupyterhub/app.py index 48f2ff375a..8c9d353741 100644 --- a/jupyterhub/app.py +++ b/jupyterhub/app.py @@ -351,11 +351,9 @@ def _db_url_changed(self, name, old, new): """ ) admin_users = Set(config=True, - help="""set of usernames of admin users - - If unspecified, only the user that launches the server will be admin. - """ + help="""DEPRECATED, use Authenticator.admin_users instead.""" ) + tornado_settings = Dict(config=True) cleanup_servers = Bool(True, config=True, @@ -580,16 +578,24 @@ def init_hub(self): def init_users(self): """Load users into and from the database""" db = self.db - - if not self.admin_users: + + if self.admin_users and not self.authenticator.admin_users: + self.log.warn( + "\nJupyterHub.admin_users is deprecated." + "\nUse Authenticator.admin_users instead." + ) + self.authenticator.admin_users = self.admin_users + admin_users = self.authenticator.admin_users + + if not admin_users: # add current user as admin if there aren't any others admins = db.query(orm.User).filter(orm.User.admin==True) if admins.first() is None: - self.admin_users.add(getuser()) + admin_users.add(getuser()) new_users = [] - for name in self.admin_users: + for name in admin_users: # ensure anyone specified as admin in config is admin in db user = orm.User.find(db, name) if user is None: @@ -810,7 +816,7 @@ def init_tornado_settings(self): db=self.db, proxy=self.proxy, hub=self.hub, - admin_users=self.admin_users, + admin_users=self.authenticator.admin_users, admin_access=self.admin_access, authenticator=self.authenticator, spawner_class=self.spawner_class, diff --git a/jupyterhub/auth.py b/jupyterhub/auth.py index 9c235e9b40..74427d1ed8 100644 --- a/jupyterhub/auth.py +++ b/jupyterhub/auth.py @@ -23,6 +23,12 @@ class Authenticator(LoggingConfigurable): """ db = Any() + admin_users = Set(config=True, + help="""set of usernames of admin users + + If unspecified, only the user that launches the server will be admin. + """ + ) whitelist = Set(config=True, help="""Username whitelist. diff --git a/jupyterhub/tests/mocking.py b/jupyterhub/tests/mocking.py index 02a38c1bda..fe8842e5a8 100644 --- a/jupyterhub/tests/mocking.py +++ b/jupyterhub/tests/mocking.py @@ -72,6 +72,9 @@ def start(self): class MockPAMAuthenticator(PAMAuthenticator): + def _admin_users_default(self): + return {'admin'} + def system_user_exists(self, user): # skip the add-system-user bit return not user.name.startswith('dne') @@ -94,9 +97,6 @@ def _authenticator_class_default(self): def _spawner_class_default(self): return MockSpawner - def _admin_users_default(self): - return {'admin'} - def init_signal(self): pass