Skip to content

Commit

Permalink
Cleanup and document recipes (#50)
Browse files Browse the repository at this point in the history
- Cleanup ipython-kernel and jupyter-notebook recipes
- Add recipes documentation section
- Document ipython-kernel recipe

[skip ci]
  • Loading branch information
jcrist committed Aug 5, 2018
1 parent 719773d commit 610aeb6
Show file tree
Hide file tree
Showing 6 changed files with 242 additions and 15 deletions.
1 change: 1 addition & 0 deletions docs/source/index.rst
Expand Up @@ -59,5 +59,6 @@ User Documentation
quickstart.rst
specification.rst
key-value-store.rst
recipes.rst
api.rst
cli.rst
5 changes: 5 additions & 0 deletions docs/source/quickstart.rst
Expand Up @@ -9,6 +9,8 @@ using the commandline -- for more information please see the `API <api.html>`__
or `CLI <cli.html>`__ documentation.


.. _quickstart-kinit:

Kinit (optional)
----------------

Expand All @@ -20,6 +22,9 @@ make sure you have an active ticket-granting-ticket before continuing:
$ kinit
.. _quickstart-skein-daemon:


Start the Skein Daemon (optional)
---------------------------------

Expand Down
178 changes: 178 additions & 0 deletions docs/source/recipes-ipython-kernel.rst
@@ -0,0 +1,178 @@
Remote IPython Kernel
=====================

The :mod:`skein.recipes.ipython_kernel` module provides a commandline recipe
for starting a remote IPython kernel on a YARN container. The intended use is
to execute the module in a service, using the command:

.. code-block:: console
$ python -m skein.recipes.ipython_kernel
The executing Python environment must contain the following dependencies to
work properly:

- ``skein``
- ``ipykernel``

After launching the service, the kernel connection information will be stored
in the `key-value store <key-value-store.html>`__ under the key
``'ipython.kernel.info'``. This key name is configurable with the command-line
flag ``--kernel-info-key``.


Example
-------

Here we provide a complete walkthrough of launching and connecting to a remote
IPython kernel. This example assumes you're logged into and running on an edge
node.


Kinit (optional)
^^^^^^^^^^^^^^^^

See :ref:`quickstart-kinit`.


Start the Skein Daemon (optional)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

See :ref:`quickstart-skein-daemon`.


Package the Python Environment
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To distribute Python environments we'll make use of `conda-pack
<https://conda.github.io/conda-pack/>`_, a tool for packaging and distributing
conda environments. As mentioned above, we need to make sure we have the
following packages installed in the remote environment:

- ``skein``
- ``ipykernel``

we'll also install ``numpy`` to have an example library for doing some
computation, and ``jupyter_console`` to have a way to connect to the remote
kernel (note that this is only needed on the client machine, but we'll install
it on both for simplicity).

.. code-block:: console
# Create a new demo environment (output trimmed for brevity)
$ conda create -n ipython-demo
...
# Activate the environment
$ conda activate ipython-demo
# Install the needed packages (output trimmed for brevity)
$ conda install conda-pack skein ipykernel numpy jupyter_console -c conda-forge
...
# Package the environment into environment.tar.gz
$ conda pack -o environment.tar.gz
Collecting packages...
Packing environment at '/home/jcrist/miniconda/envs/ipython-demo' to 'environment.tar.gz'
[########################################] | 100% Completed | 35.3s
Write the Application Specification
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Next we need to write the application specification. For more information see
the `specification docs <specification.html>`__.

.. code-block:: yaml
# stored in ipython-demo.yaml
name: ipython-demo
services:
ipython:
resources:
memory: 1024
vcores: 1
files:
# Distribute the bundled environment as part of the application.
# This will be automatically extracted by YARN to the directory
# ``environment`` during resource localization.
environment: environment.tar.gz
commands:
# Activate our environment
- source environment/bin/activate
# Start the remote ipython kernel
- python -m skein.recipes.ipython_kernel
Start the Remote IPython Kernel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Now we have everything needed to start the remote IPython kernel. The following
bash command starts the application and stores the application id in the
environment variable ``APPID``.

.. code-block:: console
$ APPID=`skein application submit ipython-demo.yaml`
Retrieve the Kernel Information
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To connect to a remote kernel, Jupyter requires information usually stored in a
``kernel.json`` file. As mentioned above, the recipe provided in
:mod:`skein.recipes.ipython_kernel` stores this information in the key
``'ipython.kernel.info'``. We can retrieve this information and store it in a
file using the following bash command:

.. code-block:: console
$ skein kv get $APPID --key ipython.kernel.info --wait > kernel.json
Connect to the Remote IPython Kernel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Using ``jupyter console`` and the ``kernel.json`` file, we can connect to the
remote kernel.

.. code-block:: console
$ jupyter console --existing kernel.json
Jupyter console 5.2.0
Python 3.6.6 | packaged by conda-forge | (default, Jul 26 2018, 09:53:17)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.5.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: import numpy as np # can import distributed libraries
In [2]: np.sum([1, 2, 3])
Out[2]: 6
In [3]: # ls shows the files on the remote container, not the local files
In [4]: ls
container_tokens environment@
default_container_executor_session.sh* launch_container.sh*
default_container_executor.sh* tmp/
In [5]: # exit shuts down the application
In [6]: exit
Shutting down kernel
Confirm that the Application Completed
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

We can check that application shutdown properly using ``skein application status``

.. code-block:: console
$ skein application status $APPID
APPLICATION_ID NAME STATE STATUS CONTAINERS VCORES MEMORY RUNTIME
application_1533143063639_0017 ipython-demo FINISHED SUCCEEDED 0 0 0 2m
11 changes: 11 additions & 0 deletions docs/source/recipes.rst
@@ -0,0 +1,11 @@
Recipes
=======

The :mod:`skein.recipes` namespace includes recipes for deploying common tools
in the Python ecosystem. These also serve as examples of how to use Skein to
deploy real services.

.. toctree::
:maxdepth: 1

recipes-ipython-kernel.rst
29 changes: 22 additions & 7 deletions skein/recipes/ipython_kernel.py
Expand Up @@ -3,12 +3,20 @@
import os
import json
import socket
from copy import copy

from ipykernel.kernelapp import IPKernelApp
from ipykernel.kernelapp import IPKernelApp, kernel_aliases
from traitlets import Unicode, Dict

from ..core import ApplicationClient


skein_kernel_aliases = dict(kernel_aliases)
skein_kernel_aliases.update({
'kernel-info-key': 'SkeinIPKernelApp.kernel_info_key'
})


class SkeinIPKernelApp(IPKernelApp):
"""A remote ipython kernel setup to run in a YARN container
and publish its address to the skein application master.
Expand All @@ -33,17 +41,23 @@ class SkeinIPKernelApp(IPKernelApp):
Get the connection information:
>>> import json
>>> info = json.loads(app.kv.wait('ipykernel.info'))
>>> info = json.loads(app.kv.wait('ipython.kernel.info'))
Use the connection info as you see fit for your application. When written
to a file, this can be used with ``jupyter console --existing file.json``
to connect to the remote kernel.
"""
ip = '0.0.0.0'
name = 'python -m skein.recipes.ipython_kernel'

aliases = Dict(skein_kernel_aliases)

ip = copy(IPKernelApp.ip)
ip.default_value = '0.0.0.0'

def initialize(self, argv=None):
self.skein_app_client = ApplicationClient.from_current()
super(SkeinIPKernelApp, self).initialize(argv=argv)
kernel_info_key = Unicode("ipython.kernel.info",
config=True,
help=("Skein key in which to store the "
"connection information"))

def write_connection_file(self):
super(SkeinIPKernelApp, self).write_connection_file()
Expand All @@ -54,7 +68,8 @@ def write_connection_file(self):
if data['ip'] in ('', '0.0.0.0'):
data['ip'] = socket.gethostbyname(socket.gethostname())

self.skein_app_client.kv['ipykernel.info'] = json.dumps(data).encode()
app = ApplicationClient.from_current()
app.kv[self.kernel_info_key] = json.dumps(data).encode()


def start_ipython_kernel(argv=None):
Expand Down
33 changes: 25 additions & 8 deletions skein/recipes/jupyter_notebook.py
Expand Up @@ -3,12 +3,20 @@
import os
import json
import socket
from copy import copy

from notebook.notebookapp import NotebookApp
from notebook.notebookapp import NotebookApp, aliases
from traitlets import Unicode, Dict

from ..core import ApplicationClient


skein_notebook_aliases = dict(aliases)
skein_notebook_aliases.update({
'notebook-info-key': 'SkeinNotebookApp.notebook_info_key'
})


class SkeinNotebookApp(NotebookApp):
"""A jupyter notebook server setup to run in a YARN container
and publish its address to the skein application master.
Expand All @@ -33,7 +41,7 @@ class SkeinNotebookApp(NotebookApp):
Get the connection information:
>>> import json
>>> info = json.loads(app.kv.wait('notebook.info'))
>>> info = json.loads(app.kv.wait('jupyter.notebook.info'))
Use the connection info as you see fit for your application. Information
provided includes:
Expand All @@ -44,12 +52,20 @@ class SkeinNotebookApp(NotebookApp):
- base_url
- token
"""
ip = '0.0.0.0'
open_browser = False
name = 'python -m skein.recipes.jupyter_notebook'

aliases = Dict(skein_notebook_aliases)

ip = copy(NotebookApp.ip)
ip.default_value = '0.0.0.0'

open_browser = copy(NotebookApp.open_browser)
open_browser.default_value = False

def initialize(self, argv=None):
super(SkeinNotebookApp, self).initialize(argv=argv)
self.skein_app_client = ApplicationClient.from_current()
notebook_info_key = Unicode("jupyter.notebook.info",
config=True,
help=("Skein key in which to store the "
"server information"))

def write_server_info_file(self):
super(SkeinNotebookApp, self).write_server_info_file()
Expand All @@ -60,7 +76,8 @@ def write_server_info_file(self):
'base_url': self.base_url,
'token': self.token}

self.skein_app_client.kv['notebook.info'] = json.dumps(data).encode()
app = ApplicationClient.from_current()
app.kv[self.notebook_info_key] = json.dumps(data).encode()


def start_notebook_application(argv=None):
Expand Down

0 comments on commit 610aeb6

Please sign in to comment.