<center>
<h1>Adding interactivity by extending Jupyter Notebook</h1>
<h3>https://github.com/cben, cben@redhat.com</h3>
<h2>https://github.com/cben/ansible_jupyter_kernel</h2>
</center>

1. Extending/improving interactive tools is easy!
2. Notebooks
3. Jupyter, writing Jupyter Kernels
4. Ansible, from the inside
5. https://github.com/cben/ansible_jupyter_kernel

# Notebook Interface: A better kind of REPLs

> Shout-out: AFAIK first introduced by Theodore Gray in Wolfram's Mathematica in 1988.

An *executable document*. Somewhat like `doctest`.

In [9]:
fingers = 2 + 2

In [10]:
print('I see', fingers, 'fingers')

I see 4 fingers


Consists of cells, which you may go back revise and re-execute.
This lets you pretty up things, smoothly spanning the range from throwaway code to something to come back to, iterate on, and share.

- Can execute *out of order* (note `In[n]` numbers on the left).  This risks inconsistent, irreproducible results, but has pragmatic benefits (think long computations)...

# Jupyter Notebook

Formerly IPython Notebook, rebranded to emphasize multi-language ecosystem

```
browser <-> jupyter-notebook server <-> language "kernel"  
 ^                                        ^
nbextensions                            IPython kernel/Metakernel extensions/magics
```

There are multiple extension points [http://mindtrove.info/4-ways-to-extend-jupyter-notebook/].

This talk focuses on writing a "kernel", Jupyter's term for a backend executing a new language — or whatever *you decide* should be treated like a language!    
List of existing kernels (>80 as of 2017) : https://github.com/jupyter/jupyter/wiki/Jupyter-kernels

## Roads to implement a Jupyter kernel

1. In any language, implement [connection](https://jupyter-client.readthedocs.io/en/latest/kernels.html) and the [wire protocol](https://jupyter-client.readthedocs.io/en/latest/messaging.html): JSON over ZMQ.  Example: https://github.com/dsblank/simple_kernel

2. **In Python, [reusing existing machinery](https://jupyter-client.readthedocs.io/en/latest/wrapperkernels.html)** — subclass `ipykernel.kernelbase.Kernel`, define a few methods `do_execute`, `do_complete`, `do_inspect`...

3. In Python, reusing even more ["Metakernel"](https://github.com/Calysto/metakernel) machinery.  Gives rich shared functionality, most notably [various `%magic` syntaxes](https://github.com/Calysto/metakernel/blob/master/metakernel/magics/README.md).

  - Lecture by Nicolas Kruchten on "magics", including implementing new ones: https://2015.pycon.ca/en/schedule/33/ 

# Simple example: HTTP client kernel

HTTP is not exactly a programming language, but have you ever done a long series of `curl` commands to explore some web API?

Let's see what it takes to build a very Jupyter kernel doing HTTP — and whether it results in something that's nice to use.

It's convenient to flesh out the behavior inside this notebook before actually running as a kernel.

In [50]:
import requests
import pprint

def do_execute(code):
    """
    GET http://google.com
    """
    http_method, url = code.split(None, 1)
    assert http_method == 'GET'
    response = requests.get(url)
    print(response.status_code)
    for h, v in sorted(response.headers.items()):
        print("{h}: {v}".format(h=h, v=v))
    print()
    print(response.text)

In [47]:
do_execute('GET http://localhost:631')

200
Accept-Encoding: gzip, deflate, identity
Connection: Keep-Alive
Content-Language: en_US
Content-Length: 2361
Content-Security-Policy: frame-ancestors 'none'
Content-Type: text/html; charset=utf-8
Date: Tue, 13 Jun 2017 09:49:42 GMT
Keep-Alive: timeout=10
Last-Modified: Thu, 27 Apr 2017 08:49:28 GMT
Server: CUPS/2.1 IPP/2.1
X-Frame-Options: DENY

<!DOCTYPE HTML>
<html>
  <head>
    <link rel="stylesheet" href="/cups.css" type="text/css">
    <link rel="shortcut icon" href="/apple-touch-icon.png" type="image/png">
    <meta charset="utf-8">
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <meta http-equiv="X-UA-Compatible" content="IE=9">
    <meta name="viewport" content="width=device-width">
    <title>Home - CUPS 2.1.4</title>
  </head>
  <body>
    <div class="header">
      <ul>
	<li><a href="http://www.cups.org/" target="_blank">CUPS.org</a></li>
	<li><a class="active" href="/">Home</a></li>
	<li><a href="/admin">Administration</a></li>
	<li><a href="

In [49]:
do_execute('GET http://localhost:631/NO_SUCH_PAGE')

404
Accept-Encoding: gzip, deflate, identity
Connection: close
Content-Language: en_US
Content-Length: 342
Content-Security-Policy: frame-ancestors 'none'
Content-Type: text/html; charset=utf-8
Date: Tue, 13 Jun 2017 09:50:49 GMT
Server: CUPS/2.1 IPP/2.1
X-Frame-Options: DENY

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML>
<HEAD>
	<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
	<TITLE>Not Found - CUPS v2.1.4</TITLE>
	<LINK REL="STYLESHEET" TYPE="text/css" HREF="/cups.css">
</HEAD>
<BODY>
<H1>Not Found</H1>
<P></P>
</BODY>
</HTML>



### Good, let's try it as a kernel

The quick and very dirty way (we'll do it better later)

In [51]:
!mkdir -p ~/.local/share/jupyter/kernels/http

In [52]:
%%writefile ~/.local/share/jupyter/kernels/http/kernel.json
{
    "argv": ["python", "/tmp/kernel1.py", "-f", "{connection_file}"],
    "display_name": "HTTP kernel1",
    "language": "http"
}

Overwriting /home/bpaskinc/.local/share/jupyter/kernels/http/kernel.json


In [53]:
%%writefile /tmp/kernel1.py
import requests

from ipykernel.kernelbase import Kernel

class HTTPKernel(Kernel):
    implementation = 'http_kernel'
    implementation_version = '0.1'
    language = 'HTTP'
    language_version = '1.1'  # TODO: plug in Hyper to support HTTP/2.0
    language_info = dict(
        name = 'http',
        mimetype = 'text/plain',
        file_extension = '.url',
    )
    banner = "HTTP kernel - WIP"

    def stream(self, text, name='stdout'):
        self.send_response(self.iopub_socket, 'stream',
                           dict(name='stdout', text=text))
    
    def do_execute(self, code, silent, store_history=True, user_expressions=None, allow_stdin=False):
        """
        GET http://google.com
        """
        http_method, url = code.split(None, 1)
        assert http_method == 'GET'
        response = requests.get(url)

        if not silent:
            self.stream('{}\n'.format(response.status_code))
            headers = "".join("{h}: {v}\n".format(h=h, v=v) 
                              for h, v in sorted(response.headers.items()))
            self.stream(headers + '\n\n')
            self.stream(response.text)

        return dict(
            status='ok',
            # The base class increments the execution count
            execution_count=self.execution_count,
            payload=[],
            user_expressions={},
        )

if __name__ == '__main__':
    from ipykernel.kernelapp import IPKernelApp
    IPKernelApp.launch_instance(kernel_class=HTTPKernel)

Overwriting /tmp/kernel1.py


In [8]:
import re, requests
text = requests.get('http://localhost:631').text
url_re = r'https?://[^"\'<>]*'
re.findall(url_re, text)

['http://www.cups.org/',
 'http://www.apple.com/',
 'http://www.cups.org/lists.php?LIST=cups',
 'http://www.cups.org/lists.php?LIST=cups-devel',
 'http://www.apple.com']

----

# Playing with ansible API

- https://docs.ansible.com/ansible/dev_guide/developing_api.html

- https://www.ansible.com/blog/how-to-extend-ansible-through-plugins — excellent overview of ansible extension points

- <img alt="Extending Ansible cover" src="Extending_Ansible_cover.jpg" style="width: 20%; float: right">
  [*Extending Ansible* book][3] by Rishabh Das (2016, I think describes pre-2.0?)  
  Free sample including API chapter at https://www.ansible.com/extending-ansible

- [`lib/ansible/adhoc.py`][1] and [`lib/ansible/playbook.py`][2] are simple usage examples.

[1]: https://github.com/ansible/ansible/blob/devel/lib/ansible/cli/adhoc.py
[2]: https://github.com/ansible/ansible/blob/devel/lib/ansible/cli/playbook.py
[3]: https://www.packtpub.com/networking-and-servers/extending-ansible

In [3]:
from ansible import constants as C
from ansible.cli import CLI
from ansible.errors import AnsibleError, AnsibleOptionsError, AnsibleParserError
from ansible.executor.task_queue_manager import TaskQueueManager
from ansible.inventory import Inventory
#from ansible.module_utils._text import to_text
from ansible.parsing.dataloader import DataLoader
#from ansible.parsing.splitter import parse_kv
from ansible.playbook.play import Play
#from ansible.plugins import get_all_plugin_loaders
#from ansible.utils.vars import load_extra_vars
#from ansible.utils.vars import load_options_vars
from ansible.vars import VariableManager


In [4]:
variable_manager = VariableManager()

In [5]:
loader = DataLoader()

In [6]:
inventory = Inventory(loader=loader, variable_manager=variable_manager)

In [7]:
passwords = {}

An `options` object is needed, many places in code require specific attributes to exist.
Could build one but easier to use the CLI arguments parser to provide them.

In [8]:
#import argparse
#options = argparse.Namespace(module_path=None, forks=C.DEFAULT_FORKS, become=C.DEFAULT_BECOME)
parser = CLI.base_parser(module_opts=True, fork_opts=True, runas_opts=True, check_opts=True)
options, extra_args = parser.parse_args([])
options

<Values at 0x7fcd3c0bd400: {'forks': 5, 'syntax': None, 'check': False, 'ask_sudo_pass': False, 'ask_su_pass': False, 'module_path': None, 'sudo_user': None, 'su_user': None, 'become_user': None, 'verbosity': 0, 'su': False, 'become_method': 'sudo', 'become': False, 'become_ask_pass': False, 'diff': False, 'sudo': False}>

In [9]:
def task_queue_manager():
    return TaskQueueManager(
        inventory=inventory,
        variable_manager=variable_manager,
        loader=loader,
        options=options,
        passwords=passwords,
        #stdout_callback=cb,
        #run_additional_callbacks=C.DEFAULT_LOAD_CALLBACK_PLUGINS,
        #run_tree=run_tree,
    )
tqm = task_queue_manager()

## Getting a Play data structure

In [10]:
import yaml

In [11]:
play1 = Play.load(yaml.load('''
hosts: localhost
tasks:
  - command: zenity --question --text="WORKS! Proceed?"
'''))
play1.tasks

[BLOCK(uuid=5ce0c5be-8647-ed5f-bbc2-000000000002)(id=140519452022992)(parent=None)]

In [12]:
tqm.run(play1)


PLAY [localhost] ***************************************************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [command] *****************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["zenity", "--question", "--text=WORKS! Proceed?"], "delta": "0:18:59.850413", "end": "2017-06-13 00:16:46.971777", "failed": true, "rc": 1, "start": "2017-06-12 23:57:47.121364", "stderr": "Gtk-Message: GtkDialog mapped without a transient parent. This is discouraged.", "stderr_lines": ["Gtk-Message: GtkDialog mapped without a transient parent. This is discouraged."], "stdout": "", "stdout_lines": []}


2

## TaskQueueManager is stateful
Look what happens after a play fails:

In [13]:
failing_play = Play.load(yaml.load('''
hosts: localhost
tasks:
  - command: 'false'
'''))
tqm.run(failing_play)


PLAY [localhost] ***************************************************************


2

In [14]:
tqm.run(play1)


PLAY [localhost] ***************************************************************


2

 => tqm will not run anything more, it will just return error exit code :-(

In [15]:
TaskQueueManager.RUN_FAILED_HOSTS

2

### Two solutions
1. Create new TaskQueueManager every time.
2. `tqm.clear_failed_hosts()`.

### Make it easier to run

In [16]:
def run(code):
    play = Play.load(yaml.safe_load(code))
    task_queue_manager().run(play)

In [17]:
run("""
tasks:
  - command: echo foo
""")

AnsibleParserError: the field 'hosts' is required but was not set

Oops.  That was not convenient enough.  Also, what a huge stacktrace :-(

In [None]:
import sys
import traceback
def run(code):
    try:
        play = Play.load(yaml.load(code))
        task_queue_manager().run(play)
    except (yaml.YAMLError, AnsibleParserError) as e:
        # Printing errors will look different in a Jupyter kernel, but for now stderr is fine.
        print(''.join(traceback.format_exception_only(type(e), e)), file=sys.stderr)

In [None]:
run("""
tasks
- syntax error: missing semicolon above after `tasks`
""")

In [None]:
run("""
tasks:
- command: echo foo
""")

Okay, much better errors!  Back to making it easy to write simple plays:

In [None]:
def play_from_code(code):
    """Support one task, list of tasks, or whole play without hosts."""
    data = orig_data = yaml.safe_load(code)
    if isinstance(data, dict) and 'tasks' not in data:
        data = [data]
    if isinstance(data, list):
        data = dict(tasks=data)
    if not isinstance(data, dict):
        raise AnsibleParserError("Expected task, list of tasks, or play, got {}".format(type(orig_data)))
    if 'hosts' not in data:
        data['hosts'] = 'localhost'
    return Play.load(data)

def run(code):
    try:
        task_queue_manager().run(play_from_code(code))
    except (yaml.YAMLError, AnsibleParserError) as e:
        # Printing errors will look different in a Jupyter kernel, but for now stderr is fine.
        print(''.join(traceback.format_exception_only(type(e), e)), file=sys.stderr)

In [None]:
run("""command: echo foo""")

### Have we solved stateful TQM the right way?
We're using a fresh `TaskQueueManager` every time, so do we still carry *any* state from cell to cell?
Can we set a variable and later use it?

In [None]:
run("""set_fact: var=1""")

In [None]:
run("""debug: msg={{var}}""")

# TODO UNSOLVED
Wait, what's that "Gathering Facts" from localhost every time?

In [None]:
def run(code):
    try:
        tqm.clear_failed_hosts()
        tqm.run(play_from_code(code))
    except (yaml.YAMLError, AnsibleParserError) as e:
        # Printing errors will look different in a Jupyter kernel, but for now stderr is fine.
        print(e, file=sys.stderr)


In [None]:
run("""debug: msg={{var}}""")

## How is a play processed before execution?
### Let's consider a more complex play.

In [None]:
play = to_play('''
vars:
  play_local_variable: 'abc'
  
tasks:

- debug: 'msg=The var equals {{play_local_variable}}'

- command: pwd
  register: pwd_out  # Sets a variable

- set_fact: global1=value1 global2=value2  # Some more vars

- with_items: [play_local_variable, pwd_out, global1, global2]  # A loop!
  debug: var={{item}}
''')
task_queue_manager().run(play)

*__Tip__: Ansible objects have `.serialize()`, handy for exploring.*

In [None]:
play.serialize()

In [None]:
play.get_vars()

In [None]:
variable_manager.get_vars(loader=loader, play=play)

In [None]:
play.compile()

^^ These are the pre_tasks, roles, tasks, and post_tasks

In [None]:
[list(block.block) for block in play.compile()]


In [None]:
play.compile()[1].serialize()

## Variables

In [None]:
from ansible.vars import preprocess_vars
preprocess_vars(yaml.load('''
hosts: localhost
tasks:
  - command: 'false'
'''))

# Peek under the hood of IPyKernel running *this* notebook?

In [None]:
import sys
sorted(sys.modules)

In [None]:
import ipykernel.ipkernel

In [None]:
import gc
[obj for obj in gc.get_referrers(ipykernel.ipkernel.IPythonKernel) 
 if isintance(obj,ipykernel.ipkernel.IPythonKernel)]

https://github.com/Calysto/metakernel/blob/master/metakernel/magics/README.md

In [None]:
%connect_info