<img src="files/images/cppnow.png" alt="C++ Now Logo">
<h1>Interactive C++ in a Jupyter Notebook Using Modules for Incremental Compilation</h1>

## Once Upon a Time...
* A colleague and I wanted to teach C++
* We wanted to make it easier to get started, so we used
  - Docker (a container environment that leverages the kernel)
  - Cling (an Iterpreted version of Clang)
  - Jupyter/JupyterHub
* We created a tool called C++Explorer for teaching C++(https://github.com/stevenrbrandt/CxxExplorer). But this talk isn't about that...

## But first, a prologue...

## Notebooks!
Have you met Jupyter Notebooks?
* Notebooks are great for experimenting/playing with code.
* Each cell is a distinct evaluation with distinct results that build on each other.
* They persist the output of each cell action
* Great teaching tool!
* Usually, they are based on Python. They don't have to be.
* Through "magicks," i.e. cells beginning with %%, a cell may do something other than a Python command.

## Docker!

Have you met Docker?

* Lightweight container that makes use of the Linux kernel
* Simple to use
* Encapsulates complex builds / installations
* This notebook environment is available on dockerhub as stevenrbrandt/clangmi
    - docker pull stevenrbrandt/clangmi
* Usually run through a related tool called docker-compose
    - git clone https://github.com/stevenrbrandt/module-interactive.git
    - docker-compose up -d
    - docker-compose logs (shows the URL)

## Our story begins...

## Cling!

Have you met Cling?

* C++Explorer is based on Cling.
* Cling is great!
  - based on clang (i.e. it's interpreted clang)
  - provides incremental compilation and execution
  - works well most of the time
* Cling, like any complex tool, has a few problems...
  - LLVM bug means no std::async() https://bugs.llvm.org/show_bug.cgi?id=21431#c5
  - When segfaults kill a notebook kernel, it's a bummer
  - Syntax errors can sometimes kill a cling kernel
  - Other funny problems (https://github.com/root-project/root/issues/7952)
  - Based on Clang 5.0.0 (Cling reports version 0.8~dev)
* Is there an alternative?

## Modules!
* Modules provide incremental compilation
* can be chained, each importing and exporting the previous one
* so make two types of cells
  - **def_code** to define code, functions, variables
  - **run_code** to produce output
* works well most of the time

In [None]:
import runcode

# verbosity levels are:
# 0: no debug output
# 1: show all compilations
# 2: show generated source code
runcode.verbosity = 2

Our first code module. Note that string and iostream are included by default.

In [None]:
%%def_code
std::string hello = "Hello";

Our second module. Note that it incorporates the first.

In [None]:
%%def_code
std::string world = "world";

Our first time running code. Note that we make use of the two previous modules.

In [None]:
%%run_code
std::cout << hello << ", " << world << "." << std::endl;

In [None]:
# We no longer need to see all this debug output
runcode.verbosity = 0

The runcode module has two functions, def_code (which saves the symbols you define) and run_code (which does not).

In [None]:
# define a flag in an environment variable
dem = "-DM=1"

In [None]:
%%run_code {dem} -DV=2 
std::cout << "hello M=" << M << " V=" << V << "\n";

We use the most innefficient form of the Fibonacci function as a proxy for work that takes a lot of time and effort. If we're using a notebook to do calculations, this is a likely situation.

In [None]:
%%def_code
int fib(int n) {
    if(n < 2)
        return n;
    return fib(n-1)+fib(n-2);
}

The code below takes a while to run.

In [None]:
%%run_code
std::cout << fib(42) << std::endl;

Naively, we try to save the result.

In [None]:
%%def_code
int n42=fib(42);

Of course, this results in fib42 being called in every piece of code we run that contains n42.

In [None]:
%%run_code
std::cout << "n42=" << n42 << std::endl;

Even though the symbol n42 is still defined inside a module we are importing, this code is fast. Apparently, lazy optimization works well.

In [None]:
%%run_code
std::cout << "n10=" << fib(10) << std::endl;

We can, potentially, fix this problem by using constexpr.

In [None]:
%%def_code
constexpr int fibc(int n) {
    if(n < 2) {
        return n;
    } else {
        return fibc(n-1)+fibc(n-2);
    }
}

In [None]:
cexpr_flags = "-fconstexpr-steps=5000000 -fconstexpr-depth=40"

Unfortunately, setting of constexpr inside a module doesn't work at the moment, so we introduce %%lib to mark code that goes in the .o file, but not the module.

In [None]:
%%def_code {cexpr_flags}
extern int fc28;
%%lib
constexpr int fc28_ = fibc(28);
int fc28 = fc28_;

This retrieves the value quickly...

In [None]:
%%run_code

std::cout << "fc28=" << fc28 << std::endl;

Maybe someday this facility will be easier to use. We note, however, that this technique essentially copies our result to a file (the object file) and retrieves it for us when we use it in subsequent calculations. This is fine if we just have an int, but not fine if we have a gigantic array that would be expensive to copy.

Fortunately, there's another way to store values between cell evaluations: shared memory. Let's create a basic counter class to put in shared memory.

In [None]:
%%def_code
struct Counter {
    int n;
    Counter() : n(0) {}
    ~Counter() { std::cout << "reset counter" << std::endl; }
    void count() {
        std::cout << "n=" << (n++) << std::endl;
    }
};

The Boost shared memory headers don't compile at the moment. To work around this issue, I made a smaller, simpler class to access shared memory: Seg.

In [None]:
%%run_code
Seg seg("mem");
Counter *c = seg.allocate<Counter>("counter");
c->count();
if(c->n == 5)
   seg.remove(c);

Seg has some special code for arrays.

In [None]:
%%run_code
Seg seg("mem");
Array<double>& arr=*seg.allocate_array<double>("data", 100);
if(arr.init()) std::cout << "init" << std::endl;
seg.remove(&arr);

This code simply creates two arrays and populates them with data.

In [None]:
%%run_code
#include <math.h>

Seg seg("mem");
const int N=100;
Array<double>& a = *seg.allocate_array<double>("data",N);
Array<double>& b = *seg.allocate_array<double>("data2",N);
double dx = 15.0/a.size();
for(int i=0;i<a.size();i++) {
    double x = i*dx;
    a[i] = x;
    b[i] = sin(x);
}

Because my toy class, Seg, implements Pybind11's buffer protocol, we can load the data arrays above, reinterpret them as numpy arrays, and plot them with matplotlib. Plotting is simple and convenient and already supported inside notebooks, so loading the shared memory segment into Python is probably the easiest way to visualize our data.

In [None]:
import clangmi
import numpy as np
import matplotlib.pyplot as plt

abuf = clangmi.allocate_array("mem","data",100)
a = np.asarray(abuf)

bbuf = clangmi.allocate_array("mem","data2",100)
b = np.asarray(bbuf)

plt.plot(a,b)

Parallel execution is straighforward. We can just call std::async().

In [None]:
%%run_code
import <future>;
auto a = std::async(std::launch::async, [](){ return 42; });
std::cout << "a=" << a.get() << std::endl;

We can, however, still use hpx parallelism. This may be necessary if we want to demonstrated cutting edge parallelism features which may be implemented in hpx but not yet supported by the compiler. To do this, we need to add some default flags to our compilations.

In [None]:
import re
import os

def rmitem(item_pattern,items):
    new_items = []
    for item in items:
        if not re.match(item_pattern, item):
            new_items += [item]
    return new_items

os.environ["LD_LIBRARY_PATH"]="/usr/local/lib64"
os.environ["PKG_CONFIG_PATH"]="/usr/local/lib64/pkgconfig"
runcode.verbose = False
from subprocess import Popen, PIPE
p = Popen("pkg-config --cflags --libs hpx_application_release".split(" "),
          stdout=PIPE,stderr=PIPE,universal_newlines=True)
out, err = p.communicate()
# Set the application flags
runcode.appflags = rmitem(r'-std=.*',out.strip().split(' '))

p = Popen("pkg-config --cflags hpx_application_release".split(" "),
          stdout=PIPE,stderr=PIPE,universal_newlines=True)
out, err = p.communicate()
# Set the module flags
runcode.modflags = rmitem(r'-std=.*',out.strip().split(' '))

In [None]:
%%run_code
#include <hpx/hpx.hpp>
#include <hpx/hpx_main.hpp>

auto a = hpx::async([](){ return 42; });
std::cout << a.get() << std::endl;

Running HPX requires that we set up some stuff...

Some things don't work yet. As mentioned before, the boost shared memory header doesn't work. Also, the hpx headers don't work in a module. The "cxxabi.h" header simply doesn't work, and that's one of the things hpx uses.

Conclusion: Using modules for interactive computing is not quite ready for production use, but once boost and other headers compile, it could be a viable alternative to Cling in our teaching modules.

<h2>Thank you!</h2>