In [1]:
var Immutable = require('immutable')
var _ = require('lodash')

var commutable = require('commutable')

# Revival

JSON.parse takes an extra argument called a reviver:

```
JSON.parse(text[, reviver])
```

The reviver accepts two parameters, `key` and `value` and returns the intended `value`. The key will either be a text key on Objects or numbers for when the value is in an Array.

Let's walk through some sample code to check this out.

In [2]:
// Classic JSON.parse
JSON.parse('{"a": 2, "b": { "name": "dave" }}')

{ a: 2, b: { name: 'dave' } }

In [3]:
function reviver(key, value) {
    if(key === 'name') {
        return value + " senior";
    }
    return value
}

JSON.parse('{"a": 2, "b": { "name": "dave" }}', reviver)

{ a: 2, b: { name: 'dave senior' } }

This means you can use this to change values based on a key, though you won't know the nested path of the overall JSON object. 

Since the string is (expected to be) JSON, there are only two types which are not immutable: `Array` and `Object`. You can use this to your advantage to create frozen or Immutable.js objects while parsing.

In [4]:
JSON.parse('{"a": 2, "b": { "name": "dave" }}', (k, v) => Object.freeze(v))

{ a: 2, b: { name: 'dave' } }

In [5]:
function immutableReviver(key, value) {
    if (Array.isArray(value)) {
        return Immutable.List(value);
    }

    if (typeof value === 'object') {
        return Immutable.Map(value)
    }
    return value;
}

Since it seemed handy enough, I put [`immutable-reviver`](https://github.com/rgbkrk/immutable-reviver) on npm. We'll just use the version written here for now though.

In [6]:
revived = JSON.parse('{"a": 2, "b": { "name": "dave" }}', immutableReviver)

Map { "a": 2, "b": Map { "name": "dave" } }

In [7]:
revived.getIn(['b', 'name'])

'dave'

The reason I started looking into this was because I was trying to see if I could optimize loading of notebooks in nteract. We currently rely on a strategy that goes like:

```
notebook = JSON.parse(rawNotebook)
immutableNotebook = Immutable.fromJS(notebook)

ourNotebook = immutableNotebook.map(...).map(...)... // A series of transformations to create our in-memory representation
```

These transformations are mostly to turn notebook cells from this:


```
{
  "metadata": {
    "collapsed": false,
    "outputExpanded": false
  },
  "cell_type": "markdown",
  "source": [
    "# Outputs you can update by name\n",
    "\n",
    "This notebook demonstrates the new name-based display functionality in the notebook. Previously, notebooks could only attach output to the cell that was currently being executed:\n",
    "\n"
  ]
}
```

into:

```
{
  "metadata": {
    "collapsed": false,
    "outputExpanded": false
  },
  "cell_type": "markdown",
  "source": "# Outputs you can update by name\n\nThis notebook demonstrates the new name-based display functionality in the notebook. Previously, notebooks could only attach output to the cell that was currently being executed:\n\n"
}
```

This multi-line string format, introduced by Jupyter, is to accomodate diffing of notebooks in tools like git and GitHub. It's applied to source on cells as well as some output types.

We can set up a reviver that handles all the keys that are most likely to have [multi-line strings](https://github.com/jupyter/nbformat/blob/62d6eb8803616d198eaa2024604d1fe923f2a7b3/nbformat/v4/nbformat.v4.schema.json#L386). We'll start with those that are media types that we know end up being encoded as an array of strings.

In [8]:
var multilineStringMimetypes = new Set([
    'application/javascript',
    'text/html',
    'text/markdown',
    'text/latex',
    'image/svg+xml',
    'image/gif',
    'image/png',
    'image/jpeg',
    'application/pdf',
    'text/plain',
]);

function immutableNBReviver(key, value) {
    if (Array.isArray(value)) {
        if(multilineStringMimetypes.has(key)) {
            return value.join('')
        }
        return Immutable.List(value);
    }

    if (typeof value === 'object') {
        return Immutable.Map(value)
    }
    return value;
}

We can also set up a "greedy" reviver that will also convert `source` and `text` fields. The primary problem with this though, because of how JSON.parse works is that we have no idea if it's a key in a cell where we expect, part of someone else's JSON payload, or in metadata.

In [9]:
var specialKeys = new Set([
    'application/javascript',
    'text/html',
    'text/markdown',
    'text/latex',
    'image/svg+xml',
    'image/gif',
    'image/png',
    'image/jpeg',
    'application/pdf',
    'text/plain',
    'source',
    'text',
]);

function immutableGreedyReviver(key, value) {
    if (Array.isArray(value)) {
        if(specialKeys.has(key)) {
            return value.join('')
        }
        return Immutable.List(value);
    }

    if (typeof value === 'object') {
        return Immutable.Map(value)
    }
    return value;
}

# Our runtime harnesses

To evaluate the speed at which we can revive our objects, we'll set up a little testing harness.

In [10]:
// Some logger that uses process.hrtime that I ripped off Stack Overflow, since we want to use timing in a way that we can't with console.time

[ a, o, ms, s, log ] = ( function * () {
    yield * [
        ( process.hrtime )(),
        process.hrtime,
        ms => ( ( ms[ 0 ] * 1e9 + ms[ 1 ] ) / 1000000 ),
        s  => s / 1000,
        () => {
            const f = o( a ), msf = ms( f ), sf = s( msf );
            return { a, o: f, ms: msf, s: sf };
        }
    ];
} )();

{}

In [11]:
// Calculate the milliseconds it takes to run f
function measure(f) {
  start = log()
  f()
  end = log()
  return end.ms - start.ms  
}

// measure the function run n times, return the mean
function runTrials(f, n=1000) {
    values = []
    for(var ii=0; ii < n; ii++) {
        values.push(measure(f))
    }
    return values.reduce((a, b) => a + b, 0)/n
}

With our harness all set up, we can run through all the notebooks we have locally to see how they perform with different revivers.

In [12]:
notebooks = require('glob').sync('./*.ipynb')

for(var notebookPath of notebooks) {
    console.log("\n ----- ", path.basename(notebookPath))
    raw = fs.readFileSync(notebookPath)
    
    var tests = [
        { name: 'straight JSON.parse', f: () => { JSON.parse(raw) } },
        { name: 'Object.freeze', f: () => { JSON.parse(raw, (k, v) => Object.freeze(v)) } },
        { name: 'basic Immutable', f: () => { JSON.parse(raw, immutableReviver) } },
        { name: 'immutable notebook', f: () => { JSON.parse(raw, immutableNBReviver) } },
        { name: 'immutable greedy nb', f: () => { JSON.parse(raw, immutableGreedyReviver) } },
        // { name: 'fromJS', f: () => { JSON.parse(raw, (k, v) => Immutable.fromJS(v)) } },
        // { name: 'current commutable way', f: () => { commutable.fromJS(JSON.parse(raw)) } },
    ]
    
    for(var test of tests) {
        mean = runTrials(test.f, 100)
        console.log(_.padEnd(test.name, 30), mean)
    }
    

}




 -----  altair.ipynb
straight JSON.parse            1.0749021599999902
Object.freeze                  2.260570740000003
basic Immutable                7.50018614
immutable notebook             7.798826189999991
immutable greedy nb            7.281795899999938

 -----  display-updates.ipynb
straight JSON.parse            0.05507465000002412
Object.freeze                  0.2737798099999691
basic Immutable                0.4565474500000255
immutable notebook             0.39348404999997
immutable greedy nb            0.35731991000001473

 -----  download-stats.ipynb
straight JSON.parse            0.03955825000002733
Object.freeze                  0.12198090999999295
basic Immutable                0.15957000999999763
immutable notebook             0.14660479000001034
immutable greedy nb            0.1560780199999499

 -----  geojson.ipynb
straight JSON.parse            0.05292227999999341
Object.freeze                  0.14175530000001346
basic Immutable                0.2406329499999628

# Evaluating revivers for notebook loading.

Within nteract we are inevitably going to end up creating an immutable structure. These measurements only make sense in the context of running both the initial `JSON.parse` followed by the transformations. To give it a rough guess, I'll only compare a few I can evaluate.

In [20]:
notebooks = require('glob').sync('./*.ipynb')

for(var notebookPath of notebooks) {
    console.log("\n ----- ", path.basename(notebookPath))
    raw = fs.readFileSync(notebookPath)
    
    var tests = [
        { name: 'straight JSON.parse baseline', f: () => { JSON.parse(raw) } },
        { name: 'Object.freeze baseline', f: () => { JSON.parse(raw, (k,v) => Object.freeze(v)) } },
        { name: 'straight JSON.parse then commutable conversion', f: () => { commutable.fromJS(JSON.parse(raw)) } },
        { name: 'immutable greedy nb', f: () => { JSON.parse(raw, immutableGreedyReviver) } },
    ]
    
    for(var test of tests) {
        mean = runTrials(test.f, 100)
        console.log(_.padEnd(test.name, 50), mean.toString().slice(0,10), 'ms')
    }
}


 -----  altair.ipynb
straight JSON.parse baseline                       1.10996391 ms
Object.freeze baseline                             2.29745900 ms
straight JSON.parse then commutable conversion     6.84918417 ms
immutable greedy nb                                5.85418076 ms

 -----  display-updates.ipynb
straight JSON.parse baseline                       0.05840098 ms
Object.freeze baseline                             0.26966722 ms
straight JSON.parse then commutable conversion     0.84944353 ms
immutable greedy nb                                0.46853360 ms

 -----  download-stats.ipynb
straight JSON.parse baseline                       0.03572858 ms
Object.freeze baseline                             0.17462950 ms
straight JSON.parse then commutable conversion     0.31322497 ms
immutable greedy nb                                0.24238763 ms

 -----  geojson.ipynb
straight JSON.parse baseline                       0.05013291 ms
Object.freeze baseline                           

Since these are in milliseconds and the difference is not much, it seems like maybe this doesn't need to be optimized. In the case of the altair notebook, which has a pretty big JSON structure inside (and only one!), perhaps it would make sense if some of our structure is frozen objects (don't force vega payloads to be Immutable Maps).

```
 -----  altair.ipynb
straight JSON.parse baseline                       1.10996391 ms
Object.freeze baseline                             2.29745900 ms
straight JSON.parse then commutable conversion     6.84918417 ms
immutable greedy nb                                5.85418076 ms
```