## Nodebooks: Introducing Node.js Data Science Notebooks

Notebooks are where data scientists process, analyse, and visualise data in an iterative, collaborative environment. They typically run environments for languages like Python, R, and Scala. For years, data science notebooks have served academics and research scientists as a scratchpad for writing code, refining algorithms, and sharing and proving their work. Today, it's a workflow that lends itself well to web developers experimenting with data sets in Node.js.

To that end, pixiedust_node is an add-on for Jupyter notebooks that allows Node.js/JavaScript to run inside notebook cells. Not only can web developers use the same workflow for collaborating in Node.js, but they can also use the same tools to work with existing data scientists coding in Python.

pixiedust_node is built on the popular PixieDust helper library. Let’s get started.

> Note: Run one cell at a time or unexpected results might be observed.


## Part 1: Variables, functions, and promises


### Installing
Install the [`pixiedust`](https://pypi.python.org/pypi/pixiedust) and [`pixiedust_node`](https://pypi.python.org/pypi/pixiedust-node) packages using `pip`, the Python package manager. 

In [10]:
# install or upgrade the packages
# restart the kernel to pick up the latest version
!pip3 install pixiedust --upgrade
!pip3 install pixiedust_node --upgrade

Collecting pixiedust
Collecting lxml (from pixiedust)
  Using cached https://files.pythonhosted.org/packages/dd/ba/a0e6866057fc0bbd17192925c1d63a3b85cf522965de9bc02364d08e5b84/lxml-4.5.0-cp36-cp36m-manylinux1_x86_64.whl
Collecting markdown (from pixiedust)
  Using cached https://files.pythonhosted.org/packages/c0/4e/fd492e91abdc2d2fcb70ef453064d980688762079397f779758e055f6575/Markdown-3.1.1-py2.py3-none-any.whl
Collecting mpld3 (from pixiedust)
Collecting requests (from pixiedust)
  Using cached https://files.pythonhosted.org/packages/51/bd/23c926cd341ea6b7dd0b2a00aba99ae0f828be89d72b2190f27c11d4b7fb/requests-2.22.0-py2.py3-none-any.whl
Collecting geojson (from pixiedust)
  Using cached https://files.pythonhosted.org/packages/e4/8d/9e28e9af95739e6d2d2f8d4bef0b3432da40b7c3588fbad4298c1be09e48/geojson-2.5.0-py2.py3-none-any.whl
Collecting astunparse (from pixiedust)
  Using cached https://files.pythonhosted.org/packages/2b/03/13dde6512ad7b4557eb792fbcf0c653af6076b81e5941d36ec61f7ce6028/a

Collecting idna<2.9,>=2.5 (from requests->pixiedust->pixiedust_node)
  Using cached https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl
Collecting certifi>=2017.4.17 (from requests->pixiedust->pixiedust_node)
  Using cached https://files.pythonhosted.org/packages/b9/63/df50cac98ea0d5b006c55a399c3bf1db9da7b5a24de7890bc9cfd5dd9e99/certifi-2019.11.28-py2.py3-none-any.whl
Collecting ptyprocess>=0.5 (from pexpect; sys_platform != "win32"->ipython->pixiedust_node)
  Using cached https://files.pythonhosted.org/packages/d1/29/605c2cc68a9992d18dada28206eeada56ea4bd07a239669da41674648b6f/ptyprocess-0.6.0-py2.py3-none-any.whl
Collecting parso>=0.5.2 (from jedi>=0.10->ipython->pixiedust_node)
  Downloading https://files.pythonhosted.org/packages/ec/bb/3b6c9f604ac40e2a7833bc767bd084035f12febcbd2b62204c5bc30edf97/parso-0.6.1-py2.py3-none-any.whl (97kB)
[K    100% |████████████████████████████████| 102kB 2.7MB/s a 0

### Using pixiedust_node
Now we can import `pixiedust_node` into our notebook:

In [1]:
import pixiedust_node

Pixiedust database opened successfully


pixiedust_node 0.2.5 started. Cells starting '%%node' may contain Node.js code.


And then we can write JavaScript code in cells whose first line is `%%node`:

In [2]:
%%node
// get the current date
var date = new Date();
console.log(date);

It’s that easy! We can have Python and Node.js in the same notebook. Cells are Python by default, but simply starting a cell with `%%node` indicates that the next lines will be JavaScript.

### Displaying HTML and images in notebook cells
We can use the `html` function to render HTML code in a cell:

In [3]:
%%node
var str = '<h2>Quote</h2><blockquote cite="https://www.quora.com/Albert-Einstein-reportedly-said-The-true-sign-of-intelligence-is-not-knowledge-but-imagination-What-did-he-mean">"Imagination is more important than knowledge"\nAlbert Einstein</blockquote>';
html(str)

If we have an image we want to render, we can do that with the `image` function:

In [4]:
%%node
var url = 'https://github.com/IBM/nodejs-in-notebooks/blob/master/notebooks/images/pixiedust_node_schematic.png?raw=true';
image(url);

### Printing JavaScript variables

Print variables using `console.log`.

In [5]:
%%node
var x = { a:1, b:'two', c: true };
console.log(x);

Calling the `print` function within your JavaScript code is the same as calling `print` in your Python code.

In [6]:
%%node
var y = { a:3, b:'four', c: false };
print(y);

### Visualizing data using PixieDust
You can also use PixieDust’s `display` function to render data graphically. Configuring the output as line chart, the visualization looks as follows: 

In [7]:
%%node
var data = [];
for (var i = 0; i < 1000; i++) {
    var x = 2*Math.PI * i/ 360;
    var obj = {
      x: x,
      i: i,
      sin: Math.sin(x),
      cos: Math.cos(x),
      tan: Math.tan(x)
    };
    data.push(obj);
}
// render data 

display(data);

PixieDust presents visualisations of DataFrames using Matplotlib, Bokeh, Brunel, d3, Google Maps and, MapBox. No code is required on your part because PixieDust presents simple pull-down menus and a friendly point-and-click interface, allowing you to configure how the data is presented:

<img src="https://github.com/IBM/nodejs-in-notebooks/blob/master/notebooks/images/pd_chart_types.png?raw=true"></img>

### Adding npm modules
There are thousands of libraries and tools in the npm repository, Node.js’s package manager. It’s essential that we can install npm libraries and use them in our notebook code.
Let’s say we want to make some HTTP calls to an external API service. We could deal with Node.js’s low-level HTTP library, or an easier option would be to use the ubiquitous `request` npm module.
Once we have pixiedust_node set up, installing an npm module is as simple as running `npm.install` in a Python cell:

In [None]:
npm.install('request');

Once installed, you may `require` the module in your JavaScript code:

In [9]:
%%node
var request = require('request');
var r = {
    method:'GET',
    url: 'http://api.open-notify.org/iss-now.json',
    json: true
};
request(r, function(err, req, body) {
    console.log(body);
});


As an HTTP request is an asynchronous action, the `request` library calls our callback function when the operation has completed. Inside that function, we can call print to render the data.
We can organise our code into functions to encapsulate complexity and make it easier to reuse code. We can create a function to get the current position of the International Space Station in one notebook cell:

In [10]:
%%node
var request = require('request');
var getPosition = function(callback) {
    var r = {
        method:'GET',
        url: 'http://api.open-notify.org/iss-now.json',
        json: true
    };
    request(r, function(err, req, body) {
        var obj = null;
        if (!err) {
            obj = body.iss_position
            obj.latitude = parseFloat(obj.latitude);
            obj.longitude = parseFloat(obj.longitude);
            obj.time = new Date().getTime();       
        }
        callback(err, obj);
    });
};

And use it in another cell:

In [11]:
%%node
getPosition(function(err, data) {
    console.log(data);
});

### Promises
If you prefer to work with JavaScript Promises when writing asynchronous code, then that’s okay too. Let’s rewrite our `getPosition` function to return a Promise. First we're going to install the `request-promise` module from npm:

In [12]:
npm.install( ('request', 'request-promise') )

/usr/bin/npm install -s request request-promise
... ... ... ...
... ...
... ..... ..... ..... ..... ... ..... ..... ....... ....... ....... ....... ....... ..... ..... ...
... ...
{ timestamp: 1580782820,
iss_position: { longitude: '105.0944', latitude: '-21.3439' },
message: 'success' }
{ longitude: 105.0944, latitude: -21.3439, time: 1580782820705 }
/home/jared/node
├── node-fetch@2.6.0
├── promise@8.0.3
├── request@2.88.0
└── request-promise@4.2.5


Notice how you can install multiple modules in a single call. Just pass in a Python `list` or `tuple`.
Then we can refactor our function a little:

In [13]:
%%node
var request = require('request-promise');
var getPosition = function(callback) {
    var r = {
        method:'GET',
        url: 'http://api.open-notify.org/iss-now.json',
        json: true
    };
    return request(r).then(function(body) {
        var obj = null;
        obj = body.iss_position;
        obj.latitude = parseFloat(obj.latitude);
        obj.longitude = parseFloat(obj.longitude);
        obj.time = new Date().getTime();         
        return obj;
    });
};

And call it in the Promises style:

In [14]:
%%node
getPosition().then(function(data) {
  console.log(data);
}).catch(function(err) {
  console.error(err);    
});

Or call it in a more compact form:

In [15]:
%%node
getPosition().then(console.log).catch(console.error);

***
# Part 3: Sharing data between Python and Node.js cells

You can share variables between Python and Node.js cells. Why woud you want to do that? Read on.

The Node.js library ecosystem is extensive. Perhaps you need to fetch data from a database and prefer the syntax of a particular Node.js npm module. You can use Node.js to fetch the data, move it to the Python environment, and convert it into a Pandas or Spark DataFrame for aggregation, analysis and visualisation.

PixieDust and pixiedust_node give you the flexibility to mix and match Python and Node.js code to suit the workflow you are building and the skill sets you have in your team.

Mixing Node.js and Python code in the same notebook is a great way to integrate the work of your software development and data science teams to produce a collaborative report or dashboard.


### Sharing data

Define variables in a Python cell.

In [16]:
# define a couple variables in Python
a = 'Hello from Python!'
b = 2
c = False
d = {'x':1, 'y':2}
e = 3.142
f = [{'a':1}, {'a':2}, {'a':3}]

Access or modify their values in Node.js cells.

In [17]:
%%node
// print variable values
console.log(a, b, c, d, e, f);

// change variable value 
a = 'Hello from Node.js!';

// define a new variable
var g = 'Yes, it works both ways.';

Inspect the manipulated data.

In [18]:
# display modified variable and the new variable
print('{} {}'.format(a,g))

NameError: name 'g' is not defined

... ..... ..... ..... ..... ... ..... ..... ..... ..... ..... ..... ..... ...
... ... ... ...
Hello from Python! 2 false { x: 1, y: 2 } 3.142 [ { a: 1 }, { a: 2 }, { a: 3 } ]
{ longitude: 105.4089, latitude: -20.9819, time: 1580782828008 }
{ longitude: 105.4089, latitude: -20.9819, time: 1580782828027 }


**Note:** PixieDust natively supports [data sharing between Python and Scala](https://ibm-watson-data-lab.github.io/pixiedust/scalabridge.html), extending the loop for some data types:
 ```
 %%scala
 println(a,b,c,d,e,f,g)
 
 (Hello from Node.js!,2,null,null,null,null,Yes, it works both ways.)
 ```

### Sharing data from an asynchronous callback

If you wish transfer data from Node.js to Python from an asynchronous callback, make sure you write the data to a global variable. 

Load a csv file from a GitHub repository.

In [None]:
%%node

// global variable
var sample_csv_data = '';

// load csv file from GitHub and store data in the global variable
request.get('https://github.com/ibm-watson-data-lab/open-data/raw/master/cars/cars.csv').then(function(data) {
  sample_csv_data = data;
  console.log('Fetched sample data from GitHub.');
});

Create a Pandas DataFrame from the downloaded data.

In [None]:
import pandas as pd
import io
# create DataFrame from shared csv data
pandas_df = pd.read_csv(io.StringIO(sample_csv_data))
# display first five rows
pandas_df.head(5)

In [None]:
pandas_df.shape

**Note**: Above example is for illustrative purposes only.  A much easier solution is to use [PixieDust's sampleData method](https://ibm-watson-data-lab.github.io/pixiedust/loaddata.html#load-a-csv-using-its-url) if you want to create a DataFrame from a URL. 

#### References:
 * [Nodebooks: Introducing Node.js Data Science Notebooks](https://medium.com/ibm-watson-data-lab/nodebooks-node-js-data-science-notebooks-aa140bea21ba)
 * [Nodebooks: Sharing Data Between Node.js & Python](https://medium.com/ibm-watson-data-lab/nodebooks-sharing-data-between-node-js-python-3a4acae27a02)
 * [Sharing Variables Between Python & Node.js in Jupyter Notebooks](https://medium.com/ibm-watson-data-lab/sharing-variables-between-python-node-js-in-jupyter-notebooks-682a79d4bdd9)

# Main Part

 ## Callbacks

In [None]:
import pixiedust_node

In [None]:
%%node

posts = [
  { title: 'Post One', body: 'This is post one' },
  { title: 'Post Two', body: 'This is post two' }
];

In [None]:
%%node

function getPosts() {
  setTimeout(() => {
    let output = '';
    posts.forEach((post, index) => {
      output += `${post.title}\n`;
    });
    //document.body.innerHTML = output;
      console.log(output)
  }, 1000);
}

function createPost(post, callback) {
  setTimeout(() => {
    posts.push(post);
    callback();
  }, 2000);
}

createPost({ title: 'Post Three', body: 'This is post three' }, getPosts);

## Promises

In [None]:
npm.install('promise')

In [None]:
%%node
var Promise = require('promise');

function createPost(post) {
  return new Promise((resolve, reject) => {
    setTimeout(() => {
      posts.push(post);

      const error = false;

      if (!error) {
        resolve();
      } else {
        reject('Error: Something went wrong');
      }
    }, 2000);
  });
}

In [None]:
%%node

createPost({ title: 'Post Four', body: 'This is post four' }).then(getPosts).catch(err => console.log(err));

In [None]:
npm.install('node-fetch')

In [None]:
%%node
const fetch = require('node-fetch');

// Promise.all
const promise1 = Promise.resolve('Hello World');
const promise2 = 10;
const promise3 = new Promise((resolve, reject) =>
  setTimeout(resolve, 2000, 'Goodbye')
);
const promise4 = fetch('https://jsonplaceholder.typicode.com/users').then(res =>
  res.json()
);

Promise.all([promise1, promise2, promise3, promise4]).then(values =>
  console.log(values)
);


## Async/ Await

In [None]:
%%node

createPost({ title: 'Post Five', body: 'This is post Five' }).then(getPosts).catch(err => console.log(err));


The cell above calls `.then()` on the promise. The cell below uses the new `async/await` syntax to perform the same task.

In [None]:
%%node

async function init() {
  await createPost({ title: 'Post Six', body: 'This is post Six' });

  getPosts();
}

init();

In [None]:
%%node

// Asunc / Await / Fetch

async function fetchUsers() {
  const res = await fetch('https://jsonplaceholder.typicode.com/users');

  const data = await res.json();

  console.log(data);
}

fetchUsers();


# Homework

In [1]:
import pixiedust_node

Pixiedust database opened successfully


pixiedust_node 0.2.5 started. Cells starting '%%node' may contain Node.js code.


## Promise

### 1) Create a promise that resolves in 3 seconds and returns "success" string




In [2]:
%%node

//1)

var Promise = require('promise');

function threeSecondPromise(){
    return new Promise((resolve, reject) => {
        setTimeout(() => {
                var error = false;
                if(!error){
                    resolve("sucess");
                    
                }
                else{
                    reject("failure");
                    
                }
            },3000);
    });
}


... ..... ....... ....... ......... ......... ......... ....... ......... ......... ......... ....... ..... ...


### 2) Run the above promise and make it console.log "success"



In [3]:
%%node
//2)
threeSecondPromise().then(console.log).catch(console.log)

sucess


### 3) Use Promise.all to fetch all of these people from Star Wars (SWAPI) at the same time.
    Console.log the output and make sure it has a catch block as well.
```
const urls = [
  'https://swapi.co/api/people/1',
  'https://swapi.co/api/people/2',
  'https://swapi.co/api/people/3',
  'https://swapi.co/api/people/4'
]
```


In [4]:
%%node
var urls = [
]

...


In [5]:
%%node
var Promise = require('promise');
var request = require('request-promise');
urls = [
  'https://swapi.co/api/people/1',
  'https://swapi.co/api/people/2',
  'https://swapi.co/api/people/3',
  'https://swapi.co/api/people/4'
]

console.log("working");
for ( var i = 0 ; i < urls.length; i++){
    request.get(urls[i]).then(console.log).catch(function(err) { console.log("Something went wrong")})
    
}

... ... ... ... ...
working
... ... ...
{"name":"C-3PO","height":"167","mass":"75","hair_color":"n/a","skin_color":"gold","eye_color":"yellow","birth_year":"112BBY","gender":"n/a","homeworld":"https://swapi.co/api/planets/1/","films":["https://swapi.co/api/films/2/","https://swapi.co/api/films/5/","https://swapi.co/api/films/4/","https://swapi.co/api/films/6/","https://swapi.co/api/films/3/","https://swapi.co/api/films/1/"],"species":["https://swapi.co/api/species/2/"],"vehicles":[],"starships":[],"created":"2014-12-10T15:10:51.357000Z","edited":"2014-12-20T21:17:50.309000Z","url":"https://swapi.co/api/people/2/"}
{"name":"Darth Vader","height":"202","mass":"136","hair_color":"none","skin_color":"white","eye_color":"yellow","birth_year":"41.9BBY","gender":"male","homeworld":"https://swapi.co/api/planets/1/","films":["https://swapi.co/api/films/2/","https://swapi.co/api/films/6/","https://swapi.co/api/films/3/","https://swapi.co/api/films/1/"],"species":["https://swapi.co/api/species/1/


### 4) Change one of your urls above to make it incorrect and fail the promise
    does your catch block handle it?

In [6]:
%%node
urls = [
  'https://swapi.co/api/people/1/dne',
  'https://swapi.co/api/people/2',
  'https://swapi.co/api/people/3',
  'https://swapi.co/api/people/4'
]

console.log("working");
for ( var i = 0 ; i < urls.length; i++){
    request.get(urls[i]).then(console.log).catch(function(err) { console.log("Something went wrong")})
    
}

... ... ... ... ...
working
... ... ...
Something went wrong
{"name":"R2-D2","height":"96","mass":"32","hair_color":"n/a","skin_color":"white, blue","eye_color":"red","birth_year":"33BBY","gender":"n/a","homeworld":"https://swapi.co/api/planets/8/","films":["https://swapi.co/api/films/2/","https://swapi.co/api/films/5/","https://swapi.co/api/films/4/","https://swapi.co/api/films/6/","https://swapi.co/api/films/3/","https://swapi.co/api/films/1/","https://swapi.co/api/films/7/"],"species":["https://swapi.co/api/species/2/"],"vehicles":[],"starships":[],"created":"2014-12-10T15:11:50.376000Z","edited":"2014-12-20T21:17:50.311000Z","url":"https://swapi.co/api/people/3/"}
{"name":"Darth Vader","height":"202","mass":"136","hair_color":"none","skin_color":"white","eye_color":"yellow","birth_year":"41.9BBY","gender":"male","homeworld":"https://swapi.co/api/planets/1/","films":["https://swapi.co/api/films/2/","https://swapi.co/api/films/6/","https://swapi.co/api/films/3/","https://swapi.co/api

## Async/Await

### 1) Convert the below promise into async await
```
fetch('https://swapi.co/api/starships/9/')
  .then(response => response.json())
  .then(console.log)
```



In [2]:
%%node 
const fetch = require('node-fetch');

async function fetchStarship(){
    var ret = await fetch('https://swapi.co/api/starships/9/');
    var data = await ret.json();
    
    return data;
}

(async () => {
        let data = await fetchStarship();
        console.log(data);
        }) ();

... ... ... ... ...
... ... ...
{ name: 'Death Star',
model: 'DS-1 Orbital Battle Station',
manufacturer: 'Imperial Department of Military Research, Sienar Fleet Systems',
cost_in_credits: '1000000000000',
length: '120000',
max_atmosphering_speed: 'n/a',
crew: '342953',
passengers: '843342',
cargo_capacity: '1000000000000',
consumables: '3 years',
hyperdrive_rating: '4.0',
MGLT: '10',
starship_class: 'Deep Space Mobile Battlestation',
pilots: [],
films: [ 'https://swapi.co/api/films/1/' ],
created: '2014-12-10T16:36:50.509000Z',
edited: '2014-12-22T17:35:44.452589Z',
url: 'https://swapi.co/api/starships/9/' }


### 2) Update the function below to also have async await for this line: fetch(url).then(resp => resp.json())
So there shouldn't be any .then() calls anymore!
```
const urls = [
  'https://jsonplaceholder.typicode.com/users',
  'https://jsonplaceholder.typicode.com/posts',
  'https://jsonplaceholder.typicode.com/albums'
]

const getData = async function() {
  const [ users, posts, albums ] = await Promise.all(urls.map(url =>
      fetch(url).then(resp => resp.json())
  ));
  console.log('users', users);
  console.log('posta', posts);
  console.log('albums', albums);
}
```



In [12]:
%%node
var Promise = require('promise');
const fetch = require('node-fetch');
const urls = [
  'https://jsonplaceholder.typicode.com/users',
  'https://jsonplaceholder.typicode.com/posts',
  'https://jsonplaceholder.typicode.com/albums'
]

async function getData() {
    let response = await Promise.all(urls.map(url => fetch(url)));
    let [ users, posts, albums ] = await Promise.all(response.map(resp =>resp.json()));

    console.log('users', users);
    console.log('posta', posts);
    console.log('albums', albums);
}

getData();



SyntaxError: Identifier 'fetch' has already been declared
... ... ... ... SyntaxError: Identifier 'urls' has already been declared
... ... ... ... ... ... ... ... ... ...
users [ { id: 1,
name: 'Leanne Graham',
username: 'Bret',
email: 'Sincere@april.biz',
address:
{ street: 'Kulas Light',
suite: 'Apt. 556',
city: 'Gwenborough',
zipcode: '92998-3874',
geo: [Object] },
phone: '1-770-736-8031 x56442',
website: 'hildegard.org',
company:
{ name: 'Romaguera-Crona',
catchPhrase: 'Multi-layered client-server neural-net',
bs: 'harness real-time e-markets' } },
{ id: 2,
name: 'Ervin Howell',
username: 'Antonette',
email: 'Shanna@melissa.tv',
address:
{ street: 'Victor Plains',
suite: 'Suite 879',
city: 'Wisokyburgh',
zipcode: '90566-7771',
geo: [Object] },
phone: '010-692-6593 x09125',
website: 'anastasia.net',
company:
{ name: 'Deckow-Crist',
catchPhrase: 'Proactive didactic contingency',
bs: 'synergize scalable supply-chains' } },
{ id: 3,
name: 'Clementine Bauch',
username: 'Samantha',
email

### 3)Add a try catch block to the 2) solution in order to catch any errors. 
Now, use the given array containing an invalid url, so you `console.log` your error with 'oooooops'.
```
const urls = [
  'https://jsonplaceholder.typicode.com/users',
  'https://jsonplaceholdeTYPO.typicode.com/posts',
  'https://jsonplaceholder.typicode.com/albums'
]
```

In [None]:
# thanks to https://stackoverflow.com/questions/50006595/using-promise-all-to-fetch-a-list-of-urls-with-await-statements
# and https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Asynchronous/Async_await

In [5]:
%%node
var Promise = require('promise');
const fetch = require('node-fetch');
let urls = [
  'https://jsonplaceholder.typicode.com/users',
  'https://jsonplaceholdeTYPO.typicode.com/posts',
  'https://jsonplaceholder.typicode.com/albums'
]

async function getData() {
    try{
        let response = await Promise.all(urls.map(url => fetch(url)));
        let [ users, posts, albums ] = await Promise.all(response.map(resp =>resp.json()));

        console.log('users', users);
        console.log('posta', posts);
        console.log('albums', albums);
    }catch(e){
        console.log("ooooooops");
    }
}

getData();


SyntaxError: Identifier 'fetch' has already been declared
... ... ... ... SyntaxError: Identifier 'urls' has already been declared
... ..... ..... ..... ... ... ... ... ... ... ...
ooooooops
