# Notebook as a Service

This notebook demonstrates how to make a REST API service out of a notebook.

The example implements a geospatial data service for looking up airports with a IATA (International Air Transport Association) code. The service can provide the full list of airports, or queries can be restricted to a certain geographic region defined by a bounding box.

To run the notebook as a service, the ``jupyter kernelgateway`` command should be used. The ``.jupyter/jupyter_kernel_gateway_config.py`` configuration file would need to define:

```
c.KernelGatewayApp.api = 'kernel_gateway.notebook_http'
c.KernelGatewayApp.seed_uri = 'Airports.ipynb'
```

The ``KernelGatewayApp.api`` configuration is set to ``kernel_gateway.notebook_http`` to expose the notebook as a REST API service over HTTP. The ``KernelGatewayApp.seed_uri`` specifies the location of the notebook file.

## Required module imports

In [None]:
import os
import json
import pandas as pd
import numpy as np
import folium
import folium.plugins

## Flagging test functions

Because we don't want certain code, such as test functions, to be executed when the notebook is loaded to be run as a service, define a decorator to mark code we only want run when interacting with the notebook. The decorator needs to be applied to a function. The function will be automatically called when the cell is run, so you don't need to add a separate call. The marked functions will not be run when the ``KERNEL_GATEWAY`` environment variable is set. This environment variable is automatically set when the notebook is run as a service.

In [None]:
def execute_if_development(wrapped):
    if 'KERNEL_GATEWAY' not in os.environ:
        wrapped()
    return wrapped

## Preparing the airport data

The airport data is downloaded when the notebook is run.

In [None]:
def load_airport_data():
    return pd.read_csv('http://ourairports.com/data/airports.csv')

In [None]:
raw_airport_data = load_airport_data()

For the service, only the airport name and location for airports with an IATA code is required. Entries with no IATA code are dropped, as are additional data fields the service doesn't require. Various columns are also renamed.

In [None]:
def transform_airport_data(data=raw_airport_data):
    data = data[['name', 'latitude_deg', 'longitude_deg', 'iata_code']]
    data = data.rename(columns={'latitude_deg': 'latitude', 'longitude_deg': 'longitude'})
    data = data.dropna(axis=0, how='all', subset=['iata_code'])
    return data

In [None]:
@execute_if_development
def dump_raw_airport_data():
    display(raw_airport_data.head())
    print('total rows =', len(raw_airport_data))

The list of airports is stored in ``airport_data`` and can be accessed directly if details on all airports is required.

In [None]:
airport_data = transform_airport_data()

In [None]:
@execute_if_development
def dump_airport_data():
    display(airport_data.head())
    print('total rows =', len(airport_data))

## Querying by bounding box

When visualizing data where there is a large data set, it may not always be practical to load up all of the data. For data where entries are associated with a location defined by latitude and longitude, we can perform a geospational query, where we only request data with a location that falls within a specific bounding box. The bounding box is defined by the latitude and longitude for the lower left and upper right corners.

The function implementing the bounding box query is ``airport_data_within_bbox()``.

In [None]:
def airport_data_within_bbox(ll, ur, data=airport_data):
    pts = data[['latitude','longitude']]
    inbox = np.all(np.logical_and(np.array(ll) <= pts, pts <= np.array(ur)), axis=1)
    return data[inbox]

In [None]:
@execute_if_development
def dump_airport_data_within_bbox(ll=[-35, 150], ur=[-33, 152]):
    data = airport_data_within_bbox(ll, ur)
    display(data.head())
    print('total rows =', len(data))

## Visualizing the airport data

Using the ``folium`` package for Python, we can visualize the data on an actual map, and then navigate by zooming and panning within the data set resulting from the bounding box query. This capability is only available when interacting with the notebook. To expose the ability to query the data from other applications, we need turn the notebook into a REST API service.

In [None]:
@execute_if_development
def map_airport_data_within_bbox(ll=[-35, 150], ur=[-33, 152]):
    data = airport_data_within_bbox(ll, ur)

    center = [data['latitude'].mean(), data['longitude'].mean()]
    map = folium.Map(location=center, zoom_start=10)

    locations = data[['latitude', 'longitude']].values.tolist()

    folium.plugins.FastMarkerCluster(locations).add_to(map)

    display(map)

## Definining the REST API

To expose the notebook as a REST API service, we need to define the URL endpoints the service should handle and the code which implements them. This includes marking up the type of response which is returned. For this service, the handlers will all return JSON. What code is associated with each URL handler is specified by special comments embedded at the start of a code cell.

To facilitate testing of the code for each handler, we first need to declare a dummy request data. This will be overridden by the actual request data when the notebook is run as a service.

In [None]:
REQUEST = json.dumps({
    'path' : {},
    'args' : {
        'lat1': -35,
        'lat2': -33,
        'lon1': 150,
        'lon2' : 152
    }
})

The first URL handler is for ``/ws/info/`` and it returns some information about the data returned by the service. This exists as the service is being implemented to be compatible with an existing frontend application which will use the service, and which requires the information in this format.

In [None]:
service_info = {
    "id": "iataairports",
    "displayName": "IATA Airports",
    "type": "cluster",
    "center": {
        "latitude": "-33.946",
        "longitude": "151.17"
    },
    "zoom": 8
}

In [None]:
# GET /ws/info/
print(json.dumps(service_info))

In [None]:
# ResponseInfo GET /ws/info/
print(json.dumps({"headers":{"Content-Type":"application/json"}}))

The next URL handler is for ``/ws/data/all``. It returns data for all airports.

In [None]:
# GET /ws/data/all
result = airport_data[['name','latitude','longitude']]
print(result.to_json(orient='records'))

In [None]:
# ResponseInfo GET /ws/data/all
print(json.dumps({"headers":{"Content-Type":"application/json"}}))

The final URL handler is ``/ws/data/within``. It returns data on airports within a specific bounding box. Request arguments are supplied by query string parameters. The ``lat1`` and ``lon1`` values specify the lower left corner of the bounding box. The ``lat2`` and ``lon2`` values specify the upper right corner.

In [None]:
# GET /ws/data/within
request = json.loads(REQUEST)
ll = np.array([request['args']['lat1'], request['args']['lon1']])
ur = np.array([request['args']['lat2'], request['args']['lon2']])
result = airport_data_within_bbox(ll, ur)[['name','latitude','longitude']]
print(result.to_json(orient='records'))

In [None]:
# ResponseInfo GET /ws/data/within
print(json.dumps({"headers":{"Content-Type":"application/json"}}))