# Text Summarization Service

This notebook is meant to demonstrate the transformation of an annotated notebook into a HTTP API using the Jupyter kernel gateway. We will expose a simple text summarization endpoint to the Resource Watch API Control Tower.

We will need this stuff for processing requests and handling json data:

In [None]:
import requests
import json

And we will also use gensim, a high level NLP library:

In [None]:
from gensim.summarization import summarize
from gensim.summarization import keywords

Our goals are modest for this example, so we will implement just two endpoints: summarize and keywords. Both expect a `text` field in the request body with the text to be processed.

## API

First we'll declare what a REQUEST looks like, so we can develop the service easily:

In [None]:
REQUEST =  json.dumps(
    {
        'path': {},
        'headers': {
            'Cache-Control': 'no-cache',
            'Content-Length': '1834',
            'Postman-Token': '012295ac-0273-4994-a78e-112742f0468f',
            'Content-Type': 'multipart/form-data;boundary=--------------------------492240627376624783967489',
            'Accept': '*/*', 
            'User-Agent': 'PostmanRuntime/3.0.9',
            'Accept-Encoding': 'gzip, deflate',
            'Connection': 'keep-alive',
            'Host': '192.168.1.124:8889'
        },
        'body': {
            'text': ['President Trump signed an executive order on Friday that purports to bar for at least 90 days almost all permanent immigration from seven majority-Muslim countries, including Syria and Iraq, and asserts the power to extend the ban indefinitely. But the order is illegal. More than 50 years ago, Congress outlawed such discrimination against immigrants based on national origin. That decision came after a long and shameful history in this country of barring immigrants based on where they came from. Starting in the late 19th century, laws excluded all Chinese, almost all Japanese, then all Asians in the so-called Asiatic Barred Zone. Finally, in 1924, Congress created a comprehensive “national-origins system,” skewing immigration quotas to benefit Western Europeans and to exclude most Eastern Europeans, almost all Asians, and Africans. Mr. Trump appears to want to reinstate a new type of Asiatic Barred Zone by executive order, but there is just one problem: The Immigration and Nationality Act of 1965 banned all discrimination against immigrants on the basis of national origin, replacing the old prejudicial system and giving each country an equal shot at the quotas. In signing the new law, President Lyndon B. Johnson said that “the harsh injustice” of the national-origins quota system had been “abolished.” Protesters near the White House on Wednesday. Credit Al Drago/The New York Times Nonetheless, Mr. Trump asserts that he still has the power to discriminate, pointing to a 1952 law that allows the president the ability to “suspend the entry” of “any class of aliens” that he finds are detrimental to the interest of the United States.']
        },
        'args': {}
    })

Let's move onto declaring endpoints. The jupyter notebook gateway service declares endpoints in cells with a simple DSL, commenting the first cell like in the following example. You just have to `print()` a response. Adhere to jsonapi standards!

In [None]:
# POST /summarize
request = json.loads(REQUEST)
response = summarize(request['body']['text'][0])
print(json.dumps({
    "data": [{
        "summary": response.split("\n")
    }]
    })
)

Headers go on a separate companion cell

In [None]:
# ResponseInfo POST /summarize
print(
    json.dumps({
        "headers" : {
            "Content-Type" : "application/json"
        },
        "status" : 201
    })
)

The keywords endpoint is too easy to implement with gensim:

In [None]:
# POST /keywords
request = json.loads(REQUEST)
response = keywords(request['body']['text'][0])
print(json.dumps({
    "data": [{
        "keywords": response.split("\n")
    }]
    })
)

In [None]:
# ResponseInfo POST /keywords
print(
    json.dumps({
        "headers" : {
        "Content-Type" : "application/json"
    },
    "status" : 201
    })
)

Some helper endpoints. This one mirrors the request. Useful for debugging.

In [None]:
# POST /mirror
request = json.loads(REQUEST)
print(json.dumps(request))

In [None]:
# ResponseInfo POST /mirror
print(
    json.dumps({
        "headers" : {
        "Content-Type" : "application/json"
    },
    "status" : 201
    })
)

Also, we'll need to register the microservice with Control Tower! A ping endpoint is easy.

In [None]:
# GET /ping
pong = {"ping": "pong"}
print(json.dumps(pong))

In [None]:
# ResponseInfo GET /ping
print(
    json.dumps({
        "headers" : {
        "Content-Type" : "application/json"
    },
    "status" : 200
    })
)

Also, an `/info` endpoint

In [None]:
# GET /info
