# TAP notebook template using python

This is a template for creating [Jupyter notebooks](http://jupyter-notebook.readthedocs.io/en/stable/) that connect to the [Text Analytics Pipeline (TAP)](https://github.com/uts-cic/tap).

Use this template to create your own Jupyter notebooks that use TAP. If you create something that may be useful to others, please consider contributing it to the [notebooks](https://github.com/uts-cic/tap/tree/master/jupyter_notebooks) in the TAP repo.

### Import necessary libraries

We need to make sure that all necessary libraries are imported before they are used.

In [38]:
from IPython.core.display import display, HTML  # Allows us to create annotated text using HTML and CSS
import json                                     # We need to be able to work whith JSON which is returned by TAP
from urllib import request, response            # To create requests to TAP and handle responses from TAP
import string                                   # To help with visualising analytics
import ipywidgets as widgets                    # Provides an interactive Textarea widget

### Check the connection to TAP

We need to ensure that we can actually connect to TAP, before trying asking it to analyse our text. TAP provides a health endpoint which we can use to check if the server is up.

**[1]** Firstly, we need to set the ```URL``` for TAP, and the endpoint that we want request.

In [39]:
tapUrl = "http://tap-test.utscic.edu.au/"   # TAP URL
health = "health"                           # Health endpoint

**[2]** Now we need to create a ```Request``` to send to the TAP server. 
> For now, this request is to the health endpoint, but later we will want to send a request to a different endpoint, that is why the ```tapurl``` is separate to the ```health``` endpoint.

In [40]:
tapRequest = request.Request(tapUrl+health)   # The full URL is the URL for TAP + the health endpoint

**[3]** We send the request to TAP and capture the response. 

The response from TAP is going to include a ```status``` which should be ```200``` signally that everything was OK on the server, and a ```body``` which is actually the web data that we are interested in.

In [41]:
tapResponse = request.urlopen(tapRequest)    # Send the request to TAP and capture the resonse
print(tapResponse.status)                           # Print out the status code - it should be 200
body = tapResponse.read().decode('utf8')            # If the status is OK, then read the body
print(body)                                         # Print out the body. Should be: {"message":"Ok"}
                                                    # If the request is unsuccessful, an error will be thrown


200
{"message":"Ok"}


### Using the JSON data

Most of the time, when we get JSON data back from TAP we will want to process it in some way. So let's get the message into a variable so that we can use it in other parts of our code.

In [42]:
jsonData = json.loads(body)              # Change the raw text into a JSON object
message = jsonData.get("message")        # Get the actual message
if message == 'Ok':                      # Use it in code
  print("All is good")
else:
  print("No good")

All is good


### Querying the Graphql endpoint on TAP

The above shows how to connect to TAP and get basic data back, but to use TAP's analytics capability we need to send requests to the graphql endpoint that include the text that we want analysed.

To do this, we need more than just the new endpoint in the request, we need the query itself, and a ```request header``` to tell TAP about the data format.

**[1]** Because we will want to send queries with different text, let's create query function first. This one will identify modal expressions in the text.

In [43]:
query = None              # A global variable to put the created query - Need a functional way of doing this!!
def modalQuery(text):     # A function to create a modal expression query for TAP with given text
    queryEntry = "query Modal($input: String!){ expressions(text: $input) { analytics { modal { text }}}}"
    variableEntry = {'input':text}
    global query
    query = json.dumps({'query':queryEntry, 'variables':variableEntry})

**[2]** We can pass this function to a text area widget which would allow us to auto-update the ```query``` variable when text is typed or pasted into the ```Input``` box.

In [51]:
# Create the widget
textArea = widgets.Textarea(                            
              placeholder='Paste or type your text here',
              description='Input:',
              disabled=False )
              
# Add some demo text so that the user doesn't have to make something up              
textArea.value = 'I would have prefered to code in Scala, but I will continue in Python for the benefit of others.'

# Make the widget interactive. Note that assigning it to suppress, just stops the result being shown below the widget
suppress = widgets.interact(modalQuery,text=ta)

In [52]:
print(query)                # print the json query that will be sent to TAP

{"query": "query Modal($input: String!){ expressions(text: $input) { analytics { modal { text }}}}", "variables": {"input": "I would have prefered to code in Scala, but I will continue in Python for the benefit of others."}}


**[3]** Create the request ready for sending to TAP.

In [53]:
endpoint = "graphql"                              # The query endpoint on TAP
completeUrl = tapUrl + endpoint                   # The complete url that the request is posted to
jsonHeader = {'Content-Type':'application/json'}  # A header to tell the server that we are using JSON
queryRequest = request.Request(completeUrl, data = query.encode("utf8"), headers = jsonHeader)

**[4]** Send the request to the server and process the result into a JSON object that we can use.

In [54]:
tapResponse = request.urlopen(queryRequest) # Send the request to the server and capture the response
body = tapResponse.read().decode('utf8')           # Extract the body from the response
jsonData = json.loads(body)                        # Turn it into a JSON object
print(jsonData)

modalResults = jsonData.get('data').get('expressions').get('analytics')[0].get('modal')
print(modalResults)                                # Get the relevant data from the JSON
for entry in modalResults:                         # Print out the actual analytics
  print(entry.get('text'))

{'data': {'expressions': {'analytics': [{'modal': [{'text': 'I would'}, {'text': 'I will'}]}]}}}
[{'text': 'I would'}, {'text': 'I will'}]
I would
I will


### Visualising the analytics in the original text

The above code has demonstrated how to query TAP and retrieve analytics. Now we can visualise it in a way that is more meaningful for the average user.

**[1]** First we need to create the HTML to be displayed.

In [55]:
outputText = textArea.value
for mr in modalResults:
  fs = mr.get('text')
  rs = '<span class="modexp">'+fs+'</span>'
  outputText = outputText.replace(fs,rs)

paragraph = '<p>'+outputText+'</p>'
html = HTML(paragraph)
print(paragraph)

<p><span class="modexp">I would</span> have prefered to code in Scala, but <span class="modexp">I will</span> continue in Python for the benefit of others.</p>


**[2]** We need some ```CSS``` to set the style for our marked up analytics.

In [56]:
css = HTML("""
<style>
.modexp {
    color: blue;
    border-bottom: 1px red dashed;
}
</style>

""")

**[3]** And finally we can display the results.

In [57]:
display(css,html)