# TAP notebook template using python

This is a template for creating [Jupyter notebooks](http://jupyter-notebook.readthedocs.io/en/stable/) that connect to the [Text Analytics Pipeline (TAP)](https://github.com/uts-cic/tap).

Use this template to create your own Jupyter notebooks that use TAP. If you create something that may be useful to others, please consider contributing it to the [notebooks](https://github.com/uts-cic/tap/tree/master/jupyter_notebooks) in the TAP repo.

### Import necessary libraries

We need to make sure that all necessary libraries are imported before they are used.

In [34]:
%AddDeps org.scalaj scalaj-http_2.11 2.3.0
%AddDeps org.sangria-graphql sangria_2.11 1.3.0
%AddDeps org.sangria-graphql sangria-marshalling-api_2.11 1.0.0
%AddDeps org.json4s json4s-jackson_2.11 3.5.3

Marking org.scalaj:scalaj-http_2.11:2.3.0 for download
Preparing to fetch from:
-> file:/tmp/toree_add_deps6924190800534886023/
-> https://repo1.maven.org/maven2
-> New file at /tmp/toree_add_deps6924190800534886023/https/repo1.maven.org/maven2/org/scalaj/scalaj-http_2.11/2.3.0/scalaj-http_2.11-2.3.0.jar
Marking org.sangria-graphql:sangria_2.11:1.3.0 for download
Preparing to fetch from:
-> file:/tmp/toree_add_deps6924190800534886023/
-> https://repo1.maven.org/maven2
-> New file at /tmp/toree_add_deps6924190800534886023/https/repo1.maven.org/maven2/org/sangria-graphql/sangria_2.11/1.3.0/sangria_2.11-1.3.0.jar
Marking org.sangria-graphql:sangria-marshalling-api_2.11:1.0.0 for download
Preparing to fetch from:
-> file:/tmp/toree_add_deps6924190800534886023/
-> https://repo1.maven.org/maven2
-> New file at /tmp/toree_add_deps6924190800534886023/https/repo1.maven.org/maven2/org/sangria-graphql/sangria-marshalling-api_2.11/1.0.0/sangria-marshalling-api_2.11-1.0.0.jar
Marking org.json4s:jso

In [36]:
import scalaj.http._                           // Handle connecting to TAP
import org.json4s._
import org.json4s.jackson.JsonMethods._

### Check the connection to TAP

We need to ensure that we can actually connect to TAP, before trying asking it to analyse our text. TAP provides a health endpoint which we can use to check if the server is up.

**[1]** Firstly, we need to set the ```URL``` for TAP, and the endpoint that we want request.

In [37]:
val tapUrl = "http://tap-test.utscic.edu.au/"   // TAP URL
val health = "health"                           // Health endpoint

**[2]** Now we need to create a ```Request``` to send to the TAP server. 
> For now, this request is to the health endpoint, but later we will want to send a request to a different endpoint, that is why the ```tapurl``` is separate to the ```health``` endpoint.

In [38]:
val tapRequest = Http(tapUrl+health)    // The full URL is the URL for TAP + the health endpoint

**[3]** We send the request to TAP and capture the response. 

The response from TAP is going to include a ```status``` which should be ```200``` signally that everything was OK on the server, and a ```body``` which is actually the web data that we are interested in. If the request is unsuccessful, an error page will be returned. If the URL does not exist, the status code will be ```404``` (Not found). 

In [39]:
val tapResponse = tapRequest.asString   // Initiate the request and capture the response
println(tapResponse.code)               // Print out the status code - it should be 200
println(tapResponse.body)               // Print out the body. Should be: {"message":"Ok"}

200
{"message":"Ok"}


### Using the JSON data

Most of the time, when we get JSON data back from TAP we will want to process it in some way. So let's get the message into a variable so that we can use it in other parts of our code.

In [49]:
val jsonData = parse(tapResponse.body)                // Change the raw text into a JSON object
implicit val formats = DefaultFormats                 // An implicit allows extraction to scala Ojbect from JValue
val message = (jsonData \ "message").extract[String]  // Get the actual message
message match {                                       // Use it in code
    case "Ok" => print("All good")
    case _ => print("No good")
}

All good

### Querying the Graphql endpoint on TAP

The above shows how to connect to TAP and get basic data back, but to use TAP's analytics capability we need to send requests to the graphql endpoint that include the text that we want analysed.

To do this, we need more than just the new endpoint in the request, we need the query itself, and a ```request header``` to tell TAP about the data format.

**[1]** Because we will want to send queries with different text, let's create query function first. This one will identify modal expressions in the text.

In [59]:
//query = None              # A global variable to put the created query - Need a functional way of doing this!!
//def modalQuery(text):     # A function to create a modal expression query for TAP with given text
//    queryEntry = "query Modal($input: String!){ expressions(text: $input) { analytics { modal { text }}}}"
//    variableEntry = {'input':text}
//    global query
//    query = json.dumps({'query':queryEntry, 'variables':variableEntry})

val myText = "I would have prefered to code in Scala, but I will continue in Python for the benefit of others."
val query = """
    query Modal($input: String!) {
      expressions(text: $input) {
        analytics {
          modal {
            text
          }
        }
      }
    }"""

import org.json4s.JsonDSL._
val variables = ("input" -> myText)
val fullQuery = ("query" -> query) ~ ("variables" -> variables)
val jsonQuery = compact(render(fullQuery))
println(jsonQuery)

{"query":"\n    query Modal($input: String!) {\n      expressions(text: $input) {\n        analytics {\n          modal {\n            text\n          }\n        }\n      }\n    }","variables":{"input":"I would have prefered to code in Scala, but I will continue in Python for the benefit of others."}}


**[2]** We can pass this function to a text area widget which would allow us to auto-update the ```query``` variable when text is typed or pasted into the ```Input``` box.

In [None]:
//# Create the widget
//textArea = widgets.Textarea(                            
//              placeholder='Paste or type your text here',
//              description='Input:',
//              disabled=False )
//              
//# Add some demo text so that the user doesn't have to make something up              
//textArea.value = 'I would have prefered to code in Scala, but I will continue in Python for the benefit of others.'

//# Make the widget interactive. Note that assigning it to suppress, just stops the result being shown below the widget
//suppress = widgets.interact(modalQuery,text=textArea)

In [None]:
//print(query)                # print the json query that will be sent to TAP

**[3]** Create the request ready for sending to TAP.

In [60]:
val endpoint = "graphql"                              // The query endpoint on TAP
val completeUrl = tapUrl + endpoint                   // The complete url that the request is posted to
val queryRequest = Http(completeUrl).postData(jsonQuery).header("content-type", "application/json") //The request

**[4]** Send the request to the server and process the result into a JSON object that we can use.

In [70]:
val tapResponse = queryRequest.asString     // Send the request to the server and capture the response
val queryData = parse(tapResponse.body)                // Change the raw text into a JSON object
//println(queryData)

implicit val formats = DefaultFormats                 // An implicit allows extraction to scala Ojbect from JValue
val modalResults = ((queryData \ "data" \ "expressions" \ "analytics")(0) \ "modal" \ "text").extract[List[String]]
println(modalResults)
modalResults.foreach(println(_))

List(I would, I will)
I would
I will


Name: Syntax Error.
Message: 
StackTrace: 

### Visualising the analytics in the original text

The above code has demonstrated how to query TAP and retrieve analytics. Now we can visualise it in a way that is more meaningful for the average user.

**[1]** First we need to create the HTML to be displayed.

In [73]:
val outputText = modalResults.map(
    str => (str,s"""<span class="modexp">$str</span>""")
).foldLeft(myText)((s,r) => s.replaceAll(r._1,r._2))
//for mr in modalResults:
//  fs = mr.get('text')
//  rs = '<span class="modexp">'+fs+'</span>'
//  outputText = outputText.replace(fs,rs)

val paragraph = s"<p>$outputText</p>"
//html = HTML(paragraph)
println(paragraph)

<p><span class="modexp">I would</span> have prefered to code in Scala, but <span class="modexp">I will</span> continue in Python for the benefit of others.</p>


**[2]** We need some ```CSS``` to set the style for our marked up analytics.

In [75]:
val css = """
<style>
.modexp {
    color: blue;
    border-bottom: 1px red dashed;
}
</style>
"""

**[3]** And finally we can display the results.

In [76]:
import org.apache.toree.magic.{CellMagicOutput, CellMagic}
import org.apache.toree.kernel.protocol.v5.{Data, MIMEType}

def display_html(html: String) = Left(CellMagicOutput(MIMEType.TextHtml -> html))
display_html(css+paragraph)

In [77]:
%%HTML
<urth-help/>