## Getting Started

### GitHub and Jupyter

Everything that you require for this workshop is available on GitHub and released under an Apache 2.0 license. We encourage you to make improvements (or add additional resources) and contribute back to the repos so that this workshop can be improved for future participants. 

If you are unfamiliar with working with GitHub, then see this website for helpful tips.

You can run these notebooks yourself on your own machine by following the installation steps in the [repo's README](https://github.com/uts-cic/tap-notebooks)

### Notebook basics

We are going to get straight into using this notebook. If you are not familiar with Jupyter notebooks, you can find some help under the ```help``` menu, and more detailed tutorials at these websites...

   - [Jupyter documentation](http://jupyter-notebook.readthedocs.io/en/latest/examples/Notebook/examples_index.html)
   - [jupyter-scala website](https://github.com/alexarchambault/jupyter-scala)
    
To get started, let's try a basic hello world!

In [None]:
//Let's assign a value 
// Assign "LASI'17" to myWorld by replacing the ???

val myWorld = ???

Pressing **shift-return** will run the currently selected cell and move to the next cell. Do this in the cell above. The result should be:

```
Out[1]: myWorld: String = "LASI'17" 

```
Once you have this, run the cell below to show a sentence with the value included.

In [None]:
show(s"Hello $myWorld!")

The notebook takes code that has been run in previous cells and allows you to use it in subsequent cells. We will use this feature to step through various processes in Writing Analytics.

`show()` is a function that writes to the output of the cell. However, with scala, the cell can show the output of any expression. The following takes a variable (`myWorld`) and embeds it in a string (`s"Hello !"`).

In [None]:
s"Hello $myWorld!"

The result is a string. Because we didn't assign it, the notebook assigned it to a result (`res5`). This is an actual value and we can work with it.

In [None]:
//Output the string above to the notebook by replacing the ???
show(???)

#### Coding

The code in the notebook is run by a 'kernel'. The name of the kernel is in the top righthand corner of the window under the logout button. For this workshop, we have scala or python notebooks. 

You can pick one of a number of options to complete this workshop:

1. Follow along in the scala language using the scala notebooks
2. Follow along in the python language using the python notebooks
3. Work in scala or python on your local machine using the notebooks as a guide
4. Work in another language that you prefer trying the notebook exercises in your own language

***Adopt the approach that is most useful for you and your work***

### I/O

For this workshop we're going to need some basic input and output. Rather than coding this everytime we need it, we have two objects that can provide access to (a) the file system and (b) the text analytics pipeline (TAP)

#### Basic file access

Create an object that will provide some basic filesystem access:
- Hold common properties
- Provide common file access methods

In [None]:
object LocalIO {
    import java.io.File
    import scala.io.Source
    
    val IN_DIR_NAME = "/input_files"
    val OUT_DIR_NAME = "/output_files"
    
    val thisDir = new File(".").getCanonicalPath
    val inputFileDir = thisDir+IN_DIR_NAME
    val outputFileDir = thisDir+OUT_DIR_NAME
    
    val visibleFile = (file:File) => !file.isHidden
    val textFile = (file:File) => file.getName.split('.').last.contains("txt")
    
    def directoryFromString(directory:String):Option[File] = {
         val thisDir = new File(directory)
         if (thisDir.exists && thisDir.isDirectory) Some(thisDir)
         else None
    }
    
    def listFiles(directory:String):List[File] = {
        directoryFromString(directory) match {
            case Some(dir) => dir.listFiles.toList
            case None => List[File]()
        }
    }
    
    def listThisDir = listFiles(thisDir)
    
    def listThisDirVisible = listThisDir.filter(visibleFile)
    
    def listThisDirText = listThisDirVisible.filter(textFile)
    
    def readFile(file:File) = {
        val source = Source.fromFile(file.getCanonicalPath)
        try {
            source.getLines.mkString("\n\n")
        } finally {
            source.close
        }
    }
}

In [None]:
//Check that this object is working as expected

//What files are in the current directory?
show(LocalIO.listThisDir)

In [None]:
//What text files are available in the input directory?
val inputTextFiles = LocalIO.listFiles(LocalIO.inputFileDir).filter(LocalIO.textFile)

//Open and show the first file in the input directory
show(LocalIO.readFile(inputTextFiles.head))

#### TAP access

Create an object that will provide access to the Text Analytics Pipeline (TAP):
- Hold common properties
- Provide common TAP methods

In [1]:
//For TAP access we need a couple of libraries to handle the http connection and the deserialisation of json
import $ivy.`org.scalaj:scalaj-http_2.11:2.3.0`
import $ivy.`com.lihaoyi:upickle_2.11:0.4.4`

object Tap {
    import scalaj.http._
    
    //val API_URL = "https://b9yiddda69.execute-api.ap-southeast-2.amazonaws.com/initialtest/v1"
    val API_URL = "http://localhost:8080/v1"
    val HEALTH_URL = API_URL+"/health"
    val CLEAN_URL = API_URL+"/analyse/text/clean"
    
    case class Message(message:String)
    case class Results(message:String,results:List[String])

    def serverDetails = Http(API_URL).asString

    def getHealthMessage = {
        println(s"Connecting to $HEALTH_URL")
        val response = Http(HEALTH_URL).asString
        //println(response)
        upickle.default.read[Message](response.body)
    }

    def serverIsHealthy = {
        try { getHealthMessage.message=="ok" }
        catch { case e:Exception => {
                println(s"There was a problem with the server: $e")
                false }
        }
    }
    
    def cleanText(text:String) = {
        println(s"Cleaning text: $text")
        val response = Http(CLEAN_URL).postData(text).asString
        upickle.default.read[Results](response.body)
    }
}

Downloading https://repo1.maven.org/maven2/com/lihaoyi/upickle_2.11/0.4.4/upickle_2.11-0.4.4.pom
Downloading https://repo1.maven.org/maven2/com/lihaoyi/upickle_2.11/0.4.4/upickle_2.11-0.4.4.pom.sha1
Downloaded https://repo1.maven.org/maven2/com/lihaoyi/upickle_2.11/0.4.4/upickle_2.11-0.4.4.pom
Downloaded https://repo1.maven.org/maven2/com/lihaoyi/upickle_2.11/0.4.4/upickle_2.11-0.4.4.pom.sha1
Downloading https://repo1.maven.org/maven2/com/lihaoyi/derive_2.11/0.4.4/derive_2.11-0.4.4.pom
Downloading https://repo1.maven.org/maven2/com/lihaoyi/sourcecode_2.11/0.1.3/sourcecode_2.11-0.1.3.pom.sha1
Downloading https://repo1.maven.org/maven2/org/spire-math/jawn-parser_2.11/0.10.3/jawn-parser_2.11-0.10.3.pom
Downloading https://repo1.maven.org/maven2/com/lihaoyi/sourcecode_2.11/0.1.3/sourcecode_2.11-0.1.3.pom
Downloading https://repo1.maven.org/maven2/com/lihaoyi/derive_2.11/0.4.4/derive_2.11-0.4.4.pom.sha1
Downloading https://repo1.maven.org/maven2/org/spire-math/jawn-parser_2.11/0.10.3/jawn-p

[32mimport [39m[36m$ivy.$                                  
[39m
[32mimport [39m[36m$ivy.$                               

[39m
defined [32mobject[39m [36mTap[39m

In [2]:
//Check that this object is working as expected

//Try connecting to the server to check that it is up and running
show(Tap.serverIsHealthy)

Connecting to http://localhost:8080/v1/health
[32mtrue[39m


### Workshop approach

The workshop is intended to guide you through some of the processes involved in Writing Analytics. It is not indended to be prescriptive, but rather to provide a lot of flexibility for you to explore the ideas in ways that are most relevant to your work.

If you're inexperienced with coding, then you may just wish to stick on the path provided by the Jupyter notebooks. If you're experienced, then you may wish to explore the ideas in your own way. **Choose the best path for you.**

#### Common ground

Regardless of which approach you take, all of us will explore some common ground through 3 questions that we will continually revist over the course of the workshop:

1. What are the pedagogical aspects?
2. What are the computational aspects?
3. How do we connect the pedagogical and the computational

We can think of each connection as a single Writing Analytics beam, with the aim of building an increasingly strong bridge over time from multiple beams connecting the pedagogical and the computational. 


### Thoughts on accuracy and precision

In [None]:
2+3

In [None]:
1.9999999999 + 3.0000000001

**The bald man problem:**

**Learning Analytics:** When can we know that a student has actually learnt something?

### Thoughts on evaluation

So what does this mean for how we should evaluate our Writing Analytics?