## Reflecting, writing, and analytics: What can we learn from student text as data?
### HERN Workshop - 27 Nov 2017

What does a students’ language say about their learning? When they put their personal thoughts into words, what does it reveal about them, their thinking, and their interactions with others? In this workshop we will explore some of the ways reflective writing can be used for learning, and take an introductory look at how we can discover meaningful aspects of the writing through computational analysis. During the workshop, we will experiment with a couple of tools for analysing writing, examine some cases of how these tools were used for learning, and we will establish some important principles for using writing analytics in a learning and teaching context.

### A couple of RWA examples 

#### Academic Reflective Writing

[Academic Writing Analytics (AWA)](http://awa.uts.edu.au/) 

- Login using AAF
- Try examplar reflections
- Look at the theoretical framework for reflective writing
- Examine how the framework links throught to the feedback

#### Reflection and Metacognition

[Towards the Discovery of Learner Metacognition From Reflective Writing](http://nlytx.io/2016/metacognition/index.html) 

- Try different examples and view the features
- Look at the theoretical link between metacognition and reflection
- Examine how the theory translates to the features

### DIY Reflective Writing Analytics

A basic Reflective Writing Analytics (RWA) task, step by step using the [Text Analytics Pipeline (TAP)](http://tap-test.utscic.edu.au)

But first, we need to do some setup of the software.

In [1]:
from IPython.core.display import display, HTML  # Allows us to create annotated text using HTML and CSS
import json                                     # We need to be able to work whith JSON which is returned by TAP
from urllib import request, response            # To create requests to TAP and handle responses from TAP
import string                                   # To help with visualising analytics
import ipywidgets as widgets                    # Provides an interactive Textarea widget

#### The Task - Group Efficacy

Consider a large cohort of students undertaking an assignment in small groups. Most work is undertaken outside of face-to-face time, and therefore monitoring group interaction is not practical.

Suppose we wish to identify which groups are functioning well and which groups are having problems so that we can intervene early.

**The Writing:** *Students use [GoingOK](http://goingok.org) to write short personal reflections about their group work after each group interaction (or at least once per week).*

#### Language features

What features are we likely to see in the students writing when group work is going well? What about when the group is not functioning?

Take a look at a couple of examples...

In [12]:
# Load reflections from disk, and format to screen (can these be shown in a new window??)
file = open("example_text/grp1-pers1.txt")
g1p1Text = file.read()
print(g1p1Text)
g1p1Para = '<p>'+g1p1Text+'</p>'
g1p1Html = HTML(g1p1Para)
display(g1p1)

I can't believe that Harry has done nothing for our project. Everyone has been working diligently on it except him. He hasn't even started. If he doesn't lift his game this week, I'm going to talk to our tutor about it. I don't want us all to get a bad mark because he can't make an effort.


<_io.TextIOWrapper name='example_text/grp1-pers1.txt' mode='r' encoding='UTF-8'>

#### Insignificant words that are significant

Often, when processing text computationally we are interested in content, and so words that don't contribute to the content, called stop words (a, the, this, then, me, I, us), are discarded and the algorithm works with the content words.

*What do content words tell us about the effectiveness of the groups?*

*What about the stop words? Do they tell us anything?*

In [None]:
# Highlight pronouns in example text

#### Formulating a hypothesis

When performing writing analytics, we use observations coupled a knowledge of the context and relevant theory to formulate a hypothesis about how the text might be analysed to yield insights that are of practical benefit.

**Hypothesis:** Is the use of singular or plural pronouns an indicator of group efficacy?

We can then test this on the data.

In [None]:
# Visualise singluar versus plural pronouns

#### Refining the hypothesis

During the course of testing our hypothesis we might notice other factors. We can then look for types of analysis that may be able to test whether these factors help get us further towards our goal of identifying group efficacy. Consider...

- The use of group members names (NER)
- Positive or negative language (Sentiment)
- Consistency between group members' language (Antonym-Hyponym)

In [None]:
# Visualise NER and Sentiment

# See if classes of features are aligning and group into categories

# Categorise reflections

# Identify signficant antonyms and hyponyms within reflections of a group

# Can we classify the group?

In [1]:
# Other software required...


### Check the connection to TAP

We need to ensure that we can actually connect to TAP, before trying asking it to analyse our text. TAP provides a health endpoint which we can use to check if the server is up.

**[1]** Firstly, we need to set the ```URL``` for TAP, and the endpoint that we want request.

In [2]:
tapUrl = "http://tap-test.utscic.edu.au/"   # TAP URL
health = "health"                           # Health endpoint

**[2]** Now we need to create a ```Request``` to send to the TAP server. 
> For now, this request is to the health endpoint, but later we will want to send a request to a different endpoint, that is why the ```tapurl``` is separate to the ```health``` endpoint.

In [3]:
tapRequest = request.Request(tapUrl+health)   # The full URL is the URL for TAP + the health endpoint

**[3]** We send the request to TAP and capture the response. 

The response from TAP is going to include a ```status``` which should be ```200``` signally that everything was OK on the server, and a ```body``` which is actually the web data that we are interested in.

In [4]:
tapResponse = request.urlopen(tapRequest)    # Send the request to TAP and capture the resonse
print(tapResponse.status)                           # Print out the status code - it should be 200
body = tapResponse.read().decode('utf8')            # If the status is OK, then read the body
print(body)                                         # Print out the body. Should be: {"message":"Ok"}
                                                    # If the request is unsuccessful, an error will be thrown


200
{"message":"Ok"}


### Using the JSON data

Most of the time, when we get JSON data back from TAP we will want to process it in some way. So let's get the message into a variable so that we can use it in other parts of our code.

In [5]:
jsonData = json.loads(body)              # Change the raw text into a JSON object
message = jsonData.get("message")        # Get the actual message
if message == 'Ok':                      # Use it in code
  print("All is good")
else:
  print("No good")

All is good


### Querying the Graphql endpoint on TAP

The above shows how to connect to TAP and get basic data back, but to use TAP's analytics capability we need to send requests to the graphql endpoint that include the text that we want analysed.

To do this, we need more than just the new endpoint in the request, we need the query itself, and a ```request header``` to tell TAP about the data format.

**[1]** Because we will want to send queries with different text, let's create query function first. This one will identify modal expressions in the text.

In [6]:
query = None              # A global variable to put the created query - Need a functional way of doing this!!
def modalQuery(text):     # A function to create a modal expression query for TAP with given text
    queryEntry = "query Modal($input: String!){ expressions(text: $input) { analytics { modal { text }}}}"
    variableEntry = {'input':text}
    global query
    query = json.dumps({'query':queryEntry, 'variables':variableEntry})

**[2]** We can pass this function to a text area widget which would allow us to auto-update the ```query``` variable when text is typed or pasted into the ```Input``` box.

In [7]:
# Create the widget
textArea = widgets.Textarea(                            
              placeholder='Paste or type your text here',
              description='Input:',
              disabled=False )
              
# Add some demo text so that the user doesn't have to make something up              
textArea.value = 'I would have prefered to code in Scala, but I will continue in Python for the benefit of others.'

# Make the widget interactive. Note that assigning it to suppress, just stops the result being shown below the widget
suppress = widgets.interact(modalQuery,text=textArea)