# Recogito User Study: Evaluation

Hi! We're evaluating here the records measured during user testing sessions on the Recogito annotation tool with students from Chiara Palladino's Classics courses at Furman University, South Carolina, USA.

First, we're going to load all records from the database dump. All records have been stored in real-time into a CouchDB NoSQL database, so getting a plain JSON output has been a no-brainer.

In [33]:
const records = require('./records.json')

The root element has the `rows` property, which we need to access its rows.

In [34]:
console.log('Object properties:', Object.keys(records))
console.log('Number of records:', records.rows.length)

Object properties: [ 'total_rows', 'offset', 'rows' ]
Number of records: 4763


## Exploring the Schema

At first, let's discover what kinds of properties an entry has.

In [35]:
let firstRow = records.rows[0]
console.log('Row properties:', Object.keys(firstRow))
console.log(JSON.stringify(firstRow, null, 2))

Row properties: [ 'id', 'key', 'value', 'doc' ]
{
  "id": "000c6fd0-1530-11ea-8886-214517bdac3a",
  "key": "000c6fd0-1530-11ea-8886-214517bdac3a",
  "value": {
    "rev": "1-eb5a97672dfac35f3e569aaa9d085f92"
  },
  "doc": {
    "_id": "000c6fd0-1530-11ea-8886-214517bdac3a",
    "_rev": "1-eb5a97672dfac35f3e569aaa9d085f92",
    "type": "open",
    "userId": "Ovalsquare",
    "annotation": {
      "annotation_id": "5c136455-c514-472a-8d33-4756e23b70e9",
      "version_id": "6d693d8e-58a0-4c38-abe0-47a7523003c2",
      "annotates": {
        "document_id": "abj2fb4gjn04mg",
        "filepart_id": "29945459-c942-4b7a-8887-49757f964a75",
        "content_type": [
          "IMAGE",
          "IMAGE_UPLOAD"
        ]
      },
      "contributors": [
        "tvs2019"
      ],
      "anchor": "tbox:x=2840,y=746,a=0.19528139809489925,l=93,h=-23",
      "last_modified_by": "tvs2019",
      "last_modified_at": "2019-11-15T18:07:45+00:00",
      "bodies": [
        {
          "type": "TRANSCRIPT

Again, this object is nested: Its root has `id`, `key`, and `doc`. `doc` has been provided by the analytics microservice running during each session, and this object, again, has `type`, `userId`, `annotation`, and `timestamp`. 

The `annotation` object is the annotation that Recogito operated on during particular events. We have 6 event types:
* `init`: The page has reloaded (like pressing F5 in a browser).
* `create`: An annotation has been created.
* `open`: An existing annotation has been opened in the editor.
* `edit`: An existing annotation has been edited.
* `close`: The annotation editor has been closed.
* `delete`: An existing annotation has been deleted.

## Preparing the Data

We have had two sessions with each around 10-15 participants. They all gave their consent for their actions to be recorded digitally within the browser, and to be processed for the cause of this thesis. Let's prepare the data and make two batches, one for each session.
* Session 1: Nov 15, 2019, 18:30-19:30 CET
* Session 2: Dec 2, 2019, 18:30-19:30 CET

In [36]:
var firstDate = [
    Date.parse('15 Nov 2019 18:30:00 +1'),
    Date.parse('15 Nov 2019 19:30:00 +1')
]
var secondDate = [
    Date.parse('2 Dec 2019 18:30:00 +1'),
    Date.parse('2 Dec 2019 19:30:00 +1')
]

console.log(firstDate)
console.log(secondDate)

var inbetween = (timestamp, [start, end]) => timestamp >= start && timestamp <= end

[ 1573839000000, 1573842600000 ]
[ 1575307800000, 1575311400000 ]


So let's create two distinct sets of events, and analyze their size and how many we disregarded.

In [38]:
var firstSession = []
var secondSession = []
var remainingEvents = []

for (const row of records.rows) {
    if (inbetween(row.doc.timestamp, firstDate)) {
        firstSession.push(row)
    } else if (inbetween(row.doc.timestamp, secondDate)) {
        secondSession.push(row)
    } else {
        remainingEvents.push(row)
    }
}

console.log('First session:', firstSession.length, 'events')
console.log('Second session:', secondSession.length, 'events')
console.log('Not considered:', remainingEvents.length, 'events')

First session: 1267 events
Second session: 1850 events
Not considered: 1646 events


That's looking quite solid already. Let's see who worked inbetween the events and hasn't been considered.

In [41]:
var uniqueArray = (array) => [...new Set(array)]
var userIds = {}
for (const {doc} of remainingEvents) {
    userIds[doc.userId] = typeof userIds[doc.userId] === 'undefined' ? 1 : userIds[doc.userId] + 1
}

userIds

{
  '1234': 5,
  '1089111': 73,
  '3002202': 16,
  '3003938': 10,
  recogito: 632,
  tvs2019: 49,
  homer1337: 44,
  cpalladino: 241,
  hermes: 30,
  Lemur2001: 201,
  Heartbreaker: 27,
  null: 11,
  foobar: 227,
  Ovalsquare: 3,
  falafeljan: 30,
  '0801008': 23,
  patl72033: 5,
  ravenclaw99: 3,
  azhang1004: 5,
  Elsbert_test: 5,
  homer26: 6
}

**TODO**: This should be of interest for later on, when adjusting the survey dates. 

## Data Analysis

### Number of Page reloads

Get the number of page reloads 