# AfterAlice Data

<br>
<font size=3>

This [jupyter workbook](https://jupyter.org/) uses the [R statistical language](https://en.wikipedia.org/wiki/R_(programming_language)) to extract data from the [AfterAlice website](http://theafteraliceproject.org/) using Omeka's [JSON API](https://omeka.org/classic/docs/Technical/Output_Formats/).

</font>


## Load packages and the data

Packages used below:

* jsonlite for parsing JSON
* purrr (A functional programming toolkit for R)
* listviewer for doing recon on awkward lists
* tibble and dplyr for forming and manipulating data frames

In [2]:
library(jsonlite)

In [3]:
library(purrr)


Attaching package: ‘purrr’

The following object is masked from ‘package:jsonlite’:

    flatten



In [4]:
install.packages("listviewer")

also installing the dependency ‘htmlwidgets’

Updating HTML index of packages in '.Library'
Making 'packages.html' ... done


In [5]:
library(listviewer)

In [6]:
library(tibble)

In [7]:
library(dplyr)


Attaching package: ‘dplyr’

The following objects are masked from ‘package:purrr’:

    contains, order_by

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



# Retrieving the data
<br>
<font size=3>

The JSON data at http://theafteraliceproject.org/api/collections is retreived using a library function `jsonlite::fromJSON` and placed in the dataframe called ``collections_raw`` ...

</font>

In [8]:
collections_raw <- jsonlite::fromJSON("http://theafteraliceproject.org/api/collections", simplifyVector = FALSE)

# Examining the data
<br>
<font size=3>
When you first import the JSON - the structure will be unfamiliar so it is necessary to explore.

A more readable format of the JSON can be viewed on the AfterAlice website at http://theafteraliceproject.org/api/collections?pretty_print .


To view the structure of the first collection in the JSON we use the `str` command on the dataframe `collections_raw` ...
</font>

In [29]:
str(collections_raw[[1]])

List of 10
 $ id                : int 1
 $ url               : chr "http://theafteraliceproject.org/api/collections/1"
 $ public            : logi TRUE
 $ featured          : logi TRUE
 $ added             : chr "2016-10-03T17:58:21+00:00"
 $ modified          : chr "2017-01-14T10:19:17+00:00"
 $ owner             :List of 3
  ..$ id      : int 1
  ..$ url     : chr "http://theafteraliceproject.org/api/users/1"
  ..$ resource: chr "users"
 $ items             :List of 3
  ..$ count   : int 4044
  ..$ url     : chr "http://theafteraliceproject.org/api/items?collection=1"
  ..$ resource: chr "items"
 $ element_texts     :List of 3
  ..$ :List of 3
  .. ..$ text       : chr "Hebden Bridge"
  .. ..$ element_set:List of 4
  .. .. ..$ id      : int 1
  .. .. ..$ url     : chr "http://theafteraliceproject.org/api/element_sets/1"
  .. .. ..$ name    : chr "Dublin Core"
  .. .. ..$ resource: chr "element_sets"
  .. ..$ element    :List of 4
  .. .. ..$ id      : int 50
  .. .. ..$ url     : chr

<br>
<font size=3>
This seems unwieldy at first - but upon examination reveals a hierarchical structure. 

We can extract the names of elements at the top most level by querying the ``collections_raw`` data frame as follows ... 

</font>

In [34]:
top_level_names <- names(collections_raw[[1]])
top_level_names

# Element text data
<br>
<font size=3>
If we pull out just the `element_texts` as `element_text_data` - we can see the 3 lists of metadata that comprise the `element_texts` of collections ...
</font>

In [36]:
element_text_data <- collections_raw[[1]][["element_texts"]]
element_text_data

In [38]:
for (i in 1:3){
    print(collections_raw[[1]]$element_texts[[i]]$element$name)
       }

[1] "Title"
[1] "Description"
[1] "Subject"


In [12]:
a <- collections_raw[[1]][["element_texts"]][[2]][["text"]]
a


In [13]:
b <- collections_raw[[1]][["element_texts"]][[1]][["text"]]
b


c <- collections_raw[[1]][["element_texts"]][[1]][["element"]][["name"]]
c

In [14]:
collection_title <-0

for (i in 1:90){
    collection_title[i] <-collections_raw[[i]][["element_texts"]][[1]][["text"]]
       }
collection_title

In [15]:
collection_description <-0

for (i in 1:90){
    collection_description[i] <- collections_raw[[i]][["element_texts"]][[2]][["text"]]
       }
collection_description

## Voyage of discovery

In [16]:
for (i in 1:3){
    print(collections_raw[[1]][["element_texts"]][[i]]$element$name)
       }


[1] "Title"
[1] "Description"
[1] "Subject"


In [17]:
#initialise an empty object and then assign the value by indexing

cnames <-0

for (i in 1:3){
    cnames[i] <-(collections_raw[[1]][["element_texts"]][[i]]$element$name)
       }
cnames

In [18]:
for (i in 1:3){
    print(collections_raw[[1]][["element_texts"]][[i]]$text)
       }


[1] "Hebden Bridge"
[1] "<span>Hebden Bridge is a market town which forms part of Hebden Royd in West Yorkshire, England. It is in the Upper Calder Valley, 8 miles west of Halifax and 14 miles north-east of Rochdale, at the confluence of the River Calder and the Hebden Water.</span><span><a class=\"fl q _KCd _tWc\" href=\"http://en.wikipedia.org/wiki/Hebden_Bridge\">Wikipedia</a></span><br /><br />"
[1] "<div class=\"_cgc\">\n<div class=\"r-iv66X8zOlhmU\">\n<div class=\"kno-rdesc r-iQ5ocSYwNDPc\"><span>Hebden Bridge is a market town which forms part of Hebden Royd in West Yorkshire, England. It is in the Upper Calder Valley, 8 miles (13 km) west of Halifax and 14 miles (21 km) north-east of Rochdale, at the confluence of the River Calder and the Hebden Water. In 2004, the Calder Valley ward, covering Hebden Bridge, Old Town, and part of Todmorden, had a population of 11,549; the town itself has a population of approximately 4,500. Source: Wikipedia</span></div>\n</div>\n</div>"


In [19]:
ctext <-0

for (i in 1:3){
    ctext[i] <-(collections_raw[[2]][["element_texts"]][[i]]$text)
       }
ctext

In [20]:
# Construct matrix
collection_matrix <- matrix(c(ctext), ncol = 3)
collection_matrix

0,1,2
Heptonstall,"<span>Heptonstall is a small village and civil parish within the Calderdale borough of West Yorkshire, England, historically part of the West Riding of Yorkshire.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Heptonstall"">Wikipedia</a></span><br /><br />","<div class=""_cgc""> <div class=""r-iUhkTHZShunI""> <div class=""kno-rdesc r-i5RQfDurzkj4""><span>Heptonstall is a small village and civil parish within the Calderdale borough of West Yorkshire, England, historically part of the West Riding of Yorkshire. The population of Heptonstall, including the hamlets of Colden and Slack Top, is 1,448, increasing to 1,470 at the 2011 Census. The town of Hebden Bridge lies directly to the south-east. Although Heptonstall is part of Hebden Bridge as a post town, it is not within the Hebden Royd town boundaries. The village is on the route of the Calderdale Way, a 50-mile (80 km) circular walk around the hills and valleys of Calderdale. <em>Source: Wikipedia</em></span></div> </div> </div>"


In [21]:
# Name the columns with region
colnames(collection_matrix) <- cnames

In [22]:
collection_matrix

Title,Description,Subject
Heptonstall,"<span>Heptonstall is a small village and civil parish within the Calderdale borough of West Yorkshire, England, historically part of the West Riding of Yorkshire.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Heptonstall"">Wikipedia</a></span><br /><br />","<div class=""_cgc""> <div class=""r-iUhkTHZShunI""> <div class=""kno-rdesc r-i5RQfDurzkj4""><span>Heptonstall is a small village and civil parish within the Calderdale borough of West Yorkshire, England, historically part of the West Riding of Yorkshire. The population of Heptonstall, including the hamlets of Colden and Slack Top, is 1,448, increasing to 1,470 at the 2011 Census. The town of Hebden Bridge lies directly to the south-east. Although Heptonstall is part of Hebden Bridge as a post town, it is not within the Hebden Royd town boundaries. The village is on the route of the Calderdale Way, a 50-mile (80 km) circular walk around the hills and valleys of Calderdale. <em>Source: Wikipedia</em></span></div> </div> </div>"


In [23]:
collection_title <-0
collection_description <-0

for (i in 1:90){
    collection_title[i] <-collections_raw[[i]][["element_texts"]][[1]][["text"]]
    collection_description[i] <- collections_raw[[i]][["element_texts"]][[2]][["text"]]
       }
collection_title
collection_description


In [24]:
colnames <- c("Title", "Description")

z_matrix <- matrix(c(collection_title,collection_description), ncol = 2)

colnames(z_matrix) <- colnames
z_matrix


Title,Description
Hebden Bridge,"<span>Hebden Bridge is a market town which forms part of Hebden Royd in West Yorkshire, England. It is in the Upper Calder Valley, 8 miles west of Halifax and 14 miles north-east of Rochdale, at the confluence of the River Calder and the Hebden Water.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Hebden_Bridge"">Wikipedia</a></span><br /><br />"
Heptonstall,"<span>Heptonstall is a small village and civil parish within the Calderdale borough of West Yorkshire, England, historically part of the West Riding of Yorkshire.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Heptonstall"">Wikipedia</a></span><br /><br />"
Todmorden,"<span>Todmorden is a market town and civil parish in the Upper Calder Valley in Calderdale, West Yorkshire, England. It is 17 miles from Manchester and in 2011 had a population of 15,481.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Todmorden"">Wikipedia</a></span><br /><br />"
Cornholme,"<span>Cornholme is a village within the Metropolitan Borough of Calderdale, in West Yorkshire, England. It lies at the edge of Calderdale, on the boundary with Lancashire, and in the narrow Calder Valley about 2.5 miles northwest of Todmorden.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Cornholme"">Wikipedia</a></span><br /><br />"
Sowerby Bridge,"Sowerby Bridge is a market town in the Upper Calder Valley in Calderdale in West Yorkshire, England. The Calderdale Council ward population at the 2011 census was 11,703. <em>Source: Wikipedia</em>"
Walsden,"<span>Walsden is a large village in the civil parish of Todmorden in the Metropolitan Borough of Calderdale, West Yorkshire, England, though historically in Lancashire and close to the modern boundary with Greater Manchester.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Walsden"">Wikipedia</a></span><br /><br />"
Mytholmroyd,"<span>Mytholmroyd is a large village in Hebden Bridge, West Yorkshire, England, 1 mile east of Hebden Bridge and 7 miles west of Halifax.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Mytholmroyd"">Wikipedia</a></span>"
Luddendenfoot,"<span>Luddendenfoot or Luddenden Foot is a community in Calderdale, West Yorkshire, England. The population of this Calderdale Ward at the 2011 Census was 10,653.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Luddendenfoot"">Wikipedia</a></span><br /><br />"
Lumbutts,Images taken in Lumbutts village and it's immediate surrounds.
Mankinholes,"<span>Mankinholes is a hamlet in the Metropolitan Borough of Calderdale, in West Yorkshire, England. It is situated in the Pennines and the nearest town is Todmorden in Yorkshire. The hamlet is part of Calder Ward in Calderdale Parish Council.</span><span><a class=""fl q _KCd _tWc"" href=""http://en.wikipedia.org/wiki/Mankinholes"">Wikipedia</a></span><br /><br />"


In [25]:
dim (z_matrix)