In this notebook, we use the R package "SPARQL" to query the [Nomisma](http://nomisma.org/sparql) sparql endpoint. We pass the endpoint URL to a variable called 'endpoint', and we pass the FULL sparql query to a variable called 'query'.

To run the query, we just run `SPARQL(endpoint,query)`. These results are then passed to a dataframe (which you can think of as the table of the data).

In [None]:
# based on https://www.r-bloggers.com/sparql-with-r-in-less-than-5-minutes/
install.packages("ggplot2")

install.packages("remotes") #if remotes is not already installed
remotes::install_github("lvaudor/glitter")

In [None]:
library("SPARQL") # SPARQL querying package
library("ggplot2")

## Step 1 - Set up preliminaries and define query

In [None]:
# Define the endpoint
endpoint <- "http://nomisma.org/query"

In [None]:
# create query statement
# Here we are retrieving coins of RIC Augustus 1A and 1B
# Do you see where that part is specified?

query <- "PREFIX rdf:		<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX dcterms:		<http://purl.org/dc/terms/>
PREFIX nm:		<http://nomisma.org/id/>
PREFIX nmo:		<http://nomisma.org/ontology#>
PREFIX foaf:		<http://xmlns.com/foaf/0.1/>
PREFIX skos:	<http://www.w3.org/2004/02/skos/core#>

SELECT ?object ?type ?diameter ?weight ?axis ?type ?collection 
WHERE {
	{?object nmo:hasTypeSeriesItem <http://numismatics.org/ocre/id/ric.1(2).aug.1A> }
	UNION { ?object nmo:hasTypeSeriesItem <http://numismatics.org/ocre/id/ric.1(2).aug.1B> }
	?object rdf:type nmo:NumismaticObject .
	OPTIONAL { ?object nmo:hasWeight ?weight }
	OPTIONAL { ?object nmo:hasDiameter ?diameter }
	OPTIONAL { ?object nmo:hasAxis ?axis }
	OPTIONAL { ?object dcterms:identifier ?identifier }
	OPTIONAL { ?object nmo:hasCollection ?colUri .
		?colUri skos:prefLabel ?collection FILTER(langMatches(lang(?collection), 'EN'))}
	
}"


## Step 2 - Use SPARQL package to submit query and save results to a data frame

In [None]:
qd <- SPARQL(endpoint,query)
df <- qd$results

In [None]:
# check the first few rows to see what we've got
head(df)

## Step 3 - Fix data class if necessary

In [None]:
# Numbers are sometimes returned as characters
#check to see if a column is character (chr) or numeric
str(df)

In [None]:
# if any of the columns were coded as characters, but we needed them as numeric, we
# could select the relevant column with the $ and convert it like so:
# eg, if 'weight' was chr, we select the weight column and we copy it, turn it into numeric, and paste it back in place

#df$weight <- as.numeric(as.character(df$weight))
#str(df)

## Step 4 See what we've got

In [None]:
# so now we could do some statistics. Incidentally, there were empty cells that now have <NA> in them; we need to ignore them when we calculate the mean
mean(df$weight, na.rm = TRUE)
mean(df$axis, na.rm = TRUE)
mean(df$diameter, na.rm = TRUE)

In [None]:
# how many examples do we have from each collection, with how many at what diameter?
summary_table <- table(df$collection, df$diameter)
summary_table

# Step 5 - Visualize some aspects of the data

In [None]:
# Let's imagine that it is meaningful to compare the weight of the coins against the axis
ggplot(df, aes(x=df$axis, y=df$weight)) +
geom_point() +
stat_smooth() +
xlab("Coin Axis") +
ylab("Coin Weight")