<img src="http://www.organicdatacuration.org/linkedearth/images/5/51/EarthLinked_Banner_blue_NoShadow.jpg">

# Searching the LinkedEarth wiki: Guide to SPARQL queries

One the main advantages of the LinkedEarth Ontology is in its ability to search for the datasets using very specific criteria. For intance, it is possible to search all the sea surface temperature datasets covering the Holocene from a specific location.

If one would break down the query into simpler pieces, this is what they would need to look for.
1. The archive is "marine sediment"
2. The type of inferred variables are "sea surface temperature" and "age" (To look for the Holocene)
3. The InferredVariableType "age" needs to have a HasMinValue of 0 and a HasMAxValue of 10 (kyr) or 10000 (yr).
4. The dataset needs to be between specified "lat" and "lon" coordinates.

We've just thrown a lot of terms into this query that you may not be familiar with. If you want to know more about how the LinkedEarth database is organized and makes use of the LinkedEarth Ontology, visit [this page](http://wiki.linked.earth/LinkedEarth_Ontology).

The possibilities to query the database are endless; however, it requires knowlegde of the LinkedEarth Ontology and the SPARQL language. We understand that not every use need to form highly complex query. This Notebook is for you! We have identified several types of queries that are often performed and created a user-friendly (no knowledge of SPARQL required) way for you to enter specific search criteria. 

If you're new to LinkedEarth, read all the sections below. If you've already performed queries and are familiar with the database, skip to the last section.
If you don't know how to use a Jupyter Notebook, please look at [this example](https://github.com/nickmckay/LiPD-utilities/blob/master/Examples/Welcome%20Jupyter%20-%20Quickstart.ipynb).

Table of Contents:
* [How to use this Notebook?](#howto)
* [Introduction to the LinkedEarth Ontology](#ontology)
    - [The LiPD Ontology](#LiPD)
    - [The Proxy Archive Ontology](#proxyarchive)
    - [The Proxy Observation Ontology](#proxyobs)
    - [The Proxy Sensor Ontology](#proxysensor)
    - [The Instrument Ontology](#instrument)
    - [The Inferred Variable Ontology](#inferredvar)
    - [Units](#units)
* [Create your own query](#create)
* [Getting Help](#help)

## <a name=howto></a> How to use this Notebook?

This Notebook is intended for user with very little to no knowledge of SPAQRL and Python. It will help you create a text file for the query and run it.

First, you need to know a little bit more about the LinkedEarth wiki and name standardization. Let's assume you want to look for "marine sediment" with the proxy "d18O". Two things need to happen: (1) The wiki needs to understand what a proxy is and (2) it needs to search for the exact character match (i.e. d18O rathern than delta18O, delta-18O...).

We taught the wiki that a "climate proxy" is in fact composed of a Proxy Archive (e.g., marine sediment), a Proxy Sensor (e.g. Globigerinoides ruber), and a Proxy Observation (e.g. d18O) following the work of [Evans et al. (2013)](http://www.sciencedirect.com/science/article/pii/S0277379113002011).

As for standardizing the terms, we are in the process of doing so with input from the community. To see what observations, archives and sensors are already available on the wiki, we essentially created a query to ask it just that! The results are in the cells below (rerun them to have the most up-to-date snapshot of possibilities). All you have to do is then make sure that your query term follows the wiki nomenclature (i.e. query for D18O rather than d18O!).

Second, you will need to have [Python](https://www.python.org/) installed on your computer and [Jupyter Notebook](http://jupyter.org). Both come standard with the [Anaconda release](https://www.anaconda.com/download/). This Notebook was written as to minimize user inputs. In other words, you do not need to have extensive knowledge of Python to use it.

If you're interested in learning more about Jupyter Notebook, check out this [demo](https://github.com/nickmckay/LiPD-utilities/blob/master/Examples/Welcome%20Jupyter%20-%20Quickstart.ipynb). 

### Run a cell
The cell below is a Python cell that contains a simple function. In order to run its code, you must:
1. Click on the cell to select it.
2. Press SHIFT+ENTER on your keyboard or press the "run cell" button in the toolbar above.
3. Confirm that "Congrats! You ran your first code cell!" was printed below the cell.

In [1]:
print("Congrats! You ran your first code cell!")

Congrats! You ran your first code cell!


## <a name=ontology></a> Introduction to the LinkedEarth Ontology

At its most fundamental level, the LinkedEarth ontology allows to not only define terms commonly used to describing a paleoclimate dataset (e.g. variable, uncertainty, calibration) but also to specify the relationship among those terms (e.g. a variable has uncertainty). As such it allows us to male inferences, support complex queries, as well as perform quality control on the data. To learn more about the LinkedEarth ontology, visit [this page](http://wiki.linked.earth/LinkedEarth_Ontology).

The LinkedEarth Ontology is divided into several components: the [LiPD ontology](#LiPD), the [Proxy Archive Ontology](#proxyarchive), the [Proxy Observation Ontology](#proxyobs), the [Proxy Sensor Ontology](#proxysensor), the [Instrument Ontology](#instrument), and the [Inferred Variable Ontology](#inferredvariable).

### <a name=LiPD></a> The LiPD Ontology

This part of the ontology concerns itseld with the formatting of a dataset and follows the [LiPD](http://wiki.linked.earth/Linked_Paleo_Data) format very closely. Although performing a query on the terms included in this part of the ontology would be extremely rare (but not impossible), the terms are used to navigate the hierarchical struture of the wiki. 

### <a name=proxyarchive></a> The Proxy Archive Ontology

This ontology defines the different categories of archive types used in paleoclimate studies (such as marine sediment, coral,...) follwing the definition by [Evans et al. (2013)](http://www.sciencedirect.com/science/article/pii/S0277379113002011).

On the wiki, the type of archive can be specified at two levels:
1. As a top dataset property using the "archiveType" property from LiPD
2. As part of the proxy system which is formally mapped in the ontology

Because some datasets may only contain one of the two possible ways to refer to the archiveType property, we strongly suggest that you enter all possible combinations when performing the search. For instance, search for both "marine sediment" (the LiPD term) or "MarineSediment" (the LinkedEarth Ontology term).

To know which archiveTypes are currently present on the wiki, run the cell below (SHIFT+ENTER).

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p>The Ontology is organic, meaning that it is designed to "grow" as more and more records are added to the wiki and researchers need to define new terms or redefine existing ones. The [standardization effort](http://wiki.linked.earth/Paleoclimate_Data_Standards) may also results in a reduction of the number of properties. Therefore, rerun this cell to make sure that the properties you're searching for are still available or that new, equivalent properties have been added.</p>
</div> 

In [4]:
import json
import requests

url = "http://wiki.linked.earth/store/ds/query"

query = """PREFIX core: <http://linked.earth/ontology#>
PREFIX wiki: <http://wiki.linked.earth/Special:URIResolver/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT distinct ?a 
WHERE {
{
    ?dataset wiki:Property-3AArchiveType ?a.
}UNION
{
    ?w core:proxyArchiveType ?t.
    ?t rdfs:label ?a
}
}"""

response = requests.post(url, data = {'query': query})
res = json.loads(response.text)

print("The following archive types are available on the wiki:")    
for item in res['results']['bindings']:
    print ("*" + item['a']['value'])

The following archive types are available on the wiki:
*marine sediment
*coral
*lake sediment
*glacier ice
*tree
*documents
*speleothem
*sclerosponge
*borehole
*hybrid
*bivalve
*Rock
*Sclerosponge
*Speleothem
*Wood
*Coral
*MarineSediment
*LakeSediment
*GlacierIce
*Documents
*Hybrid
*MolluskShell
*Lake


### <a name=proxyobs></a> The Proxy Observation Ontology

This part of the ontology defines the various observations made on archives following the definition of [Evans et al. (2013)](http://www.sciencedirect.com/science/article/pii/S0277379113002011)

To know which Proxy Observation types are currently present on the wiki, run the cell below (SHIFT+ENTER).

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p>The Ontology is organic, meaning that it is designed to "grow" as more and more records are added to the wiki and researchers need to define new terms or redefine existing ones. The [standardization effort](http://wiki.linked.earth/Paleoclimate_Data_Standards) may also results in a reduction of the number of properties. Therefore, rerun this cell to make sure that the properties you're searching for are still available or that new, equivalent properties have been added.</p>
</div> 

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>Please remember to enter all possible search terms while our standardization process is underway (i.e. "calcium carbonate" and "CaCO3")</p>
</div>

In [5]:
import json
import requests

url = "http://wiki.linked.earth/store/ds/query"

query = """PREFIX core: <http://linked.earth/ontology#>
PREFIX wiki: <http://wiki.linked.earth/Special:URIResolver/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT distinct ?a 
WHERE 
{
    ?w core:proxyObservationType ?t.
    ?t rdfs:label ?a
}"""

response = requests.post(url, data = {'query': query})
res = json.loads(response.text)

print("The following proxy observation types are available on the wiki: ")    
for item in res['results']['bindings']:
    print ("*" + item['a']['value'])

The following proxy observation types are available on the wiki: 
*DiffuseSpectralReflectance
*JulianDay
*Al/Ca
*B/Ca
*Ba/Ca
*Mn/Ca
*Sr/Ca
*Zn/Ca
*Radiocarbon
*D18O
*Mg/Ca
*TEX86
*TRW
*Dust
*Chloride
*Sulfate
*Nitrate
*D13C
*Depth
*Age
*Mg
*Floral
*DD
*C
*N
*P
*Si
*Uk37
*Uk37Prime
*Density
*GhostMeasured
*Trsgi
*Mg Ca
*SampleCount
*Segment
*RingWidth
*Residual
*ARS
*Corrs
*RBar
*SD
*SE
*EPS
*Core
*Uk37prime
*Upper95
*Lower95
*Year old
*Thickness
*Na
*DeltaDensity
*Reflectance
*BlueIntensity
*VarveThickness
*Reconstructed
*AgeMin
*AgeMax
*SampleID
*Depth top
*Depth bottom
*R650 700
*R570 630
*R660 670
*RABD660 670
*WaterContent
*C N
*BSi
*MXD
*EffectiveMoisture
*Pollen
*Precipitation
*Unnamed
*Sr Ca
*Calcification1
*Calcification2
*Calcification3
*CalcificationRate
*Composite
*Calcification4
*Notes
*Notes1
*Calcification5
*Calcification
*Calcification6
*Calcification7
*Trsgi1
*Trsgi2
*Trsgi3
*Trsgi4
*IceAccumulation
*F
*Cl
*Ammonium
*K
*Ca
*Duration
*Hindex
*VarveProperty
*X radiograph 

### <a name=proxysensor></a> The Proxy Sensor Ontology

This parts of the ontology defines the various types of proxy sensors following the definition of [Evans et al. (2013)](http://www.sciencedirect.com/science/article/pii/S0277379113002011). Proxy sensors are divided into two mutually exclusive categories: organic and inorganic sensors.

Since queries will most likely involved organic sensors' genus and species, we limit the search to these. 

To know which SensorGenus/SensorSpecies are currently present on the wiki, run the cell below (SHIFT+ENTER).

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p>The Ontology is organic, meaning that it is designed to "grow" as more and more records are added to the wiki and researchers need to define new terms or redefine existing ones. The [standardization effort](http://wiki.linked.earth/Paleoclimate_Data_Standards) may also results in a reduction of the number of properties. Therefore, rerun this cell to make sure that the properties you're searching for are still available or that new, equivalent properties have been added.</p>
</div> 

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>Please remember to enter all possible search terms while our standardization process is underway.</p>
</div>

In [6]:
import json
import requests

url = "http://wiki.linked.earth/store/ds/query"

query = """PREFIX core: <http://linked.earth/ontology#>
PREFIX wiki: <http://wiki.linked.earth/Special:URIResolver/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT distinct ?a ?b
WHERE 
{
    ?w core:sensorGenus ?a.
    ?w core:sensorSpecies ?b .
    
}"""

response = requests.post(url, data = {'query': query})
res = json.loads(response.text)
    
print("The available sensor genus/species are: ")    
for item in res['results']['bindings']:
    print ("*" + 'Genus: '+item['a']['value']+' Species: ' +item['b']['value'])

The available sensor genus/species are: 
*Genus: Siderastrea Species: Siderastrea siderea
*Genus: Siderastrea Species: siderea
*Genus: Siderastrea Species: Siderastrea radians
*Genus: Globigerinoides Species: ruber
*Genus: Globigerinoides Species: sacculifer
*Genus: Ceratoporella Species: nicholsoni
*Genus: Ceratoporella Species: Ceratoporella nicholsoni
*Genus: Diploria Species: labyrinthiformis
*Genus: Diploria Species: Diploria labyrinthiformis
*Genus: Diploria Species: Diploria strigosa
*Genus: Porites Species: lutea
*Genus: Porites Species: Porites austraiensis
*Genus: Porites Species: NA
*Genus: Porites Species: Porites
*Genus: Porites Species: Porites sp.
*Genus: Porites Species: Porites lobata
*Genus: Porites Species: Porites lutea
*Genus: Porites Species: NaN
*Genus: Porites Species: P. australiensis, possibly P. lobata
*Genus: Porites Species: Porites australiensis
*Genus: Porites Species: lobata
*Genus: Cibicidoides Species: mundulus
*Genus: Cibicidoides Species: wuellerstor

### <a name=instrument></a> The Instrument Ontology

This part of the ontology concerns itself with the various types of instruments used to produce the proxy observations. Although searching the wiki by type of instruments is certainly possible, it is not an often made request. 

### <a name=inferredvar></a> The Inferred Variable Ontology

This part of the ontology aims to provide a taxonomy of the various inferred variables.

To know which types of inferred variables are currently present on the wiki, run the cell below (SHIFT+ENTER).

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p>The Ontology is organic, meaning that it is designed to "grow" as more and more records are added to the wiki and researchers need to define new terms or redefine existing ones. The [standardization effort](http://wiki.linked.earth/Paleoclimate_Data_Standards) may also results in a reduction of the number of properties. Therefore, rerun this cell to make sure that the properties you're searching for are still available or that new, equivalent properties have been added.</p>
</div> 

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>Please remember to enter all possible search terms while our standardization process is underway.</p>
</div>

In [7]:
import json
import requests

url = "http://wiki.linked.earth/store/ds/query"

query = """PREFIX core: <http://linked.earth/ontology#>
PREFIX wiki: <http://wiki.linked.earth/Special:URIResolver/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT distinct ?a 
WHERE 
{
    ?w core:inferredVariableType ?t.
    ?t rdfs:label ?a
}"""

response = requests.post(url, data = {'query': query})
res = json.loads(response.text)

print("The following Inferred Variable types are available on the wiki: ")    
for item in res['results']['bindings']:
    print ("*" + item['a']['value'])

The following Inferred Variable types are available on the wiki: 
*Year
*Radiocarbon Age
*D18O
*Sea Surface Temperature
*Age
*Temperature
*Salinity
*Uncertainty temperature
*Temperature1
*Temperature2
*Temperature3
*Uncertainty temperature1
*Thermocline Temperature
*Sedimentation Rate
*Relative Sea Level
*Sea Surface Salinity
*Subsurface Temperature
*Accumulation rate
*Carbonate Ion Concentration
*Mean Accumulation Rate
*Accumulation rate, total organic carbon
*Accumulation rate, calcium carbonate


In addition, the wiki can be queried according to the interpretation field. 

To know which types of interpretation are currently present on the wiki, run the cell below (SHIFT+ENTER).

The Interpretation category has two important fields:
1. The name of the interpretation (e.g. temperature, salinity...)
2. And details about the interpretation (e.g. sea surface, air surface, thermocline)

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**NOTE**</p>
<p>The Ontology is organic, meaning that it is designed to "grow" as more and more records are added to the wiki and researchers need to define new terms or redefine existing ones. The [standardization effort](http://wiki.linked.earth/Paleoclimate_Data_Standards) may also results in a reduction of the number of properties. Therefore, rerun this cell to make sure that the properties you're searching for are still available or that new, equivalent properties have been added.</p>
</div> 

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>Please remember to enter all possible search terms while our standardization process is underway.</p>
</div>

In [9]:
import json
import requests

url = "http://wiki.linked.earth/store/ds/query"

query = """PREFIX core: <http://linked.earth/ontology#>
PREFIX wiki: <http://wiki.linked.earth/Special:URIResolver/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT distinct ?a ?b
WHERE 
{
    ?w core:name ?a.
    ?w core:detail ?b .
    
}"""

response = requests.post(url, data = {'query': query})
res = json.loads(response.text)
    
print("The following interpretation are available on the wiki: ")    
for item in res['results']['bindings']:
    print ("*" + 'Name: '+item['a']['value']+' Detail: ' +item['b']['value'])

The following interpretation are available on the wiki: 
*Name: core Detail: middle of sample
*Name: Salinity Detail: sea surface
*Name: Salinity Detail: Sea Surface
*Name: Salinity Detail: Sea surface
*Name: Salinity Detail: surface water
*Name: Temperature Detail: sea surface
*Name: Temperature Detail: Sea Surface
*Name: Temperature Detail: bottom water
*Name: Temperature Detail: Sea surface
*Name: Temperature Detail: thermocline
*Name: Temperature Detail: surface water
*Name: Temperature Detail: subsurface
*Name: Calendar Detail: Age
*Name: Age Detail: Calendar
*Name: Age Detail: calendar
*Name: age Detail: calendar
*Name: D18O Detail: sea surface
*Name: d180w Detail: Sea Surface
*Name: d18O Detail: top of sample; signal believed to primarily reflect temperature, but influence of other factors cannot be excluded.
*Name: d18O Detail: deviation of oxygen isotope ratio 18O:16O in H2O sample compared to standard mean ocean water (V-SMOW)
*Name: d18O Detail: annual resolution
*Name: d18O

### <a name=units></a> Units 

Searching for numerical quantities imply knowing the units in which the quantity is expressed. All LiPD files contained basic statistics about the numbers reported in the csv files: the mean, median, min and max values.

This fact becomes important when wanting to do an age query. However, the query is meaningless if the number doesn't have units. For instance, searching for all records covering the 0-10000 yr BP period is equivalent to searching for 0-10 kyr BP or 0-10 ka. 

Standardization of units is one of the goals of the LinkedEarth project. In the meantime, you need to perform the query with all possible unit combinations.

To see the possible units for age (or year), press SHIFT+ENTER to run the cell below. 

In [10]:
import json
import requests

url = "http://wiki.linked.earth/store/ds/query"

query = """PREFIX core: <http://linked.earth/ontology#>
PREFIX wiki: <http://wiki.linked.earth/Special:URIResolver/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT distinct ?b
WHERE 
{
    ?w core:inferredVariableType ?a.
    ?w core:hasUnits ?b .
    {
    ?a rdfs:label "Age" .
    }UNION
    {
    ?a rdfs:label "Year" .
    }
    
}"""

response = requests.post(url, data = {'query': query})
res = json.loads(response.text)
    
print("The following units are used to express age/year on the wiki: ")    
    
for item in res['results']['bindings']:
    print ("*" + item['b']['value'])

The following units are used to express age/year on the wiki: 
*AD
*yr B.P.
*yr BP
*kyr BP
*yr
*BP
*kyr
*yrs BP
*kaBP
*yr AD
*year
*yrs


## <a name = create></a> Create your own query

Now that you have all the terms that you need to query the datasets stored on the wiki, you're ready to proceed. All you need to do is enter the query terms in the cell below. 

If you need to use more than one query terms, include all of them separated by a comma following the example below (**note**: feature not available for all query terms). 

If you do not want to use this particular query, leave a blank in-between the brackets. **DO NOT comment out or delete the variable!**

Instructions:
* `archiveType`: The type of archive (enter all query terms, separated by a comma)
* `proxyObsType`: The type of proxy observation (enter all query terms, separated by a comma)
* `infVarType`: The type of inferred variable (enter all query terms, separated by a comma)
* `sensorGenus`: The Genus of the sensor (enter all query terms, separated by a comma)
* `sensorSpecies`: The Species of the sensor (enter all query terms, separated by a comma)
* `interpName`: The name of the interpretation (enter all query terms, separated by a comma)
* `interpDetail`: The detail of the interpretation (enter all query terms, separated by a comma)
* `ageUnits`: The units of in which the age (year) is expressed in.

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>You need to separate each query if need to run across multiple age queries (i.e., yr B.P. vs kyr B.P.). If the units are different but the meaning is the same (e.g., yr B.P. vs yr BP, enter all search terms separated by a comma).</p>
</div>

* `ageBound`: Enter the minimum and maximum age value to search for.

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>You **MUST** enter a minimum **and** maximum value. If you wish to perform a query such as "all ages before 2000 A.D.", enter a minimum value of -99999 to cover all bases.</p>
<p>If ageBound is provided, ageUnits is **mandatory**.</p>
</div>

* `recordLength`: The minimum length the record needs to have while matching the ageBound criteria. For instance, "look for all records between 3000 and 6000 year BP with a record length of at least 1500 year".
* `resolution`: The maximum resolution of the resord. Resolution has the same units as age/year. For instance, "look for all records with a resolution of at least 100 years".

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>Resolution applies to specific variables rather than an entire dataset. Imagine the case where some measurements are made every cm while others are made every 5cm. If you require a specific variable to have the needed resolution, make sure that either the proxyObservationType, inferredVariableType, and/or Interpretation fields are completed.</p>
</div>

* `lat`: The minimum and maximum latitude. South is expressed with negative numbers.

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>You **MUST** enter a minimum **and** maximum value. If you wish to perform a query looking for records from the Northern Hemisphere, enter [0,90].</p>
</div>

* `lon`: The minimum and maximum longitude. West is expressed with negative numbers.

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>You **MUST** enter a minimum **and** maximum value. If you wish to perform a query looking for records from the Western Hemisphere, enter [-180,0].</p>
</div>

* `alt`: The minimum and maximum altitude. Depth below sea level is expressed as negative numbers.

<div class="alert alert-warning" role="alert" style="margin: 10px">
<p>**Warning**</p>
<p>You **MUST** enter a minimum **and** maximum value. If you wish to perform a query looking for records below a certain depth (e.g., 500), enter [-99999,-500].</p>
</div>

In [2]:
# By archive
archiveType = ["marine sediment","Marine Sediment"]

# By variable
proxyObsType = ["Mg/Ca", "Mg Ca"]
infVarType = ["Sea Surface Temperature"]

# By sensor
sensorGenus=["Globigerinoides"]
sensorSpecies=["ruber"]

# By interpretation
interpName =["temperature", "Temperature"]
interpDetail =["sea surface"]

# By Age
ageUnits = ["yr BP"]
ageBound = [3000,6000] #Must enter minimum and maximum age search
recordLength = [1500]

# By resolution
#Make sure the resolution makes sense with the age units
# Will look for records with a max resolution of number entered
resolution = [100]
 
#By location
#Enter latitude boundaries below.
#If searching for entire latitude band, leave blank.
#Otherwise, enter both lower and upper bonds!!!!
#Enter south latitude as negative numbers
lat = [-30, 30]

#Enter Longitude boundaries below
# If searching for entire longitude band, leave blank
# Otherhwise, enter both lower and upper bonds!!!!
# Enter west longitude as negative numbers
lon = [100,160]

# Enter altitude boundaries below
# If not searching for specific altitude, leave blank
# Otherwise, enter both lower and upper bonds!!!!
# Enter depth in the ocean as negative numbers
# All altitudes on the wiki are in m!
alt = [-10000,0]

The cell below checks for some of the requirements as specified in the instructions. If you get an error message, fix the query before continuing.

In [3]:
#Make sure that all conditions are met
if len(ageBound)==1:
    sys.exit("You need to provide a minimum and maximum boundary.")

if ageBound and not ageUnits:
    sys.exit("When providing age limits, you must also enter the units")

if recordLength and not ageUnits:
    sys.exit("When providing a record length, you must also enter the units")    

if ageBound and ageBound[0]>ageBound[1]:
    ageBound = [ageBound[1],ageBound[0]]    

if recordLength and ageBound and recordLength[0] > (ageBound[1]-ageBound[0]):
    sys.exit("The required recordLength is greater than the provided age bounds")    

if len(resolution)>1:
    sys.exit("You can only search for a maximum resolution one at a time.")
    
if len(lat)==1:
    sys.exit("Please enter a lower AND upper boundary for the latitude search")

if lat and lat[1]<lat[0]:
    lat = [lat[1],lat[0]]
   
if len(lon)==1:
    sys.exit("Please enter a lower AND upper boundary for the longitude search")

if lon and lon[1]<lon[0]:
    lon = [lon[1],lon[0]]

if len(alt)==1:
    sys.exit("Please enter a lower AND upper boundary for the altitude search")

if alt and alt[1]<alt[0]:
    alt = [alt[1],alt[0]] 

Run the cell below to perform the query.

In [4]:
import json
import requests
import sys

url = "http://wiki.linked.earth/store/ds/query"

query = """PREFIX core: <http://linked.earth/ontology#>
PREFIX wiki: <http://wiki.linked.earth/Special:URIResolver/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

SELECT  distinct ?dataset
WHERE {
"""

### Look for data field
dataQ=""
if archiveType or proxyObsType or infVarType or sensorGenus or sensorSpecies or interpName or interpDetail or ageUnits or ageBound or recordLength or resolution:
    dataQ = "?dataset core:includesChronData|core:includesPaleoData ?data."

### Look for variable
## measuredVariable
measuredVarQ=""
if proxyObsType or archiveType or sensorGenus or sensorSpecies or interpName or interpDetail or resolution:
    measuredVarQ = "?data core:foundInMeasurementTable / core:includesVariable ?v."

## InferredVar
inferredVarQ=""
if infVarType or interpName or interpDetail or resolution:
    inferredVarQ = "?data core:foundInMeasurementTable / core:includesVariable ?v1."

### Archive Query
archiveTypeQ=""
if len(archiveType)>0:
    #add values for the archiveType
    query += "VALUES ?a {"
    for item in archiveType:
        query +="\""+item+"\" "
    query += "}\n"
    # Create the query    
    archiveTypeQ = """
#Archive Type query
{
    ?dataset wiki:Property-3AArchiveType ?a.
}UNION
{
    ?p core:proxyArchiveType / rdfs:label ?a.
}"""

### ProxyObservationQuery
proxyObsTypeQ=""
if len(proxyObsType)>0:
   #  add values for the proxyObservationType
   query+="VALUES ?b {"
   for item in proxyObsType:
       query += "\""+item+"\""
   query += "}\n"
   # Create the query
   proxyObsTypeQ="?v core:proxyObservationType/rdfs:label ?b." 

### InferredVariableQuery
infVarTypeQ=""
if len(infVarType)>0:
    query+="VALUES ?c {"
    for item in infVarType:
        query+="\""+item+"\""
    query+="}\n"
    # create the query
    infVarTypeQ="""
?v1 core:inferredVariableType ?t.
?t rdfs:label ?c.
"""  
### ProxySensorQuery
sensorQ=""
if len(sensorGenus)>0 or len(sensorSpecies)>0:
    sensorQ="""
?p core:proxySensorType ?sensor.    
"""    
## Genus query
genusQ=""
if len(sensorGenus)>0:
    query+="VALUES ?genus {"
    for item in sensorGenus:
        query+="\""+item+"\""
    query+="}\n"
    # create the query
    genusQ = "?sensor core:sensorGenus ?genus."
    
## Species query
speciesQ=""
if len(sensorSpecies)>0:
    query+=  "VALUES ?species {"
    for item in sensorSpecies:
        query+="\""+item+"\""
    query+="}\n"
    #Create the query
    speciesQ = "?sensor core:sensorSpecies ?species."  

### Proxy system query
proxySystemQ = ""
if len(archiveType)>0 or len(sensorGenus)>0 or len(sensorSpecies)>0:
    proxySystemQ="?v ?proxySystem ?p."
    
### Deal with interpretation 
## Make sure there is an interpretation to begin with
interpQ = ""
if len(interpName)>0 or len(interpDetail)>0:
    interpQ= """
{?v1 core:interpretedAs ?interpretation}
UNION
{?v core:interpretedAs ?interpretation}
"""
    
## Name
interpNameQ=""
if len(interpName)>0:
    query+= "VALUES ?intName {"
    for item in interpName:
        query+="\""+item+"\""
    query+=  "}\n"
    #Create the query
    interpNameQ = "?interpretation core:name ?intName."

## detail
interpDetailQ = ""
if len(interpDetail)>0:
    query+= "VALUES ?intDetail {"
    for item in interpDetail:
        query+="\""+item+"\""
    query+="}\n"
    #Create the query
    interpDetailQ = "?interpretation core:detail ?intDetail."  
    
### Age
## Units
ageUnitsQ = ""
if len(ageUnits)>0:
    query+= "VALUES ?units {"
    for item in ageUnits:
        query+="\""+item+"\""
    query+="}\n" 
    query+="""VALUES ?ageOrYear{"Age" "Year"}\n"""
    # create the query
    ageUnitsQ ="""    
?data core:foundInMeasurementTable / core:includesVariable ?v2.
?v2 core:inferredVariableType ?aoy.
?aoy rdfs:label ?ageOrYear.
?v2 core:hasUnits ?units .
"""  
## Minimum and maximum
ageQ = ""
if len(ageBound)>0 and len(recordLength)>0:
    ageQ="""
?v2 core:hasMinValue ?e1.
?v2 core:hasMaxValue ?e2.
filter(?e1<=""" +str(ageBound[0])+ """&& ?e2>="""+str(ageBound[1])+""" && abs(?e1-?e2)>="""+str(recordLength[0])+""").
"""  
elif len(ageBound)>0 and len(recordLength)==0:
    ageQ="""
?v2 core:hasMinValue ?e1.
?v2 core:hasMaxValue ?e2.
filter(?e1<=""" +str(ageBound[0])+ """&& ?e2>="""+str(ageBound[1])+""").
"""

### Resolution
resQ=""
if len(resolution)>0:
    resQ = """
{
?v core:hasResolution/(core:hasMeanValue |core:hasMedianValue) ?resValue.
filter (xsd:float(?resValue)<100)
}
UNION
{
?v1 core:hasResolution/(core:hasMeanValue |core:hasMedianValue) ?resValue1.
filter (xsd:float(?resValue1)<"""+str(resolution[0])+""")
}    
"""    

### Location
locQ=""
if lon or lat or alt:
       locQ = "?dataset core:collectedFrom ?z."
              
## Altitude
latQ=""
if len(lat)>0:
    latQ="""
?z <http://www.w3.org/2003/01/geo/wgs84_pos#lat> ?lat. 
filter(xsd:float(?lat)<"""+str(lat[1])+""" && xsd:float(?lat)>"""+str(lat[0])+""").     
"""

##Longitude
lonQ=""
if len(lon)>0:
    lonQ = """
?z <http://www.w3.org/2003/01/geo/wgs84_pos#long> ?long. 
filter(xsd:float(?long)<"""+str(lon[1])+""" && xsd:float(?long)>"""+str(lon[0])+""").   
"""

## Altitude
altQ=""
if len(alt)>0:
    altQ="""
?z <http://www.w3.org/2003/01/geo/wgs84_pos#alt> ?alt. 
filter(xsd:float(?alt)<"""+str(alt[1])+""" && xsd:float(?alt)>"""+str(alt[0])+""").       
"""    
        
query += """
?dataset a core:Dataset.  
"""+dataQ+"""
"""+measuredVarQ+"""
# By proxyObservationType
"""+proxyObsTypeQ+"""
"""+inferredVarQ+"""
# By InferredVariableType
"""+infVarTypeQ+"""
# Look for the proxy system model: needed for sensor and archive queries
"""+proxySystemQ+"""
# Sensor query
"""+sensorQ+"""
"""+genusQ+"""
"""+speciesQ+"""
# Archive query (looks in both places)
"""+archiveTypeQ+"""
# Interpretation query
"""+interpQ+"""
"""+interpNameQ+"""
"""+interpDetailQ+"""
# Age Query
"""+ageUnitsQ+"""
"""+ageQ+"""
# Location Query
"""+locQ+"""
#Latitude
"""+latQ+"""
#Longitude
"""+lonQ+"""
#Altitude
"""+altQ+"""
#Resolution Query
"""+resQ+"""
}"""

#print(query)
response = requests.post(url, data = {'query': query})
res = json.loads(response.text)

print("Click on the links below to access the datasets.")
            
for item in res['results']['bindings']:
    print (item['dataset']['value'])

Click on the links below to access the datasets.
http://wiki.linked.earth/Special:URIResolver/MD982181.Khider.2014
http://wiki.linked.earth/Special:URIResolver/BJ8-2D03-2D13GGC.Linsley.2010
http://wiki.linked.earth/Special:URIResolver/A7.Oppo.2005


## <a name= help></a> Getting Help

If your query doesn't return any datasets (and you know it should), first check that all the query terms are spelled correctly (the search is case sensitive!)

If you're having trouble with this Notebook, [contact us](linkedearth@gmail.com). 