# Query results

Load libraries and SPARQL queries, and send them to RDFox to be answered against the pre-converted data:

In [1]:
from pathlib import Path
import pandas as pd
from probs_ontology.runner.probs_runner import probs_query_data

queries = {
    p.stem: p.read_text()
    for p in Path("queries").glob("*.rq")
}

results = probs_query_data("../data/probs_data.nt.gz", queries)

In [2]:
obs_short_labels = {
    "https://ukfires.org/probs/ontology/data/bgs/Observation-29cc1ee823612f1307925b7c5b003feb9668a06cb991da0b6b9af30033fde2a0": "Obs 2",
    "https://ukfires.org/probs/ontology/ComposedInferredObservation--7af16db03ac2ac2a9a773645b069b7dacffa239f5b98b9887fa4a0323b787ce7": "Obs 3",
    "https://ukfires.org/probs/ontology/ComposedInferredObservation-prodcom/2017/Object-0cefb7bea0582e08d7878e4c3f684c2307edb305bfddd5e3ea6f3efb8f9b02c1-d92d8d1a049b5c171ed7dfde5057cddf9984bb5fdd104af98b622c1be88800a7": "Obs 1",
    "https://ukfires.org/probs/ontology/data/bgs/Observation-c2bb6910b2b19133e460750a4dd799924afd9c4aceae0736aea91635592cd1ff": "Obs 4",
    "https://ukfires.org/probs/ontology/ComposedInferredObservation-prodcom/2017/Object-00613791c18e3cf39874c66a176e7229189e5fed28a45ef7921e3f97e9143eab-e9398b5c9aa49bcd1b2f0dc8fa74b41a9faa021848620496ad682721f2cf9a27": "Obs 5",
    "https://ukfires.org/probs/ontology/ComposedInferredObservation-prodcom/2017/Object-00613791c18e3cf39874c66a176e7229189e5fed28a45ef7921e3f97e9143eab-e53dd7ea6154ae201c77e77a2b7260c5304cfd12b67be85965c4089720d9fa19": "Obs 6"
}
def tidydf(results):
    df = pd.DataFrame(results)
    if "Observation" in df:
        df.Observation = [obs_short_labels.get(str(x), x) for x in df.Observation]
    return df

## Retrieve original data points

First, let's check we can retrieve the original data points in a consistent way. From the BGS Minerals Yearbook:

```{literalinclude} queries/original_bgs.rq
:language: sparql
```

In [3]:
tidydf(results["original_bgs"])

Unnamed: 0,Object,Year,Value
0,Crushed stone in BGS,2014-01-01,99000000000.0
1,Igneous rock in BGS,2014-01-01,38000000000.0
2,Limestone & Dolomite in BGS,2014-01-01,53000000000.0
3,Sand & Gravel in BGS,2014-01-01,56000000000.0
4,Sandstone in BGS,2014-01-01,10000000000.0


We can also retrieve an original data point from Prodcom linked to a classification code: 

```{literalinclude} queries/original_prodcom.rq
:language: sparql
```

In [4]:
tidydf(results["original_prodcom"])

Unnamed: 0,Object,Year,Value
0,PRODCOM Object from Code 08121210,2014-01-01,24192579000
1,PRODCOM Object from Code 08121210,2017-01-01,31355644000


## Inferred observations

Now let's query for all the observations that would be relevant to modelling production of two object types, {system:ref}`CrushedStone` and {system:ref}`SandAndGravel`, in all years that are available. Here is the SPARQL query:

```{literalinclude} queries/object_observations.rq
:language: sparql
```

And the results:

In [5]:
tidydf(results["object_observations"])

Unnamed: 0,Observation,Object,Year,Bound,Value
0,Obs 3,Crushed stone,2014-01-01,https://ukfires.org/probs/ontology/ExactBound,101000000000.0
1,Obs 2,Crushed stone,2014-01-01,https://ukfires.org/probs/ontology/ExactBound,99000000000.0
2,Obs 1,Crushed stone,2018-01-01,https://ukfires.org/probs/ontology/ExactBound,69722551000.0
3,Obs 5,Sand & Gravel,2014-01-01,https://ukfires.org/probs/ontology/ExactBound,51010054000.0
4,Obs 4,Sand & Gravel,2014-01-01,https://ukfires.org/probs/ontology/ExactBound,56000000000.0


We can see where these values have come from:

```{literalinclude} queries/prov.rq
:language: sparql
```

This results in:

In [6]:
df = tidydf(results["prov"])
df

Unnamed: 0,Observation,WDFObject,WDFValue
0,Obs 3,Igneous rock in BGS,38000000000.0
1,Obs 3,Limestone & Dolomite in BGS,53000000000.0
2,Obs 3,Sandstone in BGS,10000000000.0
3,Obs 5,PRODCOM Object from Code 08121190,26817475000.0
4,Obs 5,PRODCOM Object from Code 08121210,24192579000.0
5,Obs 1,PRODCOM Object from Code 08121230,68391115000.0
6,Obs 1,PRODCOM Object from Code 08121250,0.0
7,Obs 1,PRODCOM Object from Code 08121290,1331436000.0
8,Obs 2,Crushed stone in BGS,99000000000.0
9,Obs 4,Sand & Gravel in BGS,56000000000.0


It does indeed add up to the values shown above:

In [7]:
df.groupby("Observation", as_index=False)["WDFValue"].sum()

Unnamed: 0,Observation,WDFValue
0,Obs 1,69722551000.0
1,Obs 2,99000000000.0
2,Obs 3,101000000000.0
3,Obs 4,56000000000.0
4,Obs 5,51010054000.0


## Further aggregation

We can further query for the aggregates observations of {system:ref}`Aggregates`:

```{literalinclude} queries/object_observations_aggregates.rq
:language: sparql
```

This results in:

In [8]:
tidydf(results["object_observations_aggregates"])

Unnamed: 0,Year,Bound,Value
0,2014-01-01,https://ukfires.org/probs/ontology/ExactBound,157000000000.0
1,2014-01-01,https://ukfires.org/probs/ontology/ExactBound,150010054000.0
2,2014-01-01,https://ukfires.org/probs/ontology/ExactBound,152010054000.0
3,2014-01-01,https://ukfires.org/probs/ontology/ExactBound,155000000000.0
4,2018-01-01,https://ukfires.org/probs/ontology/LowerBound,69722551000.0
