# Run ground truth examples and get expected results
We have a set of ground truth Uniprot queries in ```resources/groud-truth.yaml```. In that file, we map a natural language question to the equivalent SPARQL query to run on the Uniprot dataset. Here is an excerpt of that file:

```
question: Show me all transmembrane regions
  SPARQL: |
    SELECT  ?protein ?begin ?end
    WHERE
    {
      ?protein  rdf:type     up:Protein ;
                up:annotation  ?annotation .
      ?annotation  rdf:type  up:Transmembrane_Annotation .
      ?annotation up:range ?range .
      ?range faldo:begin ?begin .
      ?range faldo:end ?end .
    }
```

In this notebook we simply run all the SPARQL queries to gather their results. Effectively we are retrieving "ground truth" results. In another notebook, we will ask the LLM to generate SPARQL queries from natural language queries. To measure how effectively the LLM generated the queries, we will compare the results of those queries to the ground truth results here. 

We run on two RDF databases:

1. The public Uniprot SPARQL endpoint: https://sparql.uniprot.org/sparql
2. Our own Neptune database cluster, loaded with Uniprot data. If you want to setup this in your account, follow the steps in README.md and run the notebook uniprot_loader.ipynb to load the files.

See README.md for instructions how to setup notebook, Neptune cluster, and Bedrock to run these examples.


## Define functions to run the tests

In [None]:
import yaml
import boto3
import utilities as u
import importlib
from pathlib import Path

# get ground truth and prefixes
resources = Path.cwd() / "resources"
pfx = (Path.cwd() / "resources" / "prefixes.txt").read_text()
ground_truth = yaml.safe_load((resources / "ground-truth.yaml").read_text())
session=boto3.Session()

# Build a SPARQL statemen that consists of: 1/ A big set of prefixes, 2/ the ground-truth SPARQL query (that does not include prefixes), 3/ a limit
def form_sparql(q):
    return f"""
    {pfx}

    {q['SPARQL']}

    LIMIT 20
""".strip()
    
# Run a single ground truth test against the Neptune cluster. 
# index locates one example from ground-truth.yaml
# results are written to the file expected/{index}.json. It overwrites previous result, if any
def run_one_test(index):

    q = ground_truth[index]
    nlq=q['question']
    sparql=form_sparql(q)

    try:
        res=u.execute_sparql(sparql, session) 
        u.write_sparql_res("expected", str(index), nlq, q['SPARQL'], sparql, res, "")
    except Exception as e:
        print(f"Error on {index}")
        print("Exception: {}".format(type(e).__name__))
        print("Exception message: {}".format(e))
        error_msg="Exception message: {}".format(e).replace("\n", " ")
        u.write_sparql_res("expected", str(index), nlq, q['SPARQL'], sparql, {}, error_msg)
        print(error_msg)

# Run all ground truth results against the Neptune cluster. Results are written to expected/*.json. 
def run_tests():
    for index, q in enumerate(ground_truth):
        run_one_test(index)

# Run a single ground truth test against the public Uniprot SPARQL endpoint. 
# index locates one example from ground-truth.yaml
# results are written to the file up/{index}.json. It overwrites previous result, if any
def run_one_up_test(index):

    q = ground_truth[index]
    nlq=q['question']
    sparql=form_sparql(q)

    try:
        res=u.execute_sparql_uniprotref(sparql)
        u.write_sparql_res("up", str(index), nlq, q['SPARQL'], sparql, res, "")
    except Exception as e:
        print(f"Error on {index}")
        print("Exception: {}".format(type(e).__name__))
        print("Exception message: {}".format(e))
        error_msg="Exception message: {}".format(e).replace("\n", " ")
        u.write_sparql_res("up", str(index), nlq, q['SPARQL'], sparql, {}, error_msg)
        print(error_msg)

# Run all ground truth results against the public Uniprot SPARQL endpoint. Results are written to up/*.json. 
def run_up_tests():
    for index, q in enumerate(ground_truth):
        run_one_up_test(index)


In [None]:
importlib.reload(u)

## Make directories for expected and up results

In [None]:
%%bash
rm -rf expected up
mkdir -p expected up

## Run each of the ground truth queries on the Uniprot reference site
This will take several minutes

In [None]:
run_up_tests()

In [None]:
## Optional: Run each of the ground truth queries on Neptune. 
This will take several minutes

In [None]:
run_tests()

## Run just one test
A convenient way to re-run a failed test.
For example, re-run test 4 against Neptune or 34 against Uniprot ref.

The indices are to questions in the ground-truth YAML file in resources folder.

In [None]:
run_one_test(4)

In [None]:
run_one_up_test(34)