## Analyzing healthcare FHIR data with Amazon Neptune

This Jupyter Notebook extends the walkthrough described in the blog on [Analyzing healthcare FHIR data with Amazon Neptune](Link TBD). Go through the set up steps 1-3 described in the blog before issuing the queries below.

## 1. Basic introduction to SPARQL

SPARQL is used to retrieve triples from an RDF graph. 
A triple is a statement consisting of subject, predicate, and object. All elements are referenced by URIs. For further details refer to the official [SPARQL specification](https://www.w3.org/TR/rdf-sparql-query/#construct).

The following query retrieves ten random triple from the Amazon Neptune graph database. 

In [None]:
%%sparql --expand-all

SELECT *
WHERE
{ ?s ?p ?o . }
LIMIT 10

You can specify which triples you want to retrieve by specifiying subject, predicate, and/or object. In the example below we introduce a variable for the subject. The query retrieves all triples of a variable subject that is related to the object <http://hl7.org/fhir/QuestionnaireResponse> via the predicate <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>. This triple matches subjects that are of the type QuestionnaireResponse. Instead of returning all values, we only return the values of the subjects. In this case ten questionnaire response IDs.

In [None]:
%%sparql --expand-all

SELECT ?questionnaireResponse
WHERE
{ 
    ?questionnaireResponse <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://hl7.org/fhir/QuestionnaireResponse> .
}
LIMIT 10

For better readability, we introduce two Prefixes, fhir and rdf, that can be used in the WHERE clause.

In [None]:
%%sparql --expand-all

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

SELECT ?questionnaireResponse
WHERE
{ 
        ?questionnaireResponse rdf:type fhir:QuestionnaireResponse .
}
LIMIT 10

Instead of SELECT you can use CONSTRUCT to return a new RDF graph. You can specify the format of this graph in the CONSTRUCT section. 

You can use slahes to combine multiple predicates that should be followed by the query.

The query below constructs a new graph based on the information of patient to questionnaire responses mapping. Navigate to the *Graph* tab to view the graph visualization.

In [None]:
%%sparql --expand-all

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX qr: <http://hl7.org/fhir/QuestionnaireResponse.>

CONSTRUCT   { 
    ?questionnaireResponse fhir:value ?patient .
    
    ?questionnaireResponse a fhir:QuestionnaireResponse ;
    ?patient a fhir:Patient .
}
WHERE       { 
    ?questionnaireResponse qr:subject/fhir:Reference.reference/fhir:value ?patient .
}

## 2. Sample Queries

### Identify patients, that work(ed) in same industry

In [4]:
%%sparql --expand-all

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX qr: <http://hl7.org/fhir/QuestionnaireResponse.>

CONSTRUCT {
    ?questionnaireResponse fhir:value ?patient ;
        fhir:value ?industryAnswer .
    
    ?questionnaireResponse a fhir:QuestionnaireResponse ;
    ?patient a fhir:Patient .
}
WHERE {
    ?questionnaireResponse qr:subject/fhir:Reference.reference/fhir:value ?patient ;
        qr:item/qr:item.item ?item8_2 .
    ?item8_2 qr:item.item.answer/qr:item.item.answer.valueString/fhir:value ?industryAnswer ;
       qr:item.item.linkId/fhir:value "8.2" .
}

Tab(children=(Output(layout=Layout(max_height='600px', max_width='940px', overflow='scroll')), Force(network=<…

### Identify practitioners, that authored questionnaires of patients working in same industry

In [None]:
%%sparql --expand-all

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX qr: <http://hl7.org/fhir/QuestionnaireResponse.>

CONSTRUCT {
    ?questionnaireResponse fhir:value ?patient ;
        fhir:value ?practitioner ;
        fhir:value ?industryAnswer .
    
    ?questionnaireResponse a fhir:QuestionnaireResponse ;
    ?patient a fhir:Patient .
    ?practitioner a fhir:Practitioner .
}
WHERE {
    ?questionnaireResponse qr:subject/fhir:Reference.reference/fhir:value ?patient ;
                           qr:author/fhir:Reference.reference/fhir:value ?practitioner ;
                           qr:item/qr:item.item ?item8_2 .
    ?item8_2 qr:item.item.answer/qr:item.item.answer.valueString/fhir:value ?industryAnswer ;
       qr:item.item.linkId/fhir:value "8.2" .
}

### Identify industries with common hazards

In [5]:
%%sparql --expand-all

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX qr: <http://hl7.org/fhir/QuestionnaireResponse.>


CONSTRUCT {
    ?parentItem8 fhir:value ?industryAnswer ;
                 fhir:value ?hazardAnswer .
    
   # ?parentItem8 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://hl7.org/fhir/QuestionnaireResponse> .
   # ?hazardAnswer <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://hl7.org/fhir/Other> .
   # ?industryAnswer <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://hl7.org/fhir/Other> .
}
WHERE {
    ?industryAnswer ^fhir:value/^qr:item.item.answer.valueString/^qr:item.item.answer ?item8_2 .
    ?item8_2 qr:item.item.linkId/fhir:value "8.2" ;
             ^qr:item.item ?parentItem8 .
    ?parentItem8 qr:item.item ?item8_3 .
    ?item8_3 qr:item.item.linkId/fhir:value "8.3" ;
             qr:item.item.answer/qr:item.item.answer.valueString/fhir:value ?hazardAnswer .
             
    FILTER('None' != ?hazardAnswer)
}

Tab(children=(Output(layout=Layout(max_height='600px', max_width='940px', overflow='scroll')), Force(network=<…

### Get questionnaires with similar answers for question group compared to single questionnaire

In [None]:
%%sparql

PREFIX fhir: <http://hl7.org/fhir/>
PREFIX qr: <http://hl7.org/fhir/QuestionnaireResponse.>

SELECT ?similarQR  (count(?sameAnswerValue) as ?sameAnswerCount) 
WHERE {
    <http://hl7.org/fhir/QuestionnaireResponse/d2fe10a6-aba3-4cfa-9ed7-39ef2c67dfe3> qr:item ?parentItem2_a .
    ?parentItem2_a qr:item.linkId/fhir:value "2" ;
       qr:item.item ?subItem_a .
    ?subItem_a qr:item.item.answer/qr:item.item.answer.valueInteger/fhir:value ?sameAnswerValue ;
       qr:item.item.text/fhir:value ?question .
    
    ?similarQR qr:item ?parentItem2_b .
    ?parentItem2_b qr:item.linkId/fhir:value "2" ;
       qr:item.item ?subItem_b .
    ?subItem_b qr:item.item.answer/qr:item.item.answer.valueInteger/fhir:value ?sameAnswerValue ;
       qr:item.item.text/fhir:value ?question .
}
GROUP BY ?similarQR
HAVING (?sameAnswerCount > 3)
ORDER BY DESC(?sameAnswerCount)