# RDF-centered conversion
This notebook will convert ZIB triples to capacity fields.

In [1]:
from rdflib import Graph, URIRef, Namespace, Literal
from rdflib.namespace import RDFS,SKOS, RDF, OWL
from pathlib import Path

## ZIB ontology
A small sample ontology to be able to describe ZIB concepts.

This ontology contains the following ZIB classes.

| Name          | Description  |
|---------------|--------------|
| [MedicationUse](https://zibs.nl/wiki/MedicationUse2-v1.1.1(2020EN)) |    |
| [PharmaceuticalProduct](https://zibs.nl/wiki/PharmaceuticalProduct-v2.1.2(2020EN)) | Partial information model used in MedicationUse |



In [2]:
ZIB = Namespace('http://example.org/ZIB#')
zib_ontology = '../../ZIB/zib.owl'
zib_ontology = Graph(identifier=ZIB).parse(zib_ontology)

print(zib_ontology.serialize(format='turtle').decode())

@prefix : <http://example.org/ZIB#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<http://example.org/ZIB> a owl:Ontology .

:MedicationUse a owl:Class ;
    rdfs:subClassOf :ZibConcept ;
    owl:disjointWith :PharmaceuticalProduct .

:hasZibRecord a owl:FunctionalProperty,
        owl:InverseFunctionalProperty,
        owl:ObjectProperty ;
    rdfs:subPropertyOf :zibProperty .

:medicationCode a owl:FunctionalProperty,
        owl:InverseFunctionalProperty,
        owl:ObjectProperty ;
    rdfs:subPropertyOf :zibProperty .

:PharmaceuticalProduct a owl:Class ;
    rdfs:subClassOf :ZibConcept .

:ZibConcept a owl:Class .

:zibProperty a owl:ObjectProperty ;
    rdfs:subPropertyOf owl:topObjectProperty .




# Sample records
Using the ZIB ontology some sample records are created for patient Bob which uses a medication called *quinidine* which has *C01BA01* within the ATC codesystem, and *atenolol* with code *C07AB03*.
We need to create two *MedicationUse* records.

In [3]:
# Namespaces
ATC = Namespace('http://purl.bioontology.org/ontology/ATC/')
UATC = Namespace('http://purl.bioontology.org/ontology/UATC/')


patient = Namespace('http://example.org/patient/')
zib_record = Namespace('http://example.org/zib_record/')

# Sample medicationUse zib
patient_graph = Graph()
patient_graph.bind('ZIB', ZIB)
patient_graph.bind('ATC', ATC)
patient_graph.bind('UATC', UATC)
patient_graph.bind('RDFS', RDFS)
patient_graph.bind('SKOS', SKOS)
patient_graph.bind('OWL', OWL)

# Quinidine record
record_ref = zib_record.bobsmedication
patient_graph.add((patient.bob, ZIB.hasZibRecord, record_ref))
patient_graph.add((record_ref, RDF.type, ZIB.MedicationUse))
patient_graph.add((record_ref, ZIB.medicationCode, UATC.C01BA01))

# Atenolol record
record_ref = zib_record.bobsOtherMedication
patient_graph.add((patient.bob, ZIB.hasZibRecord, record_ref))
patient_graph.add((record_ref, RDF.type, ZIB.MedicationUse))
patient_graph.add((record_ref, ZIB.medicationCode, UATC.C07AB03))


print(patient_graph.serialize(format='turtle').decode())

@prefix UATC: <http://purl.bioontology.org/ontology/UATC/> .
@prefix ZIB: <http://example.org/ZIB#> .

<http://example.org/patient/bob> ZIB:hasZibRecord <http://example.org/zib_record/bobsOtherMedication>,
        <http://example.org/zib_record/bobsmedication> .

<http://example.org/zib_record/bobsOtherMedication> a ZIB:MedicationUse ;
    ZIB:medicationCode UATC:C07AB03 .

<http://example.org/zib_record/bobsmedication> a ZIB:MedicationUse ;
    ZIB:medicationCode UATC:C01BA01 .




# Adding ATC and ZIB ontology


In [4]:
from zib_uploader.tools import get

atc_path = 'atc.ttl'
atc_url = 'http://data.bioontology.org/ontologies/ATC/submissions/12/download?apikey=8b5b7825-538d-40e0-9e9e-5ab9274a9aeb'

atc_file = get(atc_url)

patient_graph.parse(str(atc_file), format='turtle')

patient_graph = patient_graph + zib_ontology

Now we need to check whether we can use the ATC ontology to infer additional information about Mr. Bobs medication use.

A *DeductiveClosure* is an expansion of the knowledge base with all knowledge that can logically be derived from the original knowledge.

In [5]:
# import owlrl

# owlrl.DeductiveClosure(owlrl.CombinedClosure.RDFS_OWLRL_Semantics).expand(patient_graph)

# query = \
# '''
# select ?medication ?superclass
# where {    
#    <http://example.org/patient/bob> ZIB:hasZibRecord ?zibRecord .
#     ?zibRecord ZIB:medicationCode ?medication .
#     ?medication RDFS:subClassOf ?superclass
# }
# '''

# result = patient_graph.query(query)

# for r in result:
#     print(f'Medication {r[0]} is subclass of {r[1]}')

It actually turns out that transitive relations can also be deduced without a reasoning engine by adding a '\*' to the transitive property.

If this is the most complicated reasoning we need we do not need a reasoning engine.

In [6]:
query = \
'''
select ?medication ?superclass
where {    
   <http://example.org/patient/bob> ZIB:hasZibRecord ?zibRecord .
    ?zibRecord ZIB:medicationCode ?medication .
    ?medication RDFS:subClassOf* ?superclass
}
'''

result = patient_graph.query(query)

for r in result:
    print(f'Medication {r[0]} is subclass of {r[1]}')

Medication http://purl.bioontology.org/ontology/UATC/C01BA01 is subclass of http://purl.bioontology.org/ontology/UATC/C01BA01
Medication http://purl.bioontology.org/ontology/UATC/C01BA01 is subclass of http://purl.bioontology.org/ontology/UATC/C01BA
Medication http://purl.bioontology.org/ontology/UATC/C01BA01 is subclass of http://purl.bioontology.org/ontology/UATC/C01B
Medication http://purl.bioontology.org/ontology/UATC/C01BA01 is subclass of http://purl.bioontology.org/ontology/UATC/C01
Medication http://purl.bioontology.org/ontology/UATC/C01BA01 is subclass of http://purl.bioontology.org/ontology/UATC/C
Medication http://purl.bioontology.org/ontology/UATC/C01BA01 is subclass of http://www.w3.org/2002/07/owl#Thing
Medication http://purl.bioontology.org/ontology/UATC/C07AB03 is subclass of http://purl.bioontology.org/ontology/UATC/C07AB03
Medication http://purl.bioontology.org/ontology/UATC/C07AB03 is subclass of http://purl.bioontology.org/ontology/UATC/C07AB
Medication http://purl.

This information can be used to derive the values of a couple of *carmed* fields in the CAPACITY codebook.
The mapping excel sheet specifies how the multiple choice values map to ATC codes.

In [7]:
import pandas as pd 

def str_to_tuples(s, sep):
    l = s.split(sep)
    l = [(l[i], l[i+1]) for i in range(0, len(l), 2)]
    
    return l
    

carmed_to_atc = 'C07A|Beta blocking agents|C01B|ANTIARRHYTHMICS, CLASS I AND III|C01AA05|digoxin|C03|Diuretics|C08|CALCIUM CHANNEL BLOCKERS|C09A | ACE inhibitors, plain|C09C|ANGIOTENSIN II RECEPTOR BLOCKERS (ARBs), PLAIN|C03DA|Aldosterone antagonists|C09CA03| valsartan|C02KX|Antihypertensives for pulmonary hypertension(phosphodisesterase)|B01AC|Platelet aggregation inhibitors excl.heparin|B01AA|Vitamin K antagonists(coumarin)|B01AE|direct thrombin inhibitors (DOAC)|C10|Lipid modifying agents|A10A|Insulins and analogues|A10B|bloog glucose lowering drugs, excl. insulins(Oral antidiabetic agents'
carmed_to_atc = str_to_tuples(carmed_to_atc, '|')
carmed_to_atc = map(lambda x: x[0], carmed_to_atc)
carmed_to_atc = list(carmed_to_atc)
                

carmed_field_index = '0-None, 1-Betablocker, 3-Antiarrhytmic drugs, 4-Digoxine, 5- Diuretics, 6- Calcium channel blocker, 7- ACE inhibitor, 8- Angiotensin II receptor blocker, 15- Aldosterone antagonist, 16 -Sacubitrivil/valsartan(Entresto),17-Phospodiesterase inhibitors,9 -antiplatelet agents, 10- coumarin, 11- direct oral anticoagulants(DOAC), 12-Lipid lowering agents, 13-Insulin, 14- Oral antidiabetic agents, 99-other cardiovascular medication'
carmed_field_index = carmed_field_index.split(',')
carmed_field_index = map(lambda x: x.split('-'), carmed_field_index)
carmed_field_index = map(lambda x: tuple(x), carmed_field_index)
carmed_field_index = list(carmed_field_index)

display(f'carmed_field_index length: {len(carmed_field_index)}  carmed_to_atc: {len(carmed_to_atc)}')

# Make sure the lists are of the same length so we can line them up and see what's missing
carmed_to_atc = [''] + carmed_to_atc
display(carmed_to_atc)
display(carmed_field_index)

combined = list(zip(carmed_to_atc, carmed_field_index))
combined_df = pd.DataFrame(combined, columns=['ATC', 'choice'])

combined_df.ATC = combined_df.ATC.str.strip()
combined_df['number'] = combined_df['choice'].map(lambda x: int(x[0].strip()))
display(combined_df)

capacity_root_class = URIRef('http://example.org/capacity/carmed/capacityValue')

for row in combined_df.itertuples():
    if row.number == 0:
        continue
        
    capacity_uri =  URIRef(f'http://example.org/capacity/carmed/{row.number}')
    atc_uri = URIRef(UATC + row.ATC)
    
    print(f'CAPACITY uri: {capacity_uri} sameas atc uri: {atc_uri}')
               
    patient_graph.add((atc_uri, OWL.sameAs, capacity_uri))
    patient_graph.add((capacity_uri, SKOS.prefLabel, Literal(row.choice[1])))
    
    # Let's combine all capacity classes under one superclass capacityValue
    patient_graph.add((capacity_uri, RDFS.subClassOf, capacity_root_class))


'carmed_field_index length: 18  carmed_to_atc: 16'

['',
 'C07A',
 'C01B',
 'C01AA05',
 'C03',
 'C08',
 'C09A ',
 'C09C',
 'C03DA',
 'C09CA03',
 'C02KX',
 'B01AC',
 'B01AA',
 'B01AE',
 'C10',
 'A10A',
 'A10B']

[('0', 'None'),
 (' 1', 'Betablocker'),
 (' 3', 'Antiarrhytmic drugs'),
 (' 4', 'Digoxine'),
 (' 5', ' Diuretics'),
 (' 6', ' Calcium channel blocker'),
 (' 7', ' ACE inhibitor'),
 (' 8', ' Angiotensin II receptor blocker'),
 (' 15', ' Aldosterone antagonist'),
 (' 16 ', 'Sacubitrivil/valsartan(Entresto)'),
 ('17', 'Phospodiesterase inhibitors'),
 ('9 ', 'antiplatelet agents'),
 (' 10', ' coumarin'),
 (' 11', ' direct oral anticoagulants(DOAC)'),
 (' 12', 'Lipid lowering agents'),
 (' 13', 'Insulin'),
 (' 14', ' Oral antidiabetic agents'),
 (' 99', 'other cardiovascular medication')]

Unnamed: 0,ATC,choice,number
0,,"(0, None)",0
1,C07A,"( 1, Betablocker)",1
2,C01B,"( 3, Antiarrhytmic drugs)",3
3,C01AA05,"( 4, Digoxine)",4
4,C03,"( 5, Diuretics)",5
5,C08,"( 6, Calcium channel blocker)",6
6,C09A,"( 7, ACE inhibitor)",7
7,C09C,"( 8, Angiotensin II receptor blocker)",8
8,C03DA,"( 15, Aldosterone antagonist)",15
9,C09CA03,"( 16 , Sacubitrivil/valsartan(Entresto))",16


CAPACITY uri: http://example.org/capacity/carmed/1 sameas atc uri: http://purl.bioontology.org/ontology/UATC/C07A
CAPACITY uri: http://example.org/capacity/carmed/3 sameas atc uri: http://purl.bioontology.org/ontology/UATC/C01B
CAPACITY uri: http://example.org/capacity/carmed/4 sameas atc uri: http://purl.bioontology.org/ontology/UATC/C01AA05
CAPACITY uri: http://example.org/capacity/carmed/5 sameas atc uri: http://purl.bioontology.org/ontology/UATC/C03
CAPACITY uri: http://example.org/capacity/carmed/6 sameas atc uri: http://purl.bioontology.org/ontology/UATC/C08
CAPACITY uri: http://example.org/capacity/carmed/7 sameas atc uri: http://purl.bioontology.org/ontology/UATC/C09A
CAPACITY uri: http://example.org/capacity/carmed/8 sameas atc uri: http://purl.bioontology.org/ontology/UATC/C09C
CAPACITY uri: http://example.org/capacity/carmed/15 sameas atc uri: http://purl.bioontology.org/ontology/UATC/C03DA
CAPACITY uri: http://example.org/capacity/carmed/16 sameas atc uri: http://purl.bioon

Now we have the triples to connect medication to CAPACITY fields. Let's check if our patient Bob uses any cardiovascular medication.

 # Querying CAPACITY data
 For our conversion it would be convenient if we had a list of all the capacity values per patient.

In [22]:
query = \
'''
select ?capacity ?superclass ?capacity ?label
where {    
   <http://example.org/patient/bob> ZIB:hasZibRecord ?zibRecord .
    ?zibProperty RDFS:subPropertyOf ZIB:zibProperty .
    ?zibRecord ?zibProperty ?zibValue .
    ?zibValue RDFS:subClassOf* ?superclass .
    ?capacity RDFS:subClassOf  <http://example.org/capacity/carmed/capacityValue> .
    ?superclass OWL:sameAs ?capacity .
    ?capacity SKOS:prefLabel ?label .
}
'''

result = patient_graph.query(query)
result_df = pd.DataFrame(result)
display(result_df)

Unnamed: 0,0,1,2,3
0,http://example.org/capacity/carmed/3,http://purl.bioontology.org/ontology/UATC/C01B,http://example.org/capacity/carmed/3,Antiarrhytmic drugs
1,http://example.org/capacity/carmed/1,http://purl.bioontology.org/ontology/UATC/C07A,http://example.org/capacity/carmed/1,Betablocker
