In [3]:
import rdflib
import morph_kgc

# Querying the Knowledge Graph

Testing the KG and ontology with SPARQL built from the requirements and competency questions.


|Identifier|Requirement or Competency Question                                                                                           |
|----------|-----------------------------------------------------------------------------------------------------------------------------|
|R1        |Articles have title, year, journal, issue, abstract, volume and author                                                       |
|R2        |Each article may refer to one or more experiments                                                                            |
|R3        |Each experiment must have a catalyst                                                                                         |
|R4        |Each experiment may have additional inputs apart from the catalyst                                                           |
|R5        |Each experiment must specify surface area and band gap                                                                       |
|R6        |An experiment must indicate operation conditions, including temperature, pressure, reaction time, medium and operation mode  |
|R7        |Photocatalysis experiments may also indicate type of reactor                                                                 |
|R8        |Electrocatalysis and Photoelectrocatalysis experiments may also indicate electrochemical configuration                       |
|R9        |Photocatalysis and Photoelectrocatalysis experiments must indicate the light source used, including type of light, lamp, wavelength, irradiance and power|
|R10       |TiO2 catalysts must indicate crystal structure                                                                               |
|R11       |Medium reaction must be either gas or liquid                                                                                 |
|R12       |When reaction medium is liquid, the pH value must be indicated                                                               |
|R13       |Operation modes must be either batch or continuous                                                                           |
|R14       |When the operation mode is continuous, the spacial speed value must be indicated                                             |
|R15       |Each experiment may have one or more outputs                                                                                 |
|R16       |When the light source is not solar or solar simulation, the wavelength must be indicated                                     |
|R17       |An experiment may provide conversion metrics, quantum efficiency metrics and electrochemical metrics                         |
|CQ1       |How many experiments are reported per year?                                                                                  |
|CQ2       |How many experiments are reported per country?                                                                               |
|CQ3       |How many experiments are there per type of catalyst?                                                                         |
|CQ4       |Which articles have been published in the ACS NANO Journal in 2018 in volume 229?                                            |
|CQ5       |Which experiments use TiO2 as catalyst, liquid medium an produce H2 as output?                                               |
|CQ6       |Which experiments use TiO2 as catalyst, liquid medium and produce the most H2 output in µmol/gh?                             |


In [4]:
g = rdflib.Graph()
g.parse("result/solarchem-kg.nt")
print(len(g))

1763831


## R1: Articles have title, year, journal, issue, abstract, volume and author
Pending refinement with SemOpenAlex

In [5]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>

    SELECT (COUNT (DISTINCT ?article) AS ?count)
    WHERE {
        ?article a bibo:Article ;
            bibo:doi ?doi ;
            bibo:abstract ?abs ;
            bibo:volume ?volume ;
            dc:title ?title ;
            bibo:issue ?issue ;
            dc:date ?date ;
            bibo:pageStart ?page .
    }

""")

for row in q_res:
    print(f"Number of articles with all properties: {row['count'].value}")

Number of articles with all properties: 415


## R2: Each article may refer to one or more experiments

In [6]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>

    SELECT (COUNT (DISTINCT ?article) AS ?count)
    WHERE {
        ?article a bibo:Article ;
            solar:hasExperimentExecution ?expexec .
    }

""")

for row in q_res:
    print(f"Number of articles with experiments associated: {row['count'].value}")

Number of articles with experiments associated: 1096


## R3: Each experiment must have a catalyst

In [30]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>

    SELECT ?expexec ?catalyst
    WHERE {
        ?expexec prov:used ?input .
        ?input a solar:Input ;
            rdfs:label ?catalyst ;
            solar:hasRole solar:Catalyst .
    } LIMIT 10

""")

for row in q_res:
    print(f"{row.expexec} {row.catalyst}")

http://w3id.org/solar/i/ExpExec/6028 WO3
http://w3id.org/solar/i/ExpExec/4325 Ti-KIT-6
http://w3id.org/solar/i/ExpExec/515 TiO2
http://w3id.org/solar/i/ExpExec/4316 Ti-KIT-6
http://w3id.org/solar/i/ExpExec/2911 TiO2
http://w3id.org/solar/i/ExpExec/5587 g-C3N4
http://w3id.org/solar/i/ExpExec/3211 g-C3N4
http://w3id.org/solar/i/ExpExec/3848 BiOCl
http://w3id.org/solar/i/ExpExec/5895 SiC
http://w3id.org/solar/i/ExpExec/5488 TiO2


## R4: Each experiment may have additional inputs apart from the catalyst

In [31]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>

    SELECT DISTINCT  ?expexec ?inputlabel ?role
    WHERE {
        ?expexec prov:used ?input .
        ?input a solar:Input ;
            rdfs:label ?inputlabel ;
            solar:hasRole ?role .
        FILTER(?role != solar:Catalyst)
    } LIMIT 10

""")

for row in q_res:
    print(f"{row.expexec} {row.inputlabel} {row.role}")

http://w3id.org/solar/i/ExpExec/5126 [Ru(bpy)3]Cl2·6H2O http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/2196 Ru(bpy) http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/5991 [Ru(bpy)3]Cl2·6H2O http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/2566 Ru-Ru(bypy) http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/5476 [Ru(bpy)3]Cl2 http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/5137 [Ru(bpy)3]Cl2·6H2O http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/5858 [Ru(bpy)3]Cl2·6H2O http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/6124 2,2-bipyridine http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/1789 Ru(bpy) http://w3id.org/solar/o/pc#Dye
http://w3id.org/solar/i/ExpExec/6286 RhB http://w3id.org/solar/o/pc#Dye


## R5: Each experiment must specify surface area and band gap

In [32]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>

    SELECT DISTINCT ?expexec ?bgap ?bgapunit ?surfacearea ?saunit
    WHERE {
        ?expexec prov:used ?input .
        ?input a solar:Input ;
            solar:hasBandGap [ qudt:numericValue ?bgap; qudt:unit ?bgapunit ] ;
            solar:hasSurfaceArea [ qudt:numericValue ?surfacearea; qudt:unit ?saunit ] .
    } LIMIT 10

""")

for row in q_res:
    print(f"{row.expexec} Bandgap: {row.bgap} {row.bgapunit} | Surface area: {row.surfacearea} {row.saunit}")

http://w3id.org/solar/i/ExpExec/2939 Bandgap: 3.13 http://qudt.org/vocab/unit/EV | Surface area: 53.0 http://qudt.org/vocab/unit/M2-PER-GM
http://w3id.org/solar/i/ExpExec/5568 Bandgap: 2.7 http://qudt.org/vocab/unit/EV | Surface area: 82.8 http://qudt.org/vocab/unit/M2-PER-GM
http://w3id.org/solar/i/ExpExec/7715 Bandgap: 3.2 http://qudt.org/vocab/unit/EV | Surface area: 45.2 http://qudt.org/vocab/unit/M2-PER-GM
http://w3id.org/solar/i/ExpExec/2413 Bandgap: 2.7 http://qudt.org/vocab/unit/EV | Surface area: 1149.4 http://qudt.org/vocab/unit/M2-PER-GM
http://w3id.org/solar/i/ExpExec/2222 Bandgap: 3.4 http://qudt.org/vocab/unit/EV | Surface area: 240.0 http://qudt.org/vocab/unit/M2-PER-GM
http://w3id.org/solar/i/ExpExec/2158 Bandgap: 3.2 http://qudt.org/vocab/unit/EV | Surface area: 46.9 http://qudt.org/vocab/unit/M2-PER-GM
http://w3id.org/solar/i/ExpExec/2588 Bandgap: 1.5 http://qudt.org/vocab/unit/EV | Surface area: 92.71 http://qudt.org/vocab/unit/M2-PER-GM
http://w3id.org/solar/i/ExpEx

## R6: An experiment must indicate operation conditions, including temperature, pressure, reaction time, medium and operation mode
Note: Medium and operation mode classes in onto, not in this KG's version

In [35]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>

    SELECT (COUNT (DISTINCT ?expexec) AS ?count)
    WHERE {
        ?expexec solar:hasCondition ?condition .
        ?condition a ?condition_type .
        VALUES ?condition_type {solar:TimeCondition solar:PressureCondition solar:TemperatureCondition}
    }
""")

for row in q_res:
    print(f"Number of articles with conditions: {row['count'].value}")

Number of articles with experiments associated: 6664


## R7: Photocatalysis experiments may also indicate type of reactor

In [37]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>

    SELECT (COUNT (DISTINCT ?expexec) AS ?count)
    WHERE {
        ?expexec solar:hasCondition ?condition .
        ?condition a solarpc:ReactorCondition ; 
            solarpc:reactorVolume ?value .
    }
""")

for row in q_res:
    print(f"Number of articles with reactor conditions: {row['count'].value}")

Number of articles with reactor conditions: 3835


## R8: Electrocatalysis and Photoelectrocatalysis experiments may also indicate electrochemical configuration
No experiments so far with this condition, only from photocatalysis

In [38]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>

    SELECT (COUNT (DISTINCT ?expexec) AS ?count)
    WHERE {
        ?expexec solar:hasCondition ?condition .
        ?condition a solar:ElectrochemicalConfiguration .
    }
""")

for row in q_res:
    print(f"Number of articles with electrochemical configuration: {row['count'].value}")

Number of articles with electrochemical configuration: 0


## R9: Photocatalysis and Photoelectrocatalysis experiments must indicate the light source used, including type of light, lamp, wavelength, irradiance and power

In [46]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>
    PREFIX obi: <http://purl.obolibrary.org/obo/OBI_>

    SELECT DISTINCT ?expexec ?lamp ?wl ?irradiance ?power 
    WHERE {
        ?expexec solar:hasLightSource ?light_source.
        ?light_source a obi:0400065 ;
            solar:hasLamp ?lamp ;
            solar:hasWavelength ?wl ;
            solar:hasIrradiance [qudt:numericValue ?irradiance] ;
            solar:hasPower [qudt:numericValue ?power] .
            
    } LIMIT 5
""")

for row in q_res:
    print(f"{row.expexec}\n\tLamp: {row.lamp}\n\tWavelength: {row.wl}\n\tIrradiance: {row.irradiance}\n\tPower: {row.power}")

http://w3id.org/solar/i/ExpExec/7417
	Lamp: http://w3id.org/solar/o/core#XenonLamp
	Wavelength: http://w3id.org/solar/o/core#400-780WL
	Irradiance: 2100.0
	Power: 300.0
http://w3id.org/solar/i/ExpExec/5756
	Lamp: http://w3id.org/solar/o/core#XenonLamp
	Wavelength: http://w3id.org/solar/o/core#400-780WL
	Irradiance: 1500.0
	Power: 450.0
http://w3id.org/solar/i/ExpExec/7088
	Lamp: http://w3id.org/solar/o/core#XenonLamp
	Wavelength: http://w3id.org/solar/o/core#400-780WL
	Irradiance: 1000.0
	Power: 300.0
http://w3id.org/solar/i/ExpExec/5029
	Lamp: http://w3id.org/solar/o/core#XenonLamp
	Wavelength: http://w3id.org/solar/o/core#315-780WL
	Irradiance: 11000.0
	Power: 251.0
http://w3id.org/solar/i/ExpExec/549
	Lamp: http://w3id.org/solar/o/core#XenonLamp
	Wavelength: http://w3id.org/solar/o/core#192-280WL
	Irradiance: 196.0
	Power: 400.0


## R10: TiO2 catalysts must indicate crystal structure

In [47]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>

    SELECT ?input ?crystal_str
    WHERE {
        ?input a solar:Input ;
            rdfs:label "TiO2" ;
            solar:hasRole solar:Catalyst ;
            solar:crystalStructure ?crystal_str .
    } LIMIT 10

""")

for row in q_res:
    print(f"{row.input}  {row.crystal_str}")

http://w3id.org/solar/i/Input/Catalyst/2828-TiO2  Anatase
http://w3id.org/solar/i/Input/Catalyst/4612-TiO2  Anatase
http://w3id.org/solar/i/Input/Catalyst/2063-TiO2  Anatase
http://w3id.org/solar/i/Input/Catalyst/201-TiO2  P25
http://w3id.org/solar/i/Input/Catalyst/5111-TiO2  P25
http://w3id.org/solar/i/Input/Catalyst/315-TiO2  P25
http://w3id.org/solar/i/Input/Catalyst/319-TiO2  Anatase
http://w3id.org/solar/i/Input/Catalyst/6173-TiO2  Anatase
http://w3id.org/solar/i/Input/Catalyst/446-TiO2  Mix
http://w3id.org/solar/i/Input/Catalyst/5022-TiO2  Anatase


## R11: Medium reaction must be either gas or liquid
When merged with the ontology

## R12: When reaction medium is liquid, the pH value must be indicated

In [52]:
q_res = g.query("""    
    PREFIX bibo: <http://purl.org/ontology/bibo/>
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX solar: <http://w3id.org/solar/o/core#>
    PREFIX solarpc: <http://w3id.org/solar/o/pc#>
    PREFIX qudt: <http://qudt.org/2.1/schema/qudt>

    SELECT ?expexec ?ph
    WHERE {
        ?expexec solar:hasCondition solar:LiquidMedium ;
            solar:hasCondition ?ph_cond .
        ?ph_cond a solar:pHCondition ;
            qudt:numericValue ?ph .
    } LIMIT 10
""")

for row in q_res:
    print(f"{row.expexec} pH: {row.ph}")

http://w3id.org/solar/i/ExpExec/3277 pH: 2.0
http://w3id.org/solar/i/ExpExec/5059 pH: 7.0
http://w3id.org/solar/i/ExpExec/6435 pH: 7.0
http://w3id.org/solar/i/ExpExec/5137 pH: 7.0
http://w3id.org/solar/i/ExpExec/3282 pH: 10.0
http://w3id.org/solar/i/ExpExec/4893 pH: 7.0
http://w3id.org/solar/i/ExpExec/6377 pH: 7.0
http://w3id.org/solar/i/ExpExec/6296 pH: 1.0
http://w3id.org/solar/i/ExpExec/5188 pH: 7.0
http://w3id.org/solar/i/ExpExec/5057 pH: 7.0


## R13: Operation modes must be either batch or continuous
When merged with the ontology

## R14: When the operation mode is continuous, the spacial speed value must be indicated
Not found in data?