# SPARQL query to search AOPWiki for nanomaterial containing AOPs

Maastricht University, Department of Bioinformatics - BiGCaT
___

Set up the environment and install the right modules

In [1]:
from SPARQLWrapper import SPARQLWrapper, JSON
import pandas

---

Define the SPARQL endpoint 

In [2]:
sparql = SPARQLWrapper("http://aopwiki-rdf.prod.openrisknet.org/sparql/")

Create an empty data frame which we can fill in with the SPARQL results

In [3]:
Stressors = pandas.DataFrame(columns=['Stressor','StressorTitle','Part of','Chemical', 'AOP'])

---

Create a list with strings which you want to include in your search query. These strings will be part of the Stressor Title of the AOP.

In [4]:
list = ['Titanium', 'Zinc', 'Carbon', 'Silica', 'nano',
        'Silver', 'Gold', 'Graphene', 'JRCNM', 'NM-']

Create the SPARQL query to extract the Stressor, Stressor Title, Part of which AOP and chemical. The SPARQL query will loop over the list of strings provided above.

In [5]:
for i in list:
    i = i.lower()
    StressorQuery = '''  
    PREFIX dc: <http://purl.org/dc/elements/1.1/>
    PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
    PREFIX cas: <http://identifiers.org/cas/>
    PREFIX cheminf: <http://semanticscience.org/resource/>
    PREFIX ncit: <http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#>

    SELECT ?Stressor ?StressorTitle ?AOPorKE ?Chemical2 ?AOP2
    WHERE {
      ?stressorIRI a ncit:C54571 ;
          dc:title ?StressorTitle ;
          dc:identifier ?Stressor ;
          dcterms:isPartOf ?AOPorKE .
      OPTIONAL {?stressorIRI aopo:has_chemical_entity ?Chemical } 
      OPTIONAL {?AOPorKE dcterms:isPartOf ?AOP }
      filter contains( lcase(?StressorTitle), "''' + i + '''").
      BIND (IF(BOUND(?AOP), str(?AOP), "") AS ?AOP2)
      BIND (IF(BOUND(?Chemical), str(?Chemical), "") AS ?Chemical2)
    }
    '''
    sparql.setQuery(StressorQuery)
    sparql.setReturnFormat(JSON)  # Here the queries are made and the results are stored in "results".
    results = sparql.query().convert()
    
# Fill in data frame which was created before

    for result in results["results"]["bindings"]:
        Stressors = Stressors.append({
            'Stressor': result["Stressor"]["value"],
            'StressorTitle': result["StressorTitle"]["value"],
            'Part of': result["AOPorKE"]["value"],
            'Chemical': result["Chemical2"]["value"],
            'AOP': result["AOP2"]["value"],
        }, ignore_index=True)

Delete duplicate rows

In [6]:
Stressors = Stressors.drop_duplicates()

In [7]:
Stressors

Unnamed: 0,Stressor,StressorTitle,Part of,Chemical,AOP
0,http://identifiers.org/aop.stressor/357,Titanium oxide (TiO),http://identifiers.org/aop.events/1508,http://identifiers.org/cas/12137-20-1,http://identifiers.org/aop/260
1,http://identifiers.org/aop.stressor/357,Titanium oxide (TiO),http://identifiers.org/aop/260,http://identifiers.org/cas/12137-20-1,
2,http://identifiers.org/aop.stressor/253,UV-activated Titanium dioxide nanoparticles,http://identifiers.org/aop/208,,
3,http://identifiers.org/aop.stressor/13,Carbon tetrachloride,http://identifiers.org/aop.events/244,http://identifiers.org/cas/56-23-5,http://identifiers.org/aop/258
4,http://identifiers.org/aop.stressor/13,Carbon tetrachloride,http://identifiers.org/aop.events/244,http://identifiers.org/cas/56-23-5,http://identifiers.org/aop/38
5,http://identifiers.org/aop.stressor/347,Polycyclic aromatic hydrocarbons,http://identifiers.org/aop.events/944,NOCAS_44043,http://identifiers.org/aop/150
6,http://identifiers.org/aop.stressor/347,Polycyclic aromatic hydrocarbons,http://identifiers.org/aop.events/944,NOCAS_44043,http://identifiers.org/aop/21
7,http://identifiers.org/aop.stressor/13,Carbon tetrachloride,http://identifiers.org/aop/258,http://identifiers.org/cas/56-23-5,
8,http://identifiers.org/aop.stressor/13,Carbon tetrachloride,http://identifiers.org/aop/38,http://identifiers.org/cas/56-23-5,
9,http://identifiers.org/aop.stressor/347,Polycyclic aromatic hydrocarbons,http://identifiers.org/aop/21,NOCAS_44043,


---

### Dependencies

In [8]:
%load_ext watermark
%watermark -v -m -p SPARQLWrapper,pandas

CPython 3.7.4
IPython 7.7.0

SPARQLWrapper 1.8.4
pandas 0.25.0

compiler   : MSC v.1915 64 bit (AMD64)
system     : Windows
release    : 10
machine    : AMD64
processor  : Intel64 Family 6 Model 142 Stepping 10, GenuineIntel
CPU cores  : 8
interpreter: 64bit


In [9]:
import datetime
now = datetime.datetime.now()
print ("Date: " + str(now.day) + "/" + str(now.month) + "/" + str(now.year))

Date: 18/9/2019
