# Extracting nanomaterials in Adverse Outcome Pathways
by [Marvin Martens](https://www.bigcat.unimaas.nl/?person=marvin-martens-msc) and [Serena Bonaretti](https://sbonaretti.github.io), Maastricht University, Department of Bioinformatics - BiGCaT


Adverse Outcome Pathways are structured representations of biological information that is relevant for risk assessment. Starting with a Molecular Initiating Event (MIE) by a stressor, a cascade of Key Events (KEs) are causally connected and lead to the Adverse Outcome. The aim of this Jupyter notebook is to extract all AOPs relevant to nanomaterial toxicity from the AOP-Wiki. We do this by looking at the chemicals and by textual filters. 

We look at te chemicals by:
- Extracting all chemicals from the [AOP-Wiki](http://aopwiki.org) through the [SPARQL endpoint](http://aopwiki-rdf.prod.openrisknet.org/sparql/)
- Identifying nanomaterials among the chemicals using the [ChEBI](https://www.ebi.ac.uk/chebi/init.do) ontology

We do textual filters by:
- Extracting all descriptions of AOPs, KEs and Stressors that contain 'nano' through the [SPARQL endpoint](http://aopwiki-rdf.prod.openrisknet.org/sparql/)
- Combining results to form a complete list of AOPs containing 'nano' in their descriptions or in their KEs and Stressors

Imports:

In [1]:
from SPARQLWrapper import SPARQLWrapper, JSON
import numpy as np
import pandas
import requests
import zeep
import re
from itertools import chain

Variables:

In [2]:
#Define the SPARQL endpoint
sparql = SPARQLWrapper("http://aopwiki-rdf.prod.openrisknet.org/sparql/")

# ChEBI wsdl
repository_wsdl = 'https://www.ebi.ac.uk/webservices/chebi/2.0/webservice?wsdl'

# BridgeDB
bridgedb = 'https://webservice.bridgedb.org/Human/xrefs/Ca/'
datasource = '?dataSource=Ce'

# ChEBI id of "nanostructure"
nanostructure = "CHEBI:50795"

# number of columns to display 
pandas.set_option("display.max_rows",20)

---
## Extracting all chemicals from the AOP-Wiki
We query the AOP-Wiki using SPARQL to extract:  
- Names of all chemicals [ChemicalName]  
- AOP that they are related to [LinkedAOP]
- Chemical Abstracts Service (CAS) identifier of the chemicals [CASRN].  

We store these information in the table `chemicals`.

In [3]:
#create the data frame
chemicals = pandas.DataFrame(columns=['ChemicalName','LinkedAOP','AOPTitle','CASRN'])

#extract chemicals from the AOP-Wiki RDF
pathwayQuery = '''
select distinct ?ChemicalName ?LinkedAOP ?LinkedAOPURI ?CASRN ?ChemicalURI ?AOPTitle where{
 ?cheLook a cheminf:CHEMINF_000000 ; dc:identifier ?ChemicalURI ; dc:title ?ChemicalName ; cheminf:CHEMINF_000446 ?CASRN ; dcterms:isPartOf ?LinkedStressor.
 ?LinkedStressor dcterms:isPartOf ?LinkedAOPURI .
 ?LinkedAOPURI a aopo:AdverseOutcomePathway ; rdfs:label ?LinkedAOP; dc:title ?AOPTitle.
} order by ?ChemicalName
'''
sparql.setQuery(pathwayQuery)
sparql.setReturnFormat(JSON)  # Here the queries are made en the results are stored in "results".
results = sparql.query().convert()

#put the data in the "chemicals" table
for result in results["results"]["bindings"]:
        chemicals = chemicals.append({
            'ChemicalName': result["ChemicalName"]["value"],
            'LinkedAOP'   : result["LinkedAOP"]["value"],
            'AOPTitle'   : result["AOPTitle"]["value"],
            'CASRN'       : result["CASRN"]["value"],
        }, ignore_index=True)

In [4]:
# display
chemicals

Unnamed: 0,ChemicalName,LinkedAOP,AOPTitle,CASRN
0,(7S)-Hydroprene,AOP 201,Juvenile hormone receptor agonism leading to m...,65733-18-8
1,"1',2'-Dihydrorotenone",AOP 273,Mitochondrial complex inhibition leading to li...,6659-45-6
2,"1',2'-Dihydrorotenone",AOP 276,Inhibition of complex I of the electron transp...,6659-45-6
3,"1',2'-Dihydrorotenone",AOP 3,Inhibition of the mitochondrial complex I of n...,6659-45-6
4,1-Chloro-4-nitrobenzene,AOP 100,Cyclooxygenase inhibition leading to reproduct...,100-00-5
5,1-Chloro-4-nitrobenzene,AOP 101,Cyclooxygenase inhibition leading to reproduct...,100-00-5
6,1-Chloro-4-nitrobenzene,AOP 102,Cyclooxygenase inhibition leading to reproduct...,100-00-5
7,1-Chloro-4-nitrobenzene,AOP 103,Cyclooxygenase inhibition leading to reproduct...,100-00-5
8,1-Chloro-4-nitrobenzene,AOP 28,Cyclooxygenase inhibition leading reproductive...,100-00-5
9,1-Chloro-4-nitrobenzene,AOP 63,Cyclooxygenase inhibition leading to reproduct...,100-00-5


In [37]:
uniquechem = set(chemicals['ChemicalName'].to_list())
print('Total of unique chemicals in the AOP-Wiki: '+str(len(uniquechem)))

Total of unique chemicals in the AOP-Wiki: 275


Because ChEBI Ontology does not include the CAS identifiers, we convert the CAS identifiers to ChEBI identifiers using the [BridgeDb API](http://bridgedb.prod.openrisknet.org/swagger/).
We add the ChEBI identifers starting with 'CHEBI:' to the table `chemicals` in the column 'ChEBI_id'. The conversion takes a few seconds.

In [5]:
chebi = []

for cas in chemicals['CASRN']:
    allchebi = re.split('\t|\n', requests.get(bridgedb + cas + datasource).text)
    filteredchebi = []
    for item in allchebi:
        if 'CHEBI:' in item:
            filteredchebi.append(item)
    chebi.append(filteredchebi)


In [6]:
# add column with ChEBI_id to chemicals
chemicals['ChEBI_id'] = chebi
chemicals

Unnamed: 0,ChemicalName,LinkedAOP,AOPTitle,CASRN,ChEBI_id
0,(7S)-Hydroprene,AOP 201,Juvenile hormone receptor agonism leading to m...,65733-18-8,[CHEBI:32110]
1,"1',2'-Dihydrorotenone",AOP 273,Mitochondrial complex inhibition leading to li...,6659-45-6,[]
2,"1',2'-Dihydrorotenone",AOP 276,Inhibition of complex I of the electron transp...,6659-45-6,[]
3,"1',2'-Dihydrorotenone",AOP 3,Inhibition of the mitochondrial complex I of n...,6659-45-6,[]
4,1-Chloro-4-nitrobenzene,AOP 100,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
5,1-Chloro-4-nitrobenzene,AOP 101,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
6,1-Chloro-4-nitrobenzene,AOP 102,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
7,1-Chloro-4-nitrobenzene,AOP 103,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
8,1-Chloro-4-nitrobenzene,AOP 28,Cyclooxygenase inhibition leading reproductive...,100-00-5,[CHEBI:34399]
9,1-Chloro-4-nitrobenzene,AOP 63,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]


We remove the rows containing empty cells in the column [ChEBI_id].

In [7]:
# delete rows where there is no ChEBI_id (e.g. row 7)
print ("removing rows: ")
for i in range (0, len(chebi)):
    if not chebi[i]:
        print (str(i))
        chemicals = chemicals.drop(i)
# re-assign the indeces

chemicals.index = np.arange(0,len(chemicals))
chemicals

removing rows: 
1
2
3
15
16
17
18
19
20
25
80
116
139
160
228
244
245
246
295
336
337
338
339
368
369
403
404
405
406
484
504
505
506
507
508
529
559
601


Unnamed: 0,ChemicalName,LinkedAOP,AOPTitle,CASRN,ChEBI_id
0,(7S)-Hydroprene,AOP 201,Juvenile hormone receptor agonism leading to m...,65733-18-8,[CHEBI:32110]
1,1-Chloro-4-nitrobenzene,AOP 100,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
2,1-Chloro-4-nitrobenzene,AOP 101,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
3,1-Chloro-4-nitrobenzene,AOP 102,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
4,1-Chloro-4-nitrobenzene,AOP 103,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
5,1-Chloro-4-nitrobenzene,AOP 28,Cyclooxygenase inhibition leading reproductive...,100-00-5,[CHEBI:34399]
6,1-Chloro-4-nitrobenzene,AOP 63,Cyclooxygenase inhibition leading to reproduct...,100-00-5,[CHEBI:34399]
7,1-Ethyl-1-nitrosourea,AOP 139,Alkylation of DNA leading to cancer 1,759-73-9,[CHEBI:23995]
8,1-Ethyl-1-nitrosourea,AOP 141,Alkylation of DNA leading to cancer 2,759-73-9,[CHEBI:23995]
9,1-Ethyl-1-nitrosourea,AOP 15,Alkylation of DNA in male pre-meiotic germ cel...,759-73-9,[CHEBI:23995]


---

## Identifying nanomaterials among the chemicals
We retrieve the child terms of the ontology tag 'nanostructure', using the Chebi ontology call [getOntologyChildren]. The while loop stops when no more child terms are found.

In [8]:
# get the children of the terms "nanostructure"

# initializations
parent_ids           = []
parent_ids.append(nanostructure)
current_children_ids = []
all_children_ids     = [] 
flag                 = 1 # flag to stop the while loop
children_level       = 0 # counter of children levels

# call the chebi ontology
client = zeep.Client(wsdl=repository_wsdl) 

# extract children
while flag == 1:
    
    print ("children level: " + str(children_level))
    print ("-> parent_ids             : " + str(len(parent_ids)))
        
    # used to break the loop
    n_of_none_children = 0
    nochild = 0
    # for each parent_id
    for i in range (0, len(parent_ids)):
        
        # variables used to stop the while loop
        n_of_parent_ids    = len(parent_ids)
        n_of_none_children = 0
        
        
        # get all the children information
        info_children = client.service.getOntologyChildren(parent_ids[i])
        
        # if the current children is not a final branch of an ontology
        if info_children is not None:
        
            # extract the ids of the new child 
            for j in range(0, len(info_children)):
                current_children_ids.append(info_children[j].chebiId)
        else:
            # add a count to n_of_none_children
            n_of_none_children = n_of_none_children + 1
            nochild += 1
           
        
    # break the loop when if are no more children id
    if n_of_none_children == len(parent_ids):
        flag = 0
        break
    print("-> terms have no children : "+ str(nochild))
    
    # add the children of the current term of the whole vector all_children_ids
    all_children_ids.extend(current_children_ids)
        
    # for the next loop
    # current_children_ids becomes the new parent_ids
    parent_ids = []
    parent_ids.extend(current_children_ids)      
    # current_children_ids is reallocated for the next loop
    print ("-> current_children_ids   : " + str(len(current_children_ids)))
    current_children_ids = []
    # print all the children ids found so far
    print ("-> all_children_ids       : " + str(len(all_children_ids)))
    print (" ")
    # to the next children level
    children_level = children_level + 1
    

Forcing soap:address location to HTTPS


children level: 0
-> parent_ids             : 1
-> terms have no children : 0
-> current_children_ids   : 11
-> all_children_ids       : 11
 
children level: 1
-> parent_ids             : 11
-> terms have no children : 4
-> current_children_ids   : 51
-> all_children_ids       : 62
 
children level: 2
-> parent_ids             : 51
-> terms have no children : 45
-> current_children_ids   : 34
-> all_children_ids       : 96
 
children level: 3
-> parent_ids             : 34
-> terms have no children : 27
-> current_children_ids   : 19
-> all_children_ids       : 115
 
children level: 4
-> parent_ids             : 19
-> terms have no children : 17
-> current_children_ids   : 5
-> all_children_ids       : 120
 
children level: 5
-> parent_ids             : 5
-> terms have no children : 3
-> current_children_ids   : 4
-> all_children_ids       : 124
 
children level: 6
-> parent_ids             : 4
-> terms have no children : 4
-> current_children_ids   : 0
-> all_children_ids       : 124


---

Extract ontology terms corresponding to chemicals

In [9]:
# ChEBI_id contains several list -> to one list
all_chebi_id = list(chain.from_iterable(chemicals['ChEBI_id']))
print('Total number of chemical IDs: ',len(all_chebi_id))

Total number of chemical IDs:  624


In [10]:
#print(all_chebi_id)
chemical_ontologies = []
for i in range (0, len(all_chebi_id)):
    if client.service.getCompleteEntity(all_chebi_id[i]) != None:
        current_chemical_ontologies = client.service.getCompleteEntity(all_chebi_id[i]).OntologyParents
        for j in range(0, len(current_chemical_ontologies)):
            chemical_ontologies.append(current_chemical_ontologies[j].chebiId)
    else:
        print (all_chebi_id[i])
print ('Total number of ontology terms: ',len(chemical_ontologies))

CHEBI:183515
CHEBI:438934
Total number of ontology terms:  3530


In [11]:
# look for intersections between the chebi list and the nanostructure children
intersection = list(set(chemical_ontologies) & set(all_children_ids))
print ("The intersecting terms are: " + str(intersection) )

The intersecting terms are: []


---

## Identifying nanomaterial-related descriptions
We perform text matching for the term 'nano' in AOP and KE textual descriptions using SPARQL queries.

In [12]:
#create the data frame
AOPs = pandas.DataFrame(columns=['AOP','AOPTitle'])
pandas.options.display.max_colwidth = 250
terms = ['dcterms:abstract','nci:C48192','dc:description', 'nci:C25217','aopo:AopContext','aopo:has_evidence','edam:operation_3799'] #to look at various text sections
Descriptions = []
for x in terms:
    AOPQuery = '''
    SELECT distinct ?AOP ?AOPTitle ?Description
    WHERE {
     ?aop a aopo:AdverseOutcomePathway ;
     dc:title ?AOPTitle ;
     dc:identifier ?AOP ;
    '''+x+''' ?Description. 
    filter contains(?Description, "nano").
    } ORDER BY ?AOP
    '''
    sparql.setQuery(AOPQuery)
    sparql.setReturnFormat(JSON)  # Here the queries are made en the results are stored in "results".
    results = sparql.query().convert()

    
    #put the data in the "AOPs" table
    for result in results["results"]["bindings"]:
            AOPs = AOPs.append({
                'AOP': result["AOP"]["value"],
                'AOPTitle'   : result["AOPTitle"]["value"],
            }, ignore_index=True)
            Descriptions.append(result["Description"]["value"])
    #Parse keywords from the Descriptions
keywords = []
for text in Descriptions:
    words = text.split()
    listofwords = []
    for word in words:
        if 'nano' in word:
            listofwords.append(word)
    keywords.append(listofwords)
AOPs["Mentioned words"] = keywords
AOPs.sort_values(by=['AOP'])

Unnamed: 0,AOP,AOPTitle,Mentioned words
6,http://identifiers.org/aop/106,Chemical binding to tubulin in oocytes leading to aneuploid offspring,[nanomolar]
7,http://identifiers.org/aop/144,Endocytic lysosomal uptake leading to liver fibrosis,"[nanomaterials, nano]"
8,http://identifiers.org/aop/144,Endocytic lysosomal uptake leading to liver fibrosis,"[nanoscale, nanoclusters]"
0,http://identifiers.org/aop/173,Substance interaction with the lung resident cell membrane components leading to lung fibrosis,"[nanotoxicology, nanomaterial-induced, nanomaterials]"
3,http://identifiers.org/aop/173,Substance interaction with the lung resident cell membrane components leading to lung fibrosis,[insoluble&nbsp;nanomaterials]
9,http://identifiers.org/aop/173,Substance interaction with the lung resident cell membrane components leading to lung fibrosis,"[(nanomaterials)&nbsp;exhibiting, nanotoxicology.]"
1,http://identifiers.org/aop/207,NADPH oxidase and P38 MAPK activation leading to reproductive failure in Caenorhabditis elegans,[nanoparticles]
2,http://identifiers.org/aop/208,Janus kinase (JAK)/Signal transducer and activator of transcription (STAT) and Transforming growth factor (TGF)-beta pathways activation leading to reproductive failure,"[nanoparticles, nanoparticles, nanoparticles]"
11,http://identifiers.org/aop/293,Increased DNA damage leading to increased risk of breast cancer,"[Inano,, Inano,]"
4,http://identifiers.org/aop/294,Increased reactive oxygen and nitrogen species (RONS) leading to increased risk of breast cancer,"[Inano,, Inano,]"


In [13]:
#create the data frame
KEs = pandas.DataFrame(columns=['KE','KETitle','AOP'])
KEQuery = '''
SELECT ?KE ?KETitle ?AOP ?Description
WHERE {
 ?ke a aopo:KeyEvent ;
 dc:title ?KETitle ;
 dc:identifier ?KE ;
 dcterms:isPartOf ?AOP ;
 dc:description ?Description. 
filter contains( ?Description, "nano").
}
'''
sparql.setQuery(KEQuery)
sparql.setReturnFormat(JSON)  # Here the queries are made en the results are stored in "results".
results = sparql.query().convert()

Descriptions = []
#put the data in the "KEs" table
for result in results["results"]["bindings"]:
        KEs = KEs.append({
            'KE': result["KE"]["value"],
            'KETitle'   : result["KETitle"]["value"],
            'AOP': result["AOP"]["value"],
        }, ignore_index=True)
        Descriptions.append(result["Description"]["value"])
#Parse keywords from the Descriptions
keywords = []
for text in Descriptions:
    words = text.split()
    listofwords = []
    for word in words:
        if 'nano' in word:
            listofwords.append(word)
    keywords.append(listofwords)
KEs["Mentioned words"] = keywords
KEs.sort_values(by=['AOP'])


Unnamed: 0,KE,KETitle,AOP,Mentioned words
1,http://identifiers.org/aop.events/960,"Increased, Uptake of thyroxine into tissue",http://identifiers.org/aop/152,"[nanomolar, nanoparticles, nanoparticle,]"
2,http://identifiers.org/aop.events/961,"Increased, Clearance of thyroxine from tissues",http://identifiers.org/aop/152,[nano-]
0,http://identifiers.org/aop.events/1498,Loss of alveolar capillary membrane integrity,http://identifiers.org/aop/173,[nanoparticles]


## Direct matching of stressor names
We perform text matching for the term 'nano' on the stressor titles.

In [38]:
Stressors = pandas.DataFrame(columns=['Stressor','StressorTitle','Part of','Chemical'])


StressorQuery = '''  

SELECT ?StressorID ?StressorTitle ?AOP ?Chemical
WHERE {
 ?ke a nci:C54571 ;
 dc:title ?StressorTitle ;
 dc:identifier ?Stressor ;
 rdfs:label ?StressorID;
 dcterms:isPartOf ?AOP .
 OPTIONAL {?ke aopo:has_chemical_entity ?chemical.} 
filter contains( ?StressorTitle, "nano").
}
'''
sparql.setQuery(StressorQuery)
sparql.setReturnFormat(JSON)  # Here the queries are made en the results are stored in "results".
results = sparql.query().convert()

for result in results["results"]["bindings"]:
        Stressors = Stressors.append({
            'Stressor': result["StressorID"]["value"],
            'StressorTitle'   : result["StressorTitle"]["value"],
            'Part of': result["AOP"]["value"],
           # 'Chemical': result["Chemical"]["value"] if ('Chemical' in result'),
        }, ignore_index=True)
Stressors

Unnamed: 0,Stressor,StressorTitle,Part of,Chemical
0,Stressor 224,nanoparticles,http://identifiers.org/aop.events/1539,
1,Stressor 224,nanoparticles,http://identifiers.org/aop/144,
2,Stressor 252,Silver nanoparticles,http://identifiers.org/aop/207,
3,Stressor 253,UV-activated Titanium dioxide nanoparticles,http://identifiers.org/aop/208,
4,Stressor 254,Silica nanoparticles,http://identifiers.org/aop/209,
5,Stressor 255,Graphene oxide nanoparticles,http://identifiers.org/aop/210,
6,Stressor 255,Graphene oxide nanoparticles,http://identifiers.org/aop/237,
7,Stressor 318,Carbon nanotubes,http://identifiers.org/aop/237,
8,Stressor 318,Carbon nanotubes,http://identifiers.org/aop/241,
9,Stressor 338,"Carbon nanotubes, Multi-walled carbon nanotubes, single-walled carbon nanotubes, carbon nanofibres",http://identifiers.org/aop.events/1458,


In [35]:
a = Stressors['Stressor'].to_list()
print('Total of Stressors with "nano" in its title: '+str(len(set(a))))
set(a)

Total of Stressors with "nano" in its title: 8


{'http://identifiers.org/aop.stressor/224',
 'http://identifiers.org/aop.stressor/252',
 'http://identifiers.org/aop.stressor/253',
 'http://identifiers.org/aop.stressor/254',
 'http://identifiers.org/aop.stressor/255',
 'http://identifiers.org/aop.stressor/318',
 'http://identifiers.org/aop.stressor/338',
 'http://identifiers.org/aop.stressor/377'}

## Collecting and combining results from text-matching SPARQL queries.
We parse the SPARQL results, and combine these in lists of AOPs and KEs related to nanomaterials.

In [17]:
ListofAOPs={}
ListofKEs={}
for aop in AOPs["AOP"]:
    if not aop in ListofAOPs:
        ListofAOPs[aop]=1
    else:
        ListofAOPs[aop]+=1
for aop in KEs["AOP"]:
    if not aop in ListofAOPs:
        ListofAOPs[aop]=1
    else:
        ListofAOPs[aop]+=1
for ke in KEs["KE"]:
    if not ke in ListofKEs:
        ListofKEs[ke] = 1
for aop in Stressors["Part of"]:
    if not 'events' in aop:
        if not aop in ListofAOPs:
            ListofAOPs[aop]=1
        else:
            ListofAOPs[aop]+=1
    else:
        if not aop in ListofKEs:
            ListofKEs[aop]=1
        else:
            ListofKEs[aop]+=1
AOPlist = []
for item in ListofAOPs:
    AOPlist.append(item)
print(AOPlist)
print('Total of AOPs related to nanomaterials: ',len(AOPlist))

['http://identifiers.org/aop/173', 'http://identifiers.org/aop/207', 'http://identifiers.org/aop/208', 'http://identifiers.org/aop/294', 'http://identifiers.org/aop/296', 'http://identifiers.org/aop/106', 'http://identifiers.org/aop/144', 'http://identifiers.org/aop/303', 'http://identifiers.org/aop/293', 'http://identifiers.org/aop/152', 'http://identifiers.org/aop/209', 'http://identifiers.org/aop/210', 'http://identifiers.org/aop/237', 'http://identifiers.org/aop/241']
Total of AOPs related to nanomaterials:  14


In [36]:
#create the data frame
AOPs = pandas.DataFrame(columns=['AOPID','shortAOPTitle','AdverseOutcome','AOP'])
pandas.options.display.max_colwidth = 250
Descriptions = []
for x in AOPlist:
    AOPQuery = '''
    SELECT distinct ?AOPID ?AOPTitle ?aop ?aotitle
    WHERE {
     ?aop a aopo:AdverseOutcomePathway ;
     dcterms:alternative ?AOPTitle ;
     dc:identifier <'''+x+'''>;
     rdfs:label ?AOPID;
     aopo:has_adverse_outcome ?ao.
     ?ao dc:title ?aotitle
    } ORDER BY ?AOPID
    '''
    sparql.setQuery(AOPQuery)
    sparql.setReturnFormat(JSON)  # Here the queries are made en the results are stored in "results".
    results = sparql.query().convert()

    
    #put the data in the "AOPs" table
    for result in results["results"]["bindings"]:
            AOPs = AOPs.append({
                'AOPID': result["AOPID"]["value"],
                'shortAOPTitle'   : result["AOPTitle"]["value"],
                'AdverseOutcome': result["aotitle"]["value"],
                'AOP': result["aop"]["value"],
            }, ignore_index=True)
    #Parse keywords from the Descriptions

AOPs.sort_values(by=['AOPID']).reset_index(drop=True)


Unnamed: 0,AOPID,shortAOPTitle,AdverseOutcome,AOP
0,AOP 106,Tubulin binding and aneuploidy,"Increase, Aneuploid offspring",http://identifiers.org/aop/106
1,AOP 144,lysosomal uptake induced liver fibrosis,"N/A, Liver fibrosis",http://identifiers.org/aop/144
2,AOP 152,Transthyretin interference,"Cognitive Function, Decreased",http://identifiers.org/aop/152
3,AOP 173,Substance interaction with the lung cell membrane leading to lung fibrosis,Pulmonary fibrosis,http://identifiers.org/aop/173
4,AOP 207,NADPH oxidase activation leading to reproductive failure,Reproductive failure,http://identifiers.org/aop/207
5,AOP 208,JAK/STAT and TGF-beta pathways activation leading to reproductive failure,Reproductive failure,http://identifiers.org/aop/208
6,AOP 209,Cholesterol and glutathione leading to hepatotoxicity: Multi-OMICS approach,Hepatotoxicity,http://identifiers.org/aop/209
7,AOP 210,"JNK, FOXO and WNT alteration leading to reproductive failure: Multi-OMICS approach",Reproductive failure,http://identifiers.org/aop/210
8,AOP 237,Secretion of inflammatory cytokines leading to plaque progression,Plaque progression in arteries,http://identifiers.org/aop/237
9,AOP 241,Latent TGFbeta1 activation leads to pulmonary fibrosis,Pulmonary fibrosis,http://identifiers.org/aop/241


### Dependencies

In [23]:
%load_ext watermark
%watermark -v -m -p SPARQLWrapper,numpy,pandas,requests,zeep

The watermark extension is already loaded. To reload it, use:
  %reload_ext watermark
CPython 3.7.3
IPython 7.6.1

SPARQLWrapper 1.8.2
numpy 1.16.4
pandas 0.24.2
requests 2.22.0
zeep 3.4.0

compiler   : GCC 7.3.0
system     : Linux
release    : 5.4.0-42-generic
machine    : x86_64
processor  : x86_64
CPU cores  : 8
interpreter: 64bit


In [24]:
import datetime
now = datetime.datetime.now()
print ("Date: " + str(now.day) + "/" + str(now.month) + "/" + str(now.year))

Date: 22/9/2020
