<a href="https://colab.research.google.com/github/IUIDSL/neo-tools/blob/master/PD_analysis_D2D_IU_KG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Parkinson analysis using  D2D-IU KG

* Neo4j instance at <http://cheminfov.informatics.indiana.edu:7474/>
* Via [Neo4j Python package/driver](https://neo4j.com/developer/python/) [docs](https://neo4j.com/docs/driver-manual/)

### Imports

In [34]:
!pip install neo4j
import neo4j



In [0]:
import pandas

### Connect to db

In [0]:
uri = "bolt://cheminfov.informatics.indiana.edu/"
db = neo4j.GraphDatabase.driver(uri)
session = db.session()

### Search for PD-related entities

In [43]:
cql = '''MATCH (d)
        WHERE d.label CONTAINS 'Parkinson'
        RETURN d.label, d.nodeType
        ORDER BY d.nodeType'''
df = pandas.DataFrame(session.run(cql).data())
df

Unnamed: 0,d.label,d.nodeType
0,Defective SLC6A3 causes Parkinsonism-dystonia ...,Pathway
1,Defective SLC6A3 causes Parkinsonism-dystonia ...,Pathway
2,Anticonvulsant and antiParkinsonian drug poiso...,Phenotype
3,Parkinsonism due to heredodegenerative disorder,Phenotype
4,Parkinson,Phenotype
5,Parkinsonism co-occurrent and due to acute inf...,Phenotype
6,Parkinsonism with dementia of Guadeloupe,Phenotype
7,Young onset Parkinson disease,Phenotype
8,Autosomal dominant late onset Parkinson disease,Phenotype
9,Atypical Parkinsonism,Phenotype


### PD phenotypes, relationships, types and counts

In [37]:
cql = '''MATCH (d:Phenotype)-[r]-(o)
        WHERE d.label CONTAINS 'Parkinson'
        RETURN d.label, d.nodeType, type(r), o.nodeType, count(*) AS degree
        ORDER BY degree DESC'''
df = pandas.DataFrame(session.run(cql).data())
df.head(10)

Unnamed: 0,d.label,d.nodeType,degree,o.nodeType,type(r)
0,Parkinson Disease,Phenotype,442,Gene,geneDisease
1,Parkinson,Phenotype,276,Compound,compoundAdverseEffect
2,Parkinson,Phenotype,93,Gene,geneDisease
3,Parkinson,Phenotype,36,Compound,compoundIndication
4,Parkinson Disease,Phenotype,30,Compound,compoundIndication
5,Parkinson,Phenotype,24,Phenotype,isaPhenotype
6,Wolff-Parkinson-White Syndrome,Phenotype,17,Compound,compoundAdverseEffect
7,Anticonvulsant and antiParkinsonian drug poiso...,Phenotype,15,Phenotype,isaPhenotype
8,Parkinsonism,Phenotype,15,Phenotype,isaPhenotype
9,Wolff-Parkinson-White Syndrome,Phenotype,15,Gene,geneDisease


### PD related compounds
Why some duplicate rows? Cypher error?

In [38]:
cql = '''MATCH (d:Phenotype)-[r]-(o:Compound)
        WHERE d.label CONTAINS 'Parkinson'
        WITH d, o, r, count((d)-[r]-(o)) AS degree
        RETURN d.nodeType, d.label, type(r), o.nodeType, o.label, degree
        ORDER BY degree DESC'''
df = pandas.DataFrame(session.run(cql).data())
df.head(12)

Unnamed: 0,d.label,d.nodeType,degree,o.label,o.nodeType,type(r)
0,Parkinson,Phenotype,1,PALIPERIDONE,Compound,compoundAdverseEffect
1,Parkinson,Phenotype,1,RISPERIDONE,Compound,compoundAdverseEffect
2,Parkinson,Phenotype,1,RISPERIDONE,Compound,compoundAdverseEffect
3,Parkinson,Phenotype,1,LURASIDONE,Compound,compoundAdverseEffect
4,Parkinson,Phenotype,1,"(1S,2S,3R)-2-(1,3-benzodioxol-5-yl)-1-formyl-3...",Compound,compoundAdverseEffect
5,Parkinson,Phenotype,1,NITRAZEPAM,Compound,compoundAdverseEffect
6,Parkinson,Phenotype,1,PROCHLORPERAZINE,Compound,compoundAdverseEffect
7,Parkinson,Phenotype,1,EMTRICITABINE,Compound,compoundAdverseEffect
8,Parkinson,Phenotype,1,N-tert-butyl-2-[2-hydroxy-3-[(3-hydroxy-2-meth...,Compound,compoundAdverseEffect
9,Parkinson,Phenotype,1,"3-(1,3-benzodioxol-5-yloxymethyl)-4-(4-fluorop...",Compound,compoundAdverseEffect


In [0]:
#db.close()