### COVID-19 Graph CVD ICD11 Corona Symptom Analysis

This notebook analyzes the list of ICD11 symptoms and descriptors in publications with their associations with coronavirus.

#### Authentication to access covidgraph.org graph

In [9]:
import pandas as pd
import json
from neo4j import GraphDatabase
# from neo4j import APOC

In [10]:
covid_browser = "https://covid.petesis.com:7473"
covid_url = "bolt://covid.petesis.com:7687"
user = "public"
password = "corona"

#driver = GraphDatabase.driver(uri, auth=(user, password))
driver = GraphDatabase.driver(uri = covid_url,\
                              auth = (user,password))

#### The queries below focus on symptoms and descriptor terms specified before it
- For each ICD11 code, a list of all its associated symptoms is created
- In a loop each name is queried into a dictionary with 5 main publication attributes (journal, publish time, source, title, and url)
- This dictionary is appended to a larger dictionary that maps each name to all of its associated papers
- This data is then written to a ```json``` file named by its ICD11 code

**Use Corona disease and symptoms from ICD 11 (e.g., 'BodyText' node in graph)**

#### ICD11 Code: XN83D

In [11]:
query = "MATCH (p:Paper)-[:PAPER_HAS_BODYTEXTCOLLECTION]-(:BodyTextCollection)\
                                -[:BODYTEXTCOLLECTION_HAS_BODYTEXT]-(a:BodyText) \
                                WHERE (LOWER(a.text) CONTAINS 'coronavirus') \
                                    return p LIMIT 1"
with driver.session() as session:
    info = session.run(query)
    for item in info:
        print(item)

<Record p=<Node id=66685 labels={'Paper'} properties={'cord_uid': 'imbxofkp', 'cord19-fulltext_hash': '276d1d1c20336ca2a6f54c7a95507001917e4c44', 'journal': 'Emerg Infect Dis', 'publish_time': '2005-01-10', 'source': 'PMC', 'title': 'Tracing SARS-Coronavirus Variant with Large Genomic Deletion', '_hash_id': '9986e9e7e5fd88596118e63d8adb8233', 'url': 'https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3294368/'}>>


In [12]:
entities_xn = ['human coronavirus 229e','human coronavirus hku1', 'human coronavirus oc43', \
                'middle east respiratory syndrome coronavirus', 'pipistrellus bat coronavirus hku5', \
                'rousettus bat coronavirus hku9', 'severe acute respiratory syndrome coronavirus', \
                'tylonycteris bat coronavirus hku4']

In [15]:
result_xn = []
result_xn_len = []
for entity in entities_xn:
    entity_result = []
    query = "MATCH (p:Paper)-[:PAPER_HAS_BODYTEXTCOLLECTION]-(:BodyTextCollection)-\
                                    [:BODYTEXTCOLLECTION_HAS_BODYTEXT]-(a:BodyText) \
                                    WHERE (LOWER(a.text) CONTAINS '" + entity + "')" + \
                                    "RETURN DISTINCT p.cord_uid"
    
    with driver.session() as session:
        info = session.run(query)
        for item in info:
            entity_result.append(item.values()[0])
    result_xn.append({entity:entity_result})
    result_xn_len.append({entity:len(entity_result)})

In [16]:
result_xn_len

[{'human coronavirus 229e': 491},
 {'human coronavirus hku1': 78},
 {'human coronavirus oc43': 277},
 {'middle east respiratory syndrome coronavirus': 1819},
 {'pipistrellus bat coronavirus hku5': 23},
 {'rousettus bat coronavirus hku9': 7},
 {'severe acute respiratory syndrome coronavirus': 2243},
 {'tylonycteris bat coronavirus hku4': 23}]

In [17]:
result_xn

[{'human coronavirus 229e': ['oa1uoqu5',
   'zaaraiy7',
   'zojhdnlu',
   'tro7b4d0',
   '1ssh296a',
   '1wswi7us',
   'yy96yeu9',
   '5gsbtfag',
   'rrhh2alf',
   'jh9e85c0',
   'xtg0e142',
   '0fitbwuv',
   '3gd1w2kn',
   'sc1yzzsn',
   't0dg7y73',
   '20los4eg',
   '1c4m2fym',
   'cobrl3dl',
   'v5vfsejz',
   'gkm7i62s',
   'qw4fhdeo',
   '9r9ll83m',
   '6g55l35h',
   '6n4updwi',
   '3c4dttrt',
   'lakdi3x8',
   'ss1upwwo',
   '108refnh',
   'rn6rtu38',
   '4z7yfj5s',
   '9tumplp3',
   'i5korxut',
   'exvngzza',
   '9wflr4u3',
   '6c68pmem',
   'd9cna1g6',
   'wa6k2kv6',
   'n0rubc72',
   'x3b6j5d0',
   'fiizu59h',
   'nc8ktxeo',
   'v2wovcqd',
   'vexo81k5',
   'zwl0safn',
   'c57impca',
   '855h0e1k',
   '509p6use',
   'dmqfij6j',
   'zgka74sr',
   'e6401p1p',
   '6f0q661e',
   '3jaw7ymu',
   '7qk98wku',
   'mvpd977d',
   '48wao8a0',
   '6eptw6io',
   '8pkrg0mx',
   'f06msmmr',
   '7x1wwqs3',
   '2j4pjjwk',
   'yc8q62z8',
   'soxxnnk8',
   'vw07dywk',
   'pvoe8enz',
   '0mu4tkui',