# Disease pathways in ATOMICANets

This Jupyter notebook provides an example of how you load ATOMICANets and analyze largest disease pathways in the network. The five processed ATOMICANets can be downloaded from [Harvard Dataverse](https://doi.org/10.7910/DVN/4DUBJX).

This notebook outputs the largest pathway components for the three diseases shown in the paper: asthma in ATOMICANet-Lipid, myeloid leukemia in ATOMICANet-Ion, hypertrophic cardiomyopathy in ATOMICANet-Small-Molecule.

In [1]:
import networkx as nx

In [None]:
chosen_disease = "asthma"

graph_path = "/path/to/atomica_net/graphs/ATOMICANet_lipid.gml"
graph = nx.read_gml(graph_path)

nodes_in_disease = []
for node in graph.nodes():
    diseases = graph.nodes[node]['diseases'].split(", ")
    if chosen_disease in diseases:
        nodes_in_disease.append(node)
largest_connected_component = max(nx.connected_components(graph.subgraph(nodes_in_disease)), key=len)

print("Number of nodes in ATOMICANet associated with disease: ", len(nodes_in_disease))
print("Number of nodes in largest connected component: ", len(largest_connected_component))
print("Relative size of largest connected component: ", len(largest_connected_component)/len(nodes_in_disease))
print("Proteins in largest connected component: ", ", ".join([graph.nodes[node]['gene'].split(" ")[0] for node in largest_connected_component]))

Number of nodes in ATOMICANet associated with disease:  43
Number of nodes in largest connected component:  10
Relative size of largest connected component:  0.23255813953488372
Proteins in largest connected component:  SCN7A, SCN1A, SCN3A, SCN8A, SCN4A, SCN5A, SCN9A, SCN10A, SCN2A, SCN11A


In [None]:
chosen_disease = "myeloid leukemia"

graph_path = "/path/to/atomica_net/graphs/ATOMICANet_ion.gml"
graph = nx.read_gml(graph_path)

nodes_in_disease = []
for node in graph.nodes():
    diseases = graph.nodes[node]['diseases'].split(", ")
    if chosen_disease in diseases:
        nodes_in_disease.append(node)
largest_connected_component = max(nx.connected_components(graph.subgraph(nodes_in_disease)), key=len)

print("Number of nodes in ATOMICANet associated with disease: ", len(nodes_in_disease))
print("Number of nodes in largest connected component: ", len(largest_connected_component))
print("Relative size of largest connected component: ", len(largest_connected_component)/len(nodes_in_disease))
print("Proteins in largest connected component: ", ", ".join([graph.nodes[node]['gene'].split(" ")[0] for node in largest_connected_component]))

Number of nodes in ATOMICANet associated with disease:  53
Number of nodes in largest connected component:  12
Relative size of largest connected component:  0.22641509433962265
Proteins in largest connected component:  PRKCA, TET2, PRKD1, WT1, PRKCE, POLE, PRKD3, PRKCH, DNMT1, PRKCB, PHF6, PRKCG


In [None]:
chosen_disease = "hypertrophic cardiomyopathy"

graph_path = "/path/to/atomica_net/graphs/ATOMICANet_small_molecule.gml"
graph = nx.read_gml(graph_path)

nodes_in_disease = []
for node in graph.nodes():
    diseases = graph.nodes[node]['diseases'].split(", ")
    if chosen_disease in diseases:
        nodes_in_disease.append(node)
largest_connected_component = max(nx.connected_components(graph.subgraph(nodes_in_disease)), key=len)

print("Number of nodes in ATOMICANet associated with disease: ", len(nodes_in_disease))
print("Number of nodes in largest connected component: ", len(largest_connected_component))
print("Relative size of largest connected component: ", len(largest_connected_component)/len(nodes_in_disease))
print("Proteins in largest connected component: ", ", ".join([graph.nodes[node]['gene'].split(" ")[0] for node in largest_connected_component]))

Number of nodes in ATOMICANet associated with disease:  45
Number of nodes in largest connected component:  7
Relative size of largest connected component:  0.15555555555555556
Proteins in largest connected component:  HRAS, MYH6, MYH7B, ACTB, MARK3, ACTG1, ACTC1
