![CMRN](https://cmrn.systemsbiology.net/static/images/cancerMiRNARegulatoryNetwork.gif)
# Cancer miRNA regulatory network reconstruction

### Abstract
Genes regulated by the same miRNA can be discovered by virtue of their coexpression at the transcriptional level and the presence of a conserved miRNA-binding site in their 3' UTRs. Using this principle we have integrated the three best performing and complementary algorithms into a framework for inference of regulation by miRNAs (FIRM) from sets of coexpressed genes. We demonstrate the utility of FIRM by inferring a cancer-miRNA regulatory network through the analysis of 2,240 gene coexpression signatures from 46 cancers. By analyzing this network for functional enrichment of known hallmarks of cancer we have discovered a subset of 13 miRNAs that regulate oncogenic processes across diverse cancers. We have performed experiments to test predictions from this miRNA-regulatory network to demonstrate that miRNAs of the miR-29 family (miR-29a, miR-29b, and miR-29c) regulate specific genes associated with tissue invasion and metastasis in lung adenocarcinoma. Further, we highlight the specificity of using FIRM inferences to identify miRNA-regulated genes by experimentally validating that miR-767-5p, which partially shares the miR-29 seed sequence, regulates only a subset of miR-29 targets. By providing mechanistic linkage between miRNA dysregulation in cancer, their binding sites in the 3'UTRs of specific sets of coexpressed genes, and their associations with known hallmarks of cancer, FIRM, and the inferred cancer miRNA-regulatory network will serve as a powerful public resource for discovery of potential cancer biomarkers.

[Plaisier CL, Pan M, Baliga NS. A miRNA-regulatory network explains how dysregulated miRNAs perturb oncogenic processes across diverse cancers. Genome Res. 2012 Nov;22(11):2302-14. doi: 10.1101/gr.133991.111. Epub 2012 Jun 28. PMID: 22745231; PMCID: PMC3483559.](https://pubmed.ncbi.nlm.nih.gov/22745231/)

[Cancer miRNA Regulatory Network](https://cmrn.systemsbiology.net/)

### Goal of todays analyses
Build a network based on several different statistical analyses:  1) miRNA to target gene enrichment analysis across 2,240 gene co-expression signatures, 2) GO biological process functional enrichment analysis of the same 2,240 co-expression signatures, and 3) semantic similarity of GO term with the hallmarks of cancer. All of these data are located in [Supplementary Table 8](https://genome.cshlp.org/content/suppl/2012/09/13/gr.133991.111.DC1/SuppTab8.xlsx) from [Plaisier, et al. 2012](https://pubmed.ncbi.nlm.nih.gov/22745231/).

![Network structure](https://genome.cshlp.org/content/22/11/2302/F6.large.jpg)
**Figure legend**: Summary of FIRM predictions for the miR-29a/b/c and miR-767-5p cancer–miRNA regulatory subnetwork. This subnetwork is included in both the metastatic- and cross-cancer–miRNA regulatory networks. The network is laid out hierarchically with (from the top down) cancers, miRNAs, coexpression signatures, genes that were experimentally validated through luciferase assays, significantly enriched GO biological process terms for the coexpression signature, and finally the GO terms associated with hallmarks of cancers. (Left) The FIRM integration strategy that is a flow of information through this hierarchy, where the red arrows indicate a FIRM prediction. The meanings of the FIRM predictions are described on the right side, where inference of a miRNA regulating a cancer coexpression signature predicts that the miRNA is dysregulated in that cancer. This same inference predicts that the miRNA regulates the genes in the signature, which can be tested experimentally. Functional enrichment of GO term annotations among the coregulated genes predicts the effect of regulating this set of genes, and association of the enriched GO terms with hallmarks of cancer predicts the oncogenic processes that might be affected. (FIRM = framework for inference of regulation by miRNAs)

### Files for the analysis
Data files and code for this lecture:

[data_network.zip](https://asu.instructure.com/courses/166212/files/70614348?wrap=1)
[networkReconstruction.ipynb](https://asu.instructure.com/courses/166212/files/70614350?wrap=1)
[networkReconstruction.py](https://asu.instructure.com/courses/166212/files/70614351?wrap=1)

Then put them into a working directory and extract the data files so that you have the following directory structure:

```| Working_Directory
     -| networkReconstruction.py
     -| networkReconstruction.ipynb
     -| data
         -| hsa_mature.csv
         -| SuppTab8.csv```

**Descriptions of files:**
- **data/SuppTab8.csv**:  The Supplementary Table 8 is provided as an Excel spreadsheet, and has been converted into a CSV file. It will be located in the data_network.zip file you downloaded.
- **data/hsa_mature.csv**:  A file that can be used to translate miRBase from computer friendly mature sequence IDs into human friendly miRNA names.



### Packages to install
You will need to install both pyvis and networkx:

In [1]:
import sys
!{sys.executable} -m pip install pyvis
!{sys.executable} -m pip install networkx



### Load libraries

In [2]:
import pandas as pd
from pyvis.network import Network
import networkx as nx

### Load up data
Load up SuppTab8 which has all the information needed:

In [3]:
st8 = pd.read_csv('data/SuppTab8.csv', header=0, index_col=0)
print(st8.shape)

(314, 13)


Split out cancer from cluster name and add to DataFrame:

In [4]:
st8['cancer'] = [i.split(' ')[1] for i in st8.index]
st8

Unnamed: 0_level_0,GO Terms,Mature Seqeunce IDs,Hallmarks of Cancer,Self Sufficiency In Growth Signals,Insensitivity To Antigrowth Signals,Evading Apoptosis,Limitless Replicative Potential,Sustained Angiogenesis,Tissue Invasion And Metastasis,Genome Instability And Mutation,Tumor Promoting Inflammation,Reprogramming Energy Metabolism,Evading Immune Detection,cancer
Co-expression Signature,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
COID Lung Bhattacharjee_24,GO:0051384|GO:0050896|GO:0030155|GO:0007596|GO...,MIMAT0004694,9,0.93,0.93,0.90,0.81,1.00,1.00,0.81,0.85,0.37,0.85,Lung
SMCL Lung Bhattacharjee_61,GO:0019226|GO:0007599|GO:0042127|GO:0007596|GO...,MIMAT0000760,9,1.00,1.00,1.00,0.81,1.00,1.00,0.85,1.00,0.79,1.00,Lung
AC Brain Sun_5,GO:0019730|GO:0050867|GO:0042110|GO:0006928|GO...,MIMAT0003307|MIMAT0002840,8,0.93,0.93,0.90,0.51,0.88,0.99,0.83,1.00,0.48,1.00,Brain
GCT Seminoma Korkola_4,GO:0030301|GO:0042127|GO:0006631|GO:0045907|GO...,MIMAT0000729|MIMAT0002172|MIMAT0000720,8,1.00,1.00,0.94,0.52,1.00,1.00,0.83,0.95,0.48,0.95,Seminoma
ME Melanoma Hoek_50,GO:0019226|GO:0008104|GO:0007599|GO:0019221|GO...,MIMAT0003285,8,1.00,1.00,0.87,0.53,0.92,1.00,0.83,0.95,0.79,0.95,Melanoma
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SRS Ovarian Hendrix_1,GO:0006139|GO:0010467|GO:0045449,MIMAT0003311|MIMAT0001639,0,0.78,0.78,0.53,0.39,0.37,0.53,0.76,0.58,0.44,0.58,Ovarian
SRS Ovarian Hendrix_11,GO:0019058|GO:0019083|GO:0031018,MIMAT0004801|MIMAT0000703,0,0.32,0.32,0.07,0.04,0.27,0.31,0.40,0.14,0.05,0.14,Ovarian
SRS Ovarian Hendrix_26,GO:0008333,MIMAT0000771,0,0.14,0.14,0.00,0.00,0.00,0.26,0.00,0.00,0.00,0.00,Ovarian
TU Prostate Lapointe_1,GO:0018243|GO:0018242|GO:0009100|GO:0006486,MIMAT0004801|MIMAT0000083|MIMAT0000082|MIMAT00...,0,0.40,0.40,0.16,0.00,0.00,0.23,0.40,0.16,0.67,0.16,Prostate


Split up GO terms into a list

In [5]:
st8['GO Terms'] = [i.split('|') for i in st8['GO Terms']]
st8

Unnamed: 0_level_0,GO Terms,Mature Seqeunce IDs,Hallmarks of Cancer,Self Sufficiency In Growth Signals,Insensitivity To Antigrowth Signals,Evading Apoptosis,Limitless Replicative Potential,Sustained Angiogenesis,Tissue Invasion And Metastasis,Genome Instability And Mutation,Tumor Promoting Inflammation,Reprogramming Energy Metabolism,Evading Immune Detection,cancer
Co-expression Signature,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
COID Lung Bhattacharjee_24,"[GO:0051384, GO:0050896, GO:0030155, GO:000759...",MIMAT0004694,9,0.93,0.93,0.90,0.81,1.00,1.00,0.81,0.85,0.37,0.85,Lung
SMCL Lung Bhattacharjee_61,"[GO:0019226, GO:0007599, GO:0042127, GO:000759...",MIMAT0000760,9,1.00,1.00,1.00,0.81,1.00,1.00,0.85,1.00,0.79,1.00,Lung
AC Brain Sun_5,"[GO:0019730, GO:0050867, GO:0042110, GO:000692...",MIMAT0003307|MIMAT0002840,8,0.93,0.93,0.90,0.51,0.88,0.99,0.83,1.00,0.48,1.00,Brain
GCT Seminoma Korkola_4,"[GO:0030301, GO:0042127, GO:0006631, GO:004590...",MIMAT0000729|MIMAT0002172|MIMAT0000720,8,1.00,1.00,0.94,0.52,1.00,1.00,0.83,0.95,0.48,0.95,Seminoma
ME Melanoma Hoek_50,"[GO:0019226, GO:0008104, GO:0007599, GO:001922...",MIMAT0003285,8,1.00,1.00,0.87,0.53,0.92,1.00,0.83,0.95,0.79,0.95,Melanoma
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SRS Ovarian Hendrix_1,"[GO:0006139, GO:0010467, GO:0045449]",MIMAT0003311|MIMAT0001639,0,0.78,0.78,0.53,0.39,0.37,0.53,0.76,0.58,0.44,0.58,Ovarian
SRS Ovarian Hendrix_11,"[GO:0019058, GO:0019083, GO:0031018]",MIMAT0004801|MIMAT0000703,0,0.32,0.32,0.07,0.04,0.27,0.31,0.40,0.14,0.05,0.14,Ovarian
SRS Ovarian Hendrix_26,[GO:0008333],MIMAT0000771,0,0.14,0.14,0.00,0.00,0.00,0.26,0.00,0.00,0.00,0.00,Ovarian
TU Prostate Lapointe_1,"[GO:0018243, GO:0018242, GO:0009100, GO:0006486]",MIMAT0004801|MIMAT0000083|MIMAT0000082|MIMAT00...,0,0.40,0.40,0.16,0.00,0.00,0.23,0.40,0.16,0.67,0.16,Prostate


Split up miRNAs into a list

In [6]:
st8['Mature Seqeunce IDs'] = [i.split('|') for i in st8['Mature Seqeunce IDs']]
st8

Unnamed: 0_level_0,GO Terms,Mature Seqeunce IDs,Hallmarks of Cancer,Self Sufficiency In Growth Signals,Insensitivity To Antigrowth Signals,Evading Apoptosis,Limitless Replicative Potential,Sustained Angiogenesis,Tissue Invasion And Metastasis,Genome Instability And Mutation,Tumor Promoting Inflammation,Reprogramming Energy Metabolism,Evading Immune Detection,cancer
Co-expression Signature,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
COID Lung Bhattacharjee_24,"[GO:0051384, GO:0050896, GO:0030155, GO:000759...",[MIMAT0004694],9,0.93,0.93,0.90,0.81,1.00,1.00,0.81,0.85,0.37,0.85,Lung
SMCL Lung Bhattacharjee_61,"[GO:0019226, GO:0007599, GO:0042127, GO:000759...",[MIMAT0000760],9,1.00,1.00,1.00,0.81,1.00,1.00,0.85,1.00,0.79,1.00,Lung
AC Brain Sun_5,"[GO:0019730, GO:0050867, GO:0042110, GO:000692...","[MIMAT0003307, MIMAT0002840]",8,0.93,0.93,0.90,0.51,0.88,0.99,0.83,1.00,0.48,1.00,Brain
GCT Seminoma Korkola_4,"[GO:0030301, GO:0042127, GO:0006631, GO:004590...","[MIMAT0000729, MIMAT0002172, MIMAT0000720]",8,1.00,1.00,0.94,0.52,1.00,1.00,0.83,0.95,0.48,0.95,Seminoma
ME Melanoma Hoek_50,"[GO:0019226, GO:0008104, GO:0007599, GO:001922...",[MIMAT0003285],8,1.00,1.00,0.87,0.53,0.92,1.00,0.83,0.95,0.79,0.95,Melanoma
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SRS Ovarian Hendrix_1,"[GO:0006139, GO:0010467, GO:0045449]","[MIMAT0003311, MIMAT0001639]",0,0.78,0.78,0.53,0.39,0.37,0.53,0.76,0.58,0.44,0.58,Ovarian
SRS Ovarian Hendrix_11,"[GO:0019058, GO:0019083, GO:0031018]","[MIMAT0004801, MIMAT0000703]",0,0.32,0.32,0.07,0.04,0.27,0.31,0.40,0.14,0.05,0.14,Ovarian
SRS Ovarian Hendrix_26,[GO:0008333],[MIMAT0000771],0,0.14,0.14,0.00,0.00,0.00,0.26,0.00,0.00,0.00,0.00,Ovarian
TU Prostate Lapointe_1,"[GO:0018243, GO:0018242, GO:0009100, GO:0006486]","[MIMAT0004801, MIMAT0000083, MIMAT0000082, MIM...",0,0.40,0.40,0.16,0.00,0.00,0.23,0.40,0.16,0.67,0.16,Prostate


### Load up accessory data
Load up mature sequence IDs to miRNA names

In [7]:
miRDb = pd.read_csv('data/hsa_mature.csv', header=0, index_col=1)
print(miRDb.shape)

(2656, 3)


## Filter network into cross-cancer subnetwork
![Networks](https://genome.cshlp.org/content/22/11/2302/F4.large.jpg)

Following the figure we will first reduce the network to only those associated to a hallmark of cancer.

### Filter 1:  associated to a hallmark of cancer

Then we use a function to pull out for each cluster the significant hallmarks of cancer based on an semantic similiarity score of 0.8. To facilitate this we will use a function:

In [8]:
hallmarks = ['Self Sufficiency In Growth Signals', 'Insensitivity To Antigrowth Signals', 'Evading Apoptosis', 'Limitless Replicative Potential', 'Sustained Angiogenesis', 'Tissue Invasion And Metastasis', 'Genome Instability And Mutation', 'Tumor Promoting Inflammation', 'Reprogramming Energy Metabolism', 'Evading Immune Detection']
def getHallmarks(scores, cutoff=0.8):
    return list((scores[scores>cutoff]).index)

Let's run it against all the clusters to identify which hallmark(s) are significant for each cluster:

In [9]:
clusterHallmarks = []
for cluster in st8.index:
    scores = st8.loc[cluster, hallmarks]
    clusterHallmarks.append(getHallmarks(scores))

And this back to the DataFrame as a new column 'sigHallmarks':

In [10]:
st8['sigHallmarks'] = clusterHallmarks
st8

Unnamed: 0_level_0,GO Terms,Mature Seqeunce IDs,Hallmarks of Cancer,Self Sufficiency In Growth Signals,Insensitivity To Antigrowth Signals,Evading Apoptosis,Limitless Replicative Potential,Sustained Angiogenesis,Tissue Invasion And Metastasis,Genome Instability And Mutation,Tumor Promoting Inflammation,Reprogramming Energy Metabolism,Evading Immune Detection,cancer,sigHallmarks
Co-expression Signature,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
COID Lung Bhattacharjee_24,"[GO:0051384, GO:0050896, GO:0030155, GO:000759...",[MIMAT0004694],9,0.93,0.93,0.90,0.81,1.00,1.00,0.81,0.85,0.37,0.85,Lung,"[Self Sufficiency In Growth Signals, Insensiti..."
SMCL Lung Bhattacharjee_61,"[GO:0019226, GO:0007599, GO:0042127, GO:000759...",[MIMAT0000760],9,1.00,1.00,1.00,0.81,1.00,1.00,0.85,1.00,0.79,1.00,Lung,"[Self Sufficiency In Growth Signals, Insensiti..."
AC Brain Sun_5,"[GO:0019730, GO:0050867, GO:0042110, GO:000692...","[MIMAT0003307, MIMAT0002840]",8,0.93,0.93,0.90,0.51,0.88,0.99,0.83,1.00,0.48,1.00,Brain,"[Self Sufficiency In Growth Signals, Insensiti..."
GCT Seminoma Korkola_4,"[GO:0030301, GO:0042127, GO:0006631, GO:004590...","[MIMAT0000729, MIMAT0002172, MIMAT0000720]",8,1.00,1.00,0.94,0.52,1.00,1.00,0.83,0.95,0.48,0.95,Seminoma,"[Self Sufficiency In Growth Signals, Insensiti..."
ME Melanoma Hoek_50,"[GO:0019226, GO:0008104, GO:0007599, GO:001922...",[MIMAT0003285],8,1.00,1.00,0.87,0.53,0.92,1.00,0.83,0.95,0.79,0.95,Melanoma,"[Self Sufficiency In Growth Signals, Insensiti..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SRS Ovarian Hendrix_1,"[GO:0006139, GO:0010467, GO:0045449]","[MIMAT0003311, MIMAT0001639]",0,0.78,0.78,0.53,0.39,0.37,0.53,0.76,0.58,0.44,0.58,Ovarian,[]
SRS Ovarian Hendrix_11,"[GO:0019058, GO:0019083, GO:0031018]","[MIMAT0004801, MIMAT0000703]",0,0.32,0.32,0.07,0.04,0.27,0.31,0.40,0.14,0.05,0.14,Ovarian,[]
SRS Ovarian Hendrix_26,[GO:0008333],[MIMAT0000771],0,0.14,0.14,0.00,0.00,0.00,0.26,0.00,0.00,0.00,0.00,Ovarian,[]
TU Prostate Lapointe_1,"[GO:0018243, GO:0018242, GO:0009100, GO:0006486]","[MIMAT0004801, MIMAT0000083, MIMAT0000082, MIM...",0,0.40,0.40,0.16,0.00,0.00,0.23,0.40,0.16,0.67,0.16,Prostate,[]


Now we can subset the network down to those with an association with a hallmark of cancer:

In [11]:
st8_hm = st8.loc[[True if len(st8.loc[cluster,'sigHallmarks'])>0 else False for cluster in st8.index]]
st8_hm

Unnamed: 0_level_0,GO Terms,Mature Seqeunce IDs,Hallmarks of Cancer,Self Sufficiency In Growth Signals,Insensitivity To Antigrowth Signals,Evading Apoptosis,Limitless Replicative Potential,Sustained Angiogenesis,Tissue Invasion And Metastasis,Genome Instability And Mutation,Tumor Promoting Inflammation,Reprogramming Energy Metabolism,Evading Immune Detection,cancer,sigHallmarks
Co-expression Signature,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1
COID Lung Bhattacharjee_24,"[GO:0051384, GO:0050896, GO:0030155, GO:000759...",[MIMAT0004694],9,0.93,0.93,0.90,0.81,1.00,1.00,0.81,0.85,0.37,0.85,Lung,"[Self Sufficiency In Growth Signals, Insensiti..."
SMCL Lung Bhattacharjee_61,"[GO:0019226, GO:0007599, GO:0042127, GO:000759...",[MIMAT0000760],9,1.00,1.00,1.00,0.81,1.00,1.00,0.85,1.00,0.79,1.00,Lung,"[Self Sufficiency In Growth Signals, Insensiti..."
AC Brain Sun_5,"[GO:0019730, GO:0050867, GO:0042110, GO:000692...","[MIMAT0003307, MIMAT0002840]",8,0.93,0.93,0.90,0.51,0.88,0.99,0.83,1.00,0.48,1.00,Brain,"[Self Sufficiency In Growth Signals, Insensiti..."
GCT Seminoma Korkola_4,"[GO:0030301, GO:0042127, GO:0006631, GO:004590...","[MIMAT0000729, MIMAT0002172, MIMAT0000720]",8,1.00,1.00,0.94,0.52,1.00,1.00,0.83,0.95,0.48,0.95,Seminoma,"[Self Sufficiency In Growth Signals, Insensiti..."
ME Melanoma Hoek_50,"[GO:0019226, GO:0008104, GO:0007599, GO:001922...",[MIMAT0003285],8,1.00,1.00,0.87,0.53,0.92,1.00,0.83,0.95,0.79,0.95,Melanoma,"[Self Sufficiency In Growth Signals, Insensiti..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
SQ Lung Bhattacharjee_44,"[GO:0009888, GO:0006928, GO:0030199, GO:003019...","[MIMAT0003882, MIMAT0016913]",1,0.49,0.49,0.24,0.27,0.46,0.99,0.55,0.29,0.06,0.29,Lung,[Tissue Invasion And Metastasis]
SQ Lung Bhattacharjee_45,[GO:0006334],[MIMAT0003330],1,0.30,0.30,0.06,0.34,0.00,0.25,0.81,0.06,0.00,0.06,Lung,[Genome Instability And Mutation]
SRS Ovarian Hendrix_72,[GO:0006334],[MIMAT0005930],1,0.30,0.30,0.06,0.34,0.00,0.25,0.81,0.06,0.00,0.06,Ovarian,[Genome Instability And Mutation]
TU Prostate Lapointe_37,[GO:0040012],"[MIMAT0004975, MIMAT0002881, MIMAT0004563, MIM...",1,0.53,0.53,0.29,0.00,0.15,0.84,0.15,0.36,0.00,0.36,Prostate,[Tissue Invasion And Metastasis]


### Filter 2:  Same oncogenic process regulated by same miRNA across cancers

In [12]:
miRNAs = {}
for cluster in st8_hm.index:
    #print(row1)
    for miRNA in st8_hm.loc[cluster,'Mature Seqeunce IDs']:
        if not miRNA in miRNAs:
            miRNAs[miRNA] = []
        miRNAs[miRNA].append(cluster)

### Filter 3:  miRNAs that regulate the same GO term & hallmark of cancer

In [13]:
cc_miRNAs = {}
cc_go_miRNAs = {}
for miRNA in miRNAs:
    # If have more than one cluster regulated by the miRNA
    if len(miRNAs[miRNA])>1:
        # If more than one cancer is regulated by the miRNA
        if len(st8_hm.loc[miRNAs[miRNA],'cancer'])>1:
            # Idenitfy which clusters have same go term -> hallmark relationships
            hm_dict = {}
            go_dict = {}
            # First, fill hm_dict with hallmarks and go_dict with go terms
            for cluster in miRNAs[miRNA]:
                # Build clusters so can test for common hallmarks
                for hm1 in st8_hm.loc[cluster,'sigHallmarks']:
                    if not hm1 in hm_dict:
                        hm_dict[hm1] = []
                    hm_dict[hm1].append(cluster)
                # Build clusters so can test for common GO terms
                for go1 in st8_hm.loc[cluster,'GO Terms']:
                    if not go1 in go_dict:
                        go_dict[go1] = []
                    go_dict[go1].append(cluster)
            # Second, find hallmarks in common across clusters, and refine subnetwork (cc_miRNAs)
            for hm1 in hm_dict:
                # If more than one cluster linked to same hallmark
                if len(hm_dict[hm1])>1:
                    if not miRNA in cc_miRNAs:
                        cc_miRNAs[miRNA] = {}
                    cc_miRNAs[miRNA][hm1] = miRNAs[miRNA]
            # Thrid, find go terms in common across clusters, and refine subnetwork
            for go1 in go_dict:
                # If more than one cluster linked to same GO term
                if len(go_dict[go1])>1:
                    # If had more than one cluster linked to same hallmark
                    if miRNA in cc_miRNAs:
                        if not miRNA in cc_go_miRNAs:
                            cc_go_miRNAs[miRNA] = {'hallmarks':cc_miRNAs[miRNA].keys(), 'clusters':miRNAs[miRNA], 'GO_terms':[]}
                        cc_go_miRNAs[miRNA]['GO_terms'].append(go1)
print(cc_miRNAs)
print(cc_go_miRNAs)

{'MIMAT0004694': {'Self Sufficiency In Growth Signals': ['COID Lung Bhattacharjee_24', 'GL Brain Bredel_30'], 'Insensitivity To Antigrowth Signals': ['COID Lung Bhattacharjee_24', 'GL Brain Bredel_30']}, 'MIMAT0003285': {'Self Sufficiency In Growth Signals': ['ME Melanoma Hoek_50', 'AD Lung Beer_6', 'AD Lung Bhattacharjee_30', 'B-CLL Leukemia Haslinger_62', 'CA Breast Sorlie_12', 'MM Myeloma Zhan_3'], 'Insensitivity To Antigrowth Signals': ['ME Melanoma Hoek_50', 'AD Lung Beer_6', 'AD Lung Bhattacharjee_30', 'B-CLL Leukemia Haslinger_62', 'CA Breast Sorlie_12', 'MM Myeloma Zhan_3'], 'Sustained Angiogenesis': ['ME Melanoma Hoek_50', 'AD Lung Beer_6', 'AD Lung Bhattacharjee_30', 'B-CLL Leukemia Haslinger_62', 'CA Breast Sorlie_12', 'MM Myeloma Zhan_3'], 'Tissue Invasion And Metastasis': ['ME Melanoma Hoek_50', 'AD Lung Beer_6', 'AD Lung Bhattacharjee_30', 'B-CLL Leukemia Haslinger_62', 'CA Breast Sorlie_12', 'MM Myeloma Zhan_3'], 'Genome Instability And Mutation': ['ME Melanoma Hoek_50',

### Build the network #1: cancer &rarr; miRNA &rarr; cluster &rarr; GO term &rarr; hallmark of cancer
First need to initialize a networkx directed graph:

In [14]:
crossCancer = nx.DiGraph()

Add each cancer &rarr; miRNA &rarr; cluster &rarr; GO term &rarr; hallmark of cancer relationship:

In [15]:
for miRNA in cc_go_miRNAs:
    for cluster in cc_go_miRNAs[miRNA]['clusters']:
        # Add cancer -> miRNA
        crossCancer.add_edge(st8_hm.loc[cluster,'cancer'],miRDb.loc[miRNA,'Name'])
        crossCancer.nodes[st8_hm.loc[cluster,'cancer']]['group'] = 'cancer'
        crossCancer.nodes[miRDb.loc[miRNA,'Name']]['group'] = 'miRNA'
        # Add miRNA -> cluster
        crossCancer.add_edge(miRDb.loc[miRNA,'Name'],cluster)
        crossCancer.nodes[cluster]['group'] = 'cluster'
        # Add cluster -> GO term -> HM
        for go1 in cc_go_miRNAs[miRNA]['GO_terms']:
            crossCancer.add_edge(cluster,go1)
            crossCancer.nodes[go1]['group'] = 'go_term'
            for hm1 in cc_go_miRNAs[miRNA]['hallmarks']:
                crossCancer.add_edge(go1,hm1)
                crossCancer.nodes[hm1]['group'] = 'hallmark'

Write out the network as an interactive HTML page:

In [16]:
cc_nt = Network('800px','800px', directed=True, notebook=True)
#cc_nt.show_buttons(filter_=['physics'])
cc_nt.hrepulsion()
cc_nt.from_nx(crossCancer)
cc_nt.show('crossCancer_wGOTerms_EL.html')

### Build the network #2: cancer &rarr; miRNA &rarr; cluster &rarr; hallmark of cancer
Now let's see what happens if we remove the GO terms from the network.

First need to initialize a networkx directed graph:

In [17]:
crossCancer = nx.DiGraph()

Add each cancer &rarr; miRNA &rarr; cluster &rarr; hallmark of cancer relationship:

In [18]:
for miRNA in cc_go_miRNAs:
    for cluster in cc_go_miRNAs[miRNA]['clusters']:
        # Add cancer -> miRNA
        crossCancer.add_edge(st8_hm.loc[cluster,'cancer'],miRDb.loc[miRNA,'Name'])
        crossCancer.nodes[st8_hm.loc[cluster,'cancer']]['group'] = 'cancer'
        crossCancer.nodes[miRDb.loc[miRNA,'Name']]['group'] = 'miRNA'
        # Add miRNA -> cluster
        crossCancer.add_edge(miRDb.loc[miRNA,'Name'],cluster)
        crossCancer.nodes[cluster]['group'] = 'cluster'
        # Add cluster -> HM
        for hm1 in cc_go_miRNAs[miRNA]['hallmarks']:
            crossCancer.add_edge(cluster,hm1)
            crossCancer.nodes[hm1]['group'] = 'hallmark'

Write out the network as an interactive HTML page:

In [19]:
cc_nt = Network('800px','800px', directed=True, notebook=True)
#cc_nt.show_buttons(filter_=['physics'])
cc_nt.hrepulsion()
cc_nt.from_nx(crossCancer)
cc_nt.show('crossCancer_no_GOTerms_EL.html')

### Build the network #3: cancer &rarr; miRNA &rarr; hallmark of cancer
Now let's see what happens if we remove the clusters and GO terms from the network.

First need to initialize a networkx directed graph:

In [20]:
crossCancer = nx.DiGraph()

Add each cancer &rarr; miRNA &rarr; hallmark of cancer relationship:

In [21]:
for miRNA in cc_go_miRNAs:
    for cluster in cc_go_miRNAs[miRNA]['clusters']:
        # Add cancer -> miRNA
        crossCancer.add_edge(st8_hm.loc[cluster,'cancer'],miRDb.loc[miRNA,'Name'])
        crossCancer.nodes[st8_hm.loc[cluster,'cancer']]['group'] = 'cancer'
        crossCancer.nodes[miRDb.loc[miRNA,'Name']]['group'] = 'miRNA'
        # Add miRNA -> HM
        for hm1 in cc_go_miRNAs[miRNA]['hallmarks']:
            crossCancer.add_edge(miRDb.loc[miRNA,'Name'],hm1)
            crossCancer.nodes[hm1]['group'] = 'hallmark'

Write out the network as an interactive HTML page:

In [22]:
cc_nt = Network('800px','800px', directed=True, notebook=True)
#cc_nt.show_buttons(filter_=['physics'])
cc_nt.from_nx(crossCancer)
cc_nt.show('crossCancer_no_Cluster_no_GOTerms_EL.html')