# DMD Drug Repurposing

### Intereseting Identifiers: 

    'MONDO:0010679', #Duchenne Muscular Distrophy
    'MONDO:0010311', #Becker Muscular Distrophy
    'MONDO:0010542', #Dilated Cardiomiopathy 3B
    'MONDO:0016097', #Asymtomatic Female Carrier
    'HGNC:2928', #DMD human gene
    'MGI:94909' #DMD mouse gene


## Import

In [2]:
import monarch

## Monarch



Include in the seed list the IDs that you are interested in, then run the cell. The output will be two .csv files containing the node and edge information. 

In [3]:
%%time
# prepare data to graph schema
# seed nodes

seedList = [ 
    'MONDO:0010679', #Duchenne Muscular Distrophy
#     'MONDO:0010311', #Becker Muscular Distrophy
#     'MONDO:0010542', #Dilated Cardiomiopathy 3B
#     'MONDO:0016097', #Asymtomatic Female Carrier
    'HGNC:2928', #DMD human gene
#     'MGI:94909' #DMD mouse gene
] 

# get first shell of neighbours
neighboursList = monarch.get_neighbours_list(seedList)
print(len(neighboursList))

# introduce animal model ortho-phenotypes for seed and 1st shell neighbors
## For seed nodes:
seed_orthophenoList = monarch.get_orthopheno_list(seedList)
print(len(seed_orthophenoList))
## For 1st shell nodes:
neighbours_orthophenoList = monarch.get_orthopheno_list(neighboursList)
print(len(neighbours_orthophenoList))

# network nodes: seed + 1shell + ortholog-phentoype
geneList = sum([seedList,
                neighboursList,
                seed_orthophenoList,
                neighbours_orthophenoList], 
               [])
print('genelist: ',len(geneList))

# get Monarch network
monarch_network = monarch.extract_edges(geneList)
print('network: ',len(monarch_network))

# save edges
monarch.print_network(monarch_network, 'monarch_connections')

# build network with graph schema 
monarch_edges = monarch.build_edges(monarch_network)
monarch_nodes = monarch.build_nodes(monarch_network)

  0%|                                                     | 0/2 [00:00<?, ?it/s]


The function "get_neighbours_list()" is running. Its runtime may take some minutes. If you interrupt the process, you will lose all the nodes retrieved and you should start over the execution of this function.


100%|█████████████████████████████████████████████| 2/2 [00:17<00:00,  8.72s/it]
  0%|                                                     | 0/2 [00:00<?, ?it/s]


Finished get_neighbours_list().

1759

The function "get_orthopheno_list()" is running. Its runtime may take some hours. If you interrupt the process, you will lose all the nodes retrieved and you should start over the execution of this function.


100%|█████████████████████████████████████████████| 2/2 [00:18<00:00,  9.05s/it]
100%|███████████████████████████████████████████| 13/13 [00:39<00:00,  2.57s/it]
  0%|                                                  | 0/1759 [00:00<?, ?it/s]


Finished get_orthopheno_list().

272

The function "get_orthopheno_list()" is running. Its runtime may take some hours. If you interrupt the process, you will lose all the nodes retrieved and you should start over the execution of this function.


100%|█████████████████████████████████████| 1759/1759 [1:02:55<00:00,  1.51s/it]
100%|█████████████████████████████████████| 3309/3309 [2:16:48<00:00,  2.74s/it]
  0%|                                                 | 0/10458 [00:00<?, ?it/s]


Finished get_orthopheno_list().

8694
genelist:  10727

The function "extract_edges()" is running. Its runtime may take some hours. If you interrupt the process, you will lose all the edges retrieved and you should start over the execution of this function.


100%|███████████████████████████████████| 10458/10458 [9:30:27<00:00,  2.94s/it]



Finished extract_edges(). To save the retrieved Monarch edges use the function "print_network()".

network:  93310

Saving Monarch edges at: '/home/s2959607/homedir/bioknowledge-reviewer/bioknowledge_reviewer/monarch/monarch_connections_v2021-12-02.csv'...


The function "build_edges()" is running...
Detected a new reference source in Monarch not yet implemented in this module. The new source should be added to the dictionary of sources.Otherwise, the source CURIE cannot be translated to the corresponding URI.
In the build_edges() method, update 'uriPrefixes_dct' dictionary with 'NCBIBSgene'
The edge that includes this new reference source is Pandas(Index=10945, object_id='HP:0002885', object_label='Medulloblastoma', reference_id_list='PMID:8929955|PMID:20223039|PMID:20685668|PMID:9101302|PMID:17411426|PMID:12173026|PMID:17026565|PMID:15300853|PMID:9585611|PMID:23159591|PMID:24664542|PMID:11852337|PMID:9890479|PMID:8162022|PMID:25980754|PMID:28533537|PMID:12007223|MONDO:0021055|PMID:1


Adding BioThings annotation: gene name, synonyms, description...
symbols: 3424
querying 1-1000...done.
querying 1001-2000...done.
querying 2001-3000...done.
querying 3001-3424...done.
Finished.
454 input query terms found dup hits:
	[('Pak3', 2), ('Cdc42', 3), ('Rnf11', 2), ('GIT2', 3), ('CHRNA1', 3), ('Dtna', 2), ('dtna', 2), ('VC
1744 input query terms found no hit:
	['ENSEMBL:ENSGALG00000009050', 'ENSEMBL:ENSMODG00000017909', 'ENSEMBL:ENSACAG00000007410', 'ENSEMBL:
Pass "returnall=True" to return complete lists of duplicate or missing query terms.

* This is the size of the nodes file data structure: (10458, 6)
* These are the nodes attributes: Index(['description', 'id', 'name', 'preflabel', 'semantic_groups',
       'synonyms'],
      dtype='object')
* This is the first record:
  description                          id name                   preflabel  \
0          NA  ENSEMBL:ENSGALG00000009050  NaN  ENSEMBL:ENSGALG00000009050   

  semantic_groups synonyms  
0            GENE  