# Random Walk with Restart using MultiXrank

This code runs MultiXrank (https://github.com/anthbapt/multixrank ; https://multixrank-doc.readthedocs.io/en/latest/) on a network composed of 2 layers:
- RARE-X disease-patient-symtom associations network
- ORPHANET disease-phenotype associations network

The mapping of Rare-X diseases and Orphanet diseases is used as a bipartite network connecting the Rare-X and Orphanet layers.

We use iteratively the **27 diseases from the RARE-X layer as seeds** for the Random Walk with Restart (RWR) analysis. Doing so, we are able to **score all nodes** from the multilayer network. These scores represent a **proximity** between the seed node and the other nodes of the network.

The output scores for each seed are stored in folder `results_MultiXrank_RARE_X_diseases/output_DiseaseDisease_PhenotypeOntology_Weighted/`.

In [1]:
import multixrank
import pandas as pd
import glob
import os

In [2]:
# Location of files containing the Rare-X and Orphanet layers
layer_1 = ['../network/multiplex/RARE_X/RARE_X_layer.tsv']
layer_2 = ['../network/multiplex/Orpha/DiseaseDisease_PhenotypeOntology.tsv']
# Location of file containing the bipartite network
bipartite = ['../network/bipartite/bipartite_RARE_X_orpha_diseases.tsv']

# Configuration and seed files location
outconfig = 'config_files_DiseaseDisease_PhenotypeOntology_Weighted'
outseed = 'seed_files_DiseaseDisease_PhenotypeOntology_Weighted'
# Results location
outdir = 'output_DiseaseDisease_PhenotypeOntology_Weighted'

To run MultiXrank, we need **seeds**. In our case, the seeds are the diseases contained in the Rare-X layer, that is 27 different diseases. MultiXrank will be run for each Rare-X disease taken as a seed. Below, we create the seed files that will be used in MultiXrank. These will be stored in folder `results_MultiXrank_RARE_X_diseases/seed_files_DiseaseDisease_PhenotypeOntology`.

In [3]:
def create_seeds_file(diseases_names_files: str, path_to_seeds_files: str) -> int:
    """Function that generates the seeds files used in MultiXrank

    Args:
        diseases_names_files (str): path to the file that recapitulates the diseases
        found in the RARE-X network
        path_to_seeds_files (str): path to the seeds files to generate
    """
    # create the directory to store the seeds files
    os.makedirs(path_to_seeds_files, exist_ok=True)
    # extract diseases used as seeds = the 27 diseases in the RARE-X network
    diseases = pd.read_csv(diseases_names_files, sep=";", header=0)
    i = 0
    j = 1
    for index, row in diseases.iterrows():
        seeds_file= path_to_seeds_files + "/" + f"seeds_{j}.txt"
        seeds = pd.DataFrame(columns=["seed"])
        if row[1] == "None":
            seeds._set_value(i, "seed", row[0])
        else:
            seeds._set_value(i, "seed", row[1])
        seeds.to_csv(seeds_file, sep="\t", header=None, index=False)
        i += 1
        j += 1
    return(j)

seedNb = create_seeds_file(diseases_names_files="../data/Diseases_Rx_orpha_corres.csv",
                           path_to_seeds_files=f"../results_MultiXrank_RARE_X_diseases/{outseed}/")

We also need configuration files, which contains the parameters to use when running MultiXrank. We store these in folder `/results_MultiXrank_RARE_X_diseases/config_files_DiseaseDisease_PhenotypeOntology/`.

In [4]:
def create_config_files(path_to_config_files: str, path_to_seed_files: str, layers_1: list, layers_2: list) -> None:
    """Function that generates the configurations files required to run MultiXrank

    Args:
        path_to_config_files (str): path to the folder where configuration files will be stored
        path_to_seed_files (str): path to the folder where seeds files are stored
        layers_1 (list): path to the network layer 1
        layers_2 (list): path to the network layer 2
    """
    os.makedirs(path_to_config_files, exist_ok=True)
    size = 1
    for i in range(1, seedNb):
        file = open(path_to_config_files + f'/config_{i}.yml', 'w')
        r = 0.7
        delta = 0
        eta = [1, 0]
        tau = [1]

        file.write('seed:' + ' ' + path_to_seed_files + f'/seeds_{i}.txt' + '\n')
        file.write('self_loops: 0' + '\n')
        file.write('r: ' + str(r) + '\n')
        temp = '{},'*size
        part = '[' + temp.rstrip(',') +']'
        file.write('eta: ' + str(eta) + '\n')
        file.write('lamb:' + '\n')
        file.write('    ' + '-' +  ' ' + '[' + str(1/2) + ',' + str(1/2) + ']' + '\n')
        file.write('    ' + '-' +  ' ' + '[' + str(1/2) + ',' + str(1/2) + ']' + '\n')
        file.write('multiplex:' + '\n')
        
        file.write('    ' + 'Rare_X_layer' + ':' + '\n' + '        ' + \
                        'layers:' + '\n' + '            ')
        file.write('-' +  ' ' + layers_1[0] + '\n' + '        ')
        
        file.write('delta: {}'.format(str(delta)) + '\n' + '        ' )
        file.write('graph_type: ' + '[' + ('00, '*size).rstrip(', ') + ']' + '\n' + '        ' )
        file.write('tau: ' + str(tau) + '\n')
        
        file.write('    ' + 'Orpha_layer' + ':' + '\n' + '        ' + \
                        'layers:' + '\n' + '            ')
        file.write('-' + ' ' + layers_2[0] + '\n' + '        ')
        file.write('delta: {}'.format(str(delta)) + '\n' + '        ' )
        file.write('graph_type: ' + '[' + ('01, '*size).rstrip(', ') + ']' + '\n' + '        ' )
        file.write('tau: ' + str(tau) + '\n')

        file.write('bipartite:' + '\n')
        file.write("    " +  bipartite[0] + ": {'source': 'Rare_X_layer', 'target': 'Orpha_layer', graph_type: 00}" + '\n')
        file.close

create_config_files(path_to_config_files=f"../results_MultiXrank_RARE_X_diseases/{outconfig}",
                    path_to_seed_files=f"{outseed}",
                    layers_1=layer_1, layers_2=layer_2)

Now that we have seed files and configuration files, we can run MultiXrank. The results are stored in folder **`results_MultiXrank_RARE_X_diseases/output_DiseaseDisease_PhenotypeOntology`**.

In [5]:
for i in range(1, seedNb):
    multixrank_obj = multixrank.Multixrank(config=f"../results_MultiXrank_RARE_X_diseases/{outconfig}/config_{i}.yml",
                                           wdir="../results_MultiXrank_RARE_X_diseases/")
    ranking_df = multixrank_obj.random_walk_rank()
    os.makedirs(f"../results_MultiXrank_RARE_X_diseases/{outdir}/output_{i}", exist_ok=True)
    multixrank_obj.write_ranking(ranking_df, path=f"../results_MultiXrank_RARE_X_diseases/{outdir}/output_{i}/")