Skip to content

alessio-gamba/Similarity_Inter_PS

Repository files navigation

Similarity Inter Phenotypic Series

The datasets reported in the enclosed files have been generated and analyzed, as described in the manuscript "The Similarity of the Inherited Diseases (II): Clinical and Biological Similarity between the Phenotypic Series" by Gamba A, Salmona M, Cantù L and Bazzoni G (BMC Medical Genomics, 2020, submitted)

Outputs folder:

File '01_PS_id_names.txt' is the list of the OMIM-derived Phenotypic Series (PS) that have been analyzed in this study. The list provides id and name of each PS.

File '02_similarity_coefficients_CSN.txt' reports the clinical similarity coefficient (Sim_HPO) for each pair of PS (PSi_PSj), based on the shared disease phenotypes (i.e., the annotations of the diseases in Human Phenotype Ontology). The PS are the nodes of the Clinical Similarity Network (CSN) reported in Figure 2 of the manuscript, while the similarity coefficients represent the weight w of the edge connecting two PS in the CSN.

File '03_similarity_coefficients_BSNs.txt' reports the clinical similarity coefficients (Sim_BP, Sim_CC, Sim_MF) for each pair of PS (PSi_PSj), based on the shared biological features (i.e., the annotations of the disease gene products in each of the three sub-ontologies of Gene Ontology, namely Biological Process, BP; Cellular Component, CC; Molecular Function, MF). The PS are the nodes of the Biological Similarity Networks (BSN) reported in Figures S2 (BSN-BP), S3 (BSN-CC) and S4 (BSN-MF) of the manuscript, while the similarity coefficients represent the weight w of the edge connecting the two PS in the relevant BSN. In the general BSN (reported in Figure 3), an edge (linking a given pair of PS) is the edge with the highest w among the three edges that link the same PS pair in the three sub-ontology BSN (Sim_GO_max).

Python code:

'Build_mm_and_PS.py' is the Python script for the building of the correct structure of 'morbid map' and 'PS'.

'Similarity_go_hpo.py' is the Python script for the calculation of similarity among PS. It requires complete ontologies, as also annotations of genes and diseases, from both GO and HPO.

'Omim_tools' (1 and 2) are tools directly imported and used in two scripts for the computation.

About

Similarity between Phenotypic Series

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages