-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inquiry about metapaths from 2017 Paper "Systematic Integration of Biomedical Knowledge Prioritizes Drugs for Repurposing" #58
Comments
Quoting from the manuscript:
We have an interactive table of the metapaths here, but it doesn't look like it has the fdr adjusted p-values. I think the dataset you want is fdr_delta_auroc = p.adjust(p = pval_delta_auroc, method = 'fdr') I think we also saved the FDR adjusted p-values in |
Thank you very much for your prompt reply. I have successfully identified 1069 metapaths that meet the above criteria. My next objective is to find the instance paths for these metapaths. From my understanding, and based on the information you've provided, it seems you have generated query statements for each metapath to measure their effectiveness as features. Do you have the instance paths generated during the query process for metapaths? If these data are not available, would I need to execute the queries individually on Neo4j to retrieve the information for all metapaths? Given that Neo4j in https://neo4j.het.io/ often experiences timeouts, this approach seems somewhat impractical. Could you advise on the best course of action to obtain these data? Any suggestions or alternative methods you could provide would be immensely helpful. |
We do not store actual paths corresponding to source node, target node, metapath combinations. Instead we generate them on the fly via Cypher queries to Neo4j. When the path count is large, i.e. over 10,000, then I don't suggest trying to generate all paths. I don't see a valid use case for generating such a large number of paths though. When the path count is that large, any individual path tends to be pretty meaningless. Also noting our recent publication Hetnet connectivity search provides rapid insights into how two biomedical entities are related. |
Thanks a lot! I also have a question about the undirected metaedges in the paper. You mentioned it in the last sentence of the first paragraph on page 7: "Note that all metaedges besides Gene->regulates->Gene are undirected." Take Anatomy–upregulates–Gene as an example, we can not say Gene–>upregulates–>Anatomy but Anatomy-> was upregulated->Gene. Isn't this just a directed edge? |
See related issue #23. Whether a metaedge/edge is directional or symmetric is a distinction that is most relevant when the source and target metanode are the same. When there are different source and target metanodes, we encoded "directionality" as different metaedges like:
|
You mean you use different edges between the same node pair to express the directionality. However, the entity pair, Compound and Disease, do not have different edges to represent the direction, but the same edge is used to represent the reverse direction in the metapath, e.g., Compound–palliates–Disease–palliates–Compound-treat-Disease. So I'm confused about how to distinguish the direction, or whether all edges in the meta-knowledge graph are bidirectional, even for Anatomy–upregulates–Gene and Anatomy–downregulates–Gene. |
Compound–palliates–Disease and Disease–palliates–Compound are the same edge type, just with different orientations. There is no difference in the semantic meaning between the two, which is why we consider the bipartite edges in Hetionet as bidirectional. |
Hi, I am a postgraduate studying in the domain adaptation of pre-trained language models. I've been following your work in the realm of biomedical data integration.
I was particularly intrigued by your 2017 paper titled "Systematic Integration of Biomedical Knowledge Prioritizes Drugs for Repurposing." In it, you mentioned that 709 of the 1206 metapaths exhibited a statistically significant AUROC at a false discovery rate cutoff of 5%. However, while trying to replicate some of the results and delve deeper into the open-source data, I was unable to locate these 709 metapaths. Would it be possible for you to provide the specific metapaths and their instance paths? I am keen on further exploring these paths and your assistance would be of great help as I continue my journey in the biomedical domain.
The text was updated successfully, but these errors were encountered: