-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
header info in input files #1
Comments
The S6 Data caption from the associated PLOS Computational Biology paper is slightly more helpful:
The caption for Table 3 is:
So hopefully that answers your questions regarding Note that the files available at http://het.io/disease-genes/downloads/ are from our 2015 study to predict disease-associated genes. In general, most users will be interested in Hetionet v1.0, which is available at https://neo4j.het.io (is down right now, will fix) and at https://github.com/dhimmel/hetionet. This hetnet is descibed in our 2017 eLife study called Project Rephetio. This project has much more detailed supplementary methods, since we discussed all code and data on Thinklab while performing the project. For example, see this discussion for how we processed the GWAS Catalog to get gene-disease associations in Project Rephetio. We used a very similar method to what we did in the predecessor study that created |
More generally, @jcbarret correctly points out an issue that the table columns are not very well documented for the files at http://het.io/disease-genes/downloads/. At this point, I don't have any immediate plans to fix this issue, but encourage users to post GitHub issues with any questions. At some point in the future, I'd like to revamp the het.io website and may address some of these issues then. |
We're moving the downloads page for the disease-genes study to GitHub from https://het.io/disease-genes/downloads/. The READMDE (pinned version) now shows the first two row of each table for more convenience. While columns are still not fully documented, I will close this for now. Happy to elaborate on column meanings as requested. As I note above, most users will probably be interested in the newer Hetionet data instead. |
I'm looking at files at http://het.io/disease-genes/downloads/ and am wondering if there's a key to the headers of the different input files? For example, https://raw.githubusercontent.com/dhimmel/het.io-dag-data/d8028c8820322ae4ad7642998bccc3ee7318ff16/downloads/diseases.txt has columns HC-P, HC-S, LC-P, LC-S but I'm not sure what they are. Sorry if this is obvious somewhere, but I couldn't find it after some searching.
The text was updated successfully, but these errors were encountered: