-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
missing phenotypes in icdinfo #4
Comments
|
Here is a list of the sources of missing phenotypes:
The above comprise the vast majority of phenotypes that are missing. There are a few (~80) traits that need to be updated in the pipeline. These are listed below in order of priority, but most of them are not yet able to be incorporated into the pipeline proper.
There are a couple errors in the existing pipeline that also affect many more traits, but which have to be resolved with the current versioning, namely:
|
Family History phenotypes have been added to the resource, and duplicate entries have been fixed in |
The below cardiac phenotypes are still error-prone in icdinfo:
It appears that this is due to an error in how the
It seems like the phenotype definition with the info file (above) has no nonmissing data, so the pipeline is working properly:
But the one we actually want to have logged has not been annotated (see above):
|
https://github.com/rivas-lab/ukbb-tools/blob/master/02_phenotyping/tables/ukb_20181109.tsv Phenotypes in this file are missing |
I found the issue re: the past two comments. As Matt and I found yesterday, some spacing issues were found in https://github.com/rivas-lab/ukbb-tools/blob/master/02_phenotyping/tables/ukb_20181109.tsv. We have since re-saved a proper version in its place. This is probably why Also:
There are a large number of RH and MED phenotypes with 0 counts in the icdinfo file. Ignoring those, though, all phenotypes that have the weird bug of having 0s in our
BIN_FC10010844's phe file isn't empty; there are people here and not in GBE. Perhaps these people don't make it to GBE because they are not white british unrelated (can someone check this?) Not sure where to find the HC phes to be honest. Interestingly, none of these phes that have 0 in them are in Long story short, a good number of problems will be resolved by rerunning gbe.sh on ukb_20181109.tsv. I strongly suggest we do a |
all_phes.txt I highly encourage: 1) removing duplicates, 2) moving all non-duplicates to |
Looks like HC65.phe also has very low number of cases, that are maybe excluded from analyses when white british unrelated are analyzed. Also, not in master.phe, probably for this very reason. (@maguirre1?) |
Also, biomarker phenotypes are not in It might be nice to assign new GBE_ID for adjusted traits and include both unadjusted and adjusted in the master phe & icdinfo. |
Long overdue, but a new icdinfo file has been generated. This fixes the issues with the QRS traits.
|
diff icdinfos
v1:
/oak/stanford/groups/mrivas/users/$USER/repos/rivas-lab/wiki/ukbb/icdinfo/icdinfo.txt
v2:
https://github.com/rivas-lab/ukbb-tools/blob/master/02_phenotyping/icdinfo.txt
The text was updated successfully, but these errors were encountered: