Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

qval vs pval_beta in *.egenes.txt.gz and *.signif_variant_gene_pairs.txt.gz #2

Closed
char4816 opened this issue Oct 2, 2020 · 2 comments

Comments

@char4816
Copy link

char4816 commented Oct 2, 2020

Hi,

First off, thank you for all of the awesome work with GTEx v8. I am trying to compare my eQTL results to the v8 GTEx database (see which eQTLs are replicated vs novel to my cell type).

I am thinking it would be optimal for me to compare my eQTLs with q-value < 0.05 to the *.signif_variant_gene_pairs.txt.gz files, however, am struggling to find (or derive) q-values for each of these variant-gene pairs in GTEx:
https://gtexportal.org/home/datasets

As I understand the documentation from:
https://storage.googleapis.com/gtex_analysis_v8/single_tissue_qtl_data/README_eQTL_v8.txt
the order of p-value adjustment follows the path of:
pval_nominal -> pval_perm -> pval_beta -> qval

In the files I see:

  • The *.allpairs.txt.gz files only contain pval_nominal.
  • The *.signif_variant_gene_pairs.txt.gz files contain adjustments through pval_beta. Since pval_beta tends to exceed 0.05 in these files, I assume that this list of "significant variant-gene pairs" was the subset of *.allpairs.txt.gz with 'qval' ≤ 0.05, however, there is no qval column?
  • The *.egenes.txt.gz files contain adjustments through qval.

I am confused as to why qval isn't included in all 3 file types. Given that the "eGenes are the rows with qval ≤ 0.05," I assume I should be using qval to identify which eQTLs replicate between my analysis and GTEx. It would be really helpful if the *.allpairs.txt.gz files contained a qval column for this comparison (or code for which I could derive the qval column?). At first I tried to do qvalue(pval_beta) for the subset of significant variant-gene pairs, but this fails to replicate the qval values since it is a subset of the variant-gene pairs.

Thanks in advance for any help you might be able to provide,
Chris

@francois-a
Copy link
Collaborator

Hi,

FDR was computed at the gene level (to identify eGenes), and therefore the q-values are only present in the *.egenes.txt.gz files. For details, please see Section 4.2 of the supplementary materials for additional details.
The *.signif_variant_gene_pairs.txt.gz files contain all pairs that pass the significance threshold for each eGene.

To test replication, consider computing the pi1 statistic (Section 5 of the supplementary materials). For that, you only need the nominal p-values (from the *.allpairs.txt.gz files) in GTEx corresponding to the variant-gene pairs that you're testing for replication.

@char4816
Copy link
Author

char4816 commented Oct 6, 2020

Francois,

Thank you very much for your helpful reply. This clarifies many of the things I was confused about. I will give this a try and let you know if I get stuck.

Chris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants