Skip to content

Does PolyFun Impute Summary Statistics Based on the Annotation File? #215

@ChrisK1988

Description

@ChrisK1988

Hi, thank you for such a great tool. I have a weird scenario (I think) occurring in my data. When doing the genome-wide fine-mapping for my trait of interest I have found that there are 131 "phantom" variants in my credible sets that are not in my input summary statistics. Doing process of elimination and working backwards all of these variants are present in the baselineLF2.2.UKB chromosome 8 annotations. They all have very low PIPs (generally below 0.001), but it just seems odd that these variants would be automatically inserted as potential test variants unless the software is automatically imputing the summary statistics. Would someone be able to confirm if this is expected behaviour? I am dealing with ~22 different traits and I have detected this issue in two of them so far, but only dug deeply enough to confirm what is going on in one of them. I assume the issue is the same in the other traits, however.

I do not see any fatal errors or warnings that would suggest something is wrong nor anything mentioning imputation during in the log files when my data runs, so I thought I would ask.

Thank you so much for your help,

Chris

My code is here:

#L2-regularized S-LDSC for sumstats
python polyfun.py --compute-h2-L2 --output-prefix Outputs/ADHD/Results/ADHD --sumstats Outputs/ADHD/SumStats/ADHD_Sumstats_munged.parquet --ref-ld-chr Annotations/baselineLF2.2.UKB/baselineLF2.2.UKB. --w-ld-chr Annotations/baselineLF2.2.UKB/weights.UKB. --allow-missing --no-partitions

#Create jobs
for i in {1..22}; do python create_finemapper_jobs.py --sumstats Outputs/ADHD/Results/ADHD.${i}.snpvar_ridge_constrained.gz --method finemap --finemap-exe finemap_v1.4.2_x86_64/finemap_v1.4.2_x86_64 --max-num-causal 1 --allow-missing --n 225534 --out-prefix Outputs/ADHD/Results/ADHD_All --jobs-file Outputs/ADHD/Results/ADHD_All_Jobs.Chr${i}.txt; done

#Submit jobs and run script
for chr in {1..22}; do while read i; do ${i}; done < Outputs/ADHD/Results/ADHD_All_Jobs.Chr${chr}.txt; done

The log files are below.

Munging step:

[INFO] Reading sumstats file...
[INFO] Done in 3.87 seconds
[INFO] 6774224 SNPs are in the sumstats file
[INFO] Removing 34039 HLA SNPs
[INFO] 6740185 SNPs with sumstats remained after all filtering stages
[INFO] Saving munged sumstats of 6740185 SNPs to Outputs/ADHD/Sumstats/ADHD_Sumstats_munged.parquet
[INFO] Done
ADHD_Sumstats_munged.parquet.log

L2-regularization step:

[INFO] Reading summary statistics from Outputs/ADHD/Sumstats/ADHD_Sumstats_munged.parquet ...
[INFO] Read summary statistics for 6740185 SNPs.
[INFO] Reading reference panel LD Score from Annotations/baselineLF2.2.UKB/baselineLF2.2.UKB.[1-22] ...
[INFO] Read reference panel LD Scores for 19386297 SNPs.
[INFO] Reading regression weight LD Score from Annotations/baselineLF2.2.UKB/weights.UKB.[1-22] ...
[INFO] Read regression weight LD Scores for 18275613 SNPs.
[INFO] After merging with reference panel LD, 6725531 SNPs remain.
[INFO] After merging with regression SNP LD, 6725478 SNPs remain.
[INFO] iterating over chromosomes to compute XTX, XTy...
[INFO] Evaluating Ridge lambdas...
[INFO] Selected ridge lambda: 9.3260e-02 (70/100) score: 7.3274e-02 score lstsq: 7.3026e-02
[INFO] Estimating annotation coefficients for each chromosomes set
[INFO] Computing per-SNP h^2 for each chromosome...
[INFO] Saving constrained SNP variances to disk
[WARNING] not all SNPs in the sumstats file and/or in the LD reference files are also in the annotations file. Keeping 6740165/6740185 SNPs
[INFO] Saving SNP variances to disk
[WARNING] not all SNPs in the sumstats file and/or in the LD reference files are also in the annotations file. Keeping 6740165/6740185 SNPs

Fine-mapping step for an example segment where one of the phantom segments was discovered (8_rs1984568_140657292):

[INFO] Loading sumstats file...
[INFO] Loaded sumstats for 385938 SNPs in 0.87 seconds
[INFO] Starting functionally-informed FINEMAP fine-mapping for chromosome 8 BP 140000001-143000001 (7609 SNPs)
[INFO] Running FINEMAP...
[INFO] done in 1.85 seconds
[INFO] Done in 1.86 seconds
[INFO] Writing fine-mapping results to Outputs/ADHD/Results/ADHD_All.Chr8.chr8.140000001_143000001.gz

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions