Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gnomAD v4 #161

Open
equinne5 opened this issue Dec 6, 2023 · 8 comments
Open

gnomAD v4 #161

equinne5 opened this issue Dec 6, 2023 · 8 comments

Comments

@equinne5
Copy link

equinne5 commented Dec 6, 2023

Hi Brent,

Thanks so much for SLIVAR and all of your wonderful tools!
was just wondering if there are any future plans to generate gnotation files for gnomAD v4- I understand its a huge undertaking so just wanted to see if its in the works in the future or if we should look into generating it ourselves.

Emma

@wwgordon
Copy link

Hi Emma,

I am not involved with slivar development but my lab is currently putting together a gnomad v4 gnotate file. Assuming it works as planned, I would be happy to share the file and/or the script we used to generate it. To keep file size manageable, we are only including a small subset of INFO fields, which may not match what you require:

##INFO=<ID=fafmax_faf95_max_joint,Number=A,Type=Float,Description="Maximum filtering allele frequency (using Poisson 95% CI) across genetic_ancestry groups in joint subset">
##INFO=<ID=fafmax_faf95_max_gen_anc_joint,Number=A,Type=String,Description="Genetic ancestry group with maximum filtering allele frequency (using Poisson 95% CI) in joint subset">
##INFO=<ID=faf95_joint,Number=A,Type=Float,Description="Filtering allele frequency (using Poisson 95% CI) in joint subset">
##INFO=<ID=nhomalt_joint,Number=A,Type=Integer,Description="Count of homozygous individuals in joint subset">

I should have this tested later this week.

Cheers,
William

@equinne5
Copy link
Author

Hi William, apologies for the delayed response but thank you so much that would be brilliant if you don't mind Id love to take you up on that! It would be great to have both file/script if that's alright - you can let me know the best way to share once its ready. Thanks so much again for your kind offer!
all the best,
Emma

@wwgordon
Copy link

Hi Emma,

I started with the full gnomAD v4 release, one bgz per chrom. As mentioned we only needed 3 annotations--we dropped the fafmax_faf95_max_gen_anc_joint because it is a string and therefore can't be gnotated. So first I pulled these 3 annotations using bcftools annotate -x (I use nohup because my connection is shaky):

nohup bash -c '
module load bcftools/1.17

for file in gnomad*bgz; do
  bcftools annotate -x ^INFO/fafmax_faf95_max_joint,INFO/faf95_joint,INFO/nhomalt_joint \
    --output temp_for_gnotate/toConcat_$file.bgz $file &
done
' &

Then I just concatenated these chroms into a single bcf:

ls -v temp_for_gnotate/toConcat* | \
bcftools concat \
  --file-list /dev/stdin \
  --output temp_for_gnotate/gnomad_v4_faf95joint_allChroms.bcf \
  --output-type b

Index the single bcf:

bcftools index gnomad_v4_faf95joint_allChroms.bcf

And create the gnotate:

${SLIVAR} make-gnotate \
  --field fafmax_faf95_max_joint:gnomadV4joint_maxFAF95 \
  --field faf95_joint:gnomadV4joint_FAF95_all \
  --field nhomalt_joint:gnomadV4joint_nHomAlt \
  --prefix gnotates/gnomadV4joint \
  gnomad_v4_faf95joint_allChroms.bcf

Clean up your temp files and that's all there is to it. I'm happy to send our gnotate file to you, though I suspect you may want to tailor it to the fields you require. Just let me know!

Cheers,
William

@equinne5
Copy link
Author

Thanks so much for this William - honestly you're so good for sending all of this! Its really generous .
If you don't mind Im going to be cheeky and ask for your gnotate file too ( at the moment the maxFAF95 and nHomAlt ifields are plenty to work with) and it would allow me to play around a bit with things before the end of the year if you've a gnotate file ready to go but only if its not too much trouble for you.
thanks again for being so helpful!

Emma

@wwgordon
Copy link

Sure thing, here it is (plus index):

https://storage.googleapis.com/anhinga/gnomad_v4_faf95joint_allChroms.bcf
https://storage.googleapis.com/anhinga/gnomad_v4_faf95joint_allChroms.bcf.csi

Let me know if you have any problems!
William

@equinne5
Copy link
Author

Thank you so much!! Really appreciate it! Take Care!

@wwgordon
Copy link

wwgordon commented Dec 22, 2023 via email

@wwgordon
Copy link

@equinne5 just so you're aware, there is an issue with gnomAD v4.0 AN and AF values:

https://docs.google.com/document/d/1Xm5ZIhmkh7hv2qEfCDS6J2T0IUZYiXP8pNClTlNvCGQ/edit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants