Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PRS Task List #313

Closed
5 of 6 tasks
cristinaetrv opened this issue Oct 23, 2023 · 3 comments
Closed
5 of 6 tasks

PRS Task List #313

cristinaetrv opened this issue Oct 23, 2023 · 3 comments
Assignees
Labels
prs Polygenic Risk Score methods .task list A checklist of smaller tasks
Milestone

Comments

@cristinaetrv
Copy link
Collaborator

cristinaetrv commented Oct 23, 2023

PRS module

  • Documentation: Find/select 1 PRS module to implement/port
    Sprint 7:
  • Stand up worker for PRS
  • Preprocess of genotype data (VCF or from annotation) for PRS as a dosage matrix
  • Ingest association scores for PRS using GIANT UMich format, we are planning to enforce this format for any incoming GWAS summary stats
  • Implement calculation of PRS (C+T)

Future goals;

  • Ingest covariate information for PRS/GWAS
    • People may want to include PC's either from pre-calculate ancestry or from dataset alone
@cristinaetrv cristinaetrv added .task list A checklist of smaller tasks prs Polygenic Risk Score methods labels Oct 23, 2023
@cristinaetrv cristinaetrv self-assigned this Oct 23, 2023
@cristinaetrv cristinaetrv added this to the Sprint 3 milestone Oct 23, 2023
@cristinaetrv cristinaetrv modified the milestones: Sprint 3, Sprint 5 Nov 17, 2023
@cristinaetrv cristinaetrv changed the title PRS Sprint 3/4 Task List PRS Task List Nov 17, 2023
@cristinaetrv cristinaetrv modified the milestones: Sprint 5, Sprint 6 Jan 12, 2024
@cristinaetrv cristinaetrv modified the milestones: Sprint 6, Sprint 7 Feb 1, 2024
@cristinaetrv
Copy link
Collaborator Author

Update 2/9/24

  • Was able to start testing with GIANT format files and converted everything to include that header
  • Since GIANT files are in hg19, preprocessed 1kGP in hg19 which will also be useful for re-training/training Ancestry with hg19 (also tried ld pruning with multiple parameters and re-processed 1kGP in hg38 with those parameters for training Ancestry, added a plink sh script on the workstation in case this needs to be adjusted or run again)
  • Dennis helped me get bystro-vcf running, and there are a couple important updates that need to be added to the README (I can PR this next week):
    go get github.com/akotlar/bystro-vcf && go install $_;
    needs to change to
    go get github.com/bystrogenomics/bystro-vcf && go install $_;
    And this script doesn't build, so suggest adding 'go build' as well
  • Got a feather dosage matrix file from 1kGP hg19 subset using bystro-vcf successfully
  • Up next, converting pandas workflow to include arrow throughout, or maybe replace pandas completely (including the ancestry file coming in for genetic maps and the GWAS summary stats files)
  • During meeting, we discussed adding a liftover step to the GWAS summary stats because it will probably be the case that GWAS summary stats are older (hg19) and user datasets will be newer (hg38) or vice versa, will add this to sprint 8 goals
  • Still need to write tests for PRS code once everything is in the correct format, will try to do next week and get current workflow PR'd

@wingolab
Copy link
Collaborator

wingolab commented Feb 16, 2024

@cristinaetrv - For the bystro-vcf - I got it to install with go install github.com/bystrogenomics/bystro-vcf@latest, which is the more idiomatic way of installing golang tools. Can you give it a try and, if it works for you, we could update the bystro-vcf repo README.

@cristinaetrv
Copy link
Collaborator Author

See continuation in Sprint 9 Task List, created issue for install instructions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
prs Polygenic Risk Score methods .task list A checklist of smaller tasks
Projects
None yet
Development

No branches or pull requests

2 participants