Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build PRG from MSA and more sparsely from VCF #130

Open
ffranr opened this issue Jun 7, 2018 · 4 comments
Open

Build PRG from MSA and more sparsely from VCF #130

ffranr opened this issue Jun 7, 2018 · 4 comments
Assignees
Projects

Comments

@ffranr
Copy link
Contributor

ffranr commented Jun 7, 2018

No description provided.

@ffranr ffranr added this to To Do in Release 1.9 via automation Jun 7, 2018
@bricoletc bricoletc changed the title Build PRG from MSA Build PRG more flexibly and more inclusively Jun 14, 2019
@bricoletc
Copy link
Member

bricoletc commented Jun 14, 2019

Targets here:

  1. We want to be able to build a PRG from a MSA right now

  2. Long alleles might need to be collapsed down. For eg a record of TCAGA (ref) and TTACA (alt) will 'overlap' any other variation within the same region. This creates combinatorial explosion (vcf_clusterer module) or straight out ignoring (perl script). One solution is building a graph from vcf and parsing that into a prg (this will collapse the SNPs in the long record); another (non-exclusive) is to allow nesting in gramtools, such that overlapping records are no longer flattened into one.

@bricoletc bricoletc removed this from To Do in Release 1.9 Jul 16, 2019
@bricoletc bricoletc added this to To Do in Release 1.7 via automation Jul 16, 2019
@bricoletc bricoletc changed the title Build PRG more flexibly and more inclusively Build PRG from MSA and more sparsely from VCF Jul 16, 2019
@bricoletc bricoletc added this to To Do in Release 1.9 via automation Jul 16, 2019
@bricoletc bricoletc removed this from To Do in Release 1.7 Jul 16, 2019
@bricoletc
Copy link
Member

Now that gramtools supports prgs made with make_prg, we need a streamlined way to build a whole-genome graph from:

  • a ref genome
  • a set of MSAs (as a fofn for eg)
  • a BED file (or equivalent) describing the MSA coordinates

From this gramtools (or make_prg?) runs make_prg on each MSA and combines the PRGs with the rest of the genome.

@kdm9
Copy link

kdm9 commented Feb 1, 2022

Hello folks,

I'm in the situation of needing exactly what @bricoletc describes above. Is there an approach that exists today which implements this functionality? Is this a serious plan with someone working on this functionality? If not, can I be of any assistance in making this a reality?

Best,
Kevin

@bricoletc
Copy link
Member

Hi @kdm9 , this is a timely question!
The feature is not currently implemented in a simple way at all, I've done it via a snakemake worfklow.
I will aim to implement this in gramtools. It should not be too complicated and is essential for tool usability. I estimate 2 weeks.

However, could you give me a sense of what you're trying to do? This would help make sure we're on same page and get a sense of your timeline. Feel free to drop email at bletcher@ebi.ac.uk (also let me know if #163 works for you)

bricoletc added a commit that referenced this issue Feb 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Release 1.9
  
To Do
Development

No branches or pull requests

4 participants