New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Genomics Pipeline #1724

Closed
Rokshan2016 opened this Issue Sep 14, 2017 · 2 comments

Comments

Projects
2 participants
@Rokshan2016

Rokshan2016 commented Sep 14, 2017

Hi,
I am working currently in a project on building genomics pipeline. Requirement is something like this-

I will get the Input (.fastq file). And needs to run this steps:

  1. Alignment with the sequence genome
  2. Conversion .fastq to .adam
  3. Sorting
  4. Duplicate removal
    5.Base callibration
    6.Analysis Ready Reads
  5. Variant calling(SNP, INDEL)
  6. Variant Filtering
  7. Variant Annotation

My question is : Is there any way I can run a automate script , so that when I put the .fastq data, the whole process will run automatically

Thanks!

@heuermh

This comment has been minimized.

Show comment
Hide comment
@heuermh

heuermh Sep 18, 2017

Member

Sorry for not replying sooner.

ADAM is more a framework or library on which to build workflows rather than something that runs an entire pipeline. For a full variant calling workflow as you describe above, I would suggest using BWA via Cannoli for alignment, ADAM for preprocessing, freebayes via Cannoli or avocado for variant calling, ADAM for variant filtering, and SnpEff via Cannoli for variant annotation.

https://github.com/bigdatagenomics/cannoli
https://github.com/bigdatagenomics/avocado

For running such a workflow, you'd want something like Toil or Nextflow.

https://github.com/BD2KGenomics/toil
https://www.nextflow.io

We're building out such Toil-based workflows in this repository

https://github.com/bigdatagenomics/workflows

Member

heuermh commented Sep 18, 2017

Sorry for not replying sooner.

ADAM is more a framework or library on which to build workflows rather than something that runs an entire pipeline. For a full variant calling workflow as you describe above, I would suggest using BWA via Cannoli for alignment, ADAM for preprocessing, freebayes via Cannoli or avocado for variant calling, ADAM for variant filtering, and SnpEff via Cannoli for variant annotation.

https://github.com/bigdatagenomics/cannoli
https://github.com/bigdatagenomics/avocado

For running such a workflow, you'd want something like Toil or Nextflow.

https://github.com/BD2KGenomics/toil
https://www.nextflow.io

We're building out such Toil-based workflows in this repository

https://github.com/bigdatagenomics/workflows

@Rokshan2016

This comment has been minimized.

Show comment
Hide comment
@Rokshan2016

Rokshan2016 Sep 18, 2017

ok, will check that

Thanks

Rokshan2016 commented Sep 18, 2017

ok, will check that

Thanks

@heuermh heuermh added this to the 0.23.0 milestone Dec 7, 2017

@heuermh heuermh added this to Completed in Release 0.23.0 Jan 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment