Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Week 1 #9

Closed
thejmazz opened this Issue May 22, 2016 · 3 comments

Comments

3 participants
@thejmazz
Copy link
Member

thejmazz commented May 22, 2016

Goal
Become comfortable working with variant calling (VC) pipelines across various workflow systems.

Deliverables

  • #2 run simple bionode-ncbi + tools bash script (no filtering)
  • #3 collate notes from 4 papers into summary, questions
  • #4 VC with bionode-ncbi+tools (bash) with filtering
  • #5 VC with makefile (less important - similar to bash)
  • #6 VC with snakemake
  • #7 VC with nextflow
  • #8 Blog post with compare/contrast on workflow systems, introduction to VC, identify opportunities for JS (application of streams)

Metrics and Targets

  • understand variety of tools used in VC pipelines (when/where/why)
  • understand underlying algs, and on which data they are best suited for
  • understand similarities/differences b/w snakemake/nextflow to go forward and implement in JS

@thejmazz thejmazz added the Epic label May 22, 2016

@thejmazz thejmazz added this to the Week 1 milestone May 22, 2016

@maxogden

This comment has been minimized.

Copy link

maxogden commented May 25, 2016

nice looks good to me, I like the blog post idea for sharing what you learn as you go

@maxogden

This comment has been minimized.

Copy link

maxogden commented May 26, 2016

also we should consult @andrewosh on workflow/pipeline systems in general at some point!

@thejmazz

This comment has been minimized.

Copy link
Member Author

thejmazz commented May 30, 2016

@maxogden @bmpvieira @mafintosh @yannickwurm

My end of week 1 recap:

  • unfortunately, did not complete all user stories on time. Going forward, I will estimate more time towards setting up workflows (i.e. learning new tools) as that was the bottleneck (especially khmer, and its not just running the code, but also reading up on the tool and trying to understand the algorithm it uses, pros/cons, etc); having done that I think I can get makefile+snakemake done tmr for sure, maybe nextflow too, and then a day to write a complete post. I had hoped to have a blog post comparing a simple variant calling workflow tonight.
  • I have made two issues for week 2: one being "finish week 1" @ 2 days, and the other "VC workflow with Bionode/JS" as 3 days - the first basic prototype. I think we should chat after my blog post is live, and before starting the first JS streaming pipeline prototype. as well to make a check list for that issue so it is not as broad.
  • I believe I have started to develop an understanding of the ecosystem of algorithms and tools surrounding SNP/variant calling. Not that I know all of them, or even a bunch, but I know the common ones (bwa, bowtie2, soap3, k-mer based methods) and different workflows (de novo assembly, read mapping, rare variants vs. GWAS). Still need to improve an understanding of which algorithms are best in what cases, but I think this can really only come with time. (And also with a workflow engine that facilitates easy swapping of tools and ability to compare results vs. bash script ;) )
  • I am starting to have ideas on where JS and streams can be used, it will be fun to get a prototype out and collectively decide where to take this specifically
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.