-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new workflow to deal with multiple basecalls from same flowcell #325
Conversation
|
||
Map[String, String] ref_map = read_map(ref_map_file) | ||
|
||
String outdir = sub(gcs_out_root_dir, "/$", "") + "/ONTFlowcell/~{flowcell}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure output is meant to go into the usual ONTFlowcell directory? I'm fine with that, but just wanted to check that this is what you intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's intended. the multiple basecalls themselves are output to a different dir
long-read-pipelines/wdl/ONTBasecall.wdl
Line 15 in a7527a5
String outdir = sub(gcs_out_root_dir, "/$", "") + "/ONTBasecall/~{prefix}" |
It's done this way so that flowcell level data are all stored in ONTFlowcell
.
One disparity between 1-basecall flowcells and 1+-basecall flowcells is that the latter has many metrics stored in the basecalls directory, and how they should be aggregated to the flowcell level is to be done in the planned overhaul.
Thoughts?
fc48022
to
a26c00b
Compare
54ae623
to
9b3c700
Compare
9b3c700
to
11157fc
Compare
5acec5d
to
ae2a722
Compare
ae2a722
to
1b3ee20
Compare
version 1.0 | ||
|
||
import "../../../tasks/Utility/Utils.wdl" as Utils | ||
import "../../../tasks/Utility/GeneralUtils.wdl" as GU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are Utils vs GeneralUtils?
|
||
Map[String, Float] raw_reads_stats = nanoplot_map | ||
|
||
Map[String, Float] aligned_reads_stats = NanoPlotFromBam.stats_map |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine for now, but at some point I really would like us to rethink a few things. This output is not particularly conducive to easy analysis in Jupyter notebooks. The answer could be to have some official boilerplate code that presents the Terra table as a pandas dataframe or an R tibble, with the maps here exploded out as columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. The little library that I'm going to present on tomorrow can support this easily.
…directories * one workflow that essentially copies current ONTFlowcell, but aims at a basecall directory * one workflow that plays the role of ONTFlowcell, but actually aggregates results from the WF above
1b3ee20
to
0be3cca
Compare
No description provided.