Updated revert WDL to split RGs up front for scalability #99

Open
wants to merge 2 commits into
from

Conversation

Projects
None yet
2 participants
Collaborator

vdauwera commented Mar 6, 2017

Splitting by RG up front makes RevertSam far more scalable (in part by avoiding a huge sort step) and allows us to scatter the reversion.

Collaborator

vdauwera commented Mar 6, 2017

Collaborator

vdauwera commented Mar 6, 2017 edited

SplitByRG works great but a weird error occurs when running RevertSam on the output RG bams:

Exception in thread "main" htsjdk.samtools.SAMException: Cannot determine candidate qualities: no qualities found.
	at htsjdk.samtools.util.QualityEncodingDetector.generateCandidateQualities(QualityEncodingDetector.java:228)
	at htsjdk.samtools.util.QualityEncodingDetector.generateBestGuess(QualityEncodingDetector.java:379)
	at htsjdk.samtools.util.QualityEncodingDetector.detect(QualityEncodingDetector.java:332)
	at picard.sam.RevertSam.createReadGroupFormatMap(RevertSam.java:528)
	at picard.sam.RevertSam.doWork(RevertSam.java:265)
	at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:205)
	at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:94)
	at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:104)

Validating the RG files now at b7b927df-2d05-4d4f-95c3-957208925734.

disks: "local-disk " + disk_size + " HDD"
memory: mem_size
}
output {
- Array[File] unmapped_bams = glob("*.bam")
+ File unmapped_bam = "${output_name}"
@cjllanwarne

cjllanwarne Apr 3, 2017 edited

Contributor

This could (probably should) just be File unmapped_bam = output_name

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment