Permalink
Please sign in to comment.
Browse files
Add gatkToWdlWrapper, testWDLTasks.py, all auto-generated WDL tasks f…
…or GATK 3.6, and a README to explain use.
- Loading branch information...
Showing
with
5,166 additions
and 1 deletion.
- +8 −1 scripts/README.md
- +60 −0 scripts/wrappers/gatk/README.md
- +68 −0 scripts/wrappers/gatk/WDLTasks_3.6/ASEReadCounter_3.6.wdl
- +56 −0 scripts/wrappers/gatk/WDLTasks_3.6/AnalyzeCovariates_3.6.wdl
- +76 −0 scripts/wrappers/gatk/WDLTasks_3.6/ApplyRecalibration_3.6.wdl
- +109 −0 scripts/wrappers/gatk/WDLTasks_3.6/BaseRecalibrator_3.6.wdl
- +74 −0 scripts/wrappers/gatk/WDLTasks_3.6/CalculateGenotypePosteriors_3.6.wdl
- +74 −0 scripts/wrappers/gatk/WDLTasks_3.6/CallableLoci_3.6.wdl
- +68 −0 scripts/wrappers/gatk/WDLTasks_3.6/CatVariants_3.6.wdl
- +55 −0 scripts/wrappers/gatk/WDLTasks_3.6/CheckPileup_3.6.wdl
- +65 −0 scripts/wrappers/gatk/WDLTasks_3.6/ClipReads_3.6.wdl
- +62 −0 scripts/wrappers/gatk/WDLTasks_3.6/CombineGVCFs_3.6.wdl
- +85 −0 scripts/wrappers/gatk/WDLTasks_3.6/CombineVariants_3.6.wdl
- +53 −0 scripts/wrappers/gatk/WDLTasks_3.6/CompareCallableLoci_3.6.wdl
- +95 −0 scripts/wrappers/gatk/WDLTasks_3.6/ContEst_3.6.wdl
- +44 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountBases_3.6.wdl
- +50 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountIntervals_3.6.wdl
- +51 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountLoci_3.6.wdl
- +47 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountMales_3.6.wdl
- +50 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountRODsByRef_3.6.wdl
- +57 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountRODs_3.6.wdl
- +47 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountReadEvents_3.6.wdl
- +46 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountReads_3.6.wdl
- +47 −0 scripts/wrappers/gatk/WDLTasks_3.6/CountTerminusEvent_3.6.wdl
- +112 −0 scripts/wrappers/gatk/WDLTasks_3.6/DepthOfCoverage_3.6.wdl
- +80 −0 scripts/wrappers/gatk/WDLTasks_3.6/DiagnoseTargets_3.6.wdl
- +71 −0 scripts/wrappers/gatk/WDLTasks_3.6/DiffObjects_3.6.wdl
- +53 −0 scripts/wrappers/gatk/WDLTasks_3.6/ErrorRatePerCycle_3.6.wdl
- +44 −0 scripts/wrappers/gatk/WDLTasks_3.6/FastaStats_3.6.wdl
- +83 −0 scripts/wrappers/gatk/WDLTasks_3.6/FindCoveredIntervals_3.6.wdl
- +49 −0 scripts/wrappers/gatk/WDLTasks_3.6/FlagStat_3.6.wdl
- +52 −0 scripts/wrappers/gatk/WDLTasks_3.6/GATKPaperGenotyper_3.6.wdl
- +44 −0 scripts/wrappers/gatk/WDLTasks_3.6/GCContentByInterval_3.6.wdl
- +65 −0 scripts/wrappers/gatk/WDLTasks_3.6/GenotypeConcordance_3.6.wdl
- +88 −0 scripts/wrappers/gatk/WDLTasks_3.6/GenotypeGVCFs_3.6.wdl
- +223 −0 scripts/wrappers/gatk/WDLTasks_3.6/HaplotypeCaller_3.6.wdl
- +53 −0 scripts/wrappers/gatk/WDLTasks_3.6/HaplotypeResolver_3.6.wdl
- +86 −0 scripts/wrappers/gatk/WDLTasks_3.6/IndelRealigner_3.6.wdl
- +56 −0 scripts/wrappers/gatk/WDLTasks_3.6/LeftAlignAndTrimVariants_3.6.wdl
- +47 −0 scripts/wrappers/gatk/WDLTasks_3.6/LeftAlignIndels_3.6.wdl
- +244 −0 scripts/wrappers/gatk/WDLTasks_3.6/MuTect2_3.6.wdl
- +59 −0 scripts/wrappers/gatk/WDLTasks_3.6/PhaseByTransmission_3.6.wdl
- +57 −0 scripts/wrappers/gatk/WDLTasks_3.6/Pileup_3.6.wdl
- +44 −0 scripts/wrappers/gatk/WDLTasks_3.6/PrintRODs_3.6.wdl
- +70 −0 scripts/wrappers/gatk/WDLTasks_3.6/PrintReads_3.6.wdl
- +70 −0 scripts/wrappers/gatk/WDLTasks_3.6/QualifyMissingIntervals_3.6.wdl
- +62 −0 scripts/wrappers/gatk/WDLTasks_3.6/RandomlySplitVariants_3.6.wdl
- +77 −0 scripts/wrappers/gatk/WDLTasks_3.6/ReadBackedPhasing_3.6.wdl
- +53 −0 scripts/wrappers/gatk/WDLTasks_3.6/ReadClippingStats_3.6.wdl
- +47 −0 scripts/wrappers/gatk/WDLTasks_3.6/ReadGroupProperties_3.6.wdl
- +47 −0 scripts/wrappers/gatk/WDLTasks_3.6/ReadLengthDistribution_3.6.wdl
- +64 −0 scripts/wrappers/gatk/WDLTasks_3.6/RealignerTargetCreator_3.6.wdl
- +49 −0 scripts/wrappers/gatk/WDLTasks_3.6/RegenotypeVariants_3.6.wdl
- +61 −0 scripts/wrappers/gatk/WDLTasks_3.6/SelectHeaders_3.6.wdl
- +157 −0 scripts/wrappers/gatk/WDLTasks_3.6/SelectVariants_3.6.wdl
- +62 −0 scripts/wrappers/gatk/WDLTasks_3.6/SimulateReadsForVariants_3.6.wdl
- +62 −0 scripts/wrappers/gatk/WDLTasks_3.6/SplitNCigarReads_3.6.wdl
- +47 −0 scripts/wrappers/gatk/WDLTasks_3.6/SplitSamFile_3.6.wdl
- +147 −0 scripts/wrappers/gatk/WDLTasks_3.6/UnifiedGenotyper_3.6.wdl
- +59 −0 scripts/wrappers/gatk/WDLTasks_3.6/ValidateVariants_3.6.wdl
- +80 −0 scripts/wrappers/gatk/WDLTasks_3.6/ValidationSiteSelector_3.6.wdl
- +91 −0 scripts/wrappers/gatk/WDLTasks_3.6/VariantAnnotator_3.6.wdl
- +112 −0 scripts/wrappers/gatk/WDLTasks_3.6/VariantEval_3.6.wdl
- +92 −0 scripts/wrappers/gatk/WDLTasks_3.6/VariantFiltration_3.6.wdl
- +130 −0 scripts/wrappers/gatk/WDLTasks_3.6/VariantRecalibrator_3.6.wdl
- +47 −0 scripts/wrappers/gatk/WDLTasks_3.6/VariantsToAllelicPrimitives_3.6.wdl
- +71 −0 scripts/wrappers/gatk/WDLTasks_3.6/VariantsToBinaryPed_3.6.wdl
- +68 −0 scripts/wrappers/gatk/WDLTasks_3.6/VariantsToTable_3.6.wdl
- +53 −0 scripts/wrappers/gatk/WDLTasks_3.6/VariantsToVCF_3.6.wdl
- +210 −0 scripts/wrappers/gatk/gatkToWdlWrapper.py
- +21 −0 scripts/wrappers/gatk/testWDLTasks.py
| @@ -0,0 +1,60 @@ | ||
| +###Introduction | ||
| +In this directory, you will find 3 different types of files. | ||
| +- WDL task files (named using the format `ToolName_GATKVersion.wdl`) located under a folder named for their GATK version | ||
| +- A python script for generating these WDL task files (`gatkToWdlWrapper.py`) | ||
| +- A python script for validating the WDL task files (`testWDLTasks.py`) | ||
| + | ||
| +###Using WDL task files | ||
| +These task files are workflow-less. They contain only the task specified by the GATK tool they are named for. You can choose to either copy them into your workflow WDL, as is presented in our [WDL tutorials](https://software.broadinstitute.org/wdl/userguide/topic?name=wdl-tutorials), or you can import the task into your workflow file. To import a task, download the task script you would like to use, and save it in the same directory as your workflow WDL. Then, at the top of the file, you can import using the following syntax: | ||
| +``` | ||
| +import ToolName_GATKVersion.wdl as toolName | ||
| +``` | ||
| + | ||
| +To call an imported tool in your workflow, you can then use the following syntax: | ||
| +``` | ||
| +call toolName.taskName{ inputs: ... } | ||
| +``` | ||
| + | ||
| +For example, let's say you want to use VariantEval. You can use the task like so: | ||
| +``` | ||
| +import VariantEval_3.6.wdl as variantEval | ||
| +workflow wf { | ||
| + call variantEval.VariantEval{ inputs: ... } | ||
| +} | ||
| +``` | ||
| + | ||
| +These task files were generated to give you all possible parameters available to the tool itself. Any parameter marked with a `?` is denoted as optional. Optional parameters do not need to be specified if you do not wish to use them. If you do not specify a value for an optional parameter, the default value for it will be used, according to the default recommendations for the GATK tool itself. If the GATK tool has no default (such as in a case where you could choose to pass in a file or not), then the parameter will not be used at all in the final command. | ||
| + | ||
| +If you find there is a parameter you would like to use, and it has not been specified in the command as an option, you can use the variable called `userString`. There are many parameters, for example, available to all GATK tools via the [CommandLineGATK options](https://software.broadinstitute.org/gatk/documentation/tooldocs/org_broadinstitute_gatk_engine_CommandLineGATK.php), that do not make sense to use in a majority of cases. However, if you find there is a parameter you'd like to use, you can pass it in as a simple string, as you would type it into the command when running a GATK tool from the terminal. | ||
| + | ||
| +These options are all designed so that you never need to edit the task file itself, simply import and use it immediately in your workflow. If you find any errors in using these WDL tasks, please report them to us on the [WDL forum](http://gatkforums.broadinstitute.org/wdl/categories/ask-the-wdl-team). You are also welcome to ask us any and all questions related to running GATK with WDL on that forum. | ||
| + | ||
| +###Generating WDL task files | ||
| +These instructions are intended primarily for internal use. In order to run this python script, you will need: | ||
| +- `gatkToWdlWrapper.py` | ||
| +- Python installed on your computer | ||
| +- a folder of JSON files generated from GATK tools, including the JSON for CommandLineGATK | ||
| +- a file titled `engine_args_per_tool.json` specific to the version of GATK you are generating the JSON files for (You can find one in the WDLTasks_3.6 subfolder) | ||
| + | ||
| +Once you have all of the above requirements on your local machine, go into the directory containing all your JSON files and create a new folder titled `WDLTasks`. To run the script, open up your terminal and execute the command with the following syntax: | ||
| + | ||
| +``` | ||
| +python gatkToWdlWrapper.py /absolute/path/to/directoryofJsons GATKVersion | ||
| +``` | ||
| + | ||
| +The resulting WDL tasks will be output to the folder you just created, WDLTasks. You must test that these tasks are valid, using the test script next, prior to uploading. | ||
| + | ||
| +###Testing WDL task files | ||
| +This test script simply validates that the WDL tasks follow all WDL rules. They do not guarantee that the commands themselves are valid, but this test does offer a simple sanity check. To run this python script, you will need: | ||
| +- `testWDLTasks.py` | ||
| +- Python installed on your computer | ||
| +- wdltool.jar | ||
| +- the resulting WDL tasks from the previous section | ||
| + | ||
| +Again, once you have all the above requirements on your local machine, run the script using the following syntax: | ||
| + | ||
| +``` | ||
| +python testWDLTasks.py /absolute/path/to/WDLTasks /absolute/path/to/wdltool.jar | ||
| +``` | ||
| + | ||
| +This will print out the name of the WDL task as it checks it. If an error in the script is found, it will output the error message below the name of the file it is associated with. If no error is found, a blank line will simply be output. All errors must be fixed prior to uploading the WDL tasks to this repository. |
| @@ -0,0 +1,68 @@ | ||
| +# -------------------------------------------------------------------------------------------- | ||
| +# This ASEReadCounter WDL task was generated on 09/09/16 for use with GATK version 3.6 | ||
| +# For more information on using this wrapper, please see the WDL repository at | ||
| +# https://github.com/broadinstitute/wdl/tree/develop/scripts/wrappers/gatk/README.md | ||
| +# Task Summary: Calculate read counts per allele for allele-specific expression analysis | ||
| +# -------------------------------------------------------------------------------------------- | ||
| + | ||
| +task ASEReadCounter { | ||
| + File gatk | ||
| + File ref | ||
| + File refIndex | ||
| + File refDict | ||
| + String ? userString #If a parameter you'd like to use is missing from this task, use this term to add your own string | ||
| + Array[String] input_file | ||
| + Array[String] ? intervals | ||
| + String unsafe | ||
| + String ? countOverlapReadsType | ||
| + String ? minBaseQuality | ||
| + Int ? minDepthOfNonFilteredBase | ||
| + Int ? minMappingQuality | ||
| + String ? out | ||
| + String ? outputFormat | ||
| + String sitesVCFFile | ||
| + | ||
| + command { | ||
| + java -jar ${gatk} \ | ||
| + -T ASEReadCounter \ | ||
| + -R ${ref} \ | ||
| + --input_file ${input_file} \ | ||
| + ${default="" "--intervals " + intervals} \ | ||
| + --unsafe ${unsafe} \ | ||
| + -overlap ${default="COUNT_FRAGMENTS_REQUIRE_SAME_BASE" countOverlapReadsType} \ | ||
| + -mbq ${default="0" minBaseQuality} \ | ||
| + -minDepth ${default="-1" minDepthOfNonFilteredBase} \ | ||
| + -mmq ${default="0" minMappingQuality} \ | ||
| + -o ${default="stdout" out} \ | ||
| + outputFormat ${default="RTABLE" outputFormat} \ | ||
| + -sites ${sitesVCFFile} \ | ||
| + ${default="\n" userString} | ||
| + } | ||
| + | ||
| + output { | ||
| + #To track additional outputs from your task, please manually add them below | ||
| + String taskOut = "${out}" | ||
| + } | ||
| + | ||
| + runtime { | ||
| + docker: "broadinstitute/genomes-in-the-cloud:2.2.2-1466113830" | ||
| + } | ||
| + | ||
| + parameter_meta { | ||
| + gatk: "Executable jar for the GenomeAnalysisTK" | ||
| + ref: "fasta file of reference genome" | ||
| + refIndex: "Index file of reference genome" | ||
| + refDict: "dict file of reference genome" | ||
| + userString: "An optional parameter which allows the user to specify additions to the command line at run time" | ||
| + countOverlapReadsType: "Handling of overlapping reads from the same fragment" | ||
| + minBaseQuality: "Minimum base quality" | ||
| + minDepthOfNonFilteredBase: "Minimum number of bases that pass filters" | ||
| + minMappingQuality: "Minimum read mapping quality" | ||
| + out: "An output file created by the walker. Will overwrite contents if file exists" | ||
| + outputFormat: "Format of the output file, can be CSV, TABLE, RTABLE" | ||
| + sitesVCFFile: "Undocumented option" | ||
| + input_file: "Input file containing sequence data (BAM or CRAM)" | ||
| + intervals: "One or more genomic intervals over which to operate" | ||
| + unsafe: "Enable unsafe operations: nothing will be checked at runtime" | ||
| + } | ||
| +} |
| @@ -0,0 +1,56 @@ | ||
| +# -------------------------------------------------------------------------------------------- | ||
| +# This AnalyzeCovariates WDL task was generated on 09/09/16 for use with GATK version 3.6 | ||
| +# For more information on using this wrapper, please see the WDL repository at | ||
| +# https://github.com/broadinstitute/wdl/tree/develop/scripts/wrappers/gatk/README.md | ||
| +# Task Summary: Create plots to visualize base recalibration results | ||
| +# -------------------------------------------------------------------------------------------- | ||
| + | ||
| +task AnalyzeCovariates { | ||
| + File gatk | ||
| + File ref | ||
| + File refIndex | ||
| + File refDict | ||
| + String ? userString #If a parameter you'd like to use is missing from this task, use this term to add your own string | ||
| + File ? BQSR | ||
| + File ? afterReportFile | ||
| + File ? beforeReportFile | ||
| + Boolean ? ignoreLastModificationTimes | ||
| + File ? intermediateCsvFile | ||
| + File ? plotsReportFile | ||
| + | ||
| + command { | ||
| + java -jar ${gatk} \ | ||
| + -T AnalyzeCovariates \ | ||
| + -R ${ref} \ | ||
| + ${default="" "--BQSR " + BQSR} \ | ||
| + ${default="" "-after " + afterReportFile} \ | ||
| + ${default="" "-before " + beforeReportFile} \ | ||
| + -ignoreLMT ${default="false" ignoreLastModificationTimes} \ | ||
| + ${default="" "-csv " + intermediateCsvFile} \ | ||
| + ${default="" "-plots " + plotsReportFile} \ | ||
| + ${default="\n" userString} | ||
| + } | ||
| + | ||
| + output { | ||
| + #To track additional outputs from your task, please manually add them below | ||
| + String taskOut = "${out}" | ||
| + } | ||
| + | ||
| + runtime { | ||
| + docker: "broadinstitute/genomes-in-the-cloud:2.2.2-1466113830" | ||
| + } | ||
| + | ||
| + parameter_meta { | ||
| + gatk: "Executable jar for the GenomeAnalysisTK" | ||
| + ref: "fasta file of reference genome" | ||
| + refIndex: "Index file of reference genome" | ||
| + refDict: "dict file of reference genome" | ||
| + userString: "An optional parameter which allows the user to specify additions to the command line at run time" | ||
| + afterReportFile: "file containing the BQSR second-pass report file" | ||
| + beforeReportFile: "file containing the BQSR first-pass report file" | ||
| + ignoreLastModificationTimes: "do not emit warning messages related to suspicious last modification time order of inputs" | ||
| + intermediateCsvFile: "location of the csv intermediate file" | ||
| + plotsReportFile: "location of the output report" | ||
| + BQSR: "Input covariates table file for on-the-fly base quality score recalibration" | ||
| + } | ||
| +} |
| @@ -0,0 +1,76 @@ | ||
| +# -------------------------------------------------------------------------------------------- | ||
| +# This ApplyRecalibration WDL task was generated on 09/09/16 for use with GATK version 3.6 | ||
| +# For more information on using this wrapper, please see the WDL repository at | ||
| +# https://github.com/broadinstitute/wdl/tree/develop/scripts/wrappers/gatk/README.md | ||
| +# Task Summary: Apply a score cutoff to filter variants based on a recalibration table | ||
| +# -------------------------------------------------------------------------------------------- | ||
| + | ||
| +task ApplyRecalibration { | ||
| + File gatk | ||
| + File ref | ||
| + File refIndex | ||
| + File refDict | ||
| + String ? userString #If a parameter you'd like to use is missing from this task, use this term to add your own string | ||
| + Array[String] ? intervals | ||
| + Int ? ntVal | ||
| + Boolean ? excludeFiltered | ||
| + Boolean ? ignore_all_filters | ||
| + String ? ignore_filter | ||
| + Array[String] task_input | ||
| + Float ? lodCutoff | ||
| + String ? mode | ||
| + String ? out | ||
| + String recal_file | ||
| + File ? tranches_file | ||
| + Float ? ts_filter_level | ||
| + Boolean ? useAlleleSpecificAnnotations | ||
| + | ||
| + command { | ||
| + java -jar ${gatk} \ | ||
| + -T ApplyRecalibration \ | ||
| + -R ${ref} \ | ||
| + ${default="" "--intervals " + intervals} \ | ||
| + ${default="" "-nt" + ntVal} \ | ||
| + -ef ${default="false" excludeFiltered} \ | ||
| + -ignoreAllFilters ${default="false" ignore_all_filters} \ | ||
| + ${default="" "-ignoreFilter " + ignore_filter} \ | ||
| + -input ${task_input} \ | ||
| + ${default="" "-lodCutoff " + lodCutoff} \ | ||
| + -mode ${default="SNP" mode} \ | ||
| + -o ${default="stdout" out} \ | ||
| + -recalFile ${recal_file} \ | ||
| + ${default="" "-tranchesFile " + tranches_file} \ | ||
| + ${default="" "-ts_filter_level " + ts_filter_level} \ | ||
| + -AS ${default="false" useAlleleSpecificAnnotations} \ | ||
| + ${default="\n" userString} | ||
| + } | ||
| + | ||
| + output { | ||
| + #To track additional outputs from your task, please manually add them below | ||
| + String taskOut = "${out}" | ||
| + } | ||
| + | ||
| + runtime { | ||
| + docker: "broadinstitute/genomes-in-the-cloud:2.2.2-1466113830" | ||
| + } | ||
| + | ||
| + parameter_meta { | ||
| + gatk: "Executable jar for the GenomeAnalysisTK" | ||
| + ref: "fasta file of reference genome" | ||
| + refIndex: "Index file of reference genome" | ||
| + refDict: "dict file of reference genome" | ||
| + userString: "An optional parameter which allows the user to specify additions to the command line at run time" | ||
| + excludeFiltered: "Don't output filtered loci after applying the recalibration" | ||
| + ignore_all_filters: "If specified, the variant recalibrator will ignore all input filters. Useful to rerun the VQSR from a filtered output file." | ||
| + ignore_filter: "If specified, the recalibration will be applied to variants marked as filtered by the specified filter name in the input VCF file" | ||
| + task_input: "The raw input variants to be recalibrated" | ||
| + lodCutoff: "The VQSLOD score below which to start filtering" | ||
| + mode: "Recalibration mode to employ: 1.) SNP for recalibrating only SNPs (emitting indels untouched in the output VCF); 2.) INDEL for indels; and 3.) BOTH for recalibrating both SNPs and indels simultaneously." | ||
| + out: "The output filtered and recalibrated VCF file in which each variant is annotated with its VQSLOD value" | ||
| + recal_file: "The input recal file used by ApplyRecalibration" | ||
| + tranches_file: "The input tranches file describing where to cut the data" | ||
| + ts_filter_level: "The truth sensitivity level at which to start filtering" | ||
| + useAlleleSpecificAnnotations: "If specified, the tool will attempt to apply a filter to each allele based on the input tranches and allele-specific .recal file." | ||
| + intervals: "One or more genomic intervals over which to operate" | ||
| + } | ||
| +} |
| @@ -0,0 +1,109 @@ | ||
| +# -------------------------------------------------------------------------------------------- | ||
| +# This BaseRecalibrator WDL task was generated on 09/09/16 for use with GATK version 3.6 | ||
| +# For more information on using this wrapper, please see the WDL repository at | ||
| +# https://github.com/broadinstitute/wdl/tree/develop/scripts/wrappers/gatk/README.md | ||
| +# Task Summary: Detect systematic errors in base quality scores | ||
| +# -------------------------------------------------------------------------------------------- | ||
| + | ||
| +task BaseRecalibrator { | ||
| + File gatk | ||
| + File ref | ||
| + File refIndex | ||
| + File refDict | ||
| + String ? userString #If a parameter you'd like to use is missing from this task, use this term to add your own string | ||
| + Array[String] input_file | ||
| + Array[String] ? intervals | ||
| + File ? BQSR | ||
| + Int ? nctVal | ||
| + String ? binary_tag_name | ||
| + Float ? bqsrBAQGapOpenPenalty | ||
| + String ? covariate | ||
| + String ? deletions_default_quality | ||
| + Int ? indels_context_size | ||
| + String ? insertions_default_quality | ||
| + Array[String] ? knownSites | ||
| + Boolean ? list | ||
| + String ? low_quality_tail | ||
| + Boolean ? lowMemoryMode | ||
| + Int ? maximum_cycle_value | ||
| + Int ? mismatches_context_size | ||
| + String ? mismatches_default_quality | ||
| + Boolean ? no_standard_covs | ||
| + File out | ||
| + Int ? quantizing_levels | ||
| + Boolean ? run_without_dbsnp_potentially_ruining_quality | ||
| + String ? solid_nocall_strategy | ||
| + String ? solid_recal_mode | ||
| + Boolean ? sort_by_all_columns | ||
| + | ||
| + command { | ||
| + java -jar ${gatk} \ | ||
| + -T BaseRecalibrator \ | ||
| + -R ${ref} \ | ||
| + --input_file ${input_file} \ | ||
| + ${default="" "--intervals " + intervals} \ | ||
| + ${default="" "--BQSR " + BQSR} \ | ||
| + ${default="" "-nct" + nctVal} \ | ||
| + ${default="" "-bintag " + binary_tag_name} \ | ||
| + -bqsrBAQGOP ${default="40.0" bqsrBAQGapOpenPenalty} \ | ||
| + ${default="" "-cov " + covariate} \ | ||
| + -ddq ${default="45" deletions_default_quality} \ | ||
| + -ics ${default="3" indels_context_size} \ | ||
| + -idq ${default="45" insertions_default_quality} \ | ||
| + -knownSites ${default="[]" knownSites} \ | ||
| + -ls ${default="false" list} \ | ||
| + -lqt ${default="2" low_quality_tail} \ | ||
| + -lowMemoryMode ${default="false" lowMemoryMode} \ | ||
| + -maxCycle ${default="500" maximum_cycle_value} \ | ||
| + -mcs ${default="2" mismatches_context_size} \ | ||
| + -mdq ${default="-1" mismatches_default_quality} \ | ||
| + -noStandard ${default="false" no_standard_covs} \ | ||
| + -o ${out} \ | ||
| + -ql ${default="16" quantizing_levels} \ | ||
| + -run_without_dbsnp_potentially_ruining_quality ${default="false" run_without_dbsnp_potentially_ruining_quality} \ | ||
| + -solid_nocall_strategy ${default="THROW_EXCEPTION" solid_nocall_strategy} \ | ||
| + -sMode ${default="SET_Q_ZERO" solid_recal_mode} \ | ||
| + -sortAllCols ${default="false" sort_by_all_columns} \ | ||
| + ${default="\n" userString} | ||
| + } | ||
| + | ||
| + output { | ||
| + #To track additional outputs from your task, please manually add them below | ||
| + String taskOut = "${out}" | ||
| + } | ||
| + | ||
| + runtime { | ||
| + docker: "broadinstitute/genomes-in-the-cloud:2.2.2-1466113830" | ||
| + } | ||
| + | ||
| + parameter_meta { | ||
| + gatk: "Executable jar for the GenomeAnalysisTK" | ||
| + ref: "fasta file of reference genome" | ||
| + refIndex: "Index file of reference genome" | ||
| + refDict: "dict file of reference genome" | ||
| + userString: "An optional parameter which allows the user to specify additions to the command line at run time" | ||
| + binary_tag_name: "the binary tag covariate name if using it" | ||
| + bqsrBAQGapOpenPenalty: "BQSR BAQ gap open penalty (Phred Scaled). Default value is 40. 30 is perhaps better for whole genome call sets" | ||
| + covariate: "One or more covariates to be used in the recalibration. Can be specified multiple times" | ||
| + deletions_default_quality: "default quality for the base deletions covariate" | ||
| + indels_context_size: "Size of the k-mer context to be used for base insertions and deletions" | ||
| + insertions_default_quality: "default quality for the base insertions covariate" | ||
| + knownSites: "A database of known polymorphic sites" | ||
| + list: "List the available covariates and exit" | ||
| + low_quality_tail: "minimum quality for the bases in the tail of the reads to be considered" | ||
| + lowMemoryMode: "Reduce memory usage in multi-threaded code at the expense of threading efficiency" | ||
| + maximum_cycle_value: "The maximum cycle value permitted for the Cycle covariate" | ||
| + mismatches_context_size: "Size of the k-mer context to be used for base mismatches" | ||
| + mismatches_default_quality: "default quality for the base mismatches covariate" | ||
| + no_standard_covs: "Do not use the standard set of covariates, but rather just the ones listed using the -cov argument" | ||
| + out: "The output recalibration table file to create" | ||
| + quantizing_levels: "number of distinct quality scores in the quantized output" | ||
| + run_without_dbsnp_potentially_ruining_quality: "If specified, allows the recalibrator to be used without a dbsnp rod. Very unsafe and for expert users only." | ||
| + solid_nocall_strategy: "Defines the behavior of the recalibrator when it encounters no calls in the color space. Options = THROW_EXCEPTION, LEAVE_READ_UNRECALIBRATED, or PURGE_READ" | ||
| + solid_recal_mode: "How should we recalibrate solid bases in which the reference was inserted? Options = DO_NOTHING, SET_Q_ZERO, SET_Q_ZERO_BASE_N, or REMOVE_REF_BIAS" | ||
| + sort_by_all_columns: "Sort the rows in the tables of reports" | ||
| + input_file: "Input file containing sequence data (BAM or CRAM)" | ||
| + intervals: "One or more genomic intervals over which to operate" | ||
| + BQSR: "Input covariates table file for on-the-fly base quality score recalibration" | ||
| + } | ||
| +} |
Oops, something went wrong.
0 comments on commit
2f69cf4