Fixing critical typo in HaplotypeCaller disk spec. #450

jonn-smith · 2024-04-24T15:11:17Z

No description provided.

shadizaheri · 2024-05-02T20:19:11Z

Thank you for the great work on these updates. I tested this branch on 3,000 malaria samples with the SRFlowcell workflow on Terra and it worked perfectly.
Based on my observations, I’ve written a summary of the changes. Please correct any parts if needed. This summary is meant to clearly summarize the updates in this PR for both users and developers.

Review Summary of Key Improvements

Refactoring the Output Calculation Logic:
- This PR simplifies complex calculation expressions by breaking them into smaller, clearly defined variables. It also prevents errors by incorporating conditional checks to handle cases like division by zero.
- Metrics such as estimated fold coverage and aligned fraction of bases are now calculated upfront and stored in descriptive variables, which are then used in the output block. This approach improves the readability of the code and centralizes the logic for easier future adjustments.
Enhanced Error Handling in QC Tasks:
- The PR also addresses the error handling in the FastQC QC tasks. By adding checks to ensure that base quality data is present before proceeding with calculations, the workflow is made more robust against incomplete data, which could lead to runtime errors.
Optimization of Java Options in Utility Tasks
HaplotypeCaller wdl Change in Source for gVCF Files:
- The output_gvcf and output_gvcf_index are now sourced from ReblockHcGVCF.output_gvcf and ReblockHcGVCF.output_gvcf_index, respectively.
- Previously, these files were sourced from MergeGVCFs.output_vcf and MergeGVCFs.output_vcf_index, indicating that the gVCF files were directly obtained from the merging of multiple GVCF files without additional processing.

shadizaheri

Please check the conversation for the details.

- added option to use gnarly genotyper - added het inputs to joint genotyping - fixed java memory allocation in joint genotyping to be based on memory of the VM, not hard-coded

- Added stack trace logging for errors in `ExtractVariantAnnotations`, `TrainVariantAnnotationsModel`, and `ScoreVariantAnnotations`.

…tion to reflect the actual name.

- Removed `HAPCOMP`, `HAPDOM`, and `HEC` from default annotations for SNP and INDEL VETS filtration. Need to do more testing / debugging to include these in joint calling. - Fixed name of outdir in `ConvertToZarrStore` to be correct for this workflow. - Updated the zarr conversion to use parallel Dask processes and to log to stdout.

jonn-smith added 9 commits April 24, 2024 11:04

Fixing critical typo in HaplotypeCaller disk spec.

7a0f0c0

Fixing wdl-computed divide by zero error in SRFlowcell outputs.

4379f75

Updated MergeVCFs to have an option to name an output as a GVCF.

9a49064

Removed deprecated GC logging flags from GATK commands.

2c10229

Fixed issue in FastQC if no read qualities are in the file.

16160f9

Adding some debugging code.

9066a19

Fixed problem caused by pipefail flag in set

f3c4ede

Added missing annotation groups to ReblockGVCFs.

a207846

HaplotypeCaller WDL now returns the reblocked GVCF.

fa142b6

SHuang-Broad requested a review from shadizaheri May 2, 2024 17:59

shadizaheri approved these changes May 2, 2024

View reviewed changes

jonn-smith added 8 commits May 3, 2024 15:33

Joint genotyping updates (added gnarly, added het inputs).

fda0456

- added option to use gnarly genotyper - added het inputs to joint genotyping - fixed java memory allocation in joint genotyping to be based on memory of the VM, not hard-coded

Added stacktrace logging to VETS tasks.

e949005

- Added stack trace logging for errors in `ExtractVariantAnnotations`, `TrainVariantAnnotationsModel`, and `ScoreVariantAnnotations`.

Fixed the name of the workflow in ExpandedDrugResistanceMarkerAggrega…

0b8d218

…tion to reflect the actual name.

Updates to zarr conversion for logging.

e2f9466

More logging updates to zarr store.

6045f13

More fixes to ConvertToZarr

cd31bf9

Upped default zarr conversion memory to 32gb.

50c7179

jonn-smith merged commit fe32d91 into main May 8, 2024
5 checks passed

jonn-smith deleted the jts_quick_hc_bugfix branch May 8, 2024 18:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixing critical typo in HaplotypeCaller disk spec. #450

Fixing critical typo in HaplotypeCaller disk spec. #450

jonn-smith commented Apr 24, 2024

shadizaheri commented May 2, 2024

shadizaheri left a comment

Fixing critical typo in HaplotypeCaller disk spec. #450

Fixing critical typo in HaplotypeCaller disk spec. #450

Conversation

jonn-smith commented Apr 24, 2024

shadizaheri commented May 2, 2024

shadizaheri left a comment

Choose a reason for hiding this comment