Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sfitz input vcfs #274

Merged
merged 34 commits into from
May 30, 2024
Merged

Sfitz input vcfs #274

merged 34 commits into from
May 30, 2024

Conversation

sorelfitzgibbon
Copy link
Contributor

@sorelfitzgibbon sorelfitzgibbon commented Feb 28, 2024

Description

This allows the user to input a yaml with 2-4 VCF files for intersection. Some restructuring was involved.
I will update the README in a separate PR.

Testing Results

VCF input

  • Tumor/Normal Paired Sample:
    • sample: DTB-003
    • input YAML: /hot/user/jieunoh/su2c/snv_intersect/input_yaml/DTB-003-T.yaml
    • config: /hot/user/jieunoh/su2c/snv_intersect/snv_intersect.config
    • output: /hot/user/jieunoh/su2c/output/intersected_call_ssnv_vcfs/call-sSNV-8.0.0/DTB-003-T/Intersect-BCFtools-1.17/output

NFtest BAM input

  • a_mini-all-tools-std-input
  • output: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-input-vcfs/a_mini-all-tools-std-input
  • log: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-input-vcfs/log-nftest-20240515T190437Z.log

NFtest VCF input

  • a_mini-all-tools-vcf-input
  • output: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-input-vcfs/a_mini-all-tools-vcf-input
  • -log: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-input-vcfs/log-nftest-20240515T175646Z.log

For some reason the VCF input tests are yielding many of these warnings which seem to resolve themselves:
WARN: Failed to publish file:..

Checklist

  • I have read the code review guidelines and the code review best practice on GitHub check-list.

  • I have reviewed the Nextflow pipeline standards.

  • The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)]-[brief_description_of_branch].

  • I have set up or verified the branch protection rule following the github standards before opening this pull request.

  • I have added my name to the contributors listings in the manifest block in the nextflow.config as part of this pull request; I am listed already, or do not wish to be listed. (This acknowledgement is optional.)

  • I have added the changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.

  • I have updated the version number in the metadata.yaml and manifest block of the nextflow.config file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)

  • I have tested the pipeline on at least one A-mini sample.

Comment on lines +96 to +104
check_valid_algorithms = {
valid_algorithms = params.single_NT_paired ? ['somaticsniper', 'strelka2', 'mutect2', 'muse'] : ['mutect2']
for (algo in params.algorithm) {
if (!(algo in valid_algorithms)) {
throw new Exception("ERROR: params.algorithm ${params.algorithm} contains an invalid value. Valid algorithms for given inputs: ${valid_algorithms}")
}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved from below

config/methods.config Show resolved Hide resolved
main.nf Show resolved Hide resolved
Copy link

Bleep bloop, I am a robot.

Alas, some of the Nextflow configuration tests failed!

test/configtest-F16.json

@ ["params","input_type"]
+ "bam"
@ ["params","keep_input_prefix"]
+ false

test/configtest-F32.json

@ ["params","input_type"]
+ "bam"
@ ["params","keep_input_prefix"]
+ false

If the above changes are surprising, stop and determine what happened.

If the above changes are expected, there are two ways to fix this:

  1. Automatically: Post a comment starting with "/fix-tests" (without the quotes) and I will update the tests for you (you must review my work afterwards).
  2. Manually: Follow these steps on Confluence.

@sorelfitzgibbon
Copy link
Contributor Author

/fix-tests

Copy link

Bleep bloop, I am a robot.

I have updated all of the failing tests for you with 17159d7. You must review my work before merging this pull request!

Copy link

Bleep bloop, I am a robot.

Alas, some of the Nextflow configuration tests failed!

test/configtest-F16.json

@ ["params","proc_resource_params","call_sIndel_Manta"]
- {"cpus":"6","memory":"6 GB","retry_strategy":{"memory":{"operand":"3 GB","strategy":"add"}}}
@ ["params","proc_resource_params","call_sSNV_MuSE"]
- {"cpus":"6","memory":"24 GB","retry_strategy":{"memory":{"operand":"8 GB","strategy":"add"}}}
@ ["params","proc_resource_params","call_sSNV_Mutect2"]
- {"cpus":"1","memory":"3 GB","retry_strategy":{"memory":{"operand":"2 GB","strategy":"add"}}}
@ ["params","proc_resource_params","call_sSNV_SomaticSniper"]
- {"cpus":"1","memory":"1 GB","retry_strategy":{"memory":{"operand":"3 GB","strategy":"add"}}}
@ ["params","proc_resource_params","call_sSNV_Strelka2"]
- {"cpus":"6","ext":{"retry_codes":[]},"memory":"2 GB","retry_strategy":{"memory":{"operand":"12 GB","strategy":"add"}}}
@ ["params","proc_resource_params","convert_BAM2Pileup_SAMtools"]
- {"cpus":"1","memory":"1 GB","retry_strategy":{"memory":{"operand":"3 GB","strategy":"add"}}}
@ ["params","proc_resource_params","create_IndelCandidate_SAMtools"]
- {"cpus":"1","memory":"1 GB","retry_strategy":{"memory":{"operand":"3 GB","strategy":"add"}}}
@ ["params","proc_resource_params","run_LearnReadOrientationModel_GATK"]
- {"cpus":"1","memory":"8 GB","retry_strategy":{"memory":{"operand":"2","strategy":"exponential"}}}
@ ["params","proc_resource_params","run_sump_MuSE"]
- {"cpus":"8","memory":"24 GB","retry_strategy":{"memory":{"operand":"8 GB","strategy":"add"}}}
@ ["process","withName:call_sIndel_Manta"]
- {"cpus":"6","memory":{"1":"6 GB","2":"9 GB","3":"12 GB","closure":"retry_updater(6 GB, add, 3 GB, $task.attempt, memory)"}}
@ ["process","withName:call_sSNV_MuSE"]
- {"cpus":"6","memory":{"1":"24 GB","2":"31 GB","3":"31 GB","closure":"retry_updater(24 GB, add, 8 GB, $task.attempt, memory)"}}
@ ["process","withName:call_sSNV_Mutect2"]
- {"cpus":"1","memory":{"1":"3 GB","2":"5 GB","3":"7 GB","closure":"retry_updater(3 GB, add, 2 GB, $task.attempt, memory)"}}
@ ["process","withName:call_sSNV_SomaticSniper"]
- {"cpus":"1","memory":{"1":"1 GB","2":"4 GB","3":"7 GB","closure":"retry_updater(1 GB, add, 3 GB, $task.attempt, memory)"}}
@ ["process","withName:call_sSNV_Strelka2"]
- {"cpus":"6","ext":{"retry_codes":[]},"memory":{"1":"2 GB","2":"14 GB","3":"26 GB","closure":"retry_updater(2 GB, add, 12 GB, $task.attempt, memory)"}}
@ ["process","withName:convert_BAM2Pileup_SAMtools"]
- {"cpus":"1","memory":{"1":"1 GB","2":"4 GB","3":"7 GB","closure":"retry_updater(1 GB, add, 3 GB, $task.attempt, memory)"}}
@ ["process","withName:create_IndelCandidate_SAMtools"]
- {"cpus":"1","memory":{"1":"1 GB","2":"4 GB","3":"7 GB","closure":"retry_updater(1 GB, add, 3 GB, $task.attempt, memory)"}}
@ ["process","withName:run_LearnReadOrientationModel_GATK"]
- {"cpus":"1","memory":{"1":"8 GB","2":"16 GB","3":"31 GB","closure":"retry_updater(8 GB, exponential, 2, $task.attempt, memory)"}}
@ ["process","withName:run_sump_MuSE"]
- {"cpus":"8","memory":{"1":"24 GB","2":"31 GB","3":"31 GB","closure":"retry_updater(24 GB, add, 8 GB, $task.attempt, memory)"}}

test/configtest-F32.json

@ ["params","proc_resource_params","call_sSNV_Strelka2","ext"]
- {"retry_codes":[]}
@ ["process","withName:call_sSNV_Strelka2","ext"]
- {"retry_codes":[]}

If the above changes are surprising, stop and determine what happened.

If the above changes are expected, there are two ways to fix this:

  1. Automatically: Post a comment starting with "/fix-tests" (without the quotes) and I will update the tests for you (you must review my work afterwards).
  2. Manually: Follow these steps on Confluence.

@sorelfitzgibbon
Copy link
Contributor Author

/fix-tests

Copy link

Bleep bloop, I am a robot.

I have updated all of the failing tests for you with 7808bfd. You must review my work before merging this pull request!

@sorelfitzgibbon
Copy link
Contributor Author

@yashpatel6 this PR is ready for review

@sorelfitzgibbon
Copy link
Contributor Author

@yashpatel6 , new main branch merged in and retested. Ready again for review.

@yashpatel6 yashpatel6 self-assigned this May 22, 2024
Copy link
Contributor

@yashpatel6 yashpatel6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments:

config/F16.config Show resolved Hide resolved
config/F16.config Show resolved Hide resolved
config/custom_schema_types.config Outdated Show resolved Hide resolved
config/mutect-only.config Outdated Show resolved Hide resolved
config/default.config Show resolved Hide resolved
config/mutect-only.config Outdated Show resolved Hide resolved
input/call-sSNV-template-VCF.yaml Outdated Show resolved Hide resolved
module/mutect2-processes.nf Outdated Show resolved Hide resolved
main.nf Show resolved Hide resolved
module/vcf-input.nf Outdated Show resolved Hide resolved
Copy link

Bleep bloop, I am a robot.

Alas, some of the Nextflow configuration tests failed!

test/configtest-F16.json

@ ["params","log_output_dir"]
- "/tmp/outputs/call-sSNV-8.0.0/0192847/log-call-sSNV-8.0.0-19970704T165655Z"
+ "/tmp/outputs/call-sSNV-8.1.0/0192847/log-call-sSNV-8.1.0-19970704T165655Z"
@ ["params","save_intermediate_files"]
- true
+ false

test/configtest-F32.json

@ ["params","log_output_dir"]
- "/tmp/outputs/call-sSNV-8.0.0/0192847/log-call-sSNV-8.0.0-19970704T165655Z"
+ "/tmp/outputs/call-sSNV-8.1.0/0192847/log-call-sSNV-8.1.0-19970704T165655Z"
@ ["params","save_intermediate_files"]
- true
+ false

If the above changes are surprising, stop and determine what happened.

If the above changes are expected, there are two ways to fix this:

  1. Automatically: Post a comment starting with "/fix-tests" (without the quotes) and I will update the tests for you (you must review my work afterwards).
  2. Manually: Follow these steps on Confluence.

@nwiltsie
Copy link
Member

@sorelfitzgibbon hang on, let me fix up the tests with the latest version-ignoring changes!

@nwiltsie
Copy link
Member

@sorelfitzgibbon okay, I fixed up the version numbering stuff problems with a1a92d6, and that shouldn't be a problem anymore (see uclahs-cds/tool-Nextflow-action#40).

The tests are still failing because params.save_intermediate_files changed - I didn't touch that part.

@sorelfitzgibbon
Copy link
Contributor Author

@sorelfitzgibbon okay, I fixed up the version numbering stuff problems with a1a92d6, and that shouldn't be a problem anymore (see uclahs-cds/tool-Nextflow-action#40).

The tests are still failing because params.save_intermediate_files changed - I didn't touch that part.

Thanks @nwiltsie !

@sorelfitzgibbon
Copy link
Contributor Author

/fix-tests

Copy link

Bleep bloop, I am a robot.

I have updated all of the failing tests for you with 5480446. You must review my work before merging this pull request!

@sorelfitzgibbon
Copy link
Contributor Author

@yashpatel6 I believe I have addressed all of your concerns.

@sorelfitzgibbon
Copy link
Contributor Author

Retested after merging in submodule updates:
vcf: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-input-vcfs/log-nftest-20240529T230920Z.log
std: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-input-vcfs/log-nftest-20240529T231143Z.log

Copy link
Contributor

@yashpatel6 yashpatel6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of re-organizing comments but otherwise looks good! Once the given parts are moved/removed and the test cases succeed, we can merge

Comment on lines 158 to 186
/**
* Check if proper VCF entry list
*/
check_vcf_list = { Map options, String name, Map properties ->
custom_schema_types.check_if_list(options[name], name)
for (item in options[name]) {
custom_schema_types.check_if_namespace(item, name)
properties.elements.each { key, val ->
schema.validate_parameter(item, key, val)
}
}
}

/**
* Check that at least one kind of input is given and only one of BAM or VCF is given
*/
check_input_presence_and_exclusivity = { Map options ->
def bam_given = options.containsKey('bam')
def vcf_given = options.containsKey('vcf')

if (!bam_given && !vcf_given) {
throw new Exception("At least one input type (BAM or VCF) must be provided.")
}

if (bam_given && vcf_given) {
throw new Exception("Only one input type (either BAM or VCF) should be provided.")
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these two functions aren't being used anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, check_input_presence_and_exclusivity was accidentally left behind.

I guess check_vcf_list can't be used and isn't necessary because of the way the VCFs are input. I removed the function and the VCFEntryList that referred to it.

@@ -134,6 +198,7 @@ custom_schema_types {
types = [
'InputNamespace': custom_schema_types.check_input_namespace,
'BAMEntryList': custom_schema_types.check_bam_list,
'VCFEntryList': custom_schema_types.check_vcf_list,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VCFEntryList is also not being used in the schema I think

Comment on lines 13 to 17
rename_id_ch = Channel.value(['orig_id': params.input_tumor_id,'id': params.tumor_id, 'sample_type': 'tumor' ])
.mix(Channel.value(['orig_id': params.input_normal_id, 'id': params.normal_id, 'sample_type': 'normal' ]))
.mix(Channel.value(['orig_id': 'TUMOR', 'id': params.tumor_id, 'sample_type': 'tumor' ]))
.mix(Channel.value(['orig_id': 'NORMAL', 'id': params.normal_id, 'sample_type': 'normal' ]))
.collect()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest moving this into the workflow below as the channel generated is used in the workflow and having the statements outside the workflow could have unexpected channels being created when imports happen

@sorelfitzgibbon
Copy link
Contributor Author

Changes made and successfully re-tested: /hot/software/pipeline/pipeline-call-sSNV/Nextflow/development/unreleased/sfitz-input-vcfs. Will push changes and merge.

@sorelfitzgibbon sorelfitzgibbon merged commit e27b4e7 into main May 30, 2024
7 checks passed
@sorelfitzgibbon sorelfitzgibbon deleted the sfitz-input-vcfs branch May 30, 2024 20:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants