Skip to content

Conversation

@FedericaBrando
Copy link
Member

@FedericaBrando FedericaBrando commented Jan 20, 2025

This pull request includes:

  • enabling a new Nextflow preview topic to change how versions are collected
  • reorganizing the main workflows

Missing things to tackle:

  • Reformat publishdir following latest nextflow standards
  • Comments on certain params logic
  • Add param schema
  • Solve other linting problems (i.e. variables not used in scripts in .nf processes)

TOPICS - caveats and implementation

When testing the use of the topic to collect the versions of the software the following happens:

When using topics the process may hang in there forever if a topic is called indirectly or directly and it is also emitted.

For example this happens if I emit a topic versions in CUSTOM_DUMPSOFTWARE, where the input is the same channel.topic. This behaviour is found also in the MULTIQC process.
Therefore to fix the problem we cannot emit those versions to the channel - leading to omit the versions of dumpsoftware and mqc.

// // Print parameter summary log to screen
// log.info logo + paramsSummaryLog(workflow) + citation

WorkflowDeepcsa.initialise(params, log)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've commented this out because we do it in the main.nf script - is there any difference between the two?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly have no clue whether there is any difference, if things seem to work we can leave it only there!
I am not sure what it does, but it is probably reporting the parameters or something like this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action: remove unecessary java libs and use utils_nfcore_* for pipeline initialization

Comment on lines +121 to +144
// // Input channel definitions
features_table = Channel.fromPath( params.features_table ?: params.input, checkIfExists: true)

wgs_trinucs = params.wgs_trinuc_counts
? Channel.fromPath( params.wgs_trinuc_counts, checkIfExists: true).first()
: Channel.empty()
cosmic_ref = params.cosmic_ref_signatures
? Channel.fromPath( params.cosmic_ref_signatures, checkIfExists: true).first()
: Channel.empty()
datasets3d = params.datasets3d
? Channel.fromPath( params.datasets3d, checkIfExists: true).first()
: Channel.empty()
annotations3d = params.annotations3d
? Channel.fromPath( params.annotations3d, checkIfExists: true).first()
: Channel.empty()
seqinfo_df = params.datasets3d
? Channel.fromPath( "${params.datasets3d}/seq_for_mut_prob.tsv", checkIfExists: true).first()
: Channel.empty()
cadd_scores = params.cadd_scores
? Channel.of([
file(params.cadd_scores, checkIfExists : true),
file(params.cadd_scores_ind, checkIfExists : true)
]).first()
: Channel.empty()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully grasp why we are redefining these variables. Is it to have always a channel even if the parameters are null? Also, the checkIfExists is automatically done in the validation of the parameters, maybe it is worth thinking a way to avoid redundancy.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, wgs_trinucsis never used in the script

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then we could remove all the checkIfExists mentions although maybe in some cases
it makes more sense to exclude it from the parameter validation but then have it here, so if no file is provided but that file does not need to be used it should not be a problem

anyway, all these initializations of specific channels are far from optimal, but for now it allowed me to use them in some ways that did not crush things, probably removing some of them would still be fine

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action: All need to be value channel and start the process where they are needed even if empty

Comment on lines +147 to +163
// if the user wants to use custom gene groups, import the gene groups table
// otherwise I am using the input csv as a dummy value channel
custom_groups_table = params.custom_groups_file
? Channel.fromPath( params.custom_groups_file, checkIfExists: true).first()
: Channel.fromPath(params.input)

// if the user wants to use custom BED file for computing the depths, import the BED file
// otherwise I am using the input csv as a dummy value channel
custom_bed_file = params.custom_bedfile
? Channel.fromPath( params.custom_bedfile, checkIfExists: true).first()
: Channel.fromPath(params.input)

// if the user wants to define hotspots for omega, import the hotspots definition BED file
// otherwise I am using the input csv as a dummy value channel
hotspots_bed_file = params.omega_hotspots_bedfile
? Channel.fromPath( params.omega_hotspots_bedfile, checkIfExists: true).first()
: Channel.fromPath(params.input)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rethink this logic - I personally don't understand it and whatever the user can do from outside, I think it should be outside of the main script. But maybe there is some underlying reason that I don't know and this is why it's here 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did all this to ensure that the files are "pulled" but if we mount all the disks then it is fine.
A part of me still think that is it worth having it, but I am happy to adapt to whatever is more common sense

Copy link
Member Author

@FedericaBrando FedericaBrando Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check if the else condition is valid or is it only use to fill the channel and run the process. If yes, remove and make value channel empty.

(the first one needs to be checked for sure)

@FedericaBrando FedericaBrando added the code-review 👩‍💻 Tasks associated with the code-review label Jan 20, 2025
@FedericaBrando FedericaBrando linked an issue Jan 20, 2025 that may be closed by this pull request
if (params.help) {
def logo = NfcoreTemplate.logo(workflow, params.monochrome_logs)
def citation = '\n' + WorkflowMain.citation(workflow) + '\n'
def String command = "nextflow run ${workflow.manifest.name} --input samplesheet.csv --genome GRCh37 -profile docker"
Copy link
Collaborator

@FerriolCalvet FerriolCalvet Jan 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this is ever used, but we should make sure that it is correct, I am not sure this would work right now,
we can leave it for later since it is not crucial

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works actually, if you run nextflow run main.nf --help it will exactly run that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the typical pipeline command in the help:

def String command = "nextflow run ${workflow.manifest.name} --input samplesheet.csv --genome GRCh37 -profile docker"

Copy link
Collaborator

@FerriolCalvet FerriolCalvet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, I left a couple of comments

@FedericaBrando FedericaBrando changed the title [DRAFT] CODE-REVIEW | NF lint : Implement topics, fix statement outside top-level declaration CODE-REVIEW | NF lint : Implement topics, fix statement outside top-level declaration Jan 22, 2025
@FedericaBrando FedericaBrando merged commit 1612083 into dev Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

code-review 👩‍💻 Tasks associated with the code-review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

code-review | Nextflow standards

3 participants