Add GMM-Demux #5641

mari-ga · 2024-05-21T13:08:29Z

PR checklist

Closes #XXX

jfy133

See below,

modules/nf-core/gmmdemux/main.nf

modules/nf-core/gmmdemux/meta.yml

jfy133 · 2024-05-23T12:12:19Z

modules/nf-core/gmmdemux/meta.yml

+        e.g. `[ id:'sample1', single_end:false ]`
+  - cell_hashing_matrix:
+      type: file
+      description: path to file containing matrix from cell hashing data, the tool receives either CSV files or TSV, type must be specified using parameters


If the typue MUST be specified using parameters, then this is OK to have as an input cahnnel. Everythin else via ext.args (as I've said a few times above 😬 )

Yep, in this case this are the input files, originally the tool gets only one path to the directory where these 3 files are stored.
nf-core cannot receive the path to the directory stored in test-datasets, that's why we need the 3 paths to create an intermediate folder which is later given as input

If the tool is noramlly meant to accept a directory:

Package the three test files into a directory and gzip it

Upload to test-datasets

Add to the test a 'setup' block where you specify the gzip archive and use the GUNZIP module to extract it

Pass GUNZIP.out.archive (or w/e it is) as the (single) input to the module

modules/nf-core/gmmdemux/tests/main.nf.test

…-demux merge

tstoeriko · 2024-05-29T14:23:58Z

modules/nf-core/cellranger/count/templates/cellranger_count.py

One of the commits includes changes in a bunch of files from other modules (cellranger, custom, mygene, rnatranscripts). Unless I'm missing something, I don't think these changes belong in this PR?

Yes, this isn't good...

Looks like a autoformatting thing?

Very possible, I didn't touch any of those files 🤔 anything I can do to revert it?

I think this is what you are looking for: https://stackoverflow.com/questions/215718/how-can-i-reset-or-revert-a-file-to-a-specific-revision

modules/nf-core/gmmdemux/main.nf

tstoeriko · 2024-05-29T14:56:24Z

modules/nf-core/gmmdemux/main.nf

+    tuple val(meta), path('test/barcodes.tsv.gz'), emit: barcodes
+    tuple val(meta), path('test/*.mtx.gz'       ), emit: matrix
+    tuple val(meta), path('test/features.tsv.gz'), emit: features


Where does the test in the path come from? It looks like all output files are currently being published in a directory called test within the actual output directory. Does the --output/-o flag only work for that one file? If you cannot specify the output directory properly, you might want to move the files to the right place after generating them (and delete unnecessary directories).
I think I would either
(1) rename all output files to contain the ${prefix} and publish them in the current directory OR
(2) put the .tsv.gz/mtx.gz files into a directory named ${prefix} (which kallisto quant or salmon quant seem to be doing)
Not sure which one would generally be preferred though.

Suggested change

tuple val(meta), path('test/barcodes.tsv.gz'), emit: barcodes

tuple val(meta), path('test/*.mtx.gz' ), emit: matrix

tuple val(meta), path('test/features.tsv.gz'), emit: features

tuple val(meta), path('barcodes.tsv.gz'), emit: barcodes

tuple val(meta), path('*.mtx.gz' ), emit: matrix

tuple val(meta), path('features.tsv.gz'), emit: features

Are all of these outputs also generated, when the -x flag is used? If not, they need to be optional.

Nope, but these are generated by the tool, even if no other parameters are added inside a default folder like the author says:

Output The default content in the output folder are the non-MSM droplets (SSDs), stored in MTX format. The output shares the same format with CellRanger 3.0. By default, the output is stored in SSD_mtx folder. The output location can be overwritten with the -o flag.

modules/nf-core/gmmdemux/main.nf

tstoeriko · 2024-05-29T15:46:04Z

modules/nf-core/gmmdemux/main.nf

+    val examine
+    val ambigous
+    val extract
+    val skip


Shouldn't some of these (e.g. skip) be paths?

modules/nf-core/gmmdemux/main.nf

tstoeriko

Sorry for all of my comments, this module seems to be a bit tedious to implement. I mainly left comments on main.nf for now, cause I think it makes sense to get this sorted out before I look at the tests in more detail.

modules/nf-core/gmmdemux/main.nf

modules/nf-core/gmmdemux/nextflow.config

modules/nf-core/gmmdemux/main.nf

tstoeriko · 2024-05-29T16:23:03Z

modules/nf-core/gmmdemux/main.nf

+    mkdir test
+    touch test/barcodes.tsv.gz
+    touch test/features.tsv.gz
+    touch test/matrix.mtx.gz


Suggested change

mkdir test

touch test/barcodes.tsv.gz

touch test/features.tsv.gz

touch test/matrix.mtx.gz

touch barcodes.tsv.gz

touch features.tsv.gz

touch matrix.mtx.gz

I tried with the suggested lines, but it crashes the test for stub

I think the paths just need to be updated to match the new output patterns, then it should hopefully work.

Suggested change

mkdir test

touch test/barcodes.tsv.gz

touch test/features.tsv.gz

touch test/matrix.mtx.gz

mkdir "${prefix}"

touch "${prefix}/barcodes.tsv.gz"

touch "${prefix}/features.tsv.gz"

touch "${prefix}/matrix.mtx.gz"

touch "${prefix}/classification_report_${prefix}"

touch "summary_report_${prefix}.txt"

modules/nf-core/gmmdemux/meta.yml

…gmm-demux

tstoeriko · 2024-06-05T13:33:34Z

modules/nf-core/gmmdemux/main.nf

+    tuple val(meta), path("${meta.id}/barcodes.tsv.gz"                          ), emit: barcodes
+    tuple val(meta), path("${meta.id}/*.mtx.gz"                                 ), emit: matrix
+    tuple val(meta), path("${meta.id}/features.tsv.gz"                          ), emit: features
+    tuple val(meta), path("${meta.id}/classification_report_${meta.id}"     ), emit: classification_report
+    tuple val(meta), path("summary_report_${meta.id}.txt"                       ), emit: summary_report, optional: true
+    path "versions.yml"           , emit: versions


Suggested change

tuple val(meta), path("${meta.id}/barcodes.tsv.gz" ), emit: barcodes

tuple val(meta), path("${meta.id}/*.mtx.gz" ), emit: matrix

tuple val(meta), path("${meta.id}/features.tsv.gz" ), emit: features

tuple val(meta), path("${meta.id}/classification_report_${meta.id}" ), emit: classification_report

tuple val(meta), path("summary_report_${meta.id}.txt" ), emit: summary_report, optional: true

path "versions.yml" , emit: versions

tuple val(meta), path("${prefix}/barcodes.tsv.gz") , emit: barcodes

tuple val(meta), path("${prefix}/*.mtx.gz") , emit: matrix

tuple val(meta), path("${prefix}/features.tsv.gz") , emit: features

tuple val(meta), path("${prefix}/classification_report_${prefix}"), emit: classification_report

tuple val(meta), path("summary_report_${prefix}.txt") , emit: summary_report, optional: true

path "versions.yml" , emit: versions

tstoeriko · 2024-06-05T13:39:05Z

modules/nf-core/gmmdemux/main.nf

+    if [[ ${summary_report} == true ]]; then
+        cat /dev/null >  summary_report_${prefix}.txt
+        echo "summary report file created"
+    fi 


Is this necessary because the tool itself does not create the summary report file?

tstoeriko · 2024-06-05T13:40:28Z

modules/nf-core/gmmdemux/main.nf

+    script:
+    def args           = task.ext.args ?: ''
+    def prefix         = task.ext.prefix ?: "${meta.id}"
+    def type_report    = type_report ? "-f test/classification_report_${prefix}" : "-s test/classification_report_${prefix}"


Suggested change

def type_report = type_report ? "-f test/classification_report_${prefix}" : "-s test/classification_report_${prefix}"

def type_report = type_report ? "-f ${prefix}/classification_report_${prefix}" : "-s ${prefix}/classification_report_${prefix}"

jfy133 · 2024-06-24T09:15:29Z

@mari-ga I have a feeling this PR would be better to be started from scratch due to the large number of conflicts comments... would this be OK?

gmm-demux test and other fixes

712b453

mari-ga requested a review from a team as a code owner May 21, 2024 13:08

mari-ga requested review from leoisl and removed request for a team May 21, 2024 13:08

mari-ga and others added 4 commits May 21, 2024 15:43

Trailing whitespace - nextflow config

b6162c8

whitespaces added

bbb3a0e

trailing white spaces out

aec1de5

Merge branch 'master' into gmm-demux

ff5bfc6

jfy133 requested changes May 23, 2024

View reviewed changes

mari-ga added 4 commits May 23, 2024 15:16

fixes 1

247f35b

Merge remote-tracking branch 'refs/remotes/origin/gmm-demux' into gmm…

58dad30

…-demux merge

some parameters out, tests passed

f737e9e

ruff passed

1b64cf7

mari-ga requested review from drpatelh, grst, klkeys, ggabernet and edmundmiller as code owners May 23, 2024 18:28

mari-ga and others added 4 commits May 23, 2024 21:01

white tracespace fixed

61d5eff

white trace space fixed

e130644

Merge branch 'master' into gmm-demux

8810e23

Merge branch 'master' into gmm-demux

5275c6b

tstoeriko reviewed May 29, 2024

View reviewed changes

tstoeriko requested changes May 29, 2024

View reviewed changes

jfy133 self-requested a review May 30, 2024 06:48

mari-ga added 3 commits May 30, 2024 14:49

suggested fixes working

bc25eaa

t oMerge remote-tracking branch 'refs/remotes/origin/gmm-demux' into …

06f3541

…gmm-demux

more fixes for comments

d66abd1

mari-ga mentioned this pull request Jun 3, 2024

Update - Hashing Demultiplexing data nf-core/test-datasets#1223

Merged

more fixes from comments

ec6d71a

Merge branch 'master' into gmm-demux

ebc7286

tstoeriko reviewed Jun 5, 2024

View reviewed changes

mari-ga closed this Jun 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GMM-Demux #5641

Add GMM-Demux #5641

mari-ga commented May 21, 2024 •

edited

Loading

jfy133 left a comment

jfy133 May 23, 2024

mari-ga May 23, 2024

jfy133 May 30, 2024

tstoeriko May 29, 2024

jfy133 May 30, 2024

jfy133 May 30, 2024

mari-ga May 30, 2024

tstoeriko Jun 5, 2024

tstoeriko May 29, 2024 •

edited

Loading

tstoeriko May 29, 2024

mari-ga May 30, 2024

tstoeriko May 29, 2024

tstoeriko left a comment

tstoeriko May 29, 2024

mari-ga Jun 2, 2024

tstoeriko Jun 5, 2024 •

edited

Loading

tstoeriko Jun 5, 2024 •

edited

Loading

tstoeriko Jun 5, 2024

tstoeriko Jun 5, 2024

jfy133 commented Jun 24, 2024

-    mkdir test
-    touch test/barcodes.tsv.gz
-    touch test/features.tsv.gz
-    touch test/matrix.mtx.gz
+    mkdir "${prefix}"
+    touch "${prefix}/barcodes.tsv.gz"
+    touch "${prefix}/features.tsv.gz"
+    touch "${prefix}/matrix.mtx.gz"
+    touch "${prefix}/classification_report_${prefix}"
+    touch "summary_report_${prefix}.txt"

	def type_report = type_report ? "-f test/classification_report_${prefix}" : "-s test/classification_report_${prefix}"
	def type_report = type_report ? "-f ${prefix}/classification_report_${prefix}" : "-s ${prefix}/classification_report_${prefix}"

Add GMM-Demux #5641

Add GMM-Demux #5641

Conversation

mari-ga commented May 21, 2024 • edited Loading

PR checklist

jfy133 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tstoeriko May 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tstoeriko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tstoeriko Jun 5, 2024 • edited Loading

Choose a reason for hiding this comment

tstoeriko Jun 5, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jfy133 commented Jun 24, 2024

mari-ga commented May 21, 2024 •

edited

Loading

tstoeriko May 29, 2024 •

edited

Loading

tstoeriko Jun 5, 2024 •

edited

Loading

tstoeriko Jun 5, 2024 •

edited

Loading