-
Notifications
You must be signed in to change notification settings - Fork 717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GMM-Demux #5641
Add GMM-Demux #5641
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See below,
e.g. `[ id:'sample1', single_end:false ]` | ||
- cell_hashing_matrix: | ||
type: file | ||
description: path to file containing matrix from cell hashing data, the tool receives either CSV files or TSV, type must be specified using parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the typue MUST be specified using parameters, then this is OK to have as an input cahnnel. Everythin else via ext.args
(as I've said a few times above 😬 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, in this case this are the input files, originally the tool gets only one path to the directory where these 3 files are stored.
nf-core cannot receive the path to the directory stored in test-datasets, that's why we need the 3 paths to create an intermediate folder which is later given as input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tool is noramlly meant to accept a directory:
- Package the three test files into a directory and gzip it
- Upload to test-datasets
- Add to the test a 'setup' block where you specify the gzip archive and use the GUNZIP module to extract it
- Pass GUNZIP.out.archive (or w/e it is) as the (single) input to the module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the commits includes changes in a bunch of files from other modules (cellranger, custom, mygene, rnatranscripts). Unless I'm missing something, I don't think these changes belong in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this isn't good...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a autoformatting thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very possible, I didn't touch any of those files 🤔 anything I can do to revert it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is what you are looking for: https://stackoverflow.com/questions/215718/how-can-i-reset-or-revert-a-file-to-a-specific-revision
modules/nf-core/gmmdemux/main.nf
Outdated
tuple val(meta), path('test/barcodes.tsv.gz'), emit: barcodes | ||
tuple val(meta), path('test/*.mtx.gz' ), emit: matrix | ||
tuple val(meta), path('test/features.tsv.gz'), emit: features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the test
in the path come from? It looks like all output files are currently being published in a directory called test
within the actual output directory. Does the --output
/-o
flag only work for that one file? If you cannot specify the output directory properly, you might want to move the files to the right place after generating them (and delete unnecessary directories).
I think I would either
(1) rename all output files to contain the ${prefix}
and publish them in the current directory OR
(2) put the .tsv.gz
/mtx.gz
files into a directory named ${prefix}
(which kallisto quant or salmon quant seem to be doing)
Not sure which one would generally be preferred though.
tuple val(meta), path('test/barcodes.tsv.gz'), emit: barcodes | |
tuple val(meta), path('test/*.mtx.gz' ), emit: matrix | |
tuple val(meta), path('test/features.tsv.gz'), emit: features | |
tuple val(meta), path('barcodes.tsv.gz'), emit: barcodes | |
tuple val(meta), path('*.mtx.gz' ), emit: matrix | |
tuple val(meta), path('features.tsv.gz'), emit: features |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of these outputs also generated, when the -x
flag is used? If not, they need to be optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, but these are generated by the tool, even if no other parameters are added inside a default folder like the author says:
Output
The default content in the output folder are the non-MSM droplets (SSDs), stored in MTX format. The output shares the same format with CellRanger 3.0. By default, the output is stored in SSD_mtx folder. The output location can be overwritten with the -o flag.
modules/nf-core/gmmdemux/main.nf
Outdated
val examine | ||
val ambigous | ||
val extract | ||
val skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't some of these (e.g. skip
) be paths?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for all of my comments, this module seems to be a bit tedious to implement. I mainly left comments on main.nf
for now, cause I think it makes sense to get this sorted out before I look at the tests in more detail.
mkdir test | ||
touch test/barcodes.tsv.gz | ||
touch test/features.tsv.gz | ||
touch test/matrix.mtx.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mkdir test | |
touch test/barcodes.tsv.gz | |
touch test/features.tsv.gz | |
touch test/matrix.mtx.gz | |
touch barcodes.tsv.gz | |
touch features.tsv.gz | |
touch matrix.mtx.gz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried with the suggested lines, but it crashes the test for stub
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the paths just need to be updated to match the new output patterns, then it should hopefully work.
mkdir test | |
touch test/barcodes.tsv.gz | |
touch test/features.tsv.gz | |
touch test/matrix.mtx.gz | |
mkdir "${prefix}" | |
touch "${prefix}/barcodes.tsv.gz" | |
touch "${prefix}/features.tsv.gz" | |
touch "${prefix}/matrix.mtx.gz" | |
touch "${prefix}/classification_report_${prefix}" | |
touch "summary_report_${prefix}.txt" |
tuple val(meta), path("${meta.id}/barcodes.tsv.gz" ), emit: barcodes | ||
tuple val(meta), path("${meta.id}/*.mtx.gz" ), emit: matrix | ||
tuple val(meta), path("${meta.id}/features.tsv.gz" ), emit: features | ||
tuple val(meta), path("${meta.id}/classification_report_${meta.id}" ), emit: classification_report | ||
tuple val(meta), path("summary_report_${meta.id}.txt" ), emit: summary_report, optional: true | ||
path "versions.yml" , emit: versions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tuple val(meta), path("${meta.id}/barcodes.tsv.gz" ), emit: barcodes | |
tuple val(meta), path("${meta.id}/*.mtx.gz" ), emit: matrix | |
tuple val(meta), path("${meta.id}/features.tsv.gz" ), emit: features | |
tuple val(meta), path("${meta.id}/classification_report_${meta.id}" ), emit: classification_report | |
tuple val(meta), path("summary_report_${meta.id}.txt" ), emit: summary_report, optional: true | |
path "versions.yml" , emit: versions | |
tuple val(meta), path("${prefix}/barcodes.tsv.gz") , emit: barcodes | |
tuple val(meta), path("${prefix}/*.mtx.gz") , emit: matrix | |
tuple val(meta), path("${prefix}/features.tsv.gz") , emit: features | |
tuple val(meta), path("${prefix}/classification_report_${prefix}"), emit: classification_report | |
tuple val(meta), path("summary_report_${prefix}.txt") , emit: summary_report, optional: true | |
path "versions.yml" , emit: versions |
if [[ ${summary_report} == true ]]; then | ||
cat /dev/null > summary_report_${prefix}.txt | ||
echo "summary report file created" | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary because the tool itself does not create the summary report file?
script: | ||
def args = task.ext.args ?: '' | ||
def prefix = task.ext.prefix ?: "${meta.id}" | ||
def type_report = type_report ? "-f test/classification_report_${prefix}" : "-s test/classification_report_${prefix}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def type_report = type_report ? "-f test/classification_report_${prefix}" : "-s test/classification_report_${prefix}" | |
def type_report = type_report ? "-f ${prefix}/classification_report_${prefix}" : "-s ${prefix}/classification_report_${prefix}" |
@mari-ga I have a feeling this PR would be better to be started from scratch due to the large number of conflicts comments... would this be OK? |
PR checklist
Closes #XXX
versions.yml
file.label
nf-core modules test <MODULE> --profile docker
nf-core modules test <MODULE> --profile singularity
nf-core modules test <MODULE> --profile conda
nf-core subworkflows test <SUBWORKFLOW> --profile docker
nf-core subworkflows test <SUBWORKFLOW> --profile singularity
nf-core subworkflows test <SUBWORKFLOW> --profile conda