Skip to content

Commit

Permalink
Add Universc (nf-core#1706)
Browse files Browse the repository at this point in the history
* initialise template for new module: "universc"

* update UniverSC module
  updates metadata, docker container source, licensing, and citations

* update UniverSC module
  unit tests and documentation

* update UniverSC module
  add inputs, outputs and example calls for UniverSC and Cell Ranger v3.0.2
  calls versions for UniverSC and Cell Ranger

* initialise template for new module: "universc"

* update UniverSC module
  updates metadata, docker container source, licensing, and citations

* update UniverSC module
  unit tests and documentation

* update UniverSC module
  add inputs, outputs and example calls for UniverSC and Cell Ranger v3.0.2
  calls versions for UniverSC and Cell Ranger

* resolve formatting issues for UniverSC module

* resolve linting errors for UniverSC module

* fix test jobs to call UniverSC version without errors

* correct configuration for UniverSC test jobs

* correct linting errors for UniverSC module

* correct docker build files for UniverSC

* correct syntax errors in cellranger version call

* prettier docs for UniverSC

* add output to test data for UniverSC module

* update UniverSC module to restructured repo
nf-core#2141

remove files for restructed UniverSC module (avoids duplicate tests)

* define separate outputs for Cell Ranger and UniverSC tests

* remove TODO statements and update UniverSC meta.yml
  defines input and output variables and triggers automated tests

* resolve minor linting issues with UniverSC

* update paths to UniverSC module in test config

* update container versions and tags for UniverSC module

* simplifiy container configurations for UniverSC

* update configuration for UniverSC tests (build Cell Ranger transcriptome reference first)

* test UniverSC module with Cell Ranger references

* update tests for UniverSC for restructured repository

* update reference inputs for UniverSC module

* set up references for cellranger OS test

* resolve permissions errors for starting UniverSC

* update input arguments for UniverSC and Cell Ranger OS (tests passing)

* correct versions and checksums for UniverSC tests

* resolves linting issues for UniverSC

* resolves linting issues for UniverSC

* update expected test outputs for UniverSC tests

* migrate UniverSC tests to calling open source Cell Ranger
uses Cell Ranger 3.0.2 OS implementation (MIT License)
tests passing locally

* migrate changes to source code to updated UniverSC container

* update unit tests for UniverSC to correct output (using new container)

* update output criteria for UniverSC unit tests

* change output directory for UniverSC unit tests

* test adding podman to GitHub actions (will revert if reviewers object to it)

* correct test errors for Cell Ranger OS tests (UniverSC module)

* array format for test checks (UniverSC and Cell Ranger OS)

* remove unncessary files from UniverSC module

* remove podman from automated testing

* remove mentions of nf-core/universc container

* call executable script from PATH in UniverSC container

* migrate UniverSC module Cell Ranger OS count own directory

* reorganise UniverSC submodules

* update process names in UniverSC module for consistency

* update formatting for UniverSC tests

* update unit tests for UniverSC submodules running Cell Ranger OS 3.0.2

* reorganise UniverSC submodules to fit naming conventions

* remove stub from UniverSC for testing

* add universc/mkfastq to unit tests

* correct syntax in cellranger module files

* update expected output for universc and universc/count
 tests now consistent with cellranger/count module

* correct syntax for universc/count meta.yaml (pass linting)

* add stub to universc and universc/count

* update unit tests for universc module

* update unit tests for universc module

* update unit tests for universc module

* update unit tests for universc module

* update universc/mkfastq test to run stub

* update universc/mkfastq expected outputs when running stub

* remove trailing whitespace (linting error)

* restructure UniverSC main module

* update configuration to run each UniverSC test once only

* correct UniverSC unit test configuration

* update name of tools in universc tests configuration

* update expected output for Cell Ranger and UniverSC tests

* updates unit tests for Cell Ranger and UniverSC
 uses contain for web summary HTML
 as suggested by @apeltzer nf-core#1706

* updates unit tests for Cell Ranger for web summary HTML

* updates unit tests for Cell Ranger for web summary HTML

* updates unit tests for Cell Ranger for web summary HTML to use description

* correct path to Cell Ranger test output

* update container options for run UniverSC with singularity
  runs without root priviledges in writeable container

* migrate UniverSC container to mirrored image at nfcore/universc:1.2.4
  adds documentation for image build configuration
  discussed in nf-core#1706

* remove redundant submodules from UniverSC
with functions already supported by Cell Ranger module
nf-core#1706

* migrate UniverSC references to generate by Cell Ranger submodules

* update test configuration for universc/launch

* update expected outputs for UniverSC to use Cell Ranger references

* update expected outputs for UniverSC to use Cell Ranger references

* remove UniverSC submodules for mkref and mkgtf
(already implement in Cell Ranger module)
discussed in nf-core#1706

* move universc/launch submodule to universc module

* remove tests for UniverSC submodules for mkref and mkgtf
(already implement in Cell Ranger module)
discussed in nf-core#1706

* move tests for universc/launch submodule to universc module

* migrate universc/launch submodule to universc module

* update paths in unit tests from universc/launch to universc

* update documentation for UniverSC module

* update paths in test config from universc/launch to universc

* restore cellranger module (remove changes from PR 1706)

* restore cellranger module (remove changes from PR 1706)

* restore cellranger module (remove changes from PR 1706)

* update style of documentation to pass linting

* add podman to settings and docs (passes local test)

* test podman configuration

* test podman configuration

* restore changes to testing (removes podman discussed in nf-core#2675)

* restore changes to other modules (removes cellranger discussed in nf-core#2646)

* update podman settings in UniverSC docs

* update podman parameters

* update container version for universc to stable release 1.2.5

* remove conda tests for universc (not supported)

* update container version for universc to latest release 1.2.5.1
(run tests on pushed version on personal account)

* update container version for universc to use nfcore/universc:1.2.5.1 mirror

* exit logic for universc module that doesn't support conda
 consistent with other modules exit logic for modules that dont support conda
 nf-core#2657

* trigger GitHub Actions test for tomkellygenetics/universc:1.2.5.1

* add log files to universc output directory (confirm running subroutines as expected)

* correct UniverSC test configuration

 addresses singularity test issue https://github.com/nf-core/modules/actions/runs/3955706571/jobs/6774566021

* update configuration for singularity in universc tests

* test running universc with singularity --fakeroot
 requires shadow-uidmap::newuidmap installed

* update configuration for singularity in universc tests

* debug GH Actions configuration for singularity in universc tests

* test running singularity with —fakeroot write permissions

* test singularity—

* revert changes to singularity tests
   disables singularity for universc (image too large)

* update container settings for universc
  allows running rootless podman or singularity
  using --runtime crun or --writable-tmpfs
  apptainer/singularity#3220

* test universc with singularity --writable-tmpfs

* revert changes to singularity tests (--writable-tmpfs not supported on GH Actions)

* update container settings for universc to call nfcore/universc:1.2.5.1
(pending mirrored version available)

* update version in UniverSC citation

---------

Co-authored-by: Simon Thomas Kelly <simonthomas.kelly@hugp.com>
Co-authored-by: Gisela Gabernet <gisela.gabernet@gmail.com>
Co-authored-by: TomKellyGenetics <tomkellygenetics@gmail>
Co-authored-by: Alexander Peltzer <apeltzer@users.noreply.github.com>
  • Loading branch information
5 people authored and samfulcrum committed Feb 27, 2023
1 parent e460148 commit 76ac8cf
Show file tree
Hide file tree
Showing 11 changed files with 427 additions and 0 deletions.
4 changes: 4 additions & 0 deletions .github/workflows/pytest-workflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,10 @@ jobs:
tags: merquryfk/merquryfk
- profile: "conda"
tags: merquryfk/ploidyplot
- profile: "conda"
tags: universc
- profile: "singularity"
tags: universc
- profile: "conda"
tags: subworkflows/vcf_annotate_ensemblvep
env:
Expand Down
51 changes: 51 additions & 0 deletions modules/nf-core/universc/CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- given-names: "S. Thomas"
family-names: "Kelly"
email: "tom.kelly@riken.jp"
affiliation: "Center for Integrative Medical Sciences, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan"
orcid: "https://orcid.org/0000-0003-3904-6690"
- family-names: "Battenberg"
given-names: "Kai"
email: "kai.battenberg@riken.jp"
affiliation: "Center for Sustainable Resource Science, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan"
orcid: "http://orcid.org/0000-0001-7517-2657"
version: 1.2.5.1
doi: 10.1101/2021.01.19.427209
date-released: 2021-02-14
url: "https://github.com/minoda-lab/universc"
preferred-citation:
type: article
authors:
- given-names: "S. Thomas"
family-names: "Kelly"
email: "tom.kelly@riken.jp"
affiliation: "Center for Integrative Medical Sciences, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan"
orcid: "https://orcid.org/0000-0003-3904-6690"
- family-names: "Battenberg"
given-names: "Kai"
email: "kai.battenberg@riken.jp"
affiliation: "Center for Sustainable Resource Science, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan"
orcid: "https://orcid.org/http://orcid.org/0000-0001-7517-2657"
- family-names: "Hetherington"
given-names: "Nicola A."
affiliation: "Center for Integrative Medical Sciences, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan"
orcid: "http://orcid.org/0000-0001-8802-2906"
- family-names: "Hayashi"
given-names: "Makoto"
affiliation: "Center for Sustainable Resource Science, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan"
orcid: "http://orcid.org/0000-0001-6389-4265"
- given-names: "Aki"
family-names: "Minoda"
email: "akiko.minoda@riken.jp"
affiliation: Center for Integrative Medical Sciences, RIKEN, Suehiro-cho-1-7-22, Tsurumi Ward, Yokohama, Japan"
orcid: "http://orcid.org/0000-0002-2927-5791"
doi: "10.1101/2021.01.19.427209"
title: "UniverSC: a flexible cross-platform single-cell data processing pipeline"
year: "2021"
journal: "bioRxiv"
start: 2021.01.19.427209
volume:
issue:
month: 1
37 changes: 37 additions & 0 deletions modules/nf-core/universc/CITATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
### Citation <span id="Citation"><span>

A submission to a journal and biorXiv is in progress. Please cite these when
they are available. Currently, the package can be cited
as follows:

Kelly, S.T., Battenberg, Hetherington, N.A., K., Hayashi, K., and Minoda, A. (2021)
UniverSC: a flexible cross-platform single-cell data processing pipeline.
bioRxiv 2021.01.19.427209; doi: [https://doi.org/10.1101/2021.01.19.427209](https://doi.org/10.1101/2021.01.19.427209)
package version 1.2.5.1. [https://github.com/minoda-lab/universc](https://github.com/minoda-lab/universc)

```
@article {Kelly2021.01.19.427209,
author = {Kelly, S. Thomas and Battenberg, Kai and Hetherington, Nicola A. and Hayashi, Makoto and Minoda, Aki},
title = {{UniverSC}: a flexible cross-platform single-cell data processing pipeline},
elocation-id = {2021.01.19.427209},
year = {2021},
doi = {10.1101/2021.01.19.427209},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Single-cell RNA-sequencing analysis to quantify RNA molecules in individual cells has become popular owing to the large amount of information one can obtain from each experiment. We have developed UniverSC (https://github.com/minoda-lab/universc), a universal single-cell processing tool that supports any UMI-based platform. Our command-line tool enables consistent and comprehensive integration, comparison, and evaluation across data generated from a wide range of platforms.Competing Interest StatementThe authors have declared no competing interest.},
eprint = {https://www.biorxiv.org/content/early/2021/01/19/2021.01.19.427209.full.pdf},
journal = {{bioRxiv}},
note = {package version 1.2.5.1},
URL = {https://github.com/minoda-lab/universc},
}
```

```
@Manual{,
title = {{UniverSC}: a flexible cross-platform single-cell data processing pipeline},
author = {S. Thomas Kelly, Kai Battenberg, Nicola A. Hetherington, Makoto Hayashi, and Aki Minoda},
year = {2021},
note = {package version 1.2.5.1},
url = {https://github.com/minoda-lab/universc},
}
```
116 changes: 116 additions & 0 deletions modules/nf-core/universc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# UniverSC

## Single-cell processing across technologies

UniverSC is an open-source single-cell pipeline that runs across platforms on various technologies.

## Maintainers

Tom Kelly (RIKEN, IMS)

Kai Battenberg (RIKEN CSRS/IMS)

Contact: <first name>.<family name>[at]riken.jp

## Implementation

This container runs Cell Ranger v3.0.2 installed from source on MIT License on GitHub with
modifications for compatibility with updated dependencies. All software is installed from
open-source repositories and available for reuse.

It is _not_ subject to the 10X Genomics End User License Agreement (EULA).
This version allows running Cell Ranger v3.0.2 on data generated from any experimental platform
without restrictions. However, updating to newer versions on Cell Ranger subject to the
10X EULA is not possible without the agreement of 10X Genomics.

To comply with licensing and respect 10X Genomics Trademarks, the 10X Genomics logo
has been removed from HTML reports, the tool has been renamed, and proprietary
closed-source tools to build Cloupe files are disabled.

It is still suffient to generate summary reports and count matrices compatible with
single-cell analysis tools available for 10X Genomics and Cell Ranger output format
in Python and R packages.

## Usage

### Generating References

The Cell Ranger modules can be used to generate reference indexes to run UniverSC.
Note that UniverSC requires the Open Source version v3.0.2 of Cell Ranger included
in the nf-core/universc Docker image. The same module parameters can be run provided
that the container is changed in process configurations (modify nextflow.config).

```
process {
...
withName: CELLRANGER_MKGTF {
container = "nfcore/universc:1.2.5.1"
}
withName: CELLRANGER_MKREF{
container = "nfcore/universc:1.2.5.1"
}
...
}
```

This will generate a compatible index for UniverSC using the same version of the
STAR aligner and a permissive software license without and EULA.

### Container settings

The cellranger install directory must have write permissions to run UniverSC.
To run in docker or podman use the `--user root` option in container parameters
and for singularity use the `--writeable` parameter.

These are set as default in universc/main.nf:

```
container "nfcore/universc:1.2.5.1"
if (workflow.containerEngine == 'docker'){
containerOptions = "--privileged"
}
if (workflow.containerEngine == 'podman'){
containerOptions = "--runtime /usr/bin/crun --userns=keep-id --user root --systemd=always"
}
if (workflow.containerEngine == 'singularity'){
containerOptions = "--writable"
}
```

Select the container engine with `nextflow --profile "docker"` or set the environment variable
as one of the following before running nextflow.

```
export PROFILE="docker"
export PROFILE="podman"
export PROFILE="singularity"
```

Note that due to dependencies installed in a docker image, it is not possible to use conda environments.

## Disclaimer

We are third party developers not affiliated with 10X Genomics or any other vendor of
single-cell technologies. We are releasing this code on an open-source license which calls Cell Ranger
as an external dependency.

## Licensing

This package is provided open-source on a GPL-3 license. This means that you are free to use and
modify this code provided that they also contain this license.

## Updating the package

The tomkellygenetics/universc:<VERSION> container is automatically updated with tomkellygenetics/universc:latest.

A stable release is mirrored at nfcore/universc:1.2.5.1 and will be updated as needed.

To build an updated container use the Dockerfile provided here:

[https://github.com/minoda-lab/universc/blob/master/Dockerfile](https://github.com/minoda-lab/universc/blob/master/Dockerfile)

Note that this uses a custom base image which is built with an open-source implementation of
Cell Ranger v3.0.2 on MIT License and relies of Python 2. The build file can be found here:

[https://github.com/TomKellyGenetics/cellranger_clean/blob/master/Dockerfile](https://github.com/TomKellyGenetics/cellranger_clean/blob/master/Dockerfile)
76 changes: 76 additions & 0 deletions modules/nf-core/universc/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
process UNIVERSC {
tag "$meta.id"
label 'process_medium'

// Exit if running this module with -profile conda / -profile mamba
if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) {
exit 1, "UNIVERSC module does not support Conda. Please use Docker / Singularity / Podman instead."
}
container "nfcore/universc:1.2.5.1"
if (workflow.containerEngine == 'docker'){
containerOptions = "--privileged"
}
if ( workflow.containerEngine == 'podman'){
containerOptions = "--runtime crun --userns=keep-id --systemd=always"
}
if (workflow.containerEngine == 'singularity'){
containerOptions = "-B /var/tmp --writable-tmpfs"
params.singularity_autoMounts = true
}

input:
tuple val(meta), path(reads)
path reference


output:
tuple val(meta), path("sample-${meta.id}/outs/*"), emit: outs
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
def args = task.ext.args ?: ''
def sample_arg = meta.samples.unique().join(",")
def reference_name = reference.name
def input_reads = meta.single_end ? "--file $reads" : "-R1 ${reads[0]} -R2 ${reads[1]}"
"""
universc \\
--id 'sample-${meta.id}' \\
${input_reads} \\
--technology '${meta.technology}' \\
--chemistry '${meta.chemistry}' \\
--reference ${reference_name} \\
--description ${sample_arg} \\
--jobmode "local" \\
--localcores ${task.cpus} \\
--localmem ${task.memory.toGiga()} \\
--per-cell-data \\
$args 1> _log 2> _err
# save log files
echo !! > sample-${meta.id}/outs/_invocation
cp _log sample-${meta.id}/outs/_log
cp _err sample-${meta.id}/outs/_err
cat <<-END_VERSIONS > versions.yml
"${task.process}":
cellranger: \$(echo \$(cellranger count --version 2>&1 | head -n 2 | tail -n 1 | sed 's/^.* //g' | sed 's/(//g' | sed 's/)//g' ))
universc: \$(echo \$(bash /universc/launch_universc.sh --version | grep version | grep universc | sed 's/^.* //g' ))
END_VERSIONS
"""


stub:
"""
mkdir -p "sample-${meta.id}/outs/"
touch sample-${meta.id}/outs/fake_file.txt
cat <<-END_VERSIONS > versions.yml
"${task.process}":
cellranger: \$(echo \$(cellranger count --version 2>&1 | head -n 2 | tail -n 1 | sed 's/^.* //g' | sed 's/(//g' | sed 's/)//g' ))
universc: \$(echo \$(bash /universc/launch_universc.sh --version | grep version | grep universc | sed 's/^.* //g' ))
END_VERSIONS
"""
}
42 changes: 42 additions & 0 deletions modules/nf-core/universc/meta.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
name: "universc"
description: Module to run UniverSC an open-source pipeline to demultiplex and process single-cell RNA-Seq data
keywords:
- demultiplex
- align
- single-cell
- scRNA-Seq
- count
- umi
tools:
- "universc":
description: "UniverSC: a flexible cross-platform single-cell data processing pipeline"
homepage: "https://hub.docker.com/r/tomkellygenetics/universc"
documentation: "https://raw.githubusercontent.com/minoda-lab/universc/master/man/launch_universc.sh"
tool_dev_url: "https://github.com/minoda-lab/universc"
doi: "https://doi.org/10.1101/2021.01.19.427209"
licence: ["GPL-3.0-or-later"]

input:
- meta:
type: map
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- reads:
type: file
description: FASTQ or FASTQ.GZ file, list of 2 files for paired-end data
pattern: "*.{fastq,fq,fastq.gz,fq.gz}"

output:
- outs:
type: file
description: Files containing the outputs of Cell Ranger
pattern: "sample-${meta.id}/outs/*"
- versions:
type: file
description: File containing software version
pattern: "versions.yml"

authors:
- "@kbattenb"
- "@tomkellygenetics"
4 changes: 4 additions & 0 deletions tests/config/nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@ if ("$PROFILE" == "singularity") {
} else if ("$PROFILE" == "mamba") {
conda.enabled = true
conda.useMamba = true
} else if ("$PROFILE" == "podman") {
podman.enabled = true
podman.userEmulation = true
podman.runOptions = "--runtime crun --platform linux/x86_64 --systemd=always"
} else {
docker.enabled = true
docker.userEmulation = true
Expand Down
4 changes: 4 additions & 0 deletions tests/config/pytest_modules.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3240,6 +3240,10 @@ unicycler:
- modules/nf-core/unicycler/**
- tests/modules/nf-core/unicycler/**

universc:
- modules/nf-core/universc/**
- tests/modules/nf-core/universc/**

untar:
- modules/nf-core/untar/**
- tests/modules/nf-core/untar/**
Expand Down
33 changes: 33 additions & 0 deletions tests/modules/nf-core/universc/main.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/usr/bin/env nextflow

nextflow.enable.dsl = 2

include { CELLRANGER_MKGTF } from '../../../../modules/nf-core/cellranger/mkgtf/main.nf'
include { CELLRANGER_MKREF } from '../../../../modules/nf-core/cellranger/mkref/main.nf'
include { UNIVERSC } from '../../../../modules/nf-core/universc//main.nf'

workflow test_universc_10x {

input = [ [ id:'123', technology:'10x', chemistry:'SC3Pv3', single_end:false, strandedness:'forward', samples: ["test_10x"] ], // meta map
[ file(params.test_data['homo_sapiens']['illumina']['test_10x_1_fastq_gz'], checkIfExists: true),
file(params.test_data['homo_sapiens']['illumina']['test_10x_2_fastq_gz'], checkIfExists: true)
]
]

fasta = file(params.test_data['homo_sapiens']['genome']['genome_fasta'], checkIfExists: true)
gtf = file(params.test_data['homo_sapiens']['genome']['genome_gtf'], checkIfExists: true)
reference_name = "homo_sapiens_chr22_reference"

CELLRANGER_MKGTF ( gtf )

CELLRANGER_MKREF (
fasta,
CELLRANGER_MKGTF.out.gtf,
reference_name
)

UNIVERSC (
input,
CELLRANGER_MKREF.out.reference
)
}
Loading

0 comments on commit 76ac8cf

Please sign in to comment.