Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cellranger vdj #289

Merged
merged 38 commits into from
Mar 7, 2024
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
385364c
split bulk/sc raw
mapo9 Dec 5, 2023
5c3f677
added cellranger reference, cellranger vdj
mapo9 Dec 5, 2023
e53237d
added cellranger vdj
mapo9 Dec 7, 2023
f35e3d7
added cellranger vdj
mapo9 Dec 7, 2023
cf9090c
Merge branch 'dev' into cellranger_vdj
ggabernet Feb 8, 2024
f8b6a0f
Update nextflow_schema.json
mapo9 Feb 8, 2024
747d4ba
Update nextflow_schema.json
mapo9 Feb 8, 2024
d052124
Update subworkflows/local/repertoire_analysis_reporting.nf
mapo9 Feb 8, 2024
1adb1a7
Update subworkflows/local/sc_raw_input.nf
mapo9 Feb 8, 2024
d8cee70
Update workflows/airrflow.nf
mapo9 Feb 8, 2024
40a53d6
single cell based on parameter
mapo9 Feb 9, 2024
75d05f7
fixed samplesheet_val
mapo9 Feb 9, 2024
90a2ef4
linting
mapo9 Feb 9, 2024
f048861
lint
mapo9 Feb 9, 2024
9d8598d
lint
mapo9 Feb 9, 2024
42ccb1e
pre-commit
mapo9 Feb 9, 2024
12bdef9
pre-commit
mapo9 Feb 9, 2024
62f4643
test ready
mapo9 Feb 12, 2024
2917dc2
more stuff for test
mapo9 Feb 12, 2024
1e44e33
paths to test data
mapo9 Feb 12, 2024
293ba60
prettier
mapo9 Feb 12, 2024
3e52ef3
broken link
mapo9 Feb 12, 2024
ebbd620
Merge branch 'dev' into cellranger_vdj
ggabernet Feb 16, 2024
bc1e316
avoid creating extra param
ggabernet Feb 16, 2024
2bb6952
merge fastqs with multiple lanes
ggabernet Feb 19, 2024
f1caa7b
fix text
ggabernet Feb 19, 2024
7b8b8c9
fix lint
ggabernet Feb 19, 2024
68a7329
fix metadata merge
ggabernet Feb 19, 2024
afa4c1d
add collect
ggabernet Feb 20, 2024
a0ad699
Merge pull request #305 from ggabernet/cellranger_vdj
ggabernet Feb 27, 2024
ed44d9a
delete local mkvdjref
mapo9 Feb 28, 2024
637dd94
readme update 01
mapo9 Feb 28, 2024
541904d
docs update
mapo9 Feb 29, 2024
cdef84f
Update README.md
mapo9 Mar 6, 2024
ec048d9
Update CHANGELOG.md
mapo9 Mar 6, 2024
6f0844b
Update README.md
mapo9 Mar 6, 2024
575b41d
Update README.md
mapo9 Mar 6, 2024
184d559
prettier
mapo9 Mar 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ jobs:
"test_fetchimgt",
"test_assembled_hs",
"test_assembled_mm",
"test_10x_sc",
"test_clontech_umi",
"test_nebnext_umi",
]
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ package-lock.json
.idea/
nf-params.json
.vscode/
tests/
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.

- [#294](https://github.com/nf-core/airrflow/pull/294) Merge template updates nf-core/tools v2.11.1
- [#299](https://github.com/nf-core/airrflow/pull/299) Add profile for common NEB and TAKARA protocols
- Add possibility to merge multi-lane samples when starting from fastq files
mapo9 marked this conversation as resolved.
Show resolved Hide resolved

### `Fixed`

Expand Down
14 changes: 6 additions & 8 deletions bin/check_samplesheet.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,11 +124,6 @@ def check_samplesheet(file_in, assembled):
)
)
else:
if any(tab["single_cell"].tolist()):
print_error(
"Some single cell column values are TRUE. The raw mode only accepts bulk samples. If processing single cell samples, please set the `--mode assembled` flag, and provide an AIRR rearrangement as input."
)

for col in required_columns_raw:
if col not in header:
print("ERROR: Please check samplesheet header: {} ".format(",".join(header)))
Expand Down Expand Up @@ -165,9 +160,12 @@ def check_samplesheet(file_in, assembled):

## Check that sample ids are unique
if len(tab["sample_id"]) != len(set(tab["sample_id"])):
print_error(
"Sample IDs are not unique! The sample IDs in the input samplesheet should be unique for each sample."
)
if assembled:
print_error(
"Sample IDs are not unique! The sample IDs in the input samplesheet should be unique for each sample."
)
else:
print("WARNING: Sample IDs are not unique! FastQs with the same sample ID will be merged.")

## Check that pcr_target_locus is IG or TR
for val in tab["pcr_target_locus"]:
Expand Down
11 changes: 6 additions & 5 deletions bin/reveal_add_metadata.R
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,12 @@ if (!("INPUTID" %in% names(opt))) {
# Read metadata file
metadata <- read.csv(opt$METADATA, sep = "\t", header = TRUE, stringsAsFactors = F)

# Merging samples over multiple lanes introduces multi-rows per sample
# We expect only one row per sample
metadata <- metadata %>%
filter(sample_id == opt$INPUTID)
dplyr::filter(sample_id == opt$INPUTID) %>%
dplyr::select(!starts_with("filename_")) %>%
dplyr::distinct()

if (nrow(metadata) != 1) {
stop("Expecting nrow(metadata) == 1; nrow(metadata) == ", nrow(metadata), " found")
Expand All @@ -81,10 +85,7 @@ internal_fields <-
"id",
"filetype",
"valid_single_cell",
"valid_pcr_target_locus",
"filename_R1",
"filename_R2",
"filename_I1"
"valid_pcr_target_locus"
)
metadata <- metadata[, !colnames(metadata) %in% internal_fields]

Expand Down
28 changes: 28 additions & 0 deletions conf/test_10x_sc.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
/*
* -------------------------------------------------
* Nextflow config file for running tests
* -------------------------------------------------
* Defines bundled input files and everything required
* to run a fast and simple test. Use as follows:
* nextflow run nf-core/airrflow -profile test_10x_sc,<docker/singularity>
*/

params {
config_profile_name = 'Test 10xGenomics single cell data'
config_profile_description = 'Minimal test dataset to check pipeline function with raw single cell data from 10xGenomics'

// Limit resources so that this can run on GitHub Actions
max_cpus = 2
max_memory = 6.GB
max_time = 48.h

// params
mode = 'fastq'
library_generation_method = 'sc_10x_genomics'
clonal_threshold = 0


// Input data
input = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-sc/10x_sc_raw.tsv'
reference_10x = 'https://raw.githubusercontent.com/nf-core/test-datasets/airrflow/testdata-sc/refdata-cellranger-vdj-GRCh38-alts-ensembl-5.0.0.tar.gz'
}
15 changes: 15 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,21 @@
"https://github.com/nf-core/modules.git": {
"modules": {
"nf-core": {
"cat/fastq": {
"branch": "master",
"git_sha": "02fd5bd7275abad27aad32d5c852e0a9b1b98882",
"installed_by": ["modules"]
},
"cellranger/mkvdjref": {
"branch": "master",
"git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5",
"installed_by": ["modules"]
},
"cellranger/vdj": {
"branch": "master",
"git_sha": "3f5420aa22e00bd030a2556dfdffc9e164ec0ec5",
"installed_by": ["modules"]
},
"custom/dumpsoftwareversions": {
"branch": "master",
"git_sha": "8ec825f465b9c17f9d83000022995b4f7de6fe93",
Expand Down
37 changes: 37 additions & 0 deletions modules/local/cellranger/mkvdjref.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
process CELLRANGER_MKVDJREF {
mapo9 marked this conversation as resolved.
Show resolved Hide resolved
tag "${meta.id}"
label 'process_high'

container "nf-core/cellranger:7.1.0"

input:
path fasta
// path gtf // maybe this can be used to decide whether --seqs or --genes should be used. If its possible to set gtf as an optional parameter.
val reference_name

output:
path "${reference_name}", emit: reference
path "versions.yml" , emit: versions

when:
task.ext.when == null || task.ext.when

script:
// Exit if running this module with -profile conda / -profile mamba
if (workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1) {
error "CELLRANGER_MKVDJREF module does not support Conda. Please use Docker / Singularity / Podman instead."
}
def args = task.ext.args ?: ''
"""
cellranger \\
mkvdjref \\
--genome=$reference_name \\
--seqs=$fasta \\
$args

cat <<-END_VERSIONS > versions.yml
"${task.process}":
cellranger: \$(echo \$( cellranger --version 2>&1) | sed 's/^.*[^0-9]\\([0-9]*\\.[0-9]*\\.[0-9]*\\).*\$/\\1/' )
END_VERSIONS
"""
}
29 changes: 29 additions & 0 deletions modules/local/unzip_cellrangerdb.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
process UNZIP_CELLRANGERDB {
tag "unzip_cellrangerdb"
label 'process_single'

conda "${moduleDir}/environment.yml"
container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ?
'https://depot.galaxyproject.org/singularity/ubuntu:20.04' :
'nf-core/ubuntu:20.04' }"

input:
path(archive)

output:
path("$unzipped") , emit: unzipped
path "versions.yml", emit: versions

script:
unzipped = archive.toString() - '.tar.gz'
"""
echo "${unzipped}"

tar -xzvf ${archive}

cat <<-END_VERSIONS > versions.yml
"${task.process}":
unzip_cellrangerdb: \$(echo \$(tar --version 2>&1 | sed 's/^.*(GNU tar) //; s/ Copyright.*\$//')
END_VERSIONS
"""
}
7 changes: 7 additions & 0 deletions modules/nf-core/cat/fastq/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

80 changes: 80 additions & 0 deletions modules/nf-core/cat/fastq/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

42 changes: 42 additions & 0 deletions modules/nf-core/cat/fastq/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions modules/nf-core/cellranger/mkvdjref/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

38 changes: 38 additions & 0 deletions modules/nf-core/cellranger/mkvdjref/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.