Add multiome support (scATAC + scRNA) #174

heylf · 2022-11-02T12:23:16Z

Description of feature

Just wanted to put it down and mention that I am currently working on the implementation of cellranger-arc (modules + subworkflow). Maybe worth to discuss if this is fine to integrate into scranseq or if his should be an own pipeline as it requires a different sample sheet format and different input checks.

grst · 2022-11-04T23:50:45Z

Could you elaborate on how the samplesheet would need to look like? If it's just about having additional columns, I think it would be fine.

More general, we should think about which modalities (ATAC, CITE, VDJ, spatial, ...) we want to support in the future and which should be processed by the same workflow.

heylf · 2022-11-07T11:33:35Z

I will implement cellranger-arc in scrnaseq and then see how it goes. But yes, it would be nice to discuss the modalities.

apeltzer · 2022-11-08T23:09:50Z

Would still be great to discuss here before merging an entire subworkflow first 😉

heylf · 2022-11-09T09:33:12Z

Technically cellranger-arc needs a samplesheet (lib.cv) as an input which looks likes this:

fastqs,sample,library_type
/home/jdoe/runs/HNGEXSQXXX/outs/fastq_path,example,Gene Expression
/home/jdoe/runs/HNATACSQXX/outs/fastq_path,example,Chromatin Accessibility

Thus, lib.csv defines the folder locations for the scRNA and scATAC for the sample.

My thinking was the following:

Keep the definitions, subworkflows, and modules for the samplesheet as it is currently defined for scrnaseq
Add two new optional columns to the samplesheet: folder_GEX and folder_ATAC
Write a separate input_check_multiome.nf to create an input channel for scranseq.nf that has instead of [meta, [reads]] now [meta, [folders]]
Write a module and script to generate the lib.csv for cellranger based on the input channel from step 3.

This approach has two advantages:

It can be easily adapted for future data modalities (e.g., an additional methylation data) if this arrives at some point.
The user does not have to generate the lib.csv and stays with the definition of the samplesheet for scrnaseq

What are your thoughts about this? @grst and @apeltzer

grst · 2022-11-09T09:39:19Z

Personally I like to have all files explicitly listed in the samplesheet, also for consistency with other aligners.

Possible alternative:
Add an additional column sample_type to the samplesheet. The column is optional and if nothing is specified it assumes "gex":

sample,fastq_1,fastq_2,expected_cells,sample_type
pbmc8k,s3://nf-core-awsmegatests/scrnaseq/input_data/pbmc8k_S1_L007_R1_001.fastq.gz,s3://nf-core-awsmegatests/scrnaseq/input_data/pbmc8k_S1_L007_R2_001.fastq.gz,"10000",gex
pbmc8k,s3://nf-core-awsmegatests/scrnaseq/input_data/pbmc8k_S1_L008_R1_001.fastq.gz,s3://nf-core-awsmegatests/scrnaseq/input_data/pbmc8k_S1_L008_R2_001.fastq.gz,"10000",gex
pbmc8k,s3://nf-core-awsmegatests/scrnaseq/input_data/pbmc8k_S1_L007_R1_001.fastq.gz,s3://nf-core-awsmegatests/scrnaseq/input_data/pbmc8k_S1_L007_R2_001.fastq.gz,"10000",atac
pbmc8k,s3://nf-core-awsmegatests/scrnaseq/input_data/pbmc8k_S1_L008_R1_001.fastq.gz,s3://nf-core-awsmegatests/scrnaseq/input_data/pbmc8k_S1_L008_R2_001.fastq.gz,"10000",atac

heylf · 2022-11-09T09:49:14Z

Works for me.

apeltzer · 2022-11-09T09:49:22Z

Tendency to go for the option proposed by @grst as it will not break existing solutions / setup 👍🏻

heylf · 2022-11-09T09:49:43Z

Perfect. Then I am on it!

apeltzer · 2022-11-09T09:50:40Z

A bit of a question remains for me how we should generally work with these multi-ome analysis types: scrnaseq (as per name suggests ;-)) is for sc-rna analysis, if we continue to add more types of analyses we might have overlaps with other nf-core pipelines (atacseq, ...) - maybe something I will put up for discussion on the general nf-core slack how to deal with these sort of things in the future 👍🏻

heylf · 2022-11-09T10:02:59Z

Thanks @apeltzer. Indeed, also for findability, because users might not immediately realize that you could use scrnaseq for scATAC and multiome.

apeltzer · 2022-11-09T10:26:23Z

Link to discussion on Slack also added here to do proper x-ref: https://nfcore.slack.com/archives/CE4K7FEHE/p1667987775811819

apeltzer · 2022-11-09T10:26:36Z

Please chime in there too - there is some opinions out there.

grst · 2023-07-13T12:33:40Z

The corresponding module has been merged:
nf-core/modules#3229

heylf added the enhancement New feature or request label Nov 2, 2022

grst added this to the 2.3.0 milestone Feb 21, 2023

grst mentioned this issue Mar 28, 2023

added cellranger vdj/mkvdjref modules, with module test for vdj nf-core/modules#3033

Merged

10 tasks

grst mentioned this issue Jul 13, 2023

Support for 10x FFPE scRNA #247

Closed

This was referenced Oct 26, 2023

Add ATAC-Seq data support #129

Closed

Multi-omics support (meta-issue) #272

Open

grst removed this from the 2.3.0 milestone Jan 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multiome support (scATAC + scRNA) #174

Add multiome support (scATAC + scRNA) #174

heylf commented Nov 2, 2022

grst commented Nov 4, 2022

heylf commented Nov 7, 2022

apeltzer commented Nov 8, 2022

heylf commented Nov 9, 2022 •

edited

Loading

grst commented Nov 9, 2022

heylf commented Nov 9, 2022

apeltzer commented Nov 9, 2022

heylf commented Nov 9, 2022

apeltzer commented Nov 9, 2022

heylf commented Nov 9, 2022

apeltzer commented Nov 9, 2022

apeltzer commented Nov 9, 2022

grst commented Jul 13, 2023

Add multiome support (scATAC + scRNA) #174

Add multiome support (scATAC + scRNA) #174

Comments

heylf commented Nov 2, 2022

Description of feature

grst commented Nov 4, 2022

heylf commented Nov 7, 2022

apeltzer commented Nov 8, 2022

heylf commented Nov 9, 2022 • edited Loading

grst commented Nov 9, 2022

heylf commented Nov 9, 2022

apeltzer commented Nov 9, 2022

heylf commented Nov 9, 2022

apeltzer commented Nov 9, 2022

heylf commented Nov 9, 2022

apeltzer commented Nov 9, 2022

apeltzer commented Nov 9, 2022

grst commented Jul 13, 2023

heylf commented Nov 9, 2022 •

edited

Loading