nf-pseudobulk

nf-pseudobulk is a Nextflow pipeline used to perform Gene Set Enrichment Analysis (GSEA) on pseudobulk data.

Runs pseudobulk aggregation on scRNA-seq h5ad files by summing expresesion values per patient and per cell type
Runs GSEA for the sum of cell types and for each resulting pseudobulk sample

h5ad Preprocessing

The h5ad file for use as input in this workflow should have the following characteristics:

Counts data, with the preprocessing of your choosing, stored as a layer. The name of this layer should be provided in the input samplesheet (more details below). Please note: conversion to counts values to z-scores is not advised, as subsequent steps in the processing don't allow negative values.
Genes identified by their Gene Symbol.

Usage

Before executing the workflow, create a Nextflow secret called SYNAPSE_AUTH_TOKEN using a Synapse Personal Access Token.

To run the pipeline with docker use:

nextflow run CRI-iAtlas/nf-pseudobulk --input <path/to/input.csv> -profile docker

Input Samplesheet

The input to this pipeline is a CSV samplesheet specified with the --input parameter

An example input sheet can be found at data/test_samplesheet.csv

Samplesheet requirements:

dataset: Name of dataset
h5ad: Synapse ID of input h5ad file to process
counts_layer: Name of layer in h5ad with raw counts (default: counts)
sample_id: Name of column in h5ad containing Sample IDs
cell_type_id: Name of column in h5ad containing Cell Type ID

Outputs

Output files:

gsea_pvals.csv : p-value for the enrichment test
gsea_scores.csv : enrichment scores
gsea_norm.csv : normalized enrichment scores

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
bin		bin
data		data
modules		modules
subworkflows		subworkflows
workflows		workflows
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
nextflow_schema.json		nextflow_schema.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bin

bin

data

data

modules

modules

subworkflows

subworkflows

workflows

workflows

.gitignore

.gitignore

Dockerfile

Dockerfile

README.md

README.md

main.nf

main.nf

nextflow.config

nextflow.config

nextflow_schema.json

nextflow_schema.json

requirements.txt

requirements.txt

Repository files navigation

nf-pseudobulk

h5ad Preprocessing

Usage

Input Samplesheet

Samplesheet requirements:

Outputs

About

Releases

Packages 1

Contributors 2

Languages

CRI-iAtlas/nf-pseudobulk

Folders and files

Latest commit

History

Repository files navigation

nf-pseudobulk

h5ad Preprocessing

Usage

Input Samplesheet

Samplesheet requirements:

Outputs

About

Resources

Stars

Watchers

Forks

Languages