Rexploration

flowchart TD
  A["Input tables"] --> B["DESeq2 model"]
  B --> C["DE tables: step2a"]
  B --> D["Scaled matrix (VST + z-score)"]
  D --> E["PCA"]
  D --> F["Heatmap of top DE genes"]
  B --> G["Volcano plots"]
  B --> H["MA plots"]
  D --> I["Boxplots: top 10 DE genes"]
  B --> J["Top 10 up/down lists"]

TL;DR

Input expected under input/: em.csv (tab-delimited counts, first column ID), sample_sheet.csv (SAMPLE, SAMPLE_GROUP), annotations.csv (Gene ID, Associated Gene Name).
Flags:
- --pcol : which significance column to use (pvalue or padj)
- --pthresh : significance threshold (default 0.01)
- --lfc : absolute log2 fold-change cutoff (default 1)
Run with p-values:
- nextflow run main.nf --pcol pvalue --pthresh 0.01 --lfc 1
Run with adjusted p-values:
- nextflow run main.nf --pcol padj --pthresh 0.05 --lfc 1
Output:
- Tables in output_step2a
- Plots in output_step2b (pvalue) or output_step2c (padj)

Outputs are written to output_step2a and either output_step2b (pvalue) or output_step2c (padj).

Sample Outputs

Overview

This pipeline runs DESeq2 on the provided experiment inputs, filters non-finite values, produces DE tables, PCA, heatmap, volcano/MA plots, boxplots, and top-10 up/down tables. It enforces the sample group order gut -> duct -> node.

Inputs

Expected under input/:

em.csv (tab-delimited expression matrix, first column ID)
sample_sheet.csv (tab-delimited: SAMPLE, SAMPLE_GROUP)
annotations.csv (tab-delimited: Gene ID, Associated Gene Name)

Outputs

Always:

output_step2a/de_gut_duct.tsv
output_step2a/de_duct_node.tsv
output_step2a/de_node_gut.tsv

When --pcol pvalue:

output_step2b/ with PCA, heatmap, volcano, MA, boxplots, and top-10 tables

When --pcol padj:

output_step2c/ with the same plots/tables based on adjusted p-values

Parameters

--pcol : pvalue or padj
--pthresh : significance threshold (default 0.01)
--lfc : absolute log2 fold-change threshold (default 1)
--input_dir : input directory (default input)
--outdir : output directory (default .)

Run locally

nextflow run main.nf --pcol pvalue --pthresh 0.01 --lfc 1

Docker

If you prefer Docker, create a minimal container with R and required packages. Then run:

nextflow run main.nf -profile docker --pcol pvalue --pthresh 0.01 --lfc 1

You will need a Docker-enabled nextflow.config profile that sets process.container to an image containing:

R (>= 4.2 recommended)
Bioconductor DESeq2
CRAN: ggplot2, ggrepel, pheatmap

Conda

Create a conda env with R + packages and use Nextflow's conda profile:

nextflow run main.nf -profile conda --pcol pvalue --pthresh 0.01 --lfc 1

Example nextflow.config snippet for conda:

profiles {
  conda {
    process.conda = 'conda/renv.yml'
  }
  docker {
    process.container = 'your-docker-image:tag'
  }
}

Notes

The pipeline filters out NaN/Inf values before writing outputs.
Sample group order is enforced as gut, duct, node.
Labels on volcano/MA plots use geom_label_repel.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
sample_images		sample_images
sample_input		sample_input
scripts		scripts
.DS_Store		.DS_Store
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rexploration

TL;DR

Sample Outputs

Overview

Inputs

Outputs

Parameters

Run locally

Docker

Conda

Notes

About

Uh oh!

Releases

Packages

Languages

PawnChessmon/R-Exploration

Folders and files

Latest commit

History

Repository files navigation

Rexploration

TL;DR

Sample Outputs

Overview

Inputs

Outputs

Parameters

Run locally

Docker

Conda

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages