SEtools

Pierre-Luc Germain, 14.01.2020

D-HEST Institute for Neurosciences, ETH Zürich & Laboratory of Statistical Bioinformatics, University Zürich

The SEtools package is a set of convenience functions for the Bioconductor class SummarizedExperiment. It facilitates merging, melting, and plotting SummarizedExperiment objects.

NOTE that the heatmap-related functions habe been moved to a standalone package, sechm, and have been deprecated from this package.

Getting started

Package installation

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("SEtools")

Or, to install the latest development version:

BiocManager::install("plger/SEtools")

Example data

To showcase the main functions, we will use an example object which contains (a subset of) whole-hippocampus RNAseq of mice after different stressors:

suppressPackageStartupMessages({
  library(SummarizedExperiment)
  library(SEtools)
})
data("SE", package="SEtools")
SE

## class: SummarizedExperiment 
## dim: 100 20 
## metadata(0):
## assays(2): counts logcpm
## rownames(100): Egr1 Nr4a1 ... CH36-200G6.4 Bhlhe22
## rowData names(2): meanCPM meanTPM
## colnames(20): HC.Homecage.1 HC.Homecage.2 ... HC.Swim.4 HC.Swim.5
## colData names(2): Region Condition

This is taken from Floriou-Servou et al., Biol Psychiatry 2018.

Merging and aggregating SEs

se1 <- SE[,1:10]
se2 <- SE[,11:20]
se3 <- mergeSEs( list(se1=se1, se2=se2) )
se3

## class: SummarizedExperiment 
## dim: 100 20 
## metadata(3): se1 se2 anno_colors
## assays(2): counts logcpm
## rownames(100): AC139063.2 Actr6 ... Zfp667 Zfp930
## rowData names(3): meanCPM meanTPM cluster
## colnames(20): se1.HC.Homecage.1 se1.HC.Homecage.2 ...
##   se2.HC.Swim.4 se2.HC.Swim.5
## colData names(3): Dataset Region Condition

All assays were merged, along with rowData and colData slots.

By default, row z-scores are calculated for each object when merging. This can be prevented with:

se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE)

If more than one assay is present, one can specify a different scaling behavior for each assay:

se3 <- mergeSEs( list(se1=se1, se2=se2), use.assays=c("counts", "logcpm"), do.scale=c(FALSE, TRUE))

Merging by rowData columns

It is also possible to merge by rowData columns, which are specified through the mergeBy argument. In this case, one can have one-to-many and many-to-many mappings, in which case two behaviors are possible:

By default, all combinations will be reported, which means that the same feature of one object might appear multiple times in the output because it matches multiple features of another object.
If a function is passed through aggFun, the features of each object will by aggregated by mergeBy using this function before merging.

rowData(se1)$metafeature <- sample(LETTERS,nrow(se1),replace = TRUE)
rowData(se2)$metafeature <- sample(LETTERS,nrow(se2),replace = TRUE)
se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE, mergeBy="metafeature", aggFun=median)

## Aggregating the objects by metafeature
## Merging...

sehm(se3)

Aggregating a SE

A single SE can also be aggregated by using the aggSE function:

se1b <- aggSE(se1, by = "metafeature")

## Aggregation methods for each assay:
## counts: sum; logcpm: expsum

se1b

## class: SummarizedExperiment 
## dim: 26 10 
## metadata(0):
## assays(2): counts logcpm
## rownames(26): A B ... Y Z
## rowData names(4): meanCPM meanTPM cluster metafeature
## colnames(10): HC.Homecage.1 HC.Homecage.2 ... HC.Handling.4
##   HC.Handling.5
## colData names(2): Region Condition

If the aggregation function(s) are not specified, aggSE will try to guess decent aggregation functions from the assay names.

Melting SE

To facilitate plotting features with ggplot2, the meltSE function combines assay values along with row/column data:

d <- meltSE(SE, genes=g[1:4])
head(d)

##   feature        sample Region Condition counts    logcpm
## 1    Egr1 HC.Homecage.1     HC  Homecage 1581.0 4.4284969
## 2   Nr4a1 HC.Homecage.1     HC  Homecage  750.0 3.6958917
## 3     Fos HC.Homecage.1     HC  Homecage   91.4 1.7556317
## 4    Egr2 HC.Homecage.1     HC  Homecage   15.1 0.5826999
## 5    Egr1 HC.Homecage.2     HC  Homecage 1423.0 4.4415828
## 6   Nr4a1 HC.Homecage.2     HC  Homecage  841.0 3.9237691

suppressPackageStartupMessages(library(ggplot2))
ggplot(d, aes(Condition, counts, fill=Condition)) + geom_violin() + 
    facet_wrap(~feature, scale="free")

Name		Name	Last commit message	Last commit date
Latest commit History 182 Commits
.github/workflows		.github/workflows
R		R
README_files/figure-html		README_files/figure-html
data		data
inst		inst
man		man
vignettes		vignettes
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SEtools

Getting started

Package installation

Example data

Merging and aggregating SEs

Merging by rowData columns

Aggregating a SE

Melting SE

About

Releases

Packages

Contributors 4

Languages

plger/SEtools

Folders and files

Latest commit

History

Repository files navigation

SEtools

Getting started

Package installation

Example data

Merging and aggregating SEs

Merging by rowData columns

Aggregating a SE

Melting SE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages