PePPer - Personalized Perturbation Profiler
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.

PePPeR - PErsonalized Perturbation ProfilER

PePPeR (Personalized Perturbation ProfileR) is an R package providing methods to fetch expression data sets from the GEO database, identify per-individual or group-wise differentially expressed (DE) genes and construct individual perturbation profiles.



  • R (>= 3.0.2)


  • Biobase
  • GEOquery


  • limma
  • affy
  • samr
  • preprocessCore
  • makecdfenv


Directly from GitHub

You can use install_github() method in devtools package


From an archieve file

Create a tarball containing the files in the repository

tar cvzf pepper.tgz --exclude .git pepper/

Install it using R

R CMD INSTALL pepper.tgz


> library(PEPPER)

See the documentation on the following functions for their use (?

  • for fetching information from NCBI GEO data base
  • for getting group-wise DE genes
  • get.z.matrix and get.z.score and get.peeps.from.z.matrix: for getting per-individual DE genes

See classify.simple.R function for an example on how to use PeePs for sample classification.

Case study: Getting DE genes in a reprocessed GEO Parkinson data set (GSE7621)

  • Fetch the Parkinson data set available at GEO, reprocess it using affy package and map probes to gene ids
# Data set specific parameters <- "GSE7621" # GEO id of the data set
probe.conversion <- "ENTREZ_GENE_ID" # column name for gene id mapping <- NULL # probe to gene mapping, if NULL uses the mapping in the data set
conversion.mapping.function <- NULL # modify probe names using this function 
sample.mapping.column <- "characteristics_ch1" # column to use for sample mapping <- NULL # the platform to use if there are multiple platform annotations
reprocess <- "affy" # reprocessing type for raw data
output.dir <- "./"

# Get the expression and sample mapping info from the reprocessed data set
d <-, sample.mapping.column = sample.mapping.column, do.log2 = NULL, probe.conversion = probe.conversion, =, conversion.mapping.function = conversion.mapping.function, output.dir = output.dir, =, reprocess = reprocess)
expr <- d$expr
sample.mapping <- d$sample.mapping
  • Get group-wise DE <- c("Parkinson's Disease")
states.control <- c("Old Control") 
out.file <- "case_mapping.dat"
sample.mapping <-,, states.control, out.file = out.file)
adjust.method <- 'BH'
fdr.cutoff <- 0.05
out.file <- "de.dat"
de <-, sample.mapping, c("case", "control"), method="limma", out.file, adjust.method=adjust.method, cutoff=fdr.cutoff, functional.enrichment="kegg") 
de <- de[abs(d$logFC)>=1,]
  • Get per-individual DE
# Get z scores
out.file <- "z.dat"
cutoff <- 2.5
z = get.z.matrix(expr, sample.mapping, method="mean", out.file=out.file)
indices <- apply(abs(z), 2, function(x) { which(x >= cutoff)})
geneids <- lapply(indices, function(x) { rownames(z)[x] })
# Alteratively you can use get.peeps.from.z.matrix
peeps <- get.peeps.from.z.matrix(z, cutoff=2, 


Menche J et al., Integrating personalized gene expression profiles into predictive disease-associated gene pools. Npj Systems Biology and Applications 2017;3:10 Pubmed