geneLists is an R package containing a collection of gene lists for simple, reproducible Gene Set Enrichment Analysis (GSEA).
A major focus of this repository is the collection of synaptic proteome genes as well as genes that are implicated in human brain disorders.
Gene lists are stored in the Broad Institute's GMT
format. These can be downloaded directly from the
datasets/
directory, or accessed in R with the data()
command. For
example, load the SFARI autism candidate gene dataset
with data(sfariGene)
.
Gene lists are are collected from the literature or online databases. Gene
identifiers are mapped to stable, unique Entrez IDs. Often, it is
necessary to map human genes to their homologous mouse genes. This is done using
the HomoloGene database and the getHomologs
function.
Insure you have installed AnnotationDbi
before installing geneLists
.
install.packages("BiocManager")
BiocManager::install("AnnotationDbi")
To install the geneLists
package in R, use the devtools
package:
# Install from github
devtools::install_github("soderling-lab/geneLists")
The gene mapping function getIDs
uses organism specific mapping data. Insure
you have downloaded the required packages, e.g. for mouse data you should have
installed org.Mm.eg.db
with BiocManager:
BiocManager::install("org.Mm.eg.db")
library(geneLists)
# See all available datasets
geneLists()
# Load a dataset
data(iPSD) # Uezu2016 iPSD genes
# converting between identifiers
gphn_proteome <- iPSD[["Gphn"]]
uniprot <- getIDs(gphn_proteome, from="entrez", to="uniprot", species="mouse")
# mapping genes using a given gene map
data(uniprot_map)
mapIDs(uniprot, from="Accession", to="Entrez", gene_map=uniprot_map)
# NOTE: be careful to not confuse getIDs (uses org.##.eg.db) and
# mapIDs (you must provide a gene_map; the arguments from and to specify columns
# in the gene_map).
For additional details about each dataset, see the
README in the datasets/
directory. Otherwise, the
source code used to compile each gene list can be found in inst/analysis
.
# to see all scripts in inst/analysis/2_build-lists:
list.files(system.file("analysis/2_build-lists", package="geneLists"))
The scripts in inst/analysis
record how each gene list was created. These
can be used as examples to show you can download a dataset and save it as a GMT
formatted file and gene_list R object. See the tutorials/064_E3-Ligases.R
script for a recent example on how to create a gene list.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.