LIGER (Linked Inference of Genomic Experimental Relationships)
LIGER (installed as
rliger ) is a package for integrating and analyzing multiple single-cell datasets, developed by the Macosko lab and maintained/extended by the Welch lab. It relies on integrative non-negative matrix factorization to identify shared and dataset-specific factors.
LIGER can be used to compare and contrast experimental datasets in a variety of contexts, for instance:
- Across experimental batches
- Across individuals
- Across sex
- Across tissues
- Across species (e.g., mouse and human)
- Across modalities (e.g., scRNAseq and spatial transcriptomics data, scMethylation, or scATAC-seq)
Once multiple datasets are integrated, the package provides functionality for further data exploration, analysis, and visualization. Users can:
- Identify clusters
- Find significant shared (and dataset-specific) gene markers
- Compare clusters with previously identified cell types
- Visualize clusters and gene expression using t-SNE and UMAP
We have also designed LIGER to interface with existing single-cell analysis packages, including Seurat.
Consider filling out our feedback form to help us improve the functionality and accessibility of LIGER.
For usage examples and guided walkthroughs, check the
vignettes directory of the repo.
rliger package requires only a standard computer with enough RAM to support the in-memory operations. For minimal performance, please make sure that the computer has at least about 2 GB of RAM. For optimal performance, we recommend a computer with the following specs:
- RAM: 16+ GB
- CPU: 4+ cores, 2.3 GHz/core
The package development version is tested on Linux operating systems and Mac OSX.
- Linux: CentOS 7, Manjaro 5.3.18
- Mac OSX: Mojave (10.14.1), Catalina (10.15.2)
rliger package should be compatible with Windows, Mac, and Linux operating systems.
Before setting up the
rliger package, users should have R version 3.4.0 or higher, and several packages set up from CRAN and other repositories. The user can check the dependencies in
LIGER is written in R and is also available on the Comprehensive R Archive Network (CRAN). Note that the package name is
rliger to avoid a naming conflict with an unrelated package. To install the version on CRAN, follow these instructions:
To install the latest development version directly from GitHub, type the following commands instead of step 3:
install.packages('devtools') library(devtools) install_github('welch-lab/liger')
Note that the GitHub version requires installing from source, which may involve additional installation steps on MacOS (see below).
Additional Steps for Installing LIGER from Source on MacOS (recommended before step 3)
Installation from CRAN is easy because pre-compiled binaries are available for Windows and MacOS. However, a few additional steps are required to install from source on MacOS. Installing RcppArmadillo on R>=3.4 requires Clang >= 4 and gfortran-6.1. For newer versions of R (R>=3.5), it's recommended to follow the instructions in this post. Follow the instructions below if you have R version 3.4.0-3.4.4.
- Install gfortran as suggested here
- Download clang4 from this page
- Uncompress the resulting zip file and type into Terminal (
mv /path/to/clang4/ /usr/local/
.R/Makevarsfile containing following:
# The following statements are required to use the clang4 binary CC=/usr/local/clang4/bin/clang CXX=/usr/local/clang4/bin/clang++ CXX11=/usr/local/clang4/bin/clang++ CXX14=/usr/local/clang4/bin/clang++ CXX17=/usr/local/clang4/bin/clang++ CXX1X=/usr/local/clang4/bin/clang++ LDFLAGS=-L/usr/local/clang4/lib
For example, use the following Terminal commands:
cd ~ mkdir .R cd .R nano Makevars
Paste in the required text above and save with
Additional Installation Steps for Online Learning using LIGER
The HDF5 library is required for implementing online learning in LIGER on data files in HDF5 format. It can be installed via one of the following commands:
|OS X (using Homebrew or Conda)||
|Debian-based systems (including Ubuntu)||
|Systems supporting yum and RPMs||
For Windows, the latest HDF5 1.12.0 is available at https://www.hdfgroup.org/downloads/hdf5/.
Detailed Instructions for FIt-SNE Installation
Note that the runUMAP function (which calls the
uwot package) also scales to large datasets and does not require additional installation steps.
However, using FIt-SNE is recommended for computational efficiency if you want to perform t-SNE on very large datasets.
Installing and compiling the necessary software requires the use of git, FIt-SNE, and FFTW. For a
basic overview of installation, visit this page.
Basic installation for most Unix machines can be achieved with the following commands after downloading the latest version of FFTW from here. In the fftw directory, run:
./configure make make install
(Additional instructions if necessary). Then in desired directory:
git clone https://github.com/KlugerLab/FIt-SNE.git cd FIt-SNE g++ -std=c++11 -O3 src/sptree.cpp src/tsne.cpp src/nbodyfft.cpp -o bin/fast_tsne -pthread -lfftw3 -lm pwd
Use the output of
pwd as the
fitsne.path parameter in runTSNE.
Note that the above instructions require root access. To install into a specified folder (such as your home directory) on a server, use the
./configure --prefix=<install_dir> make make install git clone https://github.com/KlugerLab/FIt-SNE.git cd FIt-SNE g++ -std=c++11 -O3 src/sptree.cpp src/tsne.cpp src/nbodyfft.cpp -I<install_dir>/include/ -L<install_dir>/lib/ -o bin/fast_tsne -pthread -lfftw3 -lm pwd
Install Time and Expected Run Time
The installation process of
rliger should take less than 30 minutes.
The expected run time is 1 - 4 hours depending on dataset size and downstream analysis of the user’s choice.
rliger package provides a small sample dataset for basic demos of the functions. You can find it in folder
We also provide a set of scRNA-seq and scATAC-seq datasets for real-world style demos. These datasets are as follows:
scATAC and scRNA data provided by 10X Genomics, access the pre-processed data from here. The data sources are:
scRNA data composed of two datasets of interneurons and oligodendrocytes from the mouse frontal cortex, two distinct cell types that should not align if integrated. Provided by Saunders, A. et.al., 2018; you can access the pre-processed data from here:
scRNA data from control and interferon-stimulated PBMCs. Raw data provided by Kang, et.al., 2017; The datasets were downsampled by applying the sample function without replacement yield 3000 cells for each matrix. You can download downsampled data from here:
Corresponding tutorials can be found in section Usage above.
This project is covered under the GNU General Public License 3.0.