MAESTRO(Model-based AnalysEs of Single-cell Transcriptome and RegulOme) is a comprehensive single-cell RNA-seq and ATAC-seq analysis suit built using snakemake. MAESTRO combines several dozen tools and packages to create an integrative pipeline, which enables scRNA-seq and scATAC-seq analysis from raw sequencing data (fastq files) all the way through alignment, quality control, cell filtering, normalization, unsupervised clustering, differential expression and peak calling, celltype annotation and transcription regulation analysis. Currently, MAESTRO support Smart-seq2, 10x-genomics, Drop-seq, SPLiT-seq for scRNA-seq protocols; microfudics-based, 10x-genomics and sci-ATAC-seq for scATAC-seq protocols.
- Release MAESTRO.
- Provide docker image for easy installation. Note, the docker does not include cellranger/cellranger ATAC, as well as the corresponding genome index. Please install cellranger/cellranger ATAC following the installation instructions.
- Fix some bugs and set LISA as the default method to predict transcription factors for scRNA-seq. Note, the docker includes the lisa conda environment, but does not include required pre-computed genome datasets. Please download hg38 or mm10 datasets and update the configuration following the installation instructions.
- Python (>= 3.0) for MAESTRO snakemake workflow
- R (>= 3.5.1) for MAESTRO R package
Installing the MAESTRO by conda
MAESTRO uses the Miniconda3 package management system to harmonize all of the software packages.
Use the following commands to the install Minicoda3：
$ wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh $ bash Miniconda3-latest-Linux-x86_64.sh
And then users can create an isolated environment for MAESTRO and install through the following commands:
$ conda config --add channels defaults $ conda config --add channels bioconda $ conda config --add channels conda-forge $ conda create -n MAESTRO maestro -c liulab-dfci
Installing the MAESTRO R package
If users already have the processed datasets, like cell by gene or cell by peak matrix generate by Cell Ranger. Users can install the MAESTRO R package alone to perform the analysis from processed datasets.
$ R > library(devtools) > install_github("liulab-dfci/MAESTRO")
Installing Cell Ranger
MAESTRO depends on the Cell Ranger and Cell Ranger ATAC for the mapping of the data generated by 10X Genomics. Please install Cell Ranger and Cell Ranger ATAC before using MAESTRO. If users have already installed Cell Ranger, please specify the path of Cell Ranger in the YAML configuration file.
MAESTRO utilizes RABIT and LISA to evaluate the enrichment of transcription factors based on the marker genes from scRNA-seq clusters. To run this function, the users need first to install RABIT, download the RABIT index from Cistrome website, and provide the file location of the index to MAESTRO in the YAML configuration file. Alternatively, users can also use LISA to predict the potential transcription factors that regulate the marker genes from scRNA-seq clusters. Please follow the description at LISA website to install and use this function.
MAESTRO utilizes giggle to identify enrichment of transcription factor peaks in scATAC-seq cluster-specific peaks. To run this function, users need first to install giggle, download the giggle index from Cistrome website, and provide the file location of the index to MAESTRO in the YAML configuration file.