Installation

This repository contains the code necessary to reproduce the analysis and figures for

"Ancestry-based differences in the immune system are associated with lupus severity" (2022), Slight-Webb, S, Thomas, K., et al...

Installation

Download the code using

git clone https://github.com/KevinTThomas/sle_activity_scrnaseq.git

Requirements

The analysis requires R (>=4.1), Rstudio (>= 2021.09.0), Python (>=3.8) run on Ubuntu 20.04. The R packages that are required for this analysis can be found in the DESCRIPTION file; this project, however, has itself been organized as an R package and - provided that you have BiocManager and remotes installed - the necessary packages can be installed with:

BiocManager::install('KevinTThomas/sle_activity_scrnaseq')

Additionally, to improve reproducibility and ease installation, a Docker container is provided. It can be retrieved using:

docker pull milescsmith/sle_activity_scrnaseq:4.1.2

The docker container is based on the rocker/rstudio:4.1.2 container and thus runs a version of RStudio appropriate for compiling the analysis notebooks. The dockerfile for building the above image can be found within this repository.

Because of the size of the data, it is recommended that the analysis be run on a Linux workstation with multiple cores and at least 64 GB of RAM or on an institutional cluster.

Analysis

The code is divided into two sections:

An analysis pipeline that processes the counts generated by Cell Ranger, performs QC and background noise removal, generates a Seurat object, normalizes data, performs batch correction, and carries out clustering and dimensional reductions.
Code for generating the figures used in the manuscript.

The code is implemented as a series of Rmarkdown files. Since the output of earlier notebooks is used as input for later notebooks, they should be run in the order:

a. 01_pp_soupx.Rmd
b. 02_pp_object_construction.Rmd
c. 03_pp_qc.Rmd
d. 04_analysis.Rmd
e. 05_plotting.Rmd

These can either be run from within Rstudio (using the knit function to compile the Rmarkdown documents into github-compatible markdown files or html files) or on the command line using:

R -e "knitr::knit('analysis/01_pp_soupX.Rmd')"

A small subset of the data is provided as a toy example in the demo_data folder, which the code is currently written to analyze. The notebooks are currently written to run on this toy dataset. To run them on the data from the study, the analysis/01_pp_soupX.Rmd file will need to have all instances of demo_data changed to the name of the directory in which the data is located. Within that data directory, create a subdirectory named "droplets" and place the final raw and filtered matrix folder for each run within their own appropriately named subdirectory. In the data directory, place a sample sheet describing the samples; the da_samplesheet_final.csv file can serve as a template, but it must have the columns

Subject_id	ancestry	classification	age	run	Hashtag

Data

The scRNA-seq data has been deposited with the Gene Expression Omnibus and is available as series GSE189050

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
R		R
analysis		analysis
data		data
demo_data		demo_data
figures		figures
man		man
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.md		README.md
environment.yml		environment.yml
r_packages.R		r_packages.R
r_versions.md		r_versions.md
sle_activity_scrnaseq.Rproj		sle_activity_scrnaseq.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation

Requirements

Analysis

Data

About

Releases

Packages

Contributors 2

Languages

License

KevinTThomas/sle_activity_scrnaseq

Folders and files

Latest commit

History

Repository files navigation

Installation

Requirements

Analysis

Data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages