Skip to content

This GitHub repository contains all the analysis code used in, "Characterization of single cell transcriptomic responses to osteoarthritis in a naturally occurring equine model."

License

Notifications You must be signed in to change notification settings

dyammons/equine_synovialFluid_scRNA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 

Repository files navigation

equine_synovialFluid_scRNA

DOI

This GitHub repository contains all the analysis code used in, "Characterization of the single cell landscape in normal and osteoarthritic equine joints."

The manuscript has been conditionally accepted and will be published in Annals of Translational Medicine shortly.

Repository goals:

  • provide a resource to make the data generated from this project accessible
  • enable reproducible/transparent data reporting
  • provide analysis code to reproduce custom figures

If you have any questions or concerns, please submit an issue, contact the corresponding author(s), and/or contact Dylan Ammons at dylan.ammons @ colostate dot edu.

File structure:

  • 📁 input instructions for obtaining data associated with this study
  • 📁 scripts contains the analysis code and source file used to complete the data analysis
  • 📁 output contains the expected directory structure

Supplemental data and potential uses:

  1. Browse the data
  2. Cell type annotations
  3. Reference Mapping
  4. Module scoring

Browse the complete annotated dataset

The proccessed dataset is avaliable for browsing via the UCSC Cell Browser portal. Using the portal you can explore feature expression throughout the dataset as well as obtain the transcriptomic signatures of each cell type though an interactive webpage.

Link to the dataset: https://equine-joint-oa.cells.ucsc.edu

Link to UCSC Cell Browser documentation: https://cellbrowser.readthedocs.io/en/master/

2. Cell type annotations with defining markers

Cell markers lists were curated using the full synovial fluid dataset of 3 normal and 3 osteoarthritis (OA) affected euqine joints. The top 50 defining features (identified using FindAllMarkers for each cell type were considered, with the top 24 features evaluated for specificity using violin plots and preference given to unique features only found in the top 50 of one cell type.

Cell type gene signatures

Cell type Major markers
Macrophage APOE, PLTP, C1QC, CD68, ENSECAG00000024790, LGMN
CCL2_Macrophage CCL2, CCL8, LYVE1, P2RY13, HBEGF, PCP4L1
CD5L_Macrophage MARCO, C4BPA, C1QC, CD55, MSR1, C1QB, ENSECAG00000022247
GPNMB_Macrophage APOE, CTSD, GPNMB, LGMN, ENSECAG00000024790, PLD3, ACP5
TPPP3_Macrophage ANXA2, S100A10, S100A4, ANXA1, LGALS3, TPPP3, S100A11, ANXA5
Mo-Mac LYZ, PLAC8B, ENSECAG00000000436, SLPI, SERPINB10, AQP9
Monocyte S100A12, ENSECAG00000010117, ECATH-3, VCAN, ENSECAG00000010598, ENSECAG00000010615
Dendritic cell CPVL, CST3, DQA, ENSECAG00000039383, DRA, DRB, IRF8, NAPSA, FCER1A, FLT3
cDC1 ENSECAG00000039383, ENSECAG00000039998, CPVL, IRF8, ENSECAG00000029928, GPIHBP1, DNASE1L3, FLT3
cDC2 FCER1A, CD1A7, CD1E2, DRB, DRA, DQB, CD1C, CD1A1
cycling_DC BRCA1, CDCA7, UHRF1, CLSPN, RRM2, BRCA2, HELLS, CDT1
migDC RORC, CCR7, NGFR, CHN1, SLCO5A1, SLC38A1, MARCKSL1, IL4I1
pDC GJB2, MS4A1, GJA1, FCRLA, OSBPL10, TCF4, IL18R1
Neutrophil ENSECAG00000030548, SNX10, SOD2, ILT11B, IFIT2, CCRL2, S100P, ENSECAG00000036967, SELP
CD4 T cells LTB, ENSECAG00000036014, ENSECAG00000000910, CD5, CD3E, CD2
SELL_CD4 SELL, KLF2, CCR7, LEF1, ENSECAG00000010622, ENSECAG00000040878, EEF1A1
Activated_CD4 ENSECAG00000000910, ENSECAG00000036014, CALY, CD40LG, ICOS
Treg ARID5B, FGL2, RALA, FOXO1, TNFRSF1B, CGA
CD8 T cells CCL5, GZMA, CD7, CTSW, NKG7, CD3E, CD2, CD3G, FASLG
Effector_CD8 GZMA, CTSW, CCL5, NKG7, DRB, KLRK1
DAPL1_CD8 DAPL1, GZMK, GZMM, IDO1, IRF7
CX3CR1_T CX3CR1, ENSECAG00000031322, ENSECAG00000022193, SCML4, ZEB2
IFN-T MX1, IRF7, ISG20, ENSECAG00000033029, ISG15, XAF1, SAMD9L
gd T cells KLRB1, GNLY, ENSECAG00000038560, TRDC, BLK, KLRF1, RORA, NCR3
GZMA_gd_T ENSECAG00000038560, BLK, TRDC, ENSECAG00000039473, KLRB1, GZMA, ENSECAG00000014832, CTSW
IL23R_gd_T1 KLRF1, SCART1, TRDC, KLRB1, TNFRSF25, IL23R, TNFSF13B, RORA, IKZF2
IL23R_gd_T2 ENSECAG00000038560, BLK, SCART1, RHEX, KLRF1, TRDC, TNFRSF25, IL23R, RORA
KLRD1_T KLRB1, HOPX, CRYBG2, KLRD1, CCL5, KLRK1, TRDC, CTSW
GNLY_T GNLY, KLRB1, KLRF1, CTSW, NKG7, TRDC, LAG3
B cells ENSECAG00000039599, MS4A1, CD79A, TCF4, CD19, PRAG1
Cycling cells PAFAH2, TOP2A, ENSECAG00000036105, H2AZ1, TUBA1A, NUSAP1, CENPF, H1-3

Using the data to complete reference mapping

Reference mapping is useful tool to facilitate the identification of cell types in single cell datasets. The approach described here uses Seurat functions to identify anchors between a query dataset (external/personal data) and the reference datasets generated in this study. The default approach describes how to use the healthy only dataset, but it will also work with the combined dataset if you load that file in as the reference.

Before running the reference mapping code, a Seurat object need to be preprocessed and stored as an object named seu.obj.

#set the path to the location in which the reference file is saved
reference <- readRDS(file = "eq_synovial_fluid_annotated.rds")

#prepare the reference
reference[['integrated']] <- as(object = reference[['integrated']] , Class = "SCTAssay")
DefaultAssay(reference) <- "integrated"

#find conserved anchors with query and reference
anchors <- FindTransferAnchors(
    reference = reference,
    query = seu.obj,
    normalization.method = "SCT",
    reference.reduction = "pca",
    dims= 1:50
)

#select meta.data slot to use for label transfer -- change refdata value to use alternate labels (i.e., refdata = reference$celltype.l1)
predictions <- TransferData(
    anchorset = anchors,
    refdata = reference$celltype.l3,
    dims = 1:50
)
seu.obj <- AddMetaData(seu.obj, metadata = predictions)

#generate and save the image
pi <- DimPlot(
    seu.obj, 
    reduction = "umap", 
    group.by = "predicted.id",
    pt.size = 0.25,
    label = T,
    label.box = T,
    shuffle = F
)
ggsave("./output/referenceMap.png", width = 7, height = 7)

4. Module scoring

Module scoring is a supplemental approach that can be applied to single cell datasets with the goal of providing further insights into cell identities. The approach described below uses the Seurat function AddModuleScore and the gene lists presented in Table 2 of our associated manuscript.

The concept of the AddModuleScore() function is similar to GSEA, but also distinct in many ways. Read the Seurat documentation and/or check out this webpage for more details.

#load in the reference file
# can download with this command: wget https://raw.githubusercontent.com/dyammons/equine_synovialFluid_scRNA/main/scripts/input/genesig_long.csv
ref.df <- read.csv("genesig_long.csv", header = T)

#organize the data
modulez <- split(ref.df$gene, ref.df$celltype.l2)

#complete module scoring
seu.obj <- AddModuleScore(seu.obj, features = modulez, name = "_score")

#correct the naming -- credit to: https://github.com/satijalab/seurat/issues/2559
names(seu.obj@meta.data)[grep("_score", names(seu.obj@meta.data))] <- names(modulez)

#plot the results
features <- names(modulez)
ecScores <- DotPlot(
    seu.obj,
    assay = "RNA",
    features = features
)

ggsave("./output/dots_celltypes.png", width = 10, height = 6)

About

This GitHub repository contains all the analysis code used in, "Characterization of single cell transcriptomic responses to osteoarthritis in a naturally occurring equine model."

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages