Error: grouping factors must have > 1 sampled level #146

esrasefik · 2020-11-28T18:46:29Z

Dear MAST Team,
I am trying to use MAST for genotype-based differential expression analysis in a number of cell-type specific clusters that I identified in my single cell RNA-Seq data. I have two genotypes (wild type and a mutant strain with a hemizygous deletion of 21 genes). I have 4 replicates per genotype, so 8 mice in total (metadata on replicate ids is referred to as orig.ident below). Due to concerns related to pseudo-replication, I would like to add a random effect for replicate to my model. However, when I try to run the code pasted below, I get an error that states:

Error: grouping factors must have > 1 sampled level no 'nobs' method is available. 
Error: number of levels of each grouping factor must be < number of observations (problems: orig.ident).

The pipeline does not crash in response to this error, but I would like to understand what is causing the error message. Any feedback would be appreciated.

Note: I performed my data preprocessing and clustering using Seurat. Hence, I first had to transform my Seurat object into an SCA object. I believe I followed the vignette correctly, but in case the error is caused by a mistake in one of the steps prior to running zlm, I included all relevant code:

s_obj4.sce <- as.SingleCellExperiment(s_obj4, assay = "RNA") # s_obj4 is my Seurat object that includes the raw counts
s_obj4.sce <- logNormCounts(s_obj4.sce, log = TRUE) # normalization
s_obj4.sca <- SceToSingleCellAssay(s_obj4.sce, check_sanity = FALSE) 

cdr <-colSums(assay(s_obj4.sca)>0) # calculcate cellular detection rate
colData(s_obj4.sca)$cngeneson <- scale(cdr)

cond <-factor(colData(s_obj4.sca)$genotype)
cond <-relevel(cond, "1") # "genotype = 1" refers to wild type
colData(s_obj4.sca)$genotype <-cond

s_obj4.sca_cl1 <- subset(s_obj4.sca, CellType =='1') # subset the data to only cells from CellType =='1' to perform the analysis on only neurons.

zlm_cl1_test <- zlm(~ genotype + cngeneson + (1 | orig.ident),
     sca = s_obj4.sca_cl1, 
     exprs_value = 'logcounts', 
     method="glmer", 
     ebayes=FALSE,
      silent=T, 
     fitArgsD = list(nAGQ = 0))

Thank you for any help you can give me!
Esra

The text was updated successfully, but these errors were encountered:

gfinak · 2020-11-28T19:14:40Z

Look at the crosstabulation of what you're actually modeling.
What does the table of genotype by orig.ident for cell type 1 look like?
The error suggests that you have only one genotype or one value of orig.ident in your data, probably after subsetting to celltype 1.
The error is probably only occuring for some genes, and may be in the continuous part of the model since that is conditional on a gene being expressed.

esrasefik · 2020-11-28T19:49:01Z

Ah that makes so much sense! Thank you for your quick response. In that case, do you think filtering the data for each cluster in the following way could help fix this issue: A gene should be expressed in at least 10% (just as an example) of the wild type or mutant cells of a given cluster. I believe the FindMarkers function in Seurat, which provides an option to use MAST for DE analysis, implements a thresholding rule like this. Is there a similar way I could filter my data in MAST prior to zlm or in the same function as zlm? Thank you again!

Esra

gfinak · 2020-11-28T19:52:37Z

Basically yes. When I do per cluster analysis I'll do the filtering in a loop. There's no specific API for this in MAST.

esrasefik · 2020-11-28T20:20:41Z

I see. Thank you again! Given the need for cluster-specific filtering prior to zlm with random effect, I assume that cellular detection rate should be re-calculated for each cluster after filtering. Could you please confirm if that is accurate?

Esra

jgamache014 mentioned this issue Mar 30, 2021

simulate() error when running Power.Rmd script kdzimm/PseudoreplicationPaper#2

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: grouping factors must have > 1 sampled level #146

Error: grouping factors must have > 1 sampled level #146

esrasefik commented Nov 28, 2020 •

edited

Loading

gfinak commented Nov 28, 2020

esrasefik commented Nov 28, 2020

gfinak commented Nov 28, 2020

esrasefik commented Nov 28, 2020

Error: grouping factors must have > 1 sampled level #146

Error: grouping factors must have > 1 sampled level #146

Comments

esrasefik commented Nov 28, 2020 • edited Loading

gfinak commented Nov 28, 2020

esrasefik commented Nov 28, 2020

gfinak commented Nov 28, 2020

esrasefik commented Nov 28, 2020

esrasefik commented Nov 28, 2020 •

edited

Loading