error message when I am using estimateDispersions() function. #20

ikwak2 · 2017-03-20T02:03:37Z

Thank you for developing nice tools for analyzing scRNA-seq data. I have used monocle 1 with fun. Now I reinstalled monocle to try census count and visualize data using monocle.

However, I am getting errors that I previously didn't had.
Here I attach error message from estimateDispersions() function, and my sessionInfo().

load("Xerr.RData") # npX : scRNA-seq expression data, pd = pheno, AnnotatedDataFrame, fd = feature, AnnotatedDataFrame.
pXX <- newCellDataSet(npX, phenoData = pd, featureData = fd)

rpc_matrix <- relative2abs(pXX)

pXX <- newCellDataSet(as(as.matrix(rpc_matrix), "sparseMatrix"),

```
                   phenoData = pd,
```
```
                   featureData = fd,
```

                   lowerDetectionLimit=1,

                   expressionFamily=negbinomial.size())

pXX <- estimateSizeFactors(pXX)
pXX <- estimateDispersions(pXX)
Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) :
invalid character indexing
In addition: Warning message:
Deprecated, use tibble::rownames_to_column() instead.

sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6

locale:
[1] C

attached base packages:
[1] splines stats4 parallel stats graphics grDevices utils
[8] datasets methods base

other attached packages:
[1] monocle_2.2.0 DDRTree_0.1.4 irlba_2.1.2
[4] VGAM_1.0-3 ggplot2_2.2.1 Biobase_2.34.0
[7] BiocGenerics_0.20.0 Matrix_1.2-7.1

loaded via a namespace (and not attached):
[1] Rcpp_0.12.9 compiler_3.3.2 RColorBrewer_1.1-2
[4] plyr_1.8.4 tools_3.3.2 tibble_1.2
[7] gtable_0.2.0 lattice_0.20-34 igraph_1.0.1
[10] DBI_0.5-1 HSMMSingleCell_0.108.0 fastICA_1.2-0
[13] dplyr_0.5.0 stringr_1.1.0 cluster_2.0.5
[16] combinat_0.0-8 grid_3.3.2 R6_2.2.0
[19] qlcMatrix_0.9.5 pheatmap_1.0.8 limma_3.30.8
[22] reshape2_1.4.2 magrittr_1.5 scales_0.4.1
[25] matrixStats_0.51.0 assertthat_0.1 colorspace_1.3-2
[28] stringi_1.1.2 lazyeval_0.2.0 munsell_0.4.3
[31] slam_0.1-40

I am not sure what I've done wrong. I can send "Xerr.RData" file if needed.

Thank you so much!
Sincerely,
ilyoup

The text was updated successfully, but these errors were encountered:

Xiaojieqiu · 2017-03-21T06:04:25Z

we have not see this error before. yes. please attach the CDS file in your response here. We will be happy to take a look over it. Thanks

ikwak2 · 2017-03-21T17:31:56Z

github do not support Rdata file format. So I sent the file to xqiu@uw.edu .
Thank you so much!

Xiaojieqiu · 2017-03-22T07:16:55Z

Thanks for your email. I have looked at your data. The 34th cell (column C34) has NaN values in your npX matrix (so the pXX cds too). This causes the line of script Matrix::rowSums(rounded > cds@lowerDetectionLimit, na.rm = T) in disp_calc_helper_NB called by estimateDispersion function get all NA values which leads to the error you saw.

After removing this cell, you can run estimateSizeFactors and estimateDispersions without error

pXX_valid <- pXX[, -34]
pXX_valid <- estimateSizeFactors(pXX_valid)
pXX_valid <- estimateDispersions(pXX_valid)

Also, please notice that estimateDispersion works better when you pool all the genes in your single-cell sample. In your example, you only have a few hundred genes.

ikwak2 · 2017-03-22T17:00:35Z

Oh, got it. Thank you so much!

jgarces02 · 2017-07-21T15:05:04Z

Hi @Xiaojieqiu I have the same problem but with the markerDiffTable function. I've tried to search some zero or NA value but there is none... I copy below my code:

path <- paste(getwd(), "2_count_outs/outs/filtered_gene_bc_matrices/GRCh38/", sep = "/")
matrix <- readMM(paste(path, "matrix.mtx", sep = ""))
pd <- read.table(paste(path, "barcodes.tsv", sep = ""))
colnames(pd) <- "cell_ID"
rownames(pd) <- pd$cell_ID
fd <- read.table(paste(path, "genes.tsv", sep = ""))
colnames(fd) <- c("transcript_ID", "gene_short_name")
rownames(fd) <- fd$transcript_ID
colnames(matrix) <- pd$cell_ID; rownames(matrix) <- fd$transcript_ID
pdata <- new("AnnotatedDataFrame", data = pd)
fdata <- new("AnnotatedDataFrame", data = fd)
rawdata <- newCellDataSet(matrix, phenoData = pdata, featureData = fdata, expressionFamily = negbinomial.size())

rawdata <- rawdata[1:30000,1:500]
rawdata <- estimateSizeFactors(rawdata)
rawdata <- estimateDispersions(rawdata)

rawdata <- detectGenes(rawdata, min_expr = 1) #zero
expressed_genes <- row.names(subset(fData(rawdata), num_cells_expressed >= 1))

gata1 <- row.names(subset(fData(rawdata), gene_short_name == "GATA1"))
gypa <- row.names(subset(fData(rawdata), gene_short_name == "GYPA"))
mpo <- row.names(subset(fData(rawdata), gene_short_name == "MPO"))
cebpb <- row.names(subset(fData(rawdata), gene_short_name == "CEBPB"))
dntt <- row.names(subset(fData(rawdata), gene_short_name =="DNTT"))
ebf1 <- row.names(subset(fData(rawdata), gene_short_name =="EBF1"))
fos <- row.names(subset(fData(rawdata), gene_short_name == "FOS"))
prdm1 <- row.names(subset(fData(rawdata), gene_short_name == "PRDM1"))
thy1 <- row.names(subset(fData(rawdata), gene_short_name == "THY1"))

cth <- newCellTypeHierarchy()
cth <- addCellType(cth, "Erythrocyte", classify_func = function(x) {x[ery_id,] >= 1 & x[gypa,] >= 1})
cth <- addCellType(cth, "Myeloid", classify_func = function(x) {x[mpo,] >= 1 & x[cebpb,] >= 1})
cth <- addCellType(cth, "LiT", classify_func = function(x) {x[ebf1,] >= 1 & x[dntt,] >= 1})
cth <- addCellType(cth, "LiB", classify_func = function(x) {x[fos,] >= 1 & x[prdm1,] >= 1})
cth <- addCellType(cth, "Progenitors", classify_func = function(x) {x[thy1,] >= 1 & x[fos,] < 1})
rawdata_ct <- classifyCells(rawdata, cth)

marker_diff <- markerDiffTable(rawdata[expressed_genes,], cth, cores = 2)
## and here the error appears:
## Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : invalid character indexing

I can't upload my matrix count because is .mtx format, but if you need it I'll send you by email.

Thanks in advance!

vertesy · 2017-12-30T10:51:20Z

I have the same issue. There are no NA or NaN values in my expression matrix, yet I got the error:

> MyCellDataSet <- estimateDispersions(MyCellDataSet)
Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : 
  invalid character indexing
In addition: Warning message:
Deprecated, use tibble::rownames_to_column() instead.

Solution

It looks like that the relative2abs() function introduces NaN values to an expression matrix.
This function also causes some cells (~10% in my case) to have only NaN values.
- If I ran Monocle with an unfiltered dataset (10K instead the highest 1000 genes), 27% of the cells were set to NaN-only values. Odd.

Replacing NA-s with 0, and removing 0-only cells helped.

rpc_matrix <- relative2abs(HSMM)

NA_count =sum(is.na(rpc_matrix))
rpc_matrix <- na.replace(rpc_matrix, 0.)

OnlyZeros = (colSums(rpc_matrix)==0)
paste(sum(OnlyZeros), "cells have zero reads in total, and there were", NA_count, "NA values before replacement to NA -> 0")

Valid = which(!OnlyZeros)
rpc_matrix = rpc_matrix[ , Valid ]; dim(rpc_matrix)

# you need to subset phenotype data too!

PS: Additionally, the vignette code uses melt() but does not require(reshape2).

fereshtehizadi · 2018-09-26T18:15:27Z

Sorry I am working with URD package, when I am trying to plot markers on clusters I always get this error

> plotDot(object.6s.mnn, genes = c("DDB_G0267178", "DDB_G0267178", "DDB_G0285311", "DDB_G0290079", "DDB_G0267180", "DDB_G0273181"), clustering="Infomap-60")
Error in intI(i, n = d[1], dn[[1]], give.dn = FALSE) : 
  invalid character indexing
>

Please somebody help me with that

Thanks a lot

rpa12356 · 2024-03-11T06:32:08Z

Hello, I am using the Monocle R package to analyze single-cell data, but when I was using the estimateDispersion() function, I meet the following error: Error in log (ifelse (y==0, 1, y/mu)): (converted from warning to) NaNs generated“
Here is my code：
cells_2<-subset(Data_harmony,labels%in%c("High-Malignant cells","low-Malignant cells"))
cells_2_matrix <- as.matrix(cells_2@assays$RNA@counts, 'sparseMatrix')
p2_data <- cells_2@meta.data
p2_data$celltype <- cells_2@active.ident
f2_data <- data.frame(gene_short_name = row.names(cells_2_matrix),row.names = row.names(cells_2_matrix))
pd2 <- new('AnnotatedDataFrame', data = p2_data)
fd2 <- new('AnnotatedDataFrame', data = f2_data)
cds2 <- newCellDataSet(cells_2_matrix,
phenoData = pd2,
featureData = fd2,
lowerDetectionLimit = 0.5,
expressionFamily = negbinomial.size())
cds2 <- estimateSizeFactors(cds2)
cds2 <- estimateDispersions(cds2)

ikwak2 closed this as completed Mar 22, 2017

jgarces02 mentioned this issue Jul 24, 2017

Error message using markerDiffTable() function: "invalid character indexing" #40

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error message when I am using estimateDispersions() function. #20

error message when I am using estimateDispersions() function. #20

ikwak2 commented Mar 20, 2017 •

edited

Loading

Xiaojieqiu commented Mar 21, 2017

ikwak2 commented Mar 21, 2017

Xiaojieqiu commented Mar 22, 2017 •

edited

Loading

ikwak2 commented Mar 22, 2017

jgarces02 commented Jul 21, 2017 •

edited

Loading

vertesy commented Dec 30, 2017

fereshtehizadi commented Sep 26, 2018

rpa12356 commented Mar 11, 2024

error message when I am using estimateDispersions() function. #20

error message when I am using estimateDispersions() function. #20

Comments

ikwak2 commented Mar 20, 2017 • edited Loading

Xiaojieqiu commented Mar 21, 2017

ikwak2 commented Mar 21, 2017

Xiaojieqiu commented Mar 22, 2017 • edited Loading

ikwak2 commented Mar 22, 2017

jgarces02 commented Jul 21, 2017 • edited Loading

vertesy commented Dec 30, 2017

Solution

fereshtehizadi commented Sep 26, 2018

rpa12356 commented Mar 11, 2024

ikwak2 commented Mar 20, 2017 •

edited

Loading

Xiaojieqiu commented Mar 22, 2017 •

edited

Loading

jgarces02 commented Jul 21, 2017 •

edited

Loading