Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scISR transforms the raw count but does not perform any imputation on zeros #3

Open
Rohit-Satyam opened this issue Jul 25, 2023 · 2 comments

Comments

@Rohit-Satyam
Copy link

Rohit-Satyam commented Jul 25, 2023

@duct317
I tried running scISR on my dataset as follows. Surprisingly, I do not see the number of genes expressed go up. They stay the same. I used the following code

library(scISR)

## Setting RNA as default assay
org<-lapply(sample.list,FUN = function(x){DefaultAssay(x)<-"RNA"; return(x)})

## Performing Imputation
org.scisr<-lapply(org,function(x){
  scISR(as.matrix(GetAssayData(x,assay="RNA", slot="count"), rownames=TRUE), ncores = 5,preprocessing = FALSE, seed=12345)
})

## Checking if all samples were imputed or some were left as is

lapply(1:length(org.scisr), function(x){
  table(colSums(org.scisr[[x]])==colSums(org[[x]]@assays$RNA@counts))
})
[[1]]

 TRUE 
10651 

[[2]]

FALSE  TRUE 
15349  1186 

[[3]]

FALSE  TRUE 
 9743   865 

[[4]]

FALSE  TRUE 
 9893  1086 

[[5]]

FALSE 
 6737 
scisr<-lapply(1:length(org.scisr),function(x){
  temp <- CreateAssayObject(count = as.sparse(org.scisr[[x]]))
org[[x]][["scisr"]] <- temp
DefaultAssay(org[[x]])<-"scisr"
return(org[[x]])
})

names(scisr)<- names(org)
## Replacing original mca because this is reference atlas
names(scisr)<- names(org)
l<- lapply(scisr, function(x){
 x<- NormalizeData(x, assay="scisr") 
 x<- FindVariableFeatures(x,selection.method="vst",assay="scisr")
})
l$mca<- org$mca

saveRDS(l,"scisr.rds")

As you can see

@duct317
Copy link
Owner

duct317 commented Jul 25, 2023

scISR will first perform statistical test to determine if the data need to be imputed. It looks like scISR determines that the first dataset does not need to be imputed. There are changes in the other data.

@Rohit-Satyam
Copy link
Author

@duct317. I understand that. There are changes in other samples but the number of genes expressed per cell stays the same. So I think imputation didn't work because had it worked we would have observed increase in gene expressed in some cells. I am saying so because when I create violin plots per sample to see Median genes expressed per cell, the distribution and the median stays the same. Of note, my count matrices are 99% zero and maybe because of that your method simply transforms the non-zero values and leaves zero values untouched.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants