In [1]:
setwd('/lustre/scratch117/cellgen/team297/kt16/Ziad/raw/Trans')

In [2]:
library(SoupX)
library(Seurat)
library(Matrix)
library(reticulate)
scp = import("scanpy")
sp = import("scipy.sparse")

In [3]:
# run soupx on transcriptome data
soupx <- function(dataDirs){
    print(dataDirs)
    # generate clusters
    dat <- Read10X(paste0(dataDirs, '/filtered_feature_bc_matrix/'))
    raw <- Read10X(paste0(dataDirs, '/raw_feature_bc_matrix/'))
    seu <- CreateSeuratObject(counts = dat$`Gene Expression`)
    seu[["percent_mt"]] <- PercentageFeatureSet(seu, pattern = "^MT-")
    seu <- NormalizeData(seu, verbose = FALSE)
    seu <- FindVariableFeatures(seu, verbose = FALSE)
    seu <- ScaleData(seu, vars.to.regress = c("percent_mt", 'nCount_RNA'), verbose = FALSE)
    seu <- RunPCA(seu, features = VariableFeatures(object = seu), verbose = FALSE)
    seu <- FindNeighbors(seu, verbose = FALSE)
    seu <- FindClusters(seu, algorithm = 4, verbose = FALSE)
    clusters <- Idents(seu)
    # run soupx
    sc <- load10X(dataDirs)	
    sc <- setClusters(sc, clusters)
    sc = autoEstCont(sc, doPlot = FALSE)
    out = adjustCounts(sc)
    colnames(out) <- paste0(dataDirs, '_', colnames(out))
    # correct ADT data too
    sc2 <- SoupChannel(raw$`Antibody Capture`, dat$`Antibody Capture`)
    sc2 <- setClusters(sc2, clusters)
    useToEst = estimateNonExpressingCells(sc2, nonExpressedGeneList = list(c('CD25+prot', 'PD1_prot')))
    sc2 = calculateContaminationFraction(sc2, list(c('CD25+prot','PD1_prot')), useToEst, forceAccept = TRUE)
    out2 = adjustCounts(sc2)    
    # combine the new matrices
    newx = sp$hstack(c(t(out), t(out2)), format = 'csr')
    adata = scp$read_10x_mtx(paste0(dataDirs, '/filtered_feature_bc_matrix'), gex_only = FALSE)
    adata$X = newx
    # save as a .h5ad file
    adata$write(paste0(dataDirs, '/adata_soupx_trans_cite.h5ad'), compression = 'gzip')

    # also create a version where only the soupx trans is appended to the original cite counts.
    newx = sp$hstack(c(t(out), t(dat$`Antibody Capture`)), format = 'csr')
    adata = scp$read_10x_mtx(paste0(dataDirs, '/filtered_feature_bc_matrix'), gex_only = FALSE)
    adata$X = newx
    # save as a .h5ad file
    adata$write(paste0(dataDirs, '/adata_soupx_trans.h5ad'), compression = 'gzip')
    
    # and a version using Mike's method
    # read in corrected ADT data
    cite <- read.delim(paste0(dataDirs, '/ADT_', dataDirs, '_bgCPM.txt.gz'), check.names = FALSE)
    cite <- tibble::column_to_rownames(cite, var = "ADT")
    cite <- Matrix::Matrix(as.matrix(cite), sparse = TRUE)
    # combine the new matrices
    newx = sp$hstack(c(t(out), t(cite)), format = 'csr')
    adata = scp$read_10x_mtx(paste0(dataDirs, '/filtered_feature_bc_matrix'), gex_only = FALSE)
    adata$X = newx
    # save as a .h5ad file
    adata$write(paste0(dataDirs, '/adata_soupx_trans_cite_bgshift.h5ad'), compression = 'gzip')
}

In [4]:
files = paste0('Sample_Fq', 1:32)
files <- files[-15]

In [5]:
for (i in seq_along(files)){
        soupx(files[i])
}

[1] "Sample_Fq1"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

1170 genes passed tf-idf cut-off and 480 soup quantile filter.  Taking the top 100.

Using 1576 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 21 clusters to 11651 cells.

Extremely high contamination estimated (0.72).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 72.33%

Expanding counts from 21 clusters to 11651 cells.



[1] "Sample_Fq2"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

2331 genes passed tf-idf cut-off and 935 soup quantile filter.  Taking the top 100.

Using 1817 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 22 clusters to 6908 cells.

Extremely high contamination estimated (0.73).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 73.40%

Expanding counts from 22 clusters to 6908 cells.



[1] "Sample_Fq3"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

916 genes passed tf-idf cut-off and 434 soup quantile filter.  Taking the top 100.

Using 1217 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 18 clusters to 7336 cells.

Extremely high contamination estimated (0.66).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 65.67%

Expanding counts from 18 clusters to 7336 cells.



[1] "Sample_Fq4"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

321 genes passed tf-idf cut-off and 190 soup quantile filter.  Taking the top 100.

Using 1118 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 16 clusters to 7163 cells.

Estimated contamination is very high (0.39).

Estimated global contamination fraction of 38.59%

Expanding counts from 16 clusters to 7163 cells.



[1] "Sample_Fq5"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

354 genes passed tf-idf cut-off and 131 soup quantile filter.  Taking the top 100.

Using 1155 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 15 clusters to 4158 cells.

Extremely high contamination estimated (0.58).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 57.94%

Expanding counts from 15 clusters to 4158 cells.



[1] "Sample_Fq6"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

359 genes passed tf-idf cut-off and 189 soup quantile filter.  Taking the top 100.

Using 1052 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 15 clusters to 4818 cells.

Extremely high contamination estimated (0.76).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 75.86%

Expanding counts from 15 clusters to 4818 cells.



[1] "Sample_Fq7"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

282 genes passed tf-idf cut-off and 244 soup quantile filter.  Taking the top 100.

Using 844 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 11 clusters to 5060 cells.

Estimated contamination is very high (0.37).

Estimated global contamination fraction of 37.11%

Expanding counts from 11 clusters to 5060 cells.



[1] "Sample_Fq8"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

349 genes passed tf-idf cut-off and 271 soup quantile filter.  Taking the top 100.

Using 1054 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 14 clusters to 6437 cells.

Estimated contamination is very high (0.33).

Estimated global contamination fraction of 33.33%

Expanding counts from 14 clusters to 6437 cells.



[1] "Sample_Fq9"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

256 genes passed tf-idf cut-off and 154 soup quantile filter.  Taking the top 100.

Using 788 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 13 clusters to 5646 cells.

Extremely high contamination estimated (0.56).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 56.22%

Expanding counts from 13 clusters to 5646 cells.



[1] "Sample_Fq10"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

474 genes passed tf-idf cut-off and 271 soup quantile filter.  Taking the top 100.

Using 1138 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 16 clusters to 6114 cells.

Estimated contamination is very high (0.41).

Estimated global contamination fraction of 41.41%

Expanding counts from 16 clusters to 6114 cells.



[1] "Sample_Fq11"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

445 genes passed tf-idf cut-off and 246 soup quantile filter.  Taking the top 100.

Using 999 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 14 clusters to 7871 cells.

Extremely high contamination estimated (0.71).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 71.45%

Expanding counts from 14 clusters to 7871 cells.



[1] "Sample_Fq12"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

403 genes passed tf-idf cut-off and 177 soup quantile filter.  Taking the top 100.

Using 1036 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 15 clusters to 6046 cells.

Estimated contamination is very high (0.44).

Estimated global contamination fraction of 44.22%

Expanding counts from 15 clusters to 6046 cells.



[1] "Sample_Fq13"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

476 genes passed tf-idf cut-off and 255 soup quantile filter.  Taking the top 100.

Using 1299 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 17 clusters to 7917 cells.

Extremely high contamination estimated (0.87).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 86.81%

Expanding counts from 17 clusters to 7917 cells.



[1] "Sample_Fq14"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

772 genes passed tf-idf cut-off and 446 soup quantile filter.  Taking the top 100.

Using 1440 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 18 clusters to 8264 cells.

Extremely high contamination estimated (0.78).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 78.24%

Expanding counts from 18 clusters to 8264 cells.



[1] "Sample_Fq16"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

353 genes passed tf-idf cut-off and 240 soup quantile filter.  Taking the top 100.

Using 933 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 12 clusters to 5524 cells.

Extremely high contamination estimated (0.64).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 63.63%

Expanding counts from 12 clusters to 5524 cells.



[1] "Sample_Fq17"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

706 genes passed tf-idf cut-off and 356 soup quantile filter.  Taking the top 100.

Using 1135 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 17 clusters to 5467 cells.

Extremely high contamination estimated (0.88).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 88.23%

Expanding counts from 17 clusters to 5467 cells.



[1] "Sample_Fq18"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

752 genes passed tf-idf cut-off and 291 soup quantile filter.  Taking the top 100.

Using 1208 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 18 clusters to 6690 cells.

Extremely high contamination estimated (0.79).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 79.03%

Expanding counts from 18 clusters to 6690 cells.



[1] "Sample_Fq19"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

834 genes passed tf-idf cut-off and 413 soup quantile filter.  Taking the top 100.

Using 1492 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 19 clusters to 7554 cells.

Extremely high contamination estimated (0.65).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 65.47%

Expanding counts from 19 clusters to 7554 cells.



[1] "Sample_Fq20"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

604 genes passed tf-idf cut-off and 357 soup quantile filter.  Taking the top 100.

Using 1251 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 16 clusters to 6350 cells.

Extremely high contamination estimated (0.66).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 66.35%

Expanding counts from 16 clusters to 6350 cells.



[1] "Sample_Fq21"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

732 genes passed tf-idf cut-off and 421 soup quantile filter.  Taking the top 100.

Using 1336 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 16 clusters to 6484 cells.

Extremely high contamination estimated (0.81).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 80.94%

Expanding counts from 16 clusters to 6484 cells.



[1] "Sample_Fq22"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

569 genes passed tf-idf cut-off and 337 soup quantile filter.  Taking the top 100.

Using 981 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 14 clusters to 5803 cells.

Extremely high contamination estimated (0.74).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 74.38%

Expanding counts from 14 clusters to 5803 cells.



[1] "Sample_Fq23"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

605 genes passed tf-idf cut-off and 385 soup quantile filter.  Taking the top 100.

Using 1116 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 15 clusters to 7650 cells.

Extremely high contamination estimated (0.7).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 70.49%

Expanding counts from 15 clusters to 7650 cells.



[1] "Sample_Fq24"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

704 genes passed tf-idf cut-off and 367 soup quantile filter.  Taking the top 100.

Using 1270 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 18 clusters to 9973 cells.

Extremely high contamination estimated (0.72).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 72.42%

Expanding counts from 18 clusters to 9973 cells.



[1] "Sample_Fq25"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

286 genes passed tf-idf cut-off and 161 soup quantile filter.  Taking the top 100.

Using 848 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 13 clusters to 9242 cells.

Extremely high contamination estimated (0.77).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 77.40%

Expanding counts from 13 clusters to 9242 cells.



[1] "Sample_Fq26"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

485 genes passed tf-idf cut-off and 283 soup quantile filter.  Taking the top 100.

Using 976 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 14 clusters to 8013 cells.

Extremely high contamination estimated (0.78).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 78.27%

Expanding counts from 14 clusters to 8013 cells.



[1] "Sample_Fq27"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

446 genes passed tf-idf cut-off and 259 soup quantile filter.  Taking the top 100.

Using 1103 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 16 clusters to 8578 cells.

Extremely high contamination estimated (0.72).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 72.18%

Expanding counts from 16 clusters to 8578 cells.



[1] "Sample_Fq28"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

574 genes passed tf-idf cut-off and 319 soup quantile filter.  Taking the top 100.

Using 1213 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 18 clusters to 8475 cells.

Extremely high contamination estimated (0.69).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 68.59%

Expanding counts from 18 clusters to 8475 cells.



[1] "Sample_Fq29"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

595 genes passed tf-idf cut-off and 347 soup quantile filter.  Taking the top 100.

Using 1070 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 16 clusters to 8744 cells.

Extremely high contamination estimated (0.77).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 77.07%

Expanding counts from 16 clusters to 8744 cells.



[1] "Sample_Fq30"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

359 genes passed tf-idf cut-off and 238 soup quantile filter.  Taking the top 100.

Using 972 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 14 clusters to 14159 cells.

Extremely high contamination estimated (0.81).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 81.44%

Expanding counts from 14 clusters to 14159 cells.



[1] "Sample_Fq31"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

527 genes passed tf-idf cut-off and 328 soup quantile filter.  Taking the top 100.

Using 1131 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 14 clusters to 3385 cells.

Extremely high contamination estimated (0.61).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 61.16%

Expanding counts from 14 clusters to 3385 cells.



[1] "Sample_Fq32"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

402 genes passed tf-idf cut-off and 209 soup quantile filter.  Taking the top 100.

Using 964 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 15 clusters to 7610 cells.

Extremely high contamination estimated (0.74).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 74.42%

Expanding counts from 15 clusters to 7610 cells.



In [4]:
file = 'Sample_Fq15'

In [6]:
soupx(file)

[1] "Sample_Fq15"


10X data contains more than one type and is being returned as a list containing matrices of each type.

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading raw count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading cell-only count data

10X data contains more than one type and is being returned as a list containing matrices of each type.

Loading extra analysis data where available

318 genes passed tf-idf cut-off and 185 soup quantile filter.  Taking the top 100.

Using 1225 independent estimates of rho.

Estimated global rho of 0.01

Expanding counts from 18 clusters to 10859 cells.

Extremely high contamination estimated (0.76).  This likely represents a failure in estimating the contamination fraction.  Set forceAccept=TRUE to proceed with this value.

Estimated global contamination fraction of 76.02%

Expanding counts from 18 clusters to 10859 cells.

