How to perform subclustering and DE analysis on a subset of an integrated object #1883

attal-kush · 2019-07-24T19:19:42Z

Hi Team Seurat,
Similar to issue #1547,
I integrated samples across multiple batch conditions and diets after performing SCTransform (according to your most recent vignette for integration with SCTransform - Compiled: 2019-07-16). Now, I have a Seurat object with 3 assays: RNA, SCT, and Integrated.

So I have a couple of questions regarding my workflow:

For downstream DE analysis, the scale.data slot in the SCT assay has disappeared after integration. Therefore, I assume I cannot use Pearson residuals for DE analysis. Is it possible and valid instead to use values from the "data" slot of the SCT assay (log-normalized corrected values) for the MAST test?
I want to subset a specific cell type (cluster) and examine subtypes in this cell type. So, my here is my workflow:
SCT_integrated <- IntegrateData(anchorset = SCT_Integrated.anchors, normalization.method = "SCT", features.to.integrate = rownames(SCT_Integrated))
SCT_integrated <- RunPCA(SCT_integrated)
SCT_integrated <- FindNeighbors(SCT_integrated, dims = 1:15)
SCT_integrated <- RunUMAP(SCT_integrated, dims = 1:15)
SCT_integrated <- FindClusters(SCT_integrated)

control_subset <- subset(SCT_integrated, orig.ident = 'Chow')
DefaultAssay(control_subset) <- "RNA"
control_subset <- FindVariableFeatures(control_subset, selection.method = "vst", nfeatures = 3000)
DefaultAssay(control_subset) <- "integrated"
control_subset <- ScaleData(control_subset, verbose = FALSE, vars.to.regress = c( "percent.mt"))
control_subset <- RunPCA(control_subset, npcs = 30, verbose = FALSE)
control_subset <- FindNeighbors(control_subset, dims = 1:15)
control_subset <- RunUMAP(control_subset, dims = 1:15)
control_subset <- FindClusters(control_subset)

I want to know:
a. Is it valid to set features.to.integrate to all the genes in the original Seurat object if I want run subclustering on the subset using its integrated assay?
b. Is it necessary to run FindVariableFeatures on the RNA assay of the subset and get new variables to use in PCA in order to properly cluster the subset? I am worried that the top variable features of the original Seurat Object are not the same variable features of the new subset. Also, instead of changing the default assay to "RNA", finding the variable features, and changing the default assay back to "integrated", would it be make more sense to just delete those lines of code and just change:
control_subset <- RunPCA(control_subset, npcs = 30, verbose = FALSE) to
control_subset <- RunPCA(control_subset, npcs = 30, verbose = FALSE, features = Variable Features(control_subset))
c. Should FindVariableFeatures be run on the RNA assay, the integrated assay, or the SCT assay? (I assume if I just need to delete the 3 lines of code I just mentioned above and change
control_subset <- RunPCA(control_subset, npcs = 30, verbose = FALSE) to
control_subset <- RunPCA(control_subset, npcs = 30, verbose = FALSE, features = Variable Features(control_subset)),
then the answer is to run it on the integrated assay).
d. Should ScaleData be run on the subset prior to PCA even though the subset comes from an integrated object prepped from SCT? (I ask because in the new integration vignette, it explicitly mentions not to run ScaleData after running the IntegrateData function)?

I have a conceptual question about the batch-correction (integration) model developed by Seurat (the one from the most recent vignette for integration with SCTransform - Compiled: 2019-07-16). Does this batch-correction overfit the data so much so such that legitimate biological differences in gene expression profiles of cells from different diets (HFD, LFD, Chow) are gone? Since the data I am analyzing comes from different diets as well as different batches, will batch-correction make me unable to determine differences in gene expression of cells from different diets? If so, would only performing batch correction on batches of the same diet and merging all the diets together without batch correction be a valid method of retaining gene expression differences between diet but not batches?
If I decide that batch correction is not required for my samples, could I subset cells from my original Seurat Object (after running Quality Control and clustering on it), set the assay to "RNA", and and run the standard SCTransform pipeline. In other words, is this workflow valid:
SCT_not_integrated <- FindClusters(SCT_not_integrated)
control_subset <- subset(SCT_not_integrated, orig.ident = 'Chow')
DefaultAssay(control_subset) <- "RNA"
control_subset <- SCTransform(control_subset, vars.to.regress = "percent.mt") %>% RunPCA() %>% FindNeighbors(dims = 1:15) %>% RunUMAP(dims = 1:15) %>% FindClusters()

In addition, since I am not integrating the subset, is it recommended to use the "scale.data" slot in the SCT assay for DE analysis or continue using the "data" slot in the SCT assay for this subset?

Thanks for all your help

MediciPrime · 2019-07-24T20:35:15Z

@attal-kush I hope its okay to piggyback of your question. I have been following the SCTransform integration tutorial and it doesn't mention how to FindClusters or identify cluster specific markers. I simply used the FindNeighbors and FindClusters command in order to create the 'seurat_clusters' list in the meta.data. Could you please let me know if the steps below are the correct way to go about identifying clusters and markers?

Steps

First the following steps were performed in the order that they were displayed: SCTransform, SelectIntegrationFeatures, PrepSCTIntegration, FindIntegrationAnchors, IntegrateData, RunPCA and RunUMAP.
At this point the tutorial displayed the UMAP plots with DimPlots and went forward to combine additional human PBMC datasets from eight different technologies. However I did the following:
all.combined <- FindNeighbors(all.combined, dim = 1:10)
all.combined <- FindClusters(all.combined, resolution = 0.2)
Next I perform FindConservedMarkers on each of the cell clusters to identify conserved gene markers for each cell cluster.

Are these the correct steps to follow? I just want to make sure the Seurat Team agrees with my workflow for identifying the cell clusters and conserved markers for the integrated and sctransform analysis.

Thank you!

graddi · 2019-09-08T16:14:45Z

Hello Seurat Team,

Thank you for the wonderful package. I have similar questions as @attal-kush with regards to reclustering of a subset of an integrated object. The ideal workflow is not clear to me and perusing the vignettes and past issues did not clarify it fully.

Is this workflow indeed the best? Or should we go directly onto integrated dataset and RunPCA?

control_subset <- subset(SCT_integrated, orig.ident = 'Chow')
DefaultAssay(control_subset) <- "RNA"
control_subset <- FindVariableFeatures(control_subset, selection.method = "vst", nfeatures = 3000)
DefaultAssay(control_subset) <- "integrated"
control_subset <- ScaleData(control_subset, verbose = FALSE, vars.to.regress = c( "percent.mt"))
control_subset <- RunPCA(control_subset, npcs = 30, verbose = FALSE)
control_subset <- FindNeighbors(control_subset, dims = 1:15)
control_subset <- RunUMAP(control_subset, dims = 1:15)
control_subset <- FindClusters(control_subset)

j-andrews7 · 2019-09-19T01:45:07Z

@MediciPrime That looks correct to me, though your resolution=0.2 parameter is quite low.

I would also like to know the recommended way of doing this.

MediciPrime · 2019-09-19T15:06:07Z

Thank you for your reply @j-andrews7!

atermanini · 2019-10-21T13:37:13Z

Hi all, I'm also interested to this topic: what is the best way to subset and reclustering data starting from an integrating dataset? After subsetting clusters of interest (subsetting by ident) I have a Seurat object with RNA, SCT and integrated assay, and dimensional reduction (pca, tsne, umap) coming from the original Seurat object. The integrated assay consists of 3000 features comings from the original integration analysis (so choosed from the whole dataset, and not only from cells of the subset).

Thanks!!
Alberto

amayer21 · 2019-10-24T15:18:25Z

Hi all,
I'm also interested in understanding better how to do this. I'm writing here to be sure to receive an email when somebody will post an explanation here :-)

My intuition would have been that when I focus on 1 cluster, I can start the whole process (including integration) as if it was a fresh dataset, especially given the fact that the integration has focused on genes that were highly variable in the whole dataset but may not be variable at all in the population I'm focusing on... But reading a few posts and issues here, it's not the way to go and I would like to understand why and to know how to do it properly.

Thank you @satijalab for this amazing tool and the amazing tutorials !!!!
All the best,
Alice

tilofrei · 2019-10-27T20:47:42Z

a) My approach would be to just run FindClusters() with a higher resolution on the whole dataset until the desired subclustering is reached. Then find the DEGs between 2 clusters with FindMarkers(ident.1=, ident.2=).

b) Running FindVariableGenes() and RunPCA() again on the integrated dataset does not seem helpful to me because the limited feature space of 3000 is not changed. The alternative would be to subset() the population of interest and run the complete preprocessing including integration only on those cells again. That enables to change the feature space.

But I'm also curious how others approach this!

EDIT:
After discussing with colleagues and reading other articles I decided to go for option b)

See this paper also:

JingleW · 2020-01-13T06:41:50Z

Hi all, I'm also interested in this issue, and wonder what is the best way to subset and reclustering data starting from an integrating dataset?
As suggested by #2042, you can change the set of features to be integrated by using the features.to.integrate argument in IntegrateData. By default, this is set to the VariableFeatures. How to set the 'features.to.integrate' as all the features?

JRPortman · 2020-02-18T15:34:10Z

Hi All,

I am also stuck on this issue too. I followed a similar approach to @amayer21 with regards to treating the data set as new after removing clusters/cells. Which of course included re-calculating the variable genes (on the "RNA" Slot) and re-integration. For the same reasons, I felt this was the most intuitive way. From my understanding, including all genes into the "Feature.to.integrate" functions will give you a gene matrix of all genes altered based on the integration, but the PCA analysis and subsequent non-linear dimensionality reduction and clustering will still be calculated based on the 2000 features found in the "Find.Integration.anchors" functions (unless otherwise stated), which change depending on the original data used, ie subsetted or whole.
From reading the other issues posted regarding the topic it appears that any kind of re-analysis prior to integration is not recommended, and that once subsetted a integrated data set should just be re-scaled and the pipeline followed on from this point on.

Similar to @amayer21 I am wondering what the best way to approach this is, and why treating a subsetted data set as new is not the correct way to run an integrated analysis pipeline?

Thank you @satijalab for this amazing analysis package.
Best wishes
Jordan

KyriakisDimitrios · 2020-03-20T10:02:34Z

Hi All,
I followed a similar approach to @attal-kush. Does anyone has found a better solution to re-project a cluster of the dataset?

Excess-Code · 2020-03-29T18:03:53Z

I know that we shouldn't rescale subsetted data from an integrated object but is it possible to RunUMAP on the subsetted data so I can at least get a plot? I have increased the resolution on FindClusters to analyze the integrated object and get my cluster of interested subclustered enough for DEG analysis but would simply like a new UMAP plot to visualize expression within that group of clusters.

LuciaJim · 2020-04-01T00:16:06Z

Hello,
Still in the same situation.
@timoast , how can we finally tackle this issue?

It seems that a repeated possibility would be to change the features.to.integrate argument in IntegrateData to all_common_features between the different integrated datasets, however I have a quite big dataset (100.000 cells) and I'm experiencing memory issues:

Error in subCsp_rows(x, i, drop = drop) : 
  Cholmod error 'problem too large' at file ../Core/cholmod_sparse.c, line 92

In any case, could this workflow (slightly modified from the one from @attal-kush) be accepted to subcluster from an integrated object?

commongenes_tokeep <- Reduce(intersect, lapply(SCT_Integrated.anchors@object.list, rownames))
SCT_integrated <- IntegrateData(anchorset = SCT_Integrated.anchors, normalization.method = "SCT", features.to.integrate = commongenes_tokeep)
SCT_integrated <- RunPCA(SCT_integrated)
SCT_integrated <- RunUMAP(SCT_integrated, dims = 1:30)
SCT_integrated <- FindNeighbors(SCT_integrated, dims = 1:30)
SCT_integrated <- FindClusters(SCT_integrated, resolution=0.8)

DefaultAssay(Sample_subset) <- "RNA"
Sample_subset <- subset(SCT_integrated, idents = c(0, 2, 3, 7))
Sample_subset <- ScaleData(Sample_subset, vars.to.regress = c("percent.mt")
Sample_subset <- RunPCA(Sample_subset)
Sample_subset <- RunUMAP(Sample_subset, dims = 1:30)
Sample_subset <- FindNeighbors(Sample_subset, dims = 1:30)
Sample_subset <- FindClusters(Sample_subset, resolution = 0.8)

vwtlin · 2020-06-22T10:15:47Z

@attal-kush Your questions are so comprehensive and I am also curious if there is a practical way to analyse the subsetted cells.

Btw, regarding DE analysis in your question 1, according to #1836 (comment), it says that both RNA and SCT assay could be used for DE analysis if my understanding is correct.

Diozao666 · 2020-08-03T19:19:05Z

it makes no sense to me the not to use the integrated assay on every downstream analysis. isn't the whole point of integration to remove batch effects?

I tried this two ways

after integration I subsetted my cells of interest and did SCTransform on the RNA assay for clustering, but for DE I used the RNA assay, as it is officially recommended (from what I understand, the batch effects are still there). I did see batch effects here (cells from different batches did not share clusters).
after integration, I subsetted my cells of interest using the integrated assay, and I still see apparent batch effects.

and reading this issue I only got more confused.

Zha0rong · 2020-08-03T19:48:19Z

Hi @attal-kush ,
I have also been working on the single cell dataset and there are several times that i need to subcluster a proportion cell type. I have tried several approaches before and i think they may be helpful:

Choose a subset of cells, and use the integration assay to Run PCA, umap, findneighbors and findclusters to do subclustering. The pro of this approach is that it is fast and easy. However there are a few times that i found some genes that are primary markers for one certain subtype of the cells i want to sub clustering do not exist in the integration assay, which may lead to some problems.
Choose a subset of cells, and then split by samples and then re-run the integration steps (select integration features, find anchors and integrate data). The pro of this approach is that I use this method to solve the problem in the previous approach and now i have the genes that are primary markers for the cell sub types. However i do not believe this is the correct approach to do integration so i did not choose this method.
I hope this helps.

Diozao666 · 2020-08-03T20:15:58Z

Hi @attal-kush ,
I have also been working on the single cell dataset and there are several times that i need to subcluster a proportion cell type. I have tried several approaches before and i think they may be helpful:

Choose a subset of cells, and use the integration assay to Run PCA, umap, findneighbors and findclusters to do subclustering. The pro of this approach is that it is fast and easy. However there are a few times that i found some genes that are primary markers for one certain subtype of the cells i want to sub clustering do not exist in the integration assay, which may lead to some problems.

Choose a subset of cells, and then split by samples and then re-run the integration steps (select integration features, find anchors and integrate data). The pro of this approach is that I use this method to solve the problem in the previous approach and now i have the genes that are primary markers for the cell sub types. However i do not believe this is the correct approach to do integration so i did not choose this method.
I hope this helps.

which do you use for DE and heatmap?

owenwilkins · 2020-08-17T13:44:28Z

did the Seurat team ever respond to this..?

vertesy · 2020-09-18T15:17:30Z

I think the proper way is to subset before integration as in Smillie et al. 2019 as referred to by @tilofreiwald. Note that @timoast from the Seurat team recommended otherwise, although I never seen an explanation why would this not best way to go.

While I did not test the above, I tested running FindVariableFeatures() (or not), and I recommend re-running FindVariableFeatures(). Note, that tested this on one data set only so far.

Without `FindVariableFeatures()` it is basically a zoom-in on the original (before subset) UMAP.

Thus not much is gained.

After re-running `FindVariableFeatures()`: the neurons and corresponding progenitors (7,8) are linked

Also, cells previously occurring as cluster outliers from cl7 found their way to the corresponding clusters.

Note that overall, the major structure is conserved, the effect may be particular to this data set.

JingleW · 2020-09-24T02:49:42Z

Hi @vertesy ,
I used the first way as @Zha0rong described for re-clustering of subset cells, choosing a subset and then use the integration assay to Run PCA, umap, findneighbors and findclusters to do subclustering. It works, however, for some types of cells, not very well. So I guess FindVariableFeatures of the subset cells should be tried. But I am not sure which assay should be used for FindVariableFeatures of the subset cells, RNA, SCT, or Integrated? I did integration with SCTransform.
Thank you.

JackieMium · 2021-01-10T02:36:40Z

Glad to find out so many of you thinking about the same problem here, sad to realize there is indeed no pratical guide about how to do this properly yet.
Subsetting the before integrating data to interested cells and then do the whole integration, followed by PCA, umap, findneighbors and findclusters seemed reasonale to me. But how do I subset a data before clustering? As cell identity is only available after intergration and clustering?

ajneville · 2021-01-20T05:50:34Z

I would like some help with this thread as well. Thank you @satijalab !!!!

vwtlin · 2021-02-01T09:25:22Z

Glad to find out so many of you thinking about the same problem here, sad to realize there is indeed no pratical guide about how to do this properly yet.
Subsetting the before integrating data to interested cells and then do the whole integration, followed by PCA, umap, findneighbors and findclusters seemed reasonale to me. But how do I subset a data before clustering? As cell identity is only available after intergration and clustering?

This issue may help you address your question.
#2812 (comment)

riselinuh · 2021-02-09T12:05:40Z

It would be nice if Satija lab could give more clear instruction on how to proceed in case of high versus low heterogeneity after subsettting. @satijalab, could you please help us? Thank you!

denvercal1234GitHub · 2021-03-24T11:47:45Z

Hi all,

I wonder if anyone has found a definitive answer for this?

I also have similar question.

I did SCTransform() workflow, then subset a cluster of interest.
I then change DefaultAssay to RNA, run SCTransform() again setting the do.scale = TRUE, and do.center = TRUE.
So, that means I run NormalizeData(), FindVariableFeatures() and ScaleData() on my RNA assay again after subsetting a cluster.
Does it look right?

Thank you so much!

Dragonmasterx87 · 2021-06-03T16:22:14Z

@vertesy just came here to chime in after seeing your comment mate, so I tried what you are suggesting, and I see no marked difference, in fact, I don't have the data to show rn because I've a lot on my plate currently, but subset>integrate>re-cluster is more laborious and less useful than integrate>subset>re-cluster. (by re-cluster I mean the entire subsetted dataset is treated as an independent body of cells and re-analyzed similar to what you allude to. I think of this as if you FACS sorted cells and were to analyze them independently of the other cell types).

Everyone: I strongly suggest using the RNA assay for all DE. This is because the RNA slot is a true representative of biological variation, when someone tries to reproduce your findings they won't perform a negative binomial regression on their PCR. Now I understand that batch variation is a pain in the a** but honestly one has to assume this will occur naturally in a PCR as well. Honestly now I'm very stringent on what my definition of a DE is because minor gene fluctuations in scRNAseq data are very unreliable and reside within the realm of false-positive dropouts. Nevertheless, I have seen that normalized RNA (log norm'd) is very reproducible in a PCR/bulk RNAseq/rnaFISH exp (if your DE gene FC is >1.5x and expressed in atleast 50% of cells).

Unless a gene is not expressed (n-reads) at 1/p* try to forget about it just like a bad day (p* being the relative mean gene expression taking into account cDNA library construction efficiency, which in the case of 10x is 15%, or 1/p* = 1/0.15 ≅ 7 reads/cell/gene). You can read more on the concept here in Martin's paper.

As far as heterogeneity goes, if you keep sub-sampling till you reach 2 cells you will find differences between even them. So there is really no simple answer because heterogeneous populations themselves should be reproducibly present in multiple individuals, in order to identify that cell type as an organism-specific cell not a donor-specific transcriptional fluctuation in a particular cell population.

Cheers, all look forward to learning more on this when the devs respond.
🐲

atakanekiz · 2021-12-03T21:29:04Z

I am also wondering if there is an official recommendation for this task. Replies here and in some other GitHub issues have slightly different approaches but they all make general sense. Out of all possible solutions, I feel like performing the analysis as @tilofreiwald's "option b" would be the best. That way, one would avoid the pitfall described in @Zha0rong's first scenario because the sub-clustering would have been driven by the variable features recalculated in the data subset.

We would all appreciate it if @timoast or others from the @satijalab can chime in. Many, many thanks for the great package and continued support!

atakanekiz · 2021-12-04T11:56:23Z

As the proof is in the pudding, I decided to try different approaches on my own data and share my findings here. I hope it is useful. The sample code is also provided at the end.

My scenario is very similar to what @attal-kush described. In short:

I have 6 scRNAseq runs of mixed immune cells
Data were initially integrated using SC transform vignette (resulting in combined_all object in the sample code below)
I subsetted all T cells (ie. non zero expression of Cd3e and Cd3g markers in the RNA assay)
Using this subsetted data, I tried 4 different approaches:
- Approach 1: Default reintegration > Re-cluster (following Integration tutorial)
- Approach 2: SCT reintegration > Re-cluster (following Integration tutorial)
- Approach 3: No re-integration > Re-scale > Re-cluster (following 3k PBMC Guided Tutorial)
- Approach 4: No re-integration > SC transform > Re-cluster (following SCT tutorial)

I found that the first and second approaches lead to a nice integration while the third and fourth lead to an uncorrected batch effect (see the image below). That would be great if someone can confirm or deny :)

Subsetting

DefaultAssay(combined_all) <- "RNA"
tcell <- subset(x = combined_all, subset = Cd3e > 0 & Cd3g > 0)

Approach 1

tcell_list <- SplitObject(tcell, split.by = "sample")
tcell_list <- lapply(X = tcell_list, FUN = function(x) {
    x <- NormalizeData(x)
    x <- FindVariableFeatures(x, selection.method = "vst", nfeatures = 2000)
})
features <- SelectIntegrationFeatures(object.list = tcell_list)
anchors <- FindIntegrationAnchors(object.list = tcell_list, anchor.features = features)

# set k.weight so that it is not larger than the number of cells in the smallest dataset
# https://github.com/satijalab/seurat/issues/3936
tcell_combined1 <- IntegrateData(anchorset = anchors, k.weight = 46)
DefaultAssay(tcell_combined1) <- "integrated"

tcell_combined1 <- ScaleData(tcell_combined1, , vars.to.regress = "percent.mt", verbose = FALSE)
tcell_combined1 <- RunPCA(tcell_combined1, npcs = 30, verbose = FALSE)
tcell_combined1 <- RunUMAP(tcell_combined1, reduction = "pca", dims = 1:10)
tcell_combined1 <- FindNeighbors(tcell_combined1, reduction = "pca", dims = 1:10)
tcell_combined1 <- FindClusters(tcell_combined1, resolution = 0.5)

Approach 2

tcell_list2 <- SplitObject(tcell, split.by = "sample")
tcell_list2 <- lapply(tcell_list2, SCTransform, vars.to.regress = "percent.mt")
features  <- SelectIntegrationFeatures(tcell_list2, nfeatures = 3000)
tcell_list2 <- PrepSCTIntegration(tcell_list2, anchor.features = features)
anchors <- FindIntegrationAnchors(tcell_list2, normalization.method = "SCT", anchor.features = features)
tcell_combined2 <- IntegrateData(anchorset = anchors, normalization.method = "SCT",
                                 k.weight = 46)
tcell_combined2 <- RunPCA(tcell_combined2)
tcell_combined2 <- RunUMAP(tcell_combined2, reduction = "pca", dims = 1:10)
tcell_combined2 <- FindNeighbors(tcell_combined2, dims = 1:10)
tcell_combined2 <- FindClusters(tcell_combined2, dims = 1:10)

Approach 3

tcell_combined3 <- tcell
tcell_combined3 <- NormalizeData(tcell_combined3)
tcell_combined3 <- FindVariableFeatures(tcell_combined3)
tcell_combined3 <- ScaleData(tcell_combined3, vars.to.regress = "percent.mt")
tcell_combined3 <- RunPCA(tcell_combined3, features = VariableFeatures(tcell_combined3))
tcell_combined3 <- RunUMAP(tcell_combined3, dims=1:10)
tcell_combined3 <- FindNeighbors(tcell_combined3, dims = 1:10)
tcell_combined3 <- FindClusters(tcell_combined3, dims = 1:10)

Approach 4

tcell_combined4 <- tcell
tcell_combined4 <- SCTransform(tcell_combined4, vars.to.regress = "percent.mt")
tcell_combined4 <- RunPCA(tcell_combined4)
tcell_combined4 <- RunUMAP(tcell_combined4, dims=1:10)
tcell_combined4 <- FindNeighbors(tcell_combined4, dims = 1:10)
tcell_combined4 <- FindClusters(tcell_combined4, dims = 1:10)

vertesy mentioned this issue Nov 26, 2019

How to do subclustering the best? vertesy/Seurat.multicore#4

Closed

atakanekiz mentioned this issue Jan 24, 2020

Supervised clustering on a subset of integrated object (best practices?) #2539

Closed

fisherj-2212 mentioned this issue Sep 7, 2020

Subsetting integrated data #3465

Closed

github-actions bot mentioned this issue Nov 30, 2020

为什么你的DoHeatmap热图画出来是“糊”的？ ixxmu/mp_duty#498

Closed

denvercal1234GitHub mentioned this issue Feb 22, 2021

How to merge clusters and what steps needed after merging in SCTransform workflow? #4128

Closed

yuhanH closed this as completed Mar 25, 2022

Gesmira mentioned this issue Jun 2, 2023

Discrepancy between UMAP and graph-based clustering after subset #7398

Closed

whitneyt1 mentioned this issue May 1, 2024

Remove empty image slots after SplitObject() #8844

Closed

Alexis-Varin mentioned this issue Jul 31, 2024

Having trouble with this code sce <- as.SingleCellExperiment(CR_SM) #9150

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to perform subclustering and DE analysis on a subset of an integrated object #1883

How to perform subclustering and DE analysis on a subset of an integrated object #1883

attal-kush commented Jul 24, 2019

MediciPrime commented Jul 24, 2019

graddi commented Sep 8, 2019

j-andrews7 commented Sep 19, 2019

MediciPrime commented Sep 19, 2019

atermanini commented Oct 21, 2019

amayer21 commented Oct 24, 2019 •

edited

Loading

tilofrei commented Oct 27, 2019 •

edited

Loading

JingleW commented Jan 13, 2020

JRPortman commented Feb 18, 2020

KyriakisDimitrios commented Mar 20, 2020

Excess-Code commented Mar 29, 2020

LuciaJim commented Apr 1, 2020

vwtlin commented Jun 22, 2020

Diozao666 commented Aug 3, 2020 •

edited

Loading

Zha0rong commented Aug 3, 2020

Diozao666 commented Aug 3, 2020

owenwilkins commented Aug 17, 2020

vertesy commented Sep 18, 2020 •

edited

Loading

JingleW commented Sep 24, 2020

JackieMium commented Jan 10, 2021

ajneville commented Jan 20, 2021

vwtlin commented Feb 1, 2021

riselinuh commented Feb 9, 2021

denvercal1234GitHub commented Mar 24, 2021

Dragonmasterx87 commented Jun 3, 2021

atakanekiz commented Dec 3, 2021 •

edited

Loading

atakanekiz commented Dec 4, 2021 •

edited

Loading

How to perform subclustering and DE analysis on a subset of an integrated object #1883

How to perform subclustering and DE analysis on a subset of an integrated object #1883

Comments

attal-kush commented Jul 24, 2019

MediciPrime commented Jul 24, 2019

graddi commented Sep 8, 2019

j-andrews7 commented Sep 19, 2019

MediciPrime commented Sep 19, 2019

atermanini commented Oct 21, 2019

amayer21 commented Oct 24, 2019 • edited Loading

tilofrei commented Oct 27, 2019 • edited Loading

JingleW commented Jan 13, 2020

JRPortman commented Feb 18, 2020

KyriakisDimitrios commented Mar 20, 2020

Excess-Code commented Mar 29, 2020

LuciaJim commented Apr 1, 2020

vwtlin commented Jun 22, 2020

Diozao666 commented Aug 3, 2020 • edited Loading

Zha0rong commented Aug 3, 2020

Diozao666 commented Aug 3, 2020

owenwilkins commented Aug 17, 2020

vertesy commented Sep 18, 2020 • edited Loading

Without FindVariableFeatures() it is basically a zoom-in on the original (before subset) UMAP.

After re-running FindVariableFeatures(): the neurons and corresponding progenitors (7,8) are linked

JingleW commented Sep 24, 2020

JackieMium commented Jan 10, 2021

ajneville commented Jan 20, 2021

vwtlin commented Feb 1, 2021

riselinuh commented Feb 9, 2021

denvercal1234GitHub commented Mar 24, 2021

Dragonmasterx87 commented Jun 3, 2021

atakanekiz commented Dec 3, 2021 • edited Loading

atakanekiz commented Dec 4, 2021 • edited Loading

Subsetting

Approach 1

Approach 2

Approach 3

Approach 4

amayer21 commented Oct 24, 2019 •

edited

Loading

tilofrei commented Oct 27, 2019 •

edited

Loading

Diozao666 commented Aug 3, 2020 •

edited

Loading

vertesy commented Sep 18, 2020 •

edited

Loading

Without `FindVariableFeatures()` it is basically a zoom-in on the original (before subset) UMAP.

After re-running `FindVariableFeatures()`: the neurons and corresponding progenitors (7,8) are linked

atakanekiz commented Dec 3, 2021 •

edited

Loading

atakanekiz commented Dec 4, 2021 •

edited

Loading