Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while running do_GroupwiseDEPlot() #3

Closed
aditisk opened this issue Oct 20, 2022 · 7 comments
Closed

Error while running do_GroupwiseDEPlot() #3

aditisk opened this issue Oct 20, 2022 · 7 comments
Labels
bug Something isn't working

Comments

@aditisk
Copy link

aditisk commented Oct 20, 2022

Hi @enblacar,

Thank you much for developing this package, we are definitely going to use it all our publications. We really love all the customization options and the detailed step wise instructions.

I am running into an error when I am running the do_GroupwiseDEPlot() function (error pasted below):

SCpubr::do_GroupwiseDEPlot(sample = baseline,de_genes = de_genes)
Error: There are no positive values in the matrix.

I double checked the first few rows of the de_genes tibble and there are positive log FC values for the top DE genes. Could you please help me figure out what is triggering the error above ?

Thanks

@enblacar
Copy link
Owner

Hi @aditisk,

Thanks for using the package! I am very happy to hear that you find it useful!

I am sorry to see that you are facing errors. Indeed, with the package release and many more people using it, I am discovering new bugs! hahahah

This one might be a bit tricky to debug... First of all:

  • Make sure that you are providing the correct assay and slot. By default, assay is set to SCT, but it might be the case that you used Seurat::NormalizeData() instead of Seurat::SCTransform() when normalizing. Same goes for the slot, make sure you use the same one from which the DE genes were computed. Also, to provide the DE genes object as Seurat returns it.

  • Second, make sure that you are using group.by appropriately. Make sure it is a factor/character metadata column and, whenever possible, refers at least to the same groups that the DE genes were computed for.

Having this checked:

This error arises because when I generate the heatmaps, I am expecting (and enforcing) only positive values in the heatmap bodies for the p-value (which is -log10 transformed, and p-vals of 0 are given a very, very big value) and the same for the log2FC heatmap (as I assume we are searching for marker genes, which should be the ones in which the avg_log2FC is positive. This can be set up in Seurat::FindAllMarkers(). Same is expected from the expression heatmaps if the slot is set to "data" (I expect either values to be 0 or positive).

If the previous comments did not help and you are getting the same error, try installing this specific commit, and try this same function with the same parameters and see if it works. I have noticed a couple of bugs that should be fixed by the next release.

Please note that you should only use this development version for this specific case, and take any function that does not come from a CRAN release as "prone to break". Specially if it is not even a GitHub release, but rather the @Head commit.

You can install the version like this:

devtools::install_github("enblacar/SCpubr", ref = "163f1bf2b16b528ffe66c182d81252c8d2a7ac44")

Let me know if this fixes the problem and if not, let's keep the debugging process going!

Enrique

@aditisk
Copy link
Author

aditisk commented Oct 21, 2022

Hi @enblacar, thank you so much for your prompt and detailed response. The initial suggestions didn't work so I installed the @Head commit and I am getting a different error now:

Error: 'SCT' is not an assay

I am not using SCTransform so that assay is missing in my Seurat object. The only assays I have are RNA and ADT so I tried using the RNA assay and 'data' slot. How would you suggest proceeding with objects that don't have SCT assays ? Thanks.

@enblacar
Copy link
Owner

Hi @aditisk,

I see! So you are working with a CITE-Seq dataset. I developed SCpubr based on a pure 10X dataset without other data modalities, so that might explain why we are facing some bugs here.

However, I managed to get it work for Seurat's CITE-Seq dataset as follows:

# Data and tutorial from: https://satijalab.org/seurat/articles/multimodal_vignette.html

# Install the dataset:
SeuratData::InstallData("cbmc")

# Load the dataset.
cbmc <- cbmc.SeuratData::cbmc

# Normalize the dataset.
cbmc <- Seurat::NormalizeData(cbmc)

# Find variable features.
cbmc <- Seurat::FindVariableFeatures(cbmc)

# Scale the data.
cbmc <- Seurat::ScaleData(cbmc)

# Run dimensional reduction: PCA.
cbmc <- Seurat::RunPCA(cbmc, verbose = FALSE)

# Compute clustering.
cbmc <- Seurat::FindNeighbors(cbmc, 
                              dims = 1:30)

cbmc <- Seurat::FindClusters(cbmc, 
                             resolution = 0.8, 
                             verbose = FALSE)

# Run dimensional reduction: UMAP.
cbmc <- Seurat::RunUMAP(cbmc, 
                        dims = 1:30)

# Set the current idents to the seurat clusters from clustering.
Seurat::Idents(cbmc) <- cbmc$seurat_clusters

# Set the current assay to the transcriptomics data.
Seurat::DefaultAssay(cbmc) <- "RNA"

# Find DE genes. Use only positive avg_log2FC values (we want to define markers).
de_genes <- Seurat::FindAllMarkers(cbmc, 
                                   assay = "RNA", 
                                   only.pos = TRUE)

# Compute the plot.
p <- SCpubr::do_GroupwiseDEPlot(sample = cbmc, 
                                de_genes = de_genes, 
                                assay = "RNA", 
                                slot = "data", 
                                cell_size = 3,
                                group.by = c("seurat_clusters", "orig.ident"))

With this code, I managed to get it run and got the following output:
test

Could you give it a try and tell me if it works for you? Please make sure that:

  • You are setting your default assay to RNA.
  • You are providing RNA to the assay parameter.
  • You are providing data to the assay, unless you stated otherwise to Seurat::FindAllMarkers()

If it works with the @HEAD commit you recently downloaded, could you also try it back in the official release and let me know as well?

install.packages("SCpubr")

If not, we will have a deep look to your dataset!

Hope this helps!
Enrique

@enblacar enblacar added the bug Something isn't working label Oct 21, 2022
@aditisk
Copy link
Author

aditisk commented Oct 21, 2022

Hi @enblacar, thanks for the suggestion. When I opened this ticket, I was using a categorical variable but to match what you were able to reproduce using the cbmc dataset, I tried using 'seurat_clusters' this time. I was able to make the plots using that variable. I specified the assay as well as group.by explicitly at each step so it worked this time (using the CRAN release).

However, when I switched back to a categorical variable again, I am getting a new error:
Error in .rowNamesDF<-(x, value = value) : missing values in 'row.names' are not allowed

So it seems like it works with 'seurat_clusters' but other categorical variables are still giving some error. Any suggestions on how to proceed with categorical variables ? Thanks.

@enblacar
Copy link
Owner

Hi @aditisk,

Happy to see that it is working under normal conditions!

I will need more info on this variable:

  • Is it a categorical variable stored in the object metadata (sample@meta.data)?
  • When you run: class(sample@meta.data[ , <your variable>) is the result either character or factor?
  • Also, since the categorical variable is going to end up in the rows of the heatmap, NA values should not be included (This I need to further test though, but seems what it is causing your problem).

I would say you should re-check these points, and make sure you do not include NAs. Let me know if it works! If the NAs were the problem, I will make sure to fix it by the next update.

Best,
Enrique

@aditisk
Copy link
Author

aditisk commented Oct 24, 2022

Hi @enblacar, you were correct about the NAs. My categorical variable was missing a value for 1 sample and had NAs in around 4k cells which was causing the problem. I removed these cells before running the function and now it works.

Thanks for your help in troubleshooting this.

@aditisk aditisk closed this as completed Oct 24, 2022
@enblacar
Copy link
Owner

Hi @aditisk,

Happy it worked!

Best,
Enrique

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants