Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plotting multiple genes in the same FeaturePlot #3521

Closed
Sergio-ote opened this issue Sep 18, 2020 · 15 comments
Closed

Plotting multiple genes in the same FeaturePlot #3521

Sergio-ote opened this issue Sep 18, 2020 · 15 comments

Comments

@Sergio-ote
Copy link

Hi everyone!
I am using some small gene signatures (3-5 genes) to identify populations in my dataset using FeaturePlot(). I would like to get a plot like this one https://www.nature.com/articles/s42003-020-0922-4/figures/1 (Figure 1c), but so far I have not managed to sort it out.

I have already checked the Seurat visualization vignette, the option for 2 genes mentioned in #1343 (not suitable for more than 2 genes) and the average mean expression mentioned in #528. This last option would be fine, but I get a lot of noise in clusters that are unimportant for my signature because i.e. I get 3 very specific markers that are expressed only in my cluster of interest, but the 4th one is expressed in high levels in this specific cluster but in low levels in the rest of the clusters. When I plot I obtain a messy FeaturePlot where all the clusters look similar.

Is there a way to obtain this kind of plot using Seurat, or do I need another package? Thank you in advance!

@samuel-marsh
Copy link
Collaborator

Hi,

Not member of dev team but hopefully can be helpful. I would try and see if using module scores would accomplish what you are trying to do. AddModuleScore can be used to see if expression of given gene set is enriched vs set of randomly selected (but based on expression bins) control genes. This might help to clean up the plot as it sounds like the enrichment of the whole gene set would likely be cell type specific whereas one particular gene might also be expressed in other cell types.

Best,
Sam

A basic workflow would be something like:

cell_typeA_marker_gene_list <- list(c("Gene1", "Gene2", "Gene3", "Gene4"))
object <- AddModuleScore(object = object, features = cell_typeA_marker_gene_list, name = "cell_typeA_score")
FeaturePlot(object = object, features = "cell_typeA_score1")

Note that gene list for AddModuleScore must be supplied in list class or coerced as part of function to list otherwise score will be created for each gene and not the gene set.
Also note that you must add "1" to the end of whatever name you provided in AddModuleScore as this is automatically appended to the name provided in the new metadata column which stores the score.

@Sergio-ote
Copy link
Author

Sergio-ote commented Sep 18, 2020

Thank you very much @samuel-marsh ! I think you solved my problem. Could this approach be used for larger gene signatures? What are the limitations that I should be aware of?

@samuel-marsh
Copy link
Collaborator

HI,

You definitely can use for larger gene sets but I think getting too big could be an issue for couple reasons. For one, given the sparsity of single cell data if your gene set is too large then it may not appear enriched via module score when biologically it may be. The inverse is of course also true if your gene set is too small, then the number of cells with enriched scores may be too high. If you want to know more about the method I suggest reading the original paper and methods that the function is derived from (Tirosh et al., 2016; see manual entry for more details).

Best,
Sam

@Yale73
Copy link

Yale73 commented Aug 12, 2021

@samuel-marsh Hi,

I used exactly the same code as yours but found no matter how many genes I used, I got the same plot. Do you run the same issue or not? If not, can you help me with it?

This one used 50 genes
50 genes

This one is 3 genes
3-gens

I also tried other lists and ran the same issues.
Rplot

Thanks,
Yale

@samuel-marsh
Copy link
Collaborator

Hi Yale,

Can you post the exact code that you ran for ID1 with either 50 or 3 genes?

Thanks,
Sam

@ksaunders73
Copy link

ksaunders73 commented Aug 12, 2021

Heyo @Yale73! I am a noobie so take my word with a grain of salt heh, but maybe try setting min.cutoff and max.cutoff the same between your lists and see if that helps? Like min.cutoff = 0 and max.cutoff = 0.3? I dunno how you got such pretty plots with yours though and it seems like they're the same scale so maybe not so helpful

@Yale73
Copy link

Yale73 commented Aug 12, 2021

@samuel-marsh
The code for 3 genes

IL17_signature_gene_list <- list(c("CXCL13", "IL17F", "CCR6"))
object <- AddModuleScore(object = our, features = IL17_signature_gene_list, name = "ID")
FeaturePlot(object = our, features = "ID1", order=T)+
  scale_color_viridis(discrete = FALSE, option="turbo")

50 genes

IL17_signature_gene_list <-list(c("CXCL13","MTRNR2L12", "CD7", "MGAT4A", "FTH1", "LAYN", 
                                   "IL17F", "KLRB1", "GNLY", "CPM", "CTSH", "GBP5", "CLEC2B",
                                   "SOX4","GZMB","CD2","CEBPD", "ODF2L","LAG3", "LRRN3",
                                   "ARHGEF12","PTPN13","TNFAIP3","TRPS1","SNX9","METRNL","BTG1", 
                                   "JUN","TMIGD2", "SPOCK2","GABARAPL1","PMEPA1","HIST1H1E","RBPJ",
                                   "LINC01871","MAP3K4", "H1FX", "UBC","GALNT1","PNRC1","MUC20-OT1",
                                   "RPS26","GABPB1-AS1","CHN1","NAP1L4","PTMS", "F2R","DAPK2",
                                   "CTLA4","CCR6"))

object <- AddModuleScore(object = our, features = IL17_signature_gene_list, name = "IL17_signature_gene_list")
FeaturePlot(object = our, features = "IL17_signature_gene_list1", order=T)+
  scale_color_viridis(discrete = FALSE, option="turbo")

Today, when I reran the 50 one, I got an error:

Error: None of the requested features were found: IL17_signature_gene_list1 in slot data
In addition: Warning message:
In FetchData(object = object, vars = c(dims, "ident", features),  :
  The following requested variables were not found: IL17_signature_gene_list1

Thanks,
Yale

@Sergio-ote
Copy link
Author

Hi @Yale73,
you are saving the results of AddModuleScore() in the object "object" but then using the object "our" as an input for FeaturePlot(). Try with FeaturePlot(object = object, features = "IL17_signature_gene_list1"). Hope that solves your issue.

Best,
Sergio

@Yale73
Copy link

Yale73 commented Aug 16, 2021

@Sergio-ote Hi,
You are totally right, I didn't notice I used object to AddModuleScore().
Now it works. Thanks for your help.

All the best,
Yale

@jimena2
Copy link

jimena2 commented Aug 17, 2022

Has anyone encountered this error when trying this?

I ran this

cell_typeA_marker_gene_list <- list(c("Gene1", "Gene2", "Gene3", "Gene4"))
object <- AddModuleScore(object = object, features = cell_typeA_marker_gene_list, name = "cell_typeA_score")
FeaturePlot(object = object, features = "cell_typeA_score1")

and got:

Error in sample.int(length(x), size, replace, prob) : 
  cannot take a sample larger than the population when 'replace = FALSE'

after trying to define object

@haseebz
Copy link

haseebz commented Mar 1, 2023

Did you try changing the DefaultAssay(object) <- "RNA"

@alexvnesta
Copy link

alexvnesta commented Apr 5, 2023

Same issue as jimena2 here:

DefaultAssay(seurat_object) <- "RNA"
cell_typeA_marker_gene_list <- list("KRT18", "KRT19", "KRT8", "KRT7")
seurat_object <- AddModuleScore(object = seurat_object, features = cell_typeA_marker_gene_list, name = "cell_typeA_marker_gene_list")
Error in sample.int(length(x), size, replace, prob) :
cannot take a sample larger than the population when 'replace = FALSE'

SessionInfo says Seurat_4.3.0

@jimena2
Copy link

jimena2 commented Aug 22, 2023

@haseebz this worked for me, thank you!

@paolabc
Copy link

paolabc commented Nov 28, 2023

Hi There

Thank you for this solution.
I am also considering if there is a way to do it for different set of genes such as vein gene sets (stab2/dab2) and arterial gene sets(dll4/cxcr4a).

I would like to have different color for represent these gene sets. What I tried so far

vein_marker_gene_list <- list(c("stab2", "dab2")) 
arterial_marker_gene_list <- list(c("stab2", "dab2"))

object <- AddModuleScore(object = object, features = vein_marker_gene_list, name = "vein_score")
object <- AddModuleScore(object = object, features = arterial_marker_gene_list , name = "arterial_score")
 
FeaturePlot(object = object, features = c("vein_cell_score1",'arterial_cell_score1'), order=T,reduction = 'umap',blend=T)+
  scale_color_viridis(discrete = FALSE, option="D")

This FeaturePlot generates 3 plots, and I am interested in the third one, which contains the two gene sets. However,it does not clear the expression difference between the gene sets plotting. I also tried to adjust the blend.threshold.

Also, I noticied that when I split my FeaturePlot based on orig.ident, the control and treatment have the same expression pattern. Should I separate the data from them before performing AddModuleScore?

I apppreacite so much any suggestion.

Thank you!

@mtekman
Copy link

mtekman commented Apr 9, 2024

I was still getting the error about population size, but nothing in the thread worked. What did work was playing with the ctrl parameter.

sm <- AddModuleScore(sm, features = list(in_genes), name = "EMT_sig", ctrl = 50)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants