Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fill_by for violin plots #175

Open
alanocallaghan opened this issue Nov 3, 2022 · 7 comments
Open

fill_by for violin plots #175

alanocallaghan opened this issue Nov 3, 2022 · 7 comments

Comments

@alanocallaghan
Copy link
Owner

See discussion in #174

How should this handle positions? eg these last examples seem variously sub-optimal

library("scater")
example_sce <- mockSCE()
example_sce <- logNormCounts(example_sce)
colData(example_sce) <- cbind(colData(example_sce), perCellQCMetrics(example_sce))
plotColData(example_sce, y = "detected", x = "Cell_Cycle")

plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Cell_Cycle")

plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Cell_Cycle", fill_by = "Cell_Cycle")

plotColData(example_sce, y = "detected", x = "Cell_Cycle", point_fun = function(...) list(), fill_by="Mutation_Status")

plotColData(example_sce, y = "detected", x = "Cell_Cycle", fill_by="Mutation_Status")

plotColData(example_sce, y = "detected", x = "Cell_Cycle", fill_by="Mutation_Status", colour_by = "Mutation_Status")

plotColData(example_sce, y = "detected", x = "Cell_Cycle", fill_by="Cell_Cycle", colour_by = "Mutation_Status")

plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by="Cell_Cycle", fill_by = "Mutation_Status")

@alanocallaghan
Copy link
Owner Author

See branch fill-by

@kikegoni
Copy link

kikegoni commented Nov 4, 2022

That's perfect!! Thanks a lot for the nice examples!

@kikegoni
Copy link

kikegoni commented Nov 4, 2022

Just as a suggestion (I can adapt it from your code), it would be great to add an option to modify the alpha = 0.2 parameter of the fill_byhere:

plot_out <- plot_out + do.call(geom_violin, c(viol_args, list(colour = "gray60", alpha = 0.2, scale = "width", width = 0.8)))

@alanocallaghan
Copy link
Owner Author

I don't like the behaviour shown here except when the fill_by, x, and colour_by arguments all match. However dodging the jittered points means setting the dodge and jitter width to be similar to the violin plots, and choosing how to group points (probably the same as fill_by). That would I guess mean also exposing a group_by arg and dodge_width, jitter_width...

@shangguandong1996
Copy link

Hi, developer

I find it seems that fill_by will report a error

library("scater")
example_sce <- mockSCE()
example_sce <- logNormCounts(example_sce)
colData(example_sce) <- cbind(colData(example_sce), perCellQCMetrics(example_sce))
> plotColData(example_sce, y = "detected", x = "Cell_Cycle", colour_by = "Cell_Cycle", fill_by = "Mutation_Status")
Error:
! Problem while computing aesthetics.
i Error occurred in the 1st layer.
Caused by error in `.data[["Mutation_Status"]]`:
! Column `Mutation_Status` not found in `.data`.
Run `rlang::last_trace()` to see where the error occurred.
> rlang::last_trace()
<error/rlang_error>
Error:
! Problem while computing aesthetics.
i Error occurred in the 1st layer.
Caused by error in `.data[["Mutation_Status"]]`:
! Column `Mutation_Status` not found in `.data`.
---
Backtrace:
     x
  1. +-base (local) `<fn>`(x)
  2. +-ggplot2:::print.ggplot(x)
  3. | +-ggplot2::ggplot_build(x)
  4. | \-ggplot2:::ggplot_build.ggplot(x)
  5. |   \-ggplot2:::by_layer(...)
  6. |     +-rlang::try_fetch(...)
  7. |     | +-base::tryCatch(...)
  8. |     | | \-base (local) tryCatchList(expr, classes, parentenv, handlers)
  9. |     | |   \-base (local) tryCatchOne(expr, names, parentenv, handlers[[1L]])
 10. |     | |     \-base (local) doTryCatch(return(expr), name, parentenv, handler)
 11. |     | \-base::withCallingHandlers(...)
 12. |     \-ggplot2 (local) f(l = layers[[i]], d = data[[i]])
 13. |       \-l$compute_aesthetics(d, plot)
 14. |         \-ggplot2 (local) compute_aesthetics(..., self = self)
 15. |           \-ggplot2:::scales_add_defaults(...)
 16. |             \-base::lapply(aesthetics[new_aesthetics], eval_tidy, data = data)
 17. |               \-rlang (local) FUN(X[[i]], ...)
 18. +-Mutation_Status
 19. +-rlang:::`[[.rlang_data_pronoun`(.data, "Mutation_Status")
 20. | \-rlang:::data_pronoun_get(...)
 21. \-rlang:::abort_data_pronoun(x, call = y)
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 
[2] LC_CTYPE=Chinese (Simplified)_China.936   
[3] LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C                              
[5] LC_TIME=Chinese (Simplified)_China.936    

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] scater_1.27.9               ggplot2_3.4.2              
 [3] scuttle_1.4.0               SingleCellExperiment_1.16.0
 [5] SummarizedExperiment_1.24.0 Biobase_2.54.0             
 [7] GenomicRanges_1.46.1        GenomeInfoDb_1.30.1        
 [9] IRanges_2.28.0              S4Vectors_0.32.4           
[11] BiocGenerics_0.40.0         MatrixGenerics_1.6.0       
[13] matrixStats_0.63.0          devtools_2.4.5             
[15] usethis_2.1.6              

loaded via a namespace (and not attached):
 [1] bitops_1.0-7              fs_1.5.2                  tools_4.1.0              
 [4] profvis_0.3.7             utf8_1.2.3                R6_2.5.1                 
 [7] irlba_2.3.5.1             vipor_0.4.5               DBI_1.1.3                
[10] colorspace_2.1-0          urlchecker_1.0.1          withr_2.5.0              
[13] gridExtra_2.3             tidyselect_1.1.2          prettyunits_1.1.1        
[16] processx_3.7.0            compiler_4.1.0            cli_3.4.1                
[19] BiocNeighbors_1.12.0      DelayedArray_0.20.0       scales_1.2.1             
[22] callr_3.7.2               stringr_1.4.1             digest_0.6.29            
[25] XVector_0.34.0            pkgconfig_2.0.3           htmltools_0.5.3          
[28] sessioninfo_1.2.2         sparseMatrixStats_1.6.0   fastmap_1.1.0            
[31] htmlwidgets_1.5.4         rlang_1.1.1               rstudioapi_0.13          
[34] shiny_1.7.2               DelayedMatrixStats_1.16.0 generics_0.1.3           
[37] BiocParallel_1.28.3       dplyr_1.0.9               RCurl_1.98-1.12          
[40] magrittr_2.0.3            BiocSingular_1.10.0       GenomeInfoDbData_1.2.7   
[43] Matrix_1.3-4              Rcpp_1.0.10               ggbeeswarm_0.7.2         
[46] munsell_0.5.0             fansi_1.0.4               viridis_0.6.3            
[49] lifecycle_1.0.3           stringi_1.7.8             zlibbioc_1.40.0          
[52] pkgbuild_1.4.0            grid_4.1.0                parallel_4.1.0           
[55] promises_1.2.0.1          ggrepel_0.9.3             crayon_1.5.1             
[58] miniUI_0.1.1.1            lattice_0.20-45           cowplot_1.1.1            
[61] beachmat_2.10.0           ps_1.6.0                  pillar_1.9.0             
[64] ScaledMatrix_1.2.0        pkgload_1.3.0             glue_1.6.2               
[67] remotes_2.4.2             vctrs_0.6.2               httpuv_1.6.5             
[70] gtable_0.3.3              purrr_0.3.4               assertthat_0.2.1         
[73] cachem_1.0.5              rsvd_1.0.5                mime_0.12                
[76] xtable_1.8-4              later_1.3.0               viridisLite_0.4.2        
[79] tibble_3.2.1              beeswarm_0.4.0            memoise_2.0.1            
[82] ellipsis_0.3.2 

@Yunuuuu
Copy link
Contributor

Yunuuuu commented Dec 19, 2023

It would be nice if plotExpression also can control the fill_by argument

@Yunuuuu
Copy link
Contributor

Yunuuuu commented Dec 19, 2023

I attempted to implement it, but incorporating this functionality into the plotExpression function would complicate it significantly due to the unpredictability of user inputs, especially when using the group aesthetics for the violin plot. Therefore, I ultimately decided to utilize the makePerCellDF function for this purpose. However, I am unsure if it is necessary to add a function that returns the data in long-format for plot.

data <- scuttle::makePerCellDF(sce_object, features = markers)
data <- tidyr::pivot_longer(data,
        cols = all_of(markers),
        names_to = "Feature",
        values_to = "logcounts"
)
violin_plot <- ggplot(data, aes(factor(label), logcounts)) +
        geom_violin(aes(fill = celltypes), scale = "width", width = 0.8) +
        scale_fill_brewer(type = "qual", palette = "Set3") +
        guides(fill = guide_legend(
            title = "Cell type", override.aes = list(size = 2L), ncol = 1L
        )) +
        labs(x = NULL) +
        facet_wrap(vars(Feature),
            ncol = n_col, scales = "free_x"
        ) +
        cowplot::theme_cowplot(font_size = 10L) +
        theme(axis.text.x = element_text(size = 6L))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants