Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too Many Points Overlapping Leads to Texts Still Are Overlapping #163

Closed
Winnie09 opened this issue May 24, 2020 · 7 comments
Closed

Too Many Points Overlapping Leads to Texts Still Are Overlapping #163

Winnie09 opened this issue May 24, 2020 · 7 comments
Labels

Comments

@Winnie09
Copy link

Hi, there are too many points in my plot that need to be labeled, so many texts are still overlapping each other. May I know whether there is a way to separate the texts out and make them clear? X and y axes have to start from 0. Here is my code:

ggplot(pd,aes(x=MAST,y=Wilcoxon,label=mtd,color=mtd)) + geom_point() + geom_text_repel() + theme_bw() + theme(legend.position = 'none')+
  scale_color_manual(values=v)+
  xlim(c(xmin,max(pd$MAST)+x_add)) + ylim(c(ymin,max(pd$Wilcoxon)+0.01))

Here is the plot:
plot1

Adjusting the nudge and segment arguments seem not working with the values I chose:

ggplot(pd,aes(x=MAST,y=Wilcoxon,label=mtd,color=mtd)) + geom_point() + geom_text_repel(nudge_x = .15,
    box.padding = 0.5,
    nudge_y = 1,
    segment.curvature = -0.1,
    segment.ncp = 3,
    segment.angle = 30) + theme_bw() + theme(legend.position = 'none')+
  scale_color_manual(values=v)
dev.off()

Here is the plot:
plot2

Data

Here is the output of the plotdata:

> pd
           MAST Wilcoxon           mtd
alra        0.8      0.5          ALRA
autoimpute  0.0      0.0    AutoImpute
baynorm     0.4      0.0       bayNorm
dca         0.0      0.0           DCA
deepimpute  7.4      0.0    DeepImpute
drimpute    0.2      0.0      DrImpute
knnsmooth   0.0      0.0 kNN-smoothing
magic       0.0      0.0         MAGIC
mcimpute    0.1      0.0      mcImpute
pblr        1.4      0.0          PBLR
raw         0.0      0.5        no_imp
saver       0.0      0.0         SAVER
saverx      0.0      0.0        SAVERX
scimpute    0.6      0.0      scImpute
screcover   0.0      0.1     scRecover
scscope     0.1      1.1       scScope
scVI        0.0      0.0          scVI
viper       0.3      0.0         VIPER

Thank you!

@slowkow
Copy link
Owner

slowkow commented May 25, 2020

Thanks for opening the new issue.

You might consider changing the limits on the x and y axes to give more space for the text labels.

Sometimes it is difficult to label all points, so you might also consider a different way to visualize the data.

@Winnie09
Copy link
Author

Thanks for replying! X and y axes have to start at 0 because they have positive ranges, therefore adjusting the space to negative range does not make sense, and adjusting the space to a larger positive side does not work. Would you have other possible solutions?

@stephaniehicks
Copy link

Hi @slowkow -- just following up to see if you had any suggestions on how to handle @Winnie09's problem. We have tried to extend the limits on the x and y axes to make more room for the text labels, but the reviewers of a paper that we are writing were not OK with this as the axes only make sense when they start at 0. I was curious if there was some way of spreading out the text given that we needed to work within some limits (e.g. positive ranges)? Thanks in advance for any guidance you are willing to provide!

@lmweber
Copy link

lmweber commented May 29, 2020

Sorry just responded on Twitter but didn't look at this issue first!

I am pretty sure the force argument can do something like this, by pushing the labels away from each other until they are no longer overlapping.

@slowkow
Copy link
Owner

slowkow commented May 29, 2020

If you want to try different options for ggrepel, you might find the options described and showcased in the documentation helpful. I'd be very happy to review a pull request that enhances the documentation.

However, please consider that ggrepel is not always the right tool for the job. There are cases where using ggrepel is a poor choice, and other tools can generate better figures that communicate results more clearly.

Below is one possibility that you might consider.

In my opinion, this visualization enjoys the following benefits that the scatter plot does not:

  • I can easily read all of the names of all of the methods because they are aligned along the y-axis.
  • There are no distracting line segments between names and data points that are necessarily included in a ggrepel figure.
  • I can easily see which items are different for "MAST" and "Wilcoxon".
  • I can easily see which items have zero for "MAST" and zero for "Wilcoxon".
  • Additional items can easily be added as additional rows or columns, if desired.
library(tidyverse)
#> Warning: package 'tibble' was built under R version 3.6.2

x <- "blah MAST Wilcoxon           mtd
alra        0.8      0.5          ALRA
autoimpute  0.0      0.0    AutoImpute
baynorm     0.4      0.0       bayNorm
dca         0.0      0.0           DCA
deepimpute  7.4      0.0    DeepImpute
drimpute    0.2      0.0      DrImpute
knnsmooth   0.0      0.0 kNN-smoothing
magic       0.0      0.0         MAGIC
mcimpute    0.1      0.0      mcImpute
pblr        1.4      0.0          PBLR
raw         0.0      0.5        no_imp
saver       0.0      0.0         SAVER
saverx      0.0      0.0        SAVERX
scimpute    0.6      0.0      scImpute
screcover   0.0      0.1     scRecover
scscope     0.1      1.1       scScope
scVI        0.0      0.0          scVI
viper       0.3      0.0         VIPER"

d <- read_tsv(str_replace_all(x, " +", "\t")) %>% select(MAST, Wilcoxon, mtd)
mtd_levels <- d$mtd[order(abs(d$MAST - d$Wilcoxon))]
d <- pivot_longer(d, cols = c("MAST", "Wilcoxon"))
d$mtd <- factor(d$mtd, mtd_levels)

ggplot(d) +
  aes(mtd, value) +
  geom_col() +
  coord_flip() +
  facet_grid(~ name, scales = "free")

Created on 2020-05-29 by the reprex package (v0.3.0)

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.1 (2019-07-05)
#>  os       macOS Catalina 10.15.5      
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       America/New_York            
#>  date     2020-05-29                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                        
#>  assertthat    0.2.1      2019-03-21 [1] CRAN (R 3.6.0)                
#>  backports     1.1.6      2020-04-05 [1] CRAN (R 3.6.1)                
#>  broom         0.5.5      2020-02-29 [1] CRAN (R 3.6.0)                
#>  callr         3.4.3      2020-03-28 [1] CRAN (R 3.6.2)                
#>  cellranger    1.1.0      2016-07-27 [1] CRAN (R 3.6.0)                
#>  cli           2.0.2      2020-02-28 [1] CRAN (R 3.6.0)                
#>  colorspace    1.4-1      2019-03-18 [1] CRAN (R 3.6.0)                
#>  crayon        1.3.4      2017-09-16 [1] CRAN (R 3.6.0)                
#>  curl          4.3        2019-12-02 [1] CRAN (R 3.6.1)                
#>  DBI           1.1.0      2019-12-15 [1] CRAN (R 3.6.0)                
#>  dbplyr        1.4.2      2019-06-17 [1] CRAN (R 3.6.0)                
#>  desc          1.2.0      2018-05-01 [1] CRAN (R 3.6.0)                
#>  devtools      2.2.2      2020-02-17 [1] CRAN (R 3.6.0)                
#>  digest        0.6.25     2020-02-23 [1] CRAN (R 3.6.0)                
#>  dplyr       * 0.8.5      2020-03-07 [1] CRAN (R 3.6.0)                
#>  ellipsis      0.3.0      2019-09-20 [1] CRAN (R 3.6.0)                
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 3.6.0)                
#>  fansi         0.4.1      2020-01-08 [1] CRAN (R 3.6.1)                
#>  farver        2.0.3      2020-01-16 [1] CRAN (R 3.6.0)                
#>  forcats     * 0.5.0      2020-03-01 [1] CRAN (R 3.6.0)                
#>  fs            1.4.1      2020-04-04 [1] CRAN (R 3.6.1)                
#>  generics      0.0.2      2018-11-29 [1] CRAN (R 3.6.0)                
#>  ggplot2     * 3.3.0      2020-03-05 [1] CRAN (R 3.6.0)                
#>  glue          1.4.0      2020-04-03 [1] CRAN (R 3.6.1)                
#>  gtable        0.3.0      2019-03-25 [1] CRAN (R 3.6.0)                
#>  haven         2.2.0      2019-11-08 [1] CRAN (R 3.6.0)                
#>  highr         0.8        2019-03-20 [1] CRAN (R 3.6.0)                
#>  hms           0.5.3      2020-01-08 [1] CRAN (R 3.6.0)                
#>  htmltools     0.4.0      2019-10-04 [1] CRAN (R 3.6.0)                
#>  httr          1.4.1      2019-08-05 [1] CRAN (R 3.6.0)                
#>  jsonlite      1.6.1      2020-02-02 [1] CRAN (R 3.6.0)                
#>  knitr         1.28       2020-02-06 [1] CRAN (R 3.6.0)                
#>  labeling      0.3        2014-08-23 [1] CRAN (R 3.6.0)                
#>  lattice       0.20-41    2020-04-02 [1] CRAN (R 3.6.1)                
#>  lifecycle     0.2.0      2020-03-06 [1] CRAN (R 3.6.0)                
#>  lubridate     1.7.4      2018-04-11 [1] CRAN (R 3.6.0)                
#>  magrittr      1.5        2014-11-22 [1] CRAN (R 3.6.0)                
#>  memoise       1.1.0.9000 2020-01-12 [1] Github (r-lib/memoise@58d3972)
#>  mime          0.9        2020-02-04 [1] CRAN (R 3.6.0)                
#>  modelr        0.1.6      2020-02-22 [1] CRAN (R 3.6.0)                
#>  munsell       0.5.0      2018-06-12 [1] CRAN (R 3.6.0)                
#>  nlme          3.1-145    2020-03-04 [1] CRAN (R 3.6.0)                
#>  pillar        1.4.3      2019-12-20 [1] CRAN (R 3.6.0)                
#>  pkgbuild      1.0.6      2019-10-09 [1] CRAN (R 3.6.0)                
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 3.6.1)                
#>  pkgload       1.0.2      2018-10-29 [1] CRAN (R 3.6.0)                
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 3.6.0)                
#>  processx      3.4.2      2020-02-09 [1] CRAN (R 3.6.1)                
#>  ps            1.3.2      2020-02-13 [1] CRAN (R 3.6.0)                
#>  purrr       * 0.3.3      2019-10-18 [1] CRAN (R 3.6.0)                
#>  R6            2.4.1      2019-11-12 [1] CRAN (R 3.6.0)                
#>  Rcpp          1.0.4.10   2020-05-01 [1] Github (RcppCore/Rcpp@95d0854)
#>  readr       * 1.3.1      2018-12-21 [1] CRAN (R 3.6.0)                
#>  readxl        1.3.1      2019-03-13 [1] CRAN (R 3.6.0)                
#>  remotes       2.1.1      2020-02-15 [1] CRAN (R 3.6.0)                
#>  reprex        0.3.0      2019-05-16 [1] CRAN (R 3.6.0)                
#>  rlang         0.4.5      2020-03-01 [1] CRAN (R 3.6.1)                
#>  rmarkdown     2.1        2020-01-20 [1] CRAN (R 3.6.0)                
#>  rprojroot     1.3-2      2018-01-03 [1] CRAN (R 3.6.0)                
#>  rvest         0.3.5      2019-11-08 [1] CRAN (R 3.6.0)                
#>  scales        1.1.0      2019-11-18 [1] CRAN (R 3.6.0)                
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 3.6.0)                
#>  stringi       1.4.6      2020-02-17 [1] CRAN (R 3.6.0)                
#>  stringr     * 1.4.0      2019-02-10 [1] CRAN (R 3.6.0)                
#>  testthat      2.3.2      2020-03-02 [1] CRAN (R 3.6.0)                
#>  tibble      * 3.0.0      2020-03-30 [1] CRAN (R 3.6.2)                
#>  tidyr       * 1.0.2      2020-01-24 [1] CRAN (R 3.6.0)                
#>  tidyselect    1.0.0      2020-01-27 [1] CRAN (R 3.6.0)                
#>  tidyverse   * 1.3.0      2019-11-21 [1] CRAN (R 3.6.0)                
#>  usethis       1.5.1      2019-07-04 [1] CRAN (R 3.6.0)                
#>  vctrs         0.2.4      2020-03-10 [1] CRAN (R 3.6.1)                
#>  withr         2.1.2      2018-03-15 [1] CRAN (R 3.6.0)                
#>  xfun          0.12       2020-01-13 [1] CRAN (R 3.6.0)                
#>  xml2          1.3.0      2020-04-01 [1] CRAN (R 3.6.2)                
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 3.6.0)                
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

@lmweber
Copy link

lmweber commented May 29, 2020

Here is a link to @stephaniehicks 's Twitter thread asking about this: https://twitter.com/stephaniehicks/status/1266208568862224384

There were also some other good suggestions there, e.g. adding a legend for all points and showing labels only for a subset (those that are far away from the rest).

@Winnie09
Copy link
Author

Winnie09 commented Jun 4, 2020

Thank you all for your help! These are great solutions. I will try that.

@slowkow slowkow closed this as completed Jun 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants