Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stop labelling when point density is too high (feature proposal) #15

Closed
ginolhac opened this issue Jan 15, 2016 · 2 comments
Closed

stop labelling when point density is too high (feature proposal) #15

ginolhac opened this issue Jan 15, 2016 · 2 comments

Comments

@ginolhac
Copy link

first of all, I would like to thank you a lot for this fantastic package!
Labelling has been a trouble for a long time, and it is great to see a nice implementation that actually works well.

Not an issue, but a feature proposal.
It would be nice to add a minimal point density where labelling could be performed. Indeed, beyond a certain density, it is impossible to do a good job and adding labels could be just skipped.

Concrete case: volcano plots, which are false discovery rate as a function of gene expression.
rplot

Of course, the shape of the volcano provides information on the actual analysis, but which genes are where is the real question. In a shiny app, thanks to @wch we can now plot use ggplot2 and interactively zoom in. In the static example above, I displayed the gene symbols only for the more extreme cases and it is still messy. But, if we display those names only when repelling could be nicely performed, we could stil zoom in a region with unknown gene symbols and see them appearing.
Of course, a threshold need to be determined by data density, but how? Divide the plot area by regions and estimate the density inside? Sorry, I cannot really help for the implementation.

Thanks again for a great package!

@slowkow
Copy link
Owner

slowkow commented Jan 15, 2016

Thanks for the comment! I had exactly this use case in mind when I developed ggrepel, and I'm glad it's being put to good use.

You might consider a very simple solution to the overcrowding problem: make the plot bigger or make the text smaller. This might not be possible, but maybe it's worth considering.

More generally, I believe the user should decide which points should be labeled. In other words, I believe it is the user's responsibility to determine if the data point density is too high. I don't know if it is sensible to implement density calculations within the ggrepel package and use those somehow to determine which points will be labeled... I'd rather try to keep the code simple if possible.

@ginolhac
Copy link
Author

I see your point and agree. Actually, I am now thinking that following your idea, I could make the labels really small in a large plot area, but tweak the label size when zooming is performed. Basically, in the shiny app, I would know exactly the coordinates that the user choose and can increase the label size accordingly.
That should do and do not require any extra coding from you ;) As soon as the label size is small enough for the global view.
Thanks for the tip!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants