Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental: new position nudge functions #183

Closed
wants to merge 25 commits into from
Closed

Experimental: new position nudge functions #183

wants to merge 25 commits into from

Conversation

aphalo
Copy link
Contributor

@aphalo aphalo commented Jan 9, 2021

I wrote three new nudge functions. I think they are a step towards "less-dumb" nudging. position_s_nudge_repel() does symmetric nudging right/left and up/down on either side of a "divide", by default the divide is the centroid of the data computed with mean(), but this can be set to another function or a numeric constant. position_r_nudge_repel() is similar but does the nudging radially away or towards a point, again by default the centroid. The third one is position_t_nudge_repel(). It is a bit more sophisticated and the code still needs some polishing. It fits a smoothing spline to the data, computes the first derivative, and uses this to apply the nudging away from the line. I have added examples, in some cases too many examples, to illustrate how they work.
Please, let me know what you think.

@slowkow
Copy link
Owner

slowkow commented Jan 10, 2021

Thank you so much! This sounds great, and I will try to have a closer look soon. I like that you are thinking about nudging in a different way than I do.

It's interesting that the R CMD check failed on Windows for your pull request. I don't see any error messages that can help to figure out what happened. By contrast, the current ggrepel code does not fail on Windows ... do you have any ideas why this might be happening? I'm puzzled.

@slowkow
Copy link
Owner

slowkow commented Jan 10, 2021

These are really cool functions! Thanks again, I'm impressed. I think this would be a really nice addition. I just want to discuss a bit of design first, thinking out loud about the user experience.

I looked over all the examples, and I don't think they're excessive at all. Actually, even more examples might be nice to get new users up to speed. I think the arguments to your functions are intuitive once you see enough examples.

Comment 1: Function names

As users type the names of these functions, would it be more natural to name them like this?

position_nudge_repel_r
position_nudge_repel_s
position_nudge_repel_t
position_nudge_repel

That way, if we want one of the (r, s, t) variants, it is the same keystrokes as the "boring" nudge (i.e. position_nudge_repel) until the final two keystrokes _s.

Comment 2: Merge 4 functions into 1?

I'm also wondering what you think about merging all 4 functions into one function. Do you think it might be more or less comfortable for users to have 1 instead of 4 functions?

Maybe something like:

PositionNudgeRepel <- ggproto("PositionNudgeRepel", Position,
  x = 0,
  y = 0,
  x_center = NA,
  y_center = NA,
  x_split = NA,
  y_split = NA,
  spline = NA,
  setup_params = ...,
  compute_layer = ...
)

Then, a user might write this to use two of your specialty nudges (radially away from center, horizontally away from x_split):

position_nudge_repel(
  x = 1,
  y = 1,
  x_center = mean,
  y_center = mean,
  x_split = mean,
  y_split = NA
)

Some if-statements would check which of the new parameters are not NA, then trigger based on what the user gives.

On the other hand, there is something nice about having separate functions. For the radial nudge, a user doesn't necessarily need to remember to pass mean by writing:

position_r_nudge_repel(x = 1, y = 1)

I think I'm leaning toward the one-function instead of the four-function design. What are your thoughts about this? Do you prefer the separate functions?

Thanks so much for contributing! :)

@aphalo
Copy link
Contributor Author

aphalo commented Jan 10, 2021

Thank, you! Somehow, our discussion last week and looking at your nudge function got me thinking, and as I implemented one of the functions I got the idea for the next one...
Comment 1: I agree, the names you suggest are much better as they will popup with auto-completion in RStudio,
Comment 2: combining the functions into only one could make the documentation too complex. Maybe two functions would be enough, position_nudge_repel(), position_nudge_repel_s(), and position_nudge_repel_r() could be combined into one and position_nudge_repel_t() kept separate, and maybe with a more illustrative name. My feeling is that position_nudge_repel_t() needs testing and tweaking and we may need to allow users to pass additional parameters for degree of smoothing or methods. At the moment it does not work for fewer than four observations and maybe lm() would work. So I think it would be wiser to keep position_nudge_repel_t() independent.
I will look into combining the functions in the next few days, and if you agree I will keep, at least for the time being position_nudge_repel_t().
I will run the test with many points locally and write a follow up comment later today.

@aphalo
Copy link
Contributor Author

aphalo commented Jan 10, 2021

I run the tests after installing either of the branches: in both cases they pass the tests.
I run R check on both branches, and in both cases the test fails when called as part of the check.
So both branches locally behave the same. The lack of useful information agrees with my experience that when the code fails, R simply crashes. I do not think there can be any significant difference in the C++ code or in the geoms as I merged master back into the branch I am working on after your most recent commit.
These two zip files are the ggrepel.Rcheck folder zipped after I checked each of the branches. Just in case this is of some use.
ggrepel-master.Rcheck.zip
ggrepel-nudge.by.pedro.Rcheck.zip

Merge position_nudge_repel_r() and position_nudge_repel_s()  into position_nudge_repel().
@aphalo
Copy link
Contributor Author

aphalo commented Jan 11, 2021

I have now done the merging of the three functions. I think that I managed to get the default behavior set so that users will almost never have to pass an argument to parameter direction, i.e., make an explicit choice from the three behaviors originally in the three functions. Earlier examples seem to work unchanged except one where an argument to direction is needed to get the same behavior as with the individual functions.
Once we have a polished version of the position_nudge_repel_t() function we can decide whether to merge it or not.

@slowkow
Copy link
Owner

slowkow commented Jan 11, 2021

Great! I will have a look soon. I'm excited about this pull request — I think users are going to love it.

Fix defaults and edits examples.
Add method = "linear" and direction = "split"
@aphalo
Copy link
Contributor Author

aphalo commented Jan 12, 2021

position_nudge_repel_t() is starting to look good. One could try to support other kinds of model fits, but computing the local slope at each point could become rather difficult. Please, let me know what you think. I think that as it is it already can solve a bunch of cases.
I had a look at the "random" crashes but I cannot figure out what is going on. I tried to add a Makevars.win to set compiler options but it was not obeyed. I thought it could be worthwhile disabling optimizations to see if this solved the crashes, but I was unable to get the compiler gg 8.3.0 to stop using -O2

Add formal parameter abline so that arbitrary straight lines can be entered. Add a couple of examples of its use.
Ensures a minimum distance from line/curve to nudged label by overriding "normal" nudging for points very near or on the line or curve. Testing still needed.
@aphalo
Copy link
Contributor Author

aphalo commented Jan 13, 2021

One important thing to consider is the names of the new formal parameters. Once they are in use in a released version it will be very difficult to change them, so your feedback on this would be useful. I will now do some testing of these position functions with tests in 'ggspectra' based on real use cases like light spectra from the sun and various lamp types. I think it is better not to merge the two functions. I am afraid a single merged function would become too complex to use and document. I think that now the interfaces a rather intuitive (at least for me) and easy to grasp.

@aphalo
Copy link
Contributor Author

aphalo commented Jan 13, 2021

I do not see commits from you in the last few hours, but now I see "All checks have passed" for my most recent commit! Surprising!

@slowkow
Copy link
Owner

slowkow commented Jan 13, 2021

I think that now the interfaces a rather intuitive (at least for me) and easy to grasp.

This is the goal! If users will find it comfortable and intuitive, then I think we've succeeded. Based on the few examples I have seen so far, I think many people will be using your nudge functions.

It sounds like you are ready for me to review once again, so I will go ahead and check to see if everything is looking OK to me. I'll write another comment once I do that.

I do not see commits from you in the last few hours, but now I see "All checks have passed" for my most recent commit! Surprising!

This is the second time I have seen code simultaneously showing two behaviors:

  • failing on a user's machine
  • succeeding on a CI service (that should be a fair representation of the user's machine)

I've come to expect that the CI services (e.g. Travis, Github Actions) are not necessarily representative of the actual computers that people use at home or work. I don't know why.

I was crossing my fingers and hoping that the test failure (on Github Actions Windows) was reproducing your experience, but it seems that is not reliable.

@aphalo aphalo marked this pull request as ready for review January 14, 2021 19:25
@aphalo
Copy link
Contributor Author

aphalo commented Jan 14, 2021

The failure under macOS is unrelated to 'ggrepel'.

@aphalo
Copy link
Contributor Author

aphalo commented Jan 17, 2021

@slowkow I have added the complementary "normal" version (for geom_text() and geom_label()) of these nudging functions to 'ggpmisc'. The tentative names I am using are position_nudge_center() and position_nudge_line(). In the second of these I have just added support for polynomial fits with lm(). I used package 'polynom' to easily compute the derivative and the prediction from it. I was already importing some functions from 'polynom' in 'ggpmisc'. In the case of 'ggrepel' this would add a new dependency, and may not be worthwhile, specially that repulsion and connection with segments gives additional flexibility.

Please, let me know what you think... whether you would like me to add this feature or not. I am not in a hurry to release 'ggpmisc' so I will wait until you have reviewed my pull request and we have reached a consensus about names to use for the parameters and functions in the two packages, as it would be best to be consistent.

@aphalo
Copy link
Contributor Author

aphalo commented Jan 19, 2021

@slowkow Please, do not yet review this pull request. I have found some bugs that are easier to see in the absence of repulsion. I will let you know, hopefully next week, when these are all fixed.
I changed its status back to "Draft".

@aphalo aphalo changed the title Experimental: three new position nudge functions Experimental: new position nudge functions Jan 19, 2021
@aphalo aphalo marked this pull request as draft January 19, 2021 11:24
In some cases center_x or center_y could remain as NULL instead of being set to mean.
As position_nudge_repel() and position_nudge_repel_t() compute summaries from data or fit splines or polynomials, there is a need to obey grouping and facets. As the grouping may be best ignored in some cases, I replaced the compute_layer function by a compute_panel function and included in this function the code to support grouping.
There was also a problem in position_nudge_repel_t() with the computation of the angles, which required taking into consideration the range of the data, for this to work consistently across groups, we get the information from scales, which is a parameter of the panel function, but not of the layer function.
For what I have tested, things seem to work reasonably well. 
I haven't thoroughly tested if the use of the compute_panel function changes the behaviour compared to position_nudge(). It also remains a test of facets with free scales. I would expect this not to work well, but I do not know if it works with position_nudge().
@aphalo aphalo marked this pull request as ready for review January 21, 2021 15:00
@aphalo
Copy link
Contributor Author

aphalo commented Jan 21, 2021

@slowkow There are just a couple of large commits as I refined the code mostly within 'ggpmisc' as without repulsion the nudging was easier to debug and test. Then I copied back my work into 'ggrepel'. I haven't yet added examples with grouping and facets, but I tested them in 'ggpmisc'. I've done quite a lot of progress, but I think more thorough testing of grouping and specially facets is still needed. Anyway, it would be good to get some feedback, when you have some time to play with the code. Merging the two functions, I think would make the function too complex, both in their interfaces and the logic of the code.

@aphalo
Copy link
Contributor Author

aphalo commented Jan 22, 2021

@slowkow An idea occurred to me overnight: if the 'ggprepel' position functions would name the returned variables differently, i.e., keep using x and y columns for the nudged values and some other name, say x_orig and y_orig for the copy of the input, they would be backwards compatible and usable with any ggplot geom. This would just eliminate the need to include similar functions in both 'ggrepel' and 'ggpmisc'. They could be simply included in 'ggrepel' (or if you prefer in 'ggpmisc'). This would require changes to the geoms in 'ggrepel' but I think the advantages would be many, both immediately and in the future. Such a change would not break any user code as far as I can think.

@aphalo
Copy link
Contributor Author

aphalo commented Jan 22, 2021

@slowkow I implemented the change in naming as it was just a 5 min task. It does work. This implementation is "quick and dirty" in that I left the existing code in the geoms unchanged, but added code to rename the columns to the previously used names. This could be improved if you agree to this renaming.
Then one could have a simple version of position_nudge() identical to that in 'ggplot2', but that keeps the original position. This could be called ggrepel::position_nudge(), i.e., override the version in 'ggplot2', to keep things as simple as possible for users. One could even very easily do the same with any other position function in 'ggplot2' that could be useful with the repulsive geoms.
I think I should for a while stop throwing ideas at you, and let you digest all this in peace...

@slowkow
Copy link
Owner

slowkow commented Jan 22, 2021

I like the new naming! Thanks for the great idea. I'll have to spend some time looking through the new examples to build confidence, but it sounds like a very good change.

@aphalo
Copy link
Contributor Author

aphalo commented Jan 25, 2021

I submitted to CRAN 'ggpmisc' 0.3.8 without the position_nudge...() functions, I attach here some examples that are different to those in my pull request, all of them using geom_text() making it easier to see the effect of the nudging by itself.

Rmarkdown source

@aphalo
Copy link
Contributor Author

aphalo commented Mar 27, 2021

@slowkow We both seem to be quite busy. The new nudge functions in my pull request will require to be supported by a maintainer. If you think they would be a burden in 'ggrepel', they could be in their own package, with myself as maintainer, or I could add them to 'ggpmisc'. At this point described only for use with non-repulsive geoms. When and if you decide to accept the suggested name change for the columns in data 'ggrepel', they would be also usable with the repulsive geoms. However, in the meantime they would be available for use with non-repulsive geoms.
Please, let me know what you think.

Add missing ggplot2:: and change direction -> params$direction in one warning message.
@slowkow
Copy link
Owner

slowkow commented Mar 29, 2021

@aphalo Thanks for bringing my attention back to this! I hope you're doing well. I apologize for the long delays, but it is difficult to prioritize open-source work over other things that need my attention more urgently.

Maybe the best forward is for you to maintain the new code for nudges in ggpmisc (or elsewhere)? Here's what I'm thinking:

  • If you maintain the nudges in your own package, you will have freedom to update your code as you see fit without me slowing you down. Sorry about that — you are moving more quickly than me, and I don't want to hold you back. I appreciate your patience, but I don't want to be a hindrance.
  • You're absolutely right that the nudges have utility outside the context of ggrepel. This makes me think that the most natural place for them might be in a separate package. I think there's a good chance that other package developers may eventually want to depend on your nudges, too. For example, packages like ggnetwork might benefit from having flexible nudges.
  • Honestly, I think anyone using ggplot2 would enjoy more flexible nudges, so it might be worth a moment to consider if ggplot2 maintainers would accept them directly into ggplot2... just a thought.
  • I will certainly plan to modify ggrepel so that it works correctly with your functions, because I want to use these new nudges in my own projects. I think you're right that it is a trivial change to the ggrepel code to make this happen.
  • I'll also plan to add documentation examples to ggrepel that demonstrate your functions, to encourage users to try them out. The docs can direct users to install the package that is required for these examples.

By the way, I was hoping to get a closer look at your examples to see how they work without ggrepel, but they don't seem to be working for me. Here's what I get when I knit your Rmd. Sorry if I'm missing something obvious... (before knitting, I git cloned your ggpmisc repo and installed with R CMD INSTALL .)

advanced-nudging.pdf

Thank you again for developing this code and sharing it! 🙏

@aphalo
Copy link
Contributor Author

aphalo commented Mar 29, 2021

@slowkow No problem at all. I am doing well, thanks, and you? I understand your situation well. I had not touched this code in all this time myself. I have also been busy.

I think, based on what Hadley has written, he does not want to add new functionality to ggplot2. So, I guess what he could potentially accept is adding an additional parameter to the existing position functions in ggplot2 to enable the saving of the original x and y coordinates when they are modified by the position functions. It could be good to rise an issue for ggplot2 early enough to agree with Hadley et al. at least on the name to use. I am using x_orig and y_orig but this can be changed. On the other hand, maybe better wait until you have tested that the change in naming I have suggested does not break things.

There could be some use for a geom_text() and geom_label() with no repulsion but with the ability to add segments or arrows for nudged text. I started writing these but rapidly stopped as they would have lots of very similar code to that in geom_text_repel() and gem_label_repel() so maybe adding to your geoms a simple switch to disable repulsion, or writing a wrapper with suitable defaults to disable repulsion could be a better solution.

I need to think what to do with 'ggpmisc', but probably it would be best to split it into a package with the general purpose functions (including nudging) and another with the statistics-related plot decorations.

I will have a look at the examples, I probably have broken something when moving around the code. I'll let you know what I find.

Thank you for being open to suggestions and dialogue! And for the wonderful tool that ggrepel is!

@aphalo
Copy link
Contributor Author

aphalo commented Apr 2, 2021

@slowkow About the examples based on 'ggpmisc': I had removed the nudge functions before submitting 'ggpmisc' 0.3.8 to CRAN. I put them back into the main branch of 'ggpmisc' rather recently. The examples using geom_text() and 'ggpmisc' are now working again for me. Sorry about this!

@aphalo
Copy link
Contributor Author

aphalo commented Apr 26, 2021

@slowkow If you agree I can close this pull request and make a new one with only the name change code. The new position_nudge functions are now in 'ggpmisc' and do not need to be included in 'ggprepel'. There are two possibilities. 1) Change the naming used in 'ggrepel', which would allow position_nudge_repel() to be used with non-repulsive geoms. 2) Not to change the naming used within 'ggrepel' at this point but add code to translate the new naming as used in 'ggpmisc' if found in the data. The second option is the least disruptive, and would need the least testing as the new code, a few lines, would be only be run if the new naming is encountered. Please, let me know what you think.

@slowkow
Copy link
Owner

slowkow commented Apr 27, 2021

@aphalo Thanks for the update! Sorry I haven't been able to find more time for this.

I'd appreciate if you could propose these changes:

  1. Change the naming used in 'ggrepel', which would allow position_nudge_repel() to be used with non-repulsive geoms.

If you don't think it'll break other packages, then I'm for it.

@aphalo
Copy link
Contributor Author

aphalo commented Apr 28, 2021

@slowkow o.k., I wrote the code some weeks ago but I need to check it carefully before making the new pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants