Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable hjust to place text proportionally along a line within the plot bounds #60

Closed
jwhendy opened this issue Jan 18, 2022 · 11 comments
Closed
Assignees

Comments

@jwhendy
Copy link

jwhendy commented Jan 18, 2022

Greetings,

Awesome package! I just was introduced to it via my StackOverflow question here, and the answer-er ran into issues with reproducing my desired result given the hjust option. This bit from the manual is very enticing, though in the example it didn't seem entirely accurate?

Although such lines aren’t curved, there are some benefits to using the geomtextpath functions if a labelled reference line is required: only a single call is needed, co-ordinates are not required for the text label, ...

Mainly, while coordinates aren't required, it does seem that tailoring hjust to the specific line is still necessary. I'll re-create the example here.

base plot

library(ggplot2)

set.seed(123)
df <- data.frame(
  x = runif(100, 0, 1),
  y = runif(100, 0, 1))

lines <- data.frame(
  intercept = rep(0, 5),
  slope = c(0.1, 0.25, 0.5, 1, 2))

p <- ggplot(df, aes(x = x, y = y)) +
  geom_point() +
  geom_abline(aes(intercept = intercept, slope = slope),
              linetype = "dashed", data = lines)

grab_2022-01-18_083517

manual labeling (trial and error to construct the data frame of x, y, and labels

This would be the ideal output of some automated solution, placing the labels nicely off to the top/side.

labels <- data.frame(
  x = c(rep(1, 3), 0.95, 0.47),
  y = c(0.12, 0.28, 0.53, 1, 1),
  label = lines$slope)

p + geom_text(aes(label = label), color = "red", data = labels)

grab_2022-01-18_083626

solution from camille using geom_textabline

library(geomtextpath)
p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope)),
                    data = lines,
                    gap = FALSE,
                    offset = unit(0.2, "lines"),
                    text_only = TRUE)

grab_2022-01-18_083844

The hjust argument isn't doing exactly what I would expect given typical ggplot2 meaning (left, center, right justification). I understand from camille's answer that 0.5 is the default value, however from trial and error starting at 0 and incrementing by 0.1, I found that 0.4 is the first time they show up:

p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope)),
                    data = lines,
                    gap = FALSE,
                    offset = unit(0.2, "lines"),
                    text_only = TRUE,
                    hjust = 0.4)

grab_2022-01-18_084200

And by 0.6, the top label is already off the plot area, with the rest following at 0.7.

p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope)),
                    data = lines,
                    gap = FALSE,
                    offset = unit(0.2, "lines"),
                    text_only = TRUE,
                    hjust = 0.6)

grab_2022-01-18_084213

#27 and #34 seem related to this, but for contours. Basically giving a sort of "auto-placement", though in this case it seems a lot more straightforward, just figuring out the min/max range of the line vs. having to figure out where a contour is "flattest".

I'll include her suggestions for "manual" placement, either from the data itself, or by using the plot object internals:

# y = intercept + slope * x
xmax <- max(df$x) 
# or layer_scales(p)$x$get_limits()[2] for data range
# or ggplot_build(p)$layout$panel_params[[1]]$y.range[2] for panel range
ymax <- max(df$y)
lines_calc <- lines %>%
  mutate(xcalc = pmin((ymax - intercept) / slope, xmax),
         ycalc = pmin(intercept + slope * xmax, ymax))

p +
  geom_text(aes(x = xcalc, y = ycalc, label = as.character(slope)),
            data = lines_calc, vjust = 0, nudge_y = 0.02)

I've honestly never programmed "under the hood" in an R package, but would be willing to try if you think there is merit/feasibility to any of these approaches? Or if there was a suggested couple of files to look in for something like the $get_limits trick that might be used for this package, I could take those as a starting point?

Let me know what you think, and thanks for your consideration/thoughs!

@teunbrand
Copy link
Collaborator

I don't think the auto-placement rules is at the heart of the issue. I think one reason that there is a discrepancy in how hjust behaves and what our intuition is, is because of these lines below that expand the x-range:

if (coord$clip == "on" && coord$is_linear()) {
# Ensure the line extends well outside the panel to avoid visible line
# ending for thick lines
ranges$x <- ranges$x + c(-1, 1) * diff(ranges$x)
}

A second reason is because the y-values for the slope = 2 are out-of-bounds, creating a longer 'arc'-length than what somebody perceives when the panel is clipped.

I don't really see a way around this problem besides manually clipping (calculate intersection points with the axes). The proper place for this is the Geom{Text/Label}abline ggproto classes, without any adjustments more downstream. This is essentially incompatible with the line ending trick mentioned above. Alan, do you think that fixing the hjust is more important than having nice line-endings?

As for the ideal solution to your problem, I don't think placing the labels at 0-degree angles at the edges is within the scope of this function: placing a label along the line would be.

@jwhendy
Copy link
Author

jwhendy commented Jan 18, 2022

I don't think the auto-placement rules is at the heart of the issue.

I think my intuitive use of "auto-place" vs. what's meant in this project (e.g. for contours) is confusing. Sorry about that. I just meant "given a line, auto-place the labels proportionally along it without me having to figure out the positions via trial and error manually."

I can change the title if it's helpful? "Enable placement of text proportionally along a line within the plot bounds using hjust"?

This is essentially incompatible with the line ending trick mentioned above. Alan, do you think that fixing the hjust is more important than having nice line-endings?

Forgive my guessing given new status to the library, but would a third option to be splitting up the handling for lines vs. labels? I noticed that the chunk at L176 is repeated at L284... do these have to be identical?

@teunbrand
Copy link
Collaborator

"given a line, auto-place the labels proportionally along it without me having to figure out the positions via trial and error manually."

Yes, I agree, that is exactly what I think it should do, but at the moment doesn't happen for the abline variant.

Forgive my guessing given new status to the library, but would a third option to be splitting up the handling for lines vs. labels? I noticed that the chunk at L176 is repeated at L284... do these have to be identical?

There is some redundancy between code because we have a plain text and label variant (with textbox) that are very similar. We have to keep labels and lines together because we allow the gap argument to break up the line if it appears to intersect with the text.

@AllanCameron
Copy link
Owner

Hi John - thanks for writing. You give a clear demonstration of the problem, and it was one I was aware of when I was writing the geom_textabline function - as Teun says, it is fairly clear where the problem lies: the reference lines are all extended way off the plotting area to ensure that ugly line ends aren't visible. This means that for the reference line functions, the hjust currently needs to be tweaked by the user on a per-line basis. This is pretty easy to do using the scale_hjust_manual function, which was included specifically for this purpose.

p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope),
                        hjust = as.character(slope)),
                    data = lines,
                    gap = FALSE,
                    offset = unit(0.2, "lines"),
                    text_only = TRUE,
                    color = "red") +
  scale_hjust_manual(values = c(0.65, 0.65, 0.65, 0.65, 0.5))

However, the problem is that the hjust values we need to supply are a bit off. The ends of the bottom four on-screen lines effectively map to 0.33 - 0.67 instead of 0 - 1 because of the line extension code that Teun pointed out, and the ends of the top line effectively maps from 0.33 - 0.5 because it doesn't reach the right edge of the screen.

@Teun - I don't think we need to shorten the lines to get the hjust working more intuitively - I think we can calculate the visible portion of the line inside the draw_panel function and remap the hjust onto it.

@jwhendy jwhendy changed the title The ability to auto-place geom_textabline labels proportionally along a line Enable hjust to place text proportionally along a line within the plot bounds Jan 18, 2022
@jwhendy
Copy link
Author

jwhendy commented Jan 18, 2022

@teunbrand

I tweaked the title to hopefully be more accurate.

There is some redundancy between code because we have a plain text and label variant (with textbox) that are very similar. We have to keep labels and lines together because we allow the gap argument to break up the line if it appears to intersect with the text.

Bah, sorry, I hadn't parsed the meaning of the two functions in R/geom_textabline. I thought one was creating just the line, specifically, and the other was applying the text (my interpretation of "label"). If this were the case, it seemed that the extremes of the line and text could be different. Now I get that these are just two variants.

@AllanCameron

I don't think we need to shorten the lines to get the hjust working more intuitively - I think we can calculate the visible portion of the line inside the draw_panel function and remap the hjust onto it.

This was my hope, but wasn't sure where one would start. Basically, keep using your expanded lines trick while also "smartly" calculating the true bounds (and thus fitting to either xmax or ymax, whichever is hit first). Would you want me to try implementing something? I'm not sure on the exact flow of things, but I might be able to prototype something... let me know!

This is pretty easy to do using the scale_hjust_manual function, which was included specifically for this purpose.

Awesome and this is still an easier workaround with at most half the guesses required vs. my approach in the example :)

Thanks for the quick help and consideration!

@AllanCameron
Copy link
Owner

Thanks @jwhendy

Would you want me to try implementing something?

You are of course very welcome to clone the repo, make changes, and submit a pull request, but Teun and I are quite well immersed in the code base (and in particular the effects that changing one part might have on another), so if you wait a couple of days (or maybe even hours!) we'll see what we can do.

@jwhendy
Copy link
Author

jwhendy commented Jan 18, 2022

@AllanCameron indeed, exactly the sort of cost benefit analysis I was looking for :) I really appreciate it, and was super surprised to see the SO answer on a new package. This will certainly become one of my goto's like ggrepel since stumbling on it!

@teunbrand
Copy link
Collaborator

teunbrand commented Jan 18, 2022

I'll take on this issue, there are some opportunities for refactoring here as well (unless Alan already has a working solution at the moment he reads this).

@jwhendy
Copy link
Author

jwhendy commented Jan 18, 2022

Just came here to link to the answer I posted with your solution, @AllanCameron . Couldn't help see the PR and close as well. You are fast!

I think this is CloseEnough, but I did re-install and re-run my example. What are your thoughts on the behavior of hjust = 1? I would intuitively have expected either:

  • all the text to be slightly off the screen. Perhaps 1 is "all the way to the edge" and with e.g. left alignment, this would anchor all left text edges to the plot boundary, or
  • always just barely inside the plot boundary, with a value of 1 meaning "go as far as you can while still being visible"

I did not expect some to be some to be more visible than others, and some still off the plot border.

library(ggplot2)
library(geomtextpath)

set.seed(123)
df <- data.frame(
  x = runif(100, 0, 1),
  y = runif(100, 0, 1))
lines <- data.frame(
  intercept = rep(0, 5),
  slope = c(0.1, 0.25, 0.5, 1, 2))

p <- ggplot(df, aes(x = x, y = y)) +
  geom_point() +
  geom_abline(aes(intercept = intercept, slope = slope),
              linetype = "dashed", data = lines)
p + geom_textabline(aes(intercept = intercept, 
                        slope = slope,
                        label = as.character(slope)),
                        hjust = 1,
                    data = lines,
                    gap = FALSE,
                    text_only = TRUE,
                    offset = unit(0.2, "lines"),
                    color = "red")

grab_2022-01-18_164005

Using hjust=0.95 works great:

grab_2022-01-18_164033

@AllanCameron
Copy link
Owner

AllanCameron commented Jan 18, 2022

Yes John, @teunbrand has come up with the goods even quicker than I expected!

The "2" is off the page simply because its vertical justification nudges it beyond the plotting margins. If you use a vjust of 0.5 or larger you should see it popping into view.

I think I will close this issue for now, as I am much happier with the new behaviour and it had a lower overhead than expected.

Many thanks for bringing this to our attention John - it's useful to get feedback to help iron out these early bugs.

@jwhendy
Copy link
Author

jwhendy commented Jan 18, 2022

The "2" is off the page simply because its vertical justification nudges it beyond the plotting margins. If you use a vjust of 0.5 or larger you should see it popping into view.

I didn't see a change with hjust=1, vjust=1, and others are also partially off the page, but I also completely agree with:

I think I will close this issue for now, as I am much happier with the new behaviour and it had a lower overhead than expected.

Indeed, works for me and this is easily 95% improved on the behavior of hjust. Much appreciated to the both of you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants