Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keywords for placement of label #34

Closed
teunbrand opened this issue Dec 10, 2021 · 12 comments
Closed

Keywords for placement of label #34

teunbrand opened this issue Dec 10, 2021 · 12 comments

Comments

@teunbrand
Copy link
Collaborator

teunbrand commented Dec 10, 2021

We were discussing in #27 that it might be convenient to have labels placed at some position. In particular, we were discussing hjust = "auto" for placing the label at the flattest part of the curve, but that got me thinking about other placement rules. I think the following keywords for the hjust parameter make sense:

  • "flattest", as we discussed before: at the flattest part of the curve. We can also do "steepest" to do the inverse, but that makes less sense to me.
  • "xmin"/"xmax"/"xmid" for placement at the leftmost/rightmost or middle horizontal position on the curve.
  • "ymin"/"ymax"/"ymid" for placement at the top, bottom or middle vertical position on the curve.
  • "before"/"after" for doing the equivalent of hjust = -1 or hjust = 2, where the text is anchored a textwidth away from the stated anchor point. This would translate to before the start of the curve or after the end of the curve (where the angles would be extrapolated based on the first/last angle). Consequently, this text would be flat and we can use the simplified per-string placement instead of the per-character placement.

This is not an exhaustive list, but these came to mind.

@AllanCameron
Copy link
Owner

This might be a nice touch. Though "flat" isn't the opposite of "steep" in this context. It's the opposite of "curved", so for example, in your economics example, the peaks are steep but they are also the "flattest" regions for text to be placed.

@teunbrand
Copy link
Collaborator Author

teunbrand commented Dec 10, 2021

You're right, I was thinking about horizontal and vertical instead, which also might be neat options, but it is indeed not necessarily the flattest part. Your comment about using the curvature for placing the label on the flattest part makes way more sense to me now.

@AllanCameron
Copy link
Owner

I've made some progress with the automatic label placement on the flattest areas of a plot, using a modified rolling mean of curvature which finds the least curved section and sets the hjust as the proportion of the arclength at that point. I have created a little function that processes the data frame from inside geom textpath, but this could maybe be called from inside textpathGrob and made more efficient (since it uses the split-apply-bind method). Really just a proof of concept at the moment, but it seems to work pretty well:

library(geomtextpath)
#> Loading required package: ggplot2

df <- data.frame(x = 1:100, y = cos(seq(0, 2 * pi, len = 100)),
                 label = "A text label of moderate length.")

ggplot(df, aes(x, y, label = label)) + geom_textpath()

ggplot(df, aes(x, y, label = label)) + geom_textpath(hjust = "auto")

set.seed(1)

df <- data.frame(x = rnorm(100), y = rnorm(100))

ggplot(df, aes(x, y)) + geom_labeldensity2d()

ggplot(df, aes(x, y)) + geom_labeldensity2d(hjust = "auto")

It even finds a place for your label in the difficult economics example:

p <- ggplot(economics, aes(date, unemploy)) +
  geom_path(colour = "grey")
p + geom_textpath(
    aes(label = "Decline", group = 1),
    hjust = "auto", size = 5, include_line = FALSE)

Created on 2021-12-10 by the reprex package (v2.0.0)

@teunbrand
Copy link
Collaborator Author

teunbrand commented Dec 10, 2021

Yes that does seems to work pretty good! I don't really worry about efficiency outside of the makeContent code as it doesn't need to run every time the user resizes their window (but all else being equal, more efficiency is better than less efficiency).
The only reason I can see to run this from within the makeContent code is because then we can know the exact text width the choose an optimal window for calculating the running mean and get the appropriate curvature.

I tried testing whether the point of minimum curvature is stable under aspect ratio deformation, but this appear to be not the case.

set.seed(42)

# Random walk
x <- cumsum(rnorm(200))
y <- cumsum(rnorm(200))
plot(x, y, type = 'l')

# Aspect ratios to test
asp <- seq(1, 5, length.out = 100)

# Calculate curvature for every ratio
curv <- vapply(asp, function(mult) {
  geomtextpath:::.get_curvature(x * mult, y)
}, numeric(length(x)))

# Visualise curvature
image(
  list(y = asp, x = 1:200, z = curv),
  useRaster = T, col = hcl.colors(255, "YlOrRd", rev = TRUE)
)

# Not always minima are the same point
min_curv <- apply(curv, 2, which.min)
all(min_curv == min_curv[1])
#> [1] FALSE

Created on 2021-12-10 by the reprex package (v2.0.1)

However, there aren't many minima in the example above (just 3) and if you use set.seed(0) there is only a single one, so my guess is that the minimum is relatively stable under deformation? (update I tested 100 seeds and in 37 of them they had 1 minimum).

@AllanCameron
Copy link
Owner

No, curvature isn't stable under aspect ratio changes - A circle has fixed curvature all the way round, but if you change the aspect ratio you get an ellipse, which has higher curvature in one dimension than the other.

@AllanCameron
Copy link
Owner

AllanCameron commented Dec 11, 2021

I've moved the auto hjust inside the makeContent mechanism (it's now inside the anchor points function). It seems to work pretty well

library(geomtextpath)
#> Loading required package: ggplot2

df <- data.frame(x = rep(sin(seq(0, 2*pi, len = 100)), 2),
                 y = rep(cos(seq(0, 2*pi, len = 100)), 2),
                 z = rep(c("A", "B"), each = 100),
                 label = "I think this is the flattest part of the curve")

p <- ggplot(df, aes(x, y, group = z, label = label)) + 
       geom_textpath(vjust = 1.2, size = 6, hjust = "auto")

p + facet_grid(z~.)

p + facet_grid(.~z)

Created on 2021-12-11 by the reprex package (v2.0.0)

@byteit101
Copy link

"xmin"/"xmax"/"xmid" for placement at the leftmost/rightmost or middle horizontal position on the curve.

I like these, as they are stable under aspect ratio changes. Could it be generalized for all xpos/ypos? I know right now I have some plots that I have to adjust hjust whenever I resize them, either directly, or indirectly via adding or removing legends, titles, etc. Such an option would be very useful for them.

This is not an exhaustive list, but these came to mind.

A probably tricky-to-implement idea: avoid the other textpaths from the other groups/colors. Something like that would be great for the plot that I used when asking the original question.

@AllanCameron
Copy link
Owner

I have implemented the positions mentioned above (though "auto" is just "flattest"). I will leave this issue open until we have had a play and some testing. The "check overlap" that @byteit101 mentions is probably a separate issue

library(geomtextpath)
#> Loading required package: ggplot2

p <- ggplot(iris, aes(x = Sepal.Length, group = 1))

p + geom_textpath(aes(label = "Default"), stat = "density", size = 6)

p + geom_textpath(aes(label = "auto"), stat = "density", size = 6, 
                  hjust = "auto")

p + geom_textpath(aes(label = "xmin"), stat = "density", size = 6, 
                  hjust = "xmin")

p + geom_textpath(aes(label = "xmid"), stat = "density", size = 6, 
                  hjust = "xmid")

p + geom_textpath(aes(label = "xmax"), stat = "density", size = 6, 
                  hjust = "xmax")

p + geom_textpath(aes(label = "ymin"), stat = "density", size = 6, 
                  hjust = "ymin")

p + geom_textpath(aes(label = "ymid"), stat = "density", size = 6, 
                  hjust = "ymid")

p + geom_textpath(aes(label = "ymax"), stat = "density", size = 6, 
                  hjust = "ymax")

Created on 2021-12-12 by the reprex package (v2.0.0)

The "ymax" setting is actually pretty useful:

 ggplot(iris, aes(x = Sepal.Length, colour = Species)) +
   geom_textpath(aes(label = Species), stat = "density",
                 size = 6, fontface = 2, hjust = "ymax", vjust = -0.2)

image

@teunbrand
Copy link
Collaborator Author

This look great! Out of curiosity, in the ymid case, is the left/right choice arbitrary or determined by something?
I thought you might like this thread: https://twitter.com/timelyportfolio/status/1469683836107866120.

@AllanCameron
Copy link
Owner

Ah...I had noticed that the repo's stars had more than doubled in 24h but couldn't figure out why. Now I know!

The ymid literally finds the point on the path nearest the mean y value.

I can't figure out why the text isn't centered over the peaks on the y max setting. I'll have a look at this and refactor the code (it's unnecessarily repetitive), plus write some tests before closing this issue.

@AllanCameron
Copy link
Owner

The text wasn't centered over the peak because the default halign was "left", so any vjust below 0.5 pushed the text so it would be in line with the first letter of a string nicely centered on the peak with a vjust of 0.5. I have switched the default to "center", since I am guessing that positioning single-line labels is a more common task than using multi-line labels, and in any case the user can change the halign if printing multi-line text. It seems unreasonable to expect the casual user to know that they should change the halign to correctly position single-line text.

@AllanCameron
Copy link
Owner

I have added tests for this and we're back at 100% code coverage. The results look as expected on all 3 geoms, so I'll close this issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants