Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hue_pal should preserve referencial transparency #81

Merged
merged 13 commits into from
Jun 15, 2018

Conversation

zeehio
Copy link
Contributor

@zeehio zeehio commented Nov 1, 2016

When hue_pal is used inside a for loop, hue_pal arguments may not be evaluated until the end of the loop and referencial transparency is lost.

See the failing test case on the first commit:

test_that("hue_pal preserves referencial transparency", {
  col1 <- hue_pal(h.start = 0)
  col2 <- hue_pal(h.start = 90)
  
  colours <- list()
  hues <- c(0, 90)
  for (i in 1:2) {
    colours[[i]] <- hue_pal(h.start = hues[i])
  }
  expect_equal(col1(1), colours[[1]](1))
  expect_equal(col2(1), colours[[2]](1))
})

The proposed patch on the second commit simply uses force on the input arguments of hue_pal, fixing the issue.

Other palettes may suffer from the same issues. If there is interest in this PR I can fix them as well.

@zeehio zeehio force-pushed the patch-1 branch 2 times, most recently from 0c63fc2 to 15361f3 Compare November 1, 2016 23:11
@codecov-io
Copy link

Current coverage is 63.18% (diff: 71.69%)

Merging #81 into master will increase coverage by 2.04%

@@             master        #81   diff @@
==========================================
  Files            26         26          
  Lines           803        853    +50   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            491        539    +48   
- Misses          312        314     +2   
  Partials          0          0          

Powered by Codecov. Last update d58d83a...15361f3

@zeehio
Copy link
Contributor Author

zeehio commented Nov 1, 2016

I have reviewed the code looking for places where referencial transparency was not honoured, typically this kind of code:

function(arg1) {
  function(x) {
     x + arg1
  }
}

Had to be changed to:

function(arg1) {
  force(arg1)
  function(x) {
     x + arg1
  }
}

By using force, when the inner function is called, it will find the arg1 value that was set when the outer function was called.

Let the force be with you.

@zeehio
Copy link
Contributor Author

zeehio commented Nov 9, 2016

@hadley, can this PR be reviewed/merged before the next release? Is it too late already to ask this?

@hadley
Copy link
Member

hadley commented Nov 9, 2016

It's too late, sorry.

@zeehio
Copy link
Contributor Author

zeehio commented Jun 26, 2017

I just rebased this PR, in case you want to merge it before the next scales release.

R/pal-hue.r Outdated
@@ -23,7 +23,11 @@
#' show_col(hue_pal(h = c(180, 270))(9))
#' show_col(hue_pal(h = c(270, 360))(9))
hue_pal <- function(h = c(0, 360) + 15, c = 100, l = 65, h.start = 0, direction = 1) {

h <- force(h)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to assign

R/trans.r Outdated
@@ -24,7 +24,10 @@
trans_new <- function(name, transform, inverse, breaks = extended_breaks(), format = format_format(), domain = c(-Inf, Inf)) {
if (is.character(transform)) transform <- match.fun(transform)
if (is.character(inverse)) inverse <- match.fun(inverse)

force(name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list() will force

R/formatter.r Outdated
@@ -265,6 +279,8 @@ trans_format <- function(trans, format = scientific_format()) {
#' \code{\link{format.POSIXct}}
#' @export
format_format <- function(...) {
force(list(...))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe make force_all <- function(...) list(...) just to make this and others slightly cleaner?

@zeehio
Copy link
Contributor Author

zeehio commented Jun 26, 2017

Thanks, with force_all the code is more clean. I also fixed the recently added viridis_pal.

Note that the github web interface is having some issues, hopefully my last commit will appear here soon.

R/a-helper.r Outdated
@@ -0,0 +1,2 @@
# Evaluates all arguments (see #81)
force_all <- function(...) list(...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put in utils.R please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I had to use @include utils.r where needed to make sure the Collate will be well done.

@hadley
Copy link
Member

hadley commented Jun 26, 2017

And add a bullet to NEWS? I wouldn't mention referential transparency, but I'm not sure how to describe otherwise.

@zeehio
Copy link
Contributor Author

zeehio commented Jun 26, 2017

This is the current NEWS bullet:

  • scales now works as expected when it is used inside a for loop. In previous
    package versions if a scales function was used with variable custom parameters
    inside a for loop, some of the parameters were not evaluated until the end
    of the loop, due to how R lazy evaluation works. (@zeehio, hue_pal should preserve referencial transparency #81).

It's not an awesome description, but I hope it is clear enough...

@hadley
Copy link
Member

hadley commented Jun 27, 2017

I don't understand why you needed @include utils.r. Which of those functions is getting executed during package build?

@zeehio
Copy link
Contributor Author

zeehio commented Jun 27, 2017

I removed all the @include directives except the one in formatter.r, that is required.

It is needed because the dollar_format() function is called at build time in the dollar <- dollar_format() assignment.

R/formatter.r Outdated
@@ -4,6 +4,7 @@
#' @param x a numeric vector to format
#' @return a function with single parameter x, a numeric vector, that
#' returns a character vector
#' @include utils.r
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is necessary because a bit below (R/formatter.r, line 104) the function dollar is defined as follows:

#' @export
#' @rdname dollar_format
dollar <- dollar_format()

So dollar_format() is called at build time and dollar_format() calls force_all(), defined in utils.r. If you would rather that I put the @include utils.r in the dollar definition below, feel free to tell me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d rather get rid of the dependency altogether somehow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, I have got rid of it in the last commit :-) : 98c42d0

R/formatter.r Outdated
@@ -158,6 +162,9 @@ scientific <- function(x, digits = 3, ...) {
#' ordinal(1:10)

ordinal_format <- function(x) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this even have an argument?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, good point. It shouldn't. I am going to remove that argument.

scales now works as expected when it is used inside a for loop.

Before, when a scales function was used with a variable custom parameters
inside a for loop, some of the parameters were not evaluated until the end
of the loop, due to how R lazy evaluation works.
Collate in DESCRIPTION is not required
@zeehio
Copy link
Contributor Author

zeehio commented Jun 6, 2018

@hadley, I see you merged master into the PR branch, I expected the merge to go the other way around :-P

Anyway, If you need anything from my end feel free to tell me, I don't mind rebasing if you like. The build failure seems unrelated to this PR and only happens in older R versions that I do not have installed right now though...

@dpseidel
Copy link
Collaborator

dpseidel commented Jun 8, 2018

Build seems to be failing because of timezone tests in the earlier versions. Since timezones are platform specific this could be a travis/linux specific issue. Not sure how to debug. (Apologies for the late edit, accidently hit comment before I was finished!)

Using Rstudio cloud to debug this, R Version 3.1.3:

Error: Test failed: 'time scales learn timezones'
* tz(x) not equal to "UTC".
1/1 mismatches
x[1]: ""
y[1]: "UTC"
* tz(x) not equal to "GMT".
1/1 mismatches
x[1]: ""
y[1]: "GMT"
* tz2(x) not equal to "GMT".
1/1 mismatches
x[1]: "UTC"
y[1]: "GMT"

@hadley
Copy link
Member

hadley commented Jun 8, 2018

Easiest I think to just skip the test with e.g. skip_if_not(getRversion() > "3.1")

@dpseidel
Copy link
Collaborator

dpseidel commented Jun 8, 2018

Hi @zeehio,

You'll see all builds are passing now but there are still a couple things we would like to see happen before merging:

  • It's curious that dollar_format is the only function that can't use force_all. Hadley suspects that this is a downstream issue in how it's implemented that should be fixed. Would you be willing to investigate this further?

  • Since the changes implemented are more about forcing evaluation in function factories than referential transparency, some of your wording needs some tweaking to better reflect the impact of the changes (e.g. in your test names)

  • The new pattern for test files is to match the test file name to the R file name (as usethis::use_test() does). Can you correct your test file name to reflect this? This may also mean that your tests need to be spread over multiple files.

Can you make these changes and update the PR? If you need anything from my end, I'm around to help! Thanks!

@zeehio
Copy link
Contributor Author

zeehio commented Jun 9, 2018

Thanks for the review and the indications. I believe I addressed all the points you mentioned, although for the first point I'd like to know what solution you prefer.

  • About the force_all and dollar_format thing, this is what I understand. I would like to know if I am right and what is the solution you prefer, if you don't like the current one.

The difference between dollar_format and the other functions is that dollar_format is the only function using force_all that is called when the package is being built (this happens in order to define the dollar function here). When the package is built its functions are "sourced" in alphabetic order according to the "C" locale. dollar_format is defined before dollar on the same file so it is always found, however when dollar_format calls force_all at build time, force_all is not yet defined because it is in another .r file (utils.r). There are multiple solutions to ensure that it works:

  • Option 1: Use #' @include utils.r. This will make roxygen define a Collate field in the DESCRIPTION file. The Collate field will enforce that R/utils.r is always sourced before R/formatter.r during the package build. For some reason, Hadley did not like that solution and wants to avoid it.

  • Option 2: Do not use force_all inside dollar_format. force_all is a very simple function. It just evaluates its arguments so it is easy to replace. This is the current solution on the PR.

  • Option 3: Move force_all from utils.r to formatter.r and place it before dollar_format. I don't like that much this alternative solution, because force_all is used across several files and not specifically related to "format" things.

  • Option 4: Rename "utils.r" to "aaa-utils.r" so it is sourced first.

  • I renamed the test names as suggested (e.g. from "hue_pal preserves referencial transparency" to "hue_pal arguments are forcely evaluated on each call #81").

  • About the test file names, I have updated them following the new convention (now we have tests/test-pal-hue.r and tests/test-breaks.r). I saw that the tests from R/breaks.r were split in different files tests/breaks-extended.R, tests/breaks-log.r, tests/breaks-minor.r. I did not merge those test files into a single one because I guess it goes beyond this PR.

Thanks for your detailed comments, your review and your time! 😄

@hadley
Copy link
Member

hadley commented Jun 9, 2018

Why is dollar different to every other function of that form?

@zeehio
Copy link
Contributor Author

zeehio commented Jun 9, 2018

In R/formatter.r I see these two similar function definitions:

  • dollar <- dollar_format()
  • percent <- percent_format()

So dollar is not the only case. However, the definition of percent_format does not have any argument, so it does not use force_all. dollar is the only function defined by calling another function that uses force_all. Why is it that way? You wrote it at 2010-12-23: e0a41e0.

Whether or not it could be deprecated or removed, I have no idea, but I (humbly) feel it goes beyond this PR.

Considering that:

  • You suggested me to define and use the function force_all.
  • Option 4 (using aaa-helper.r instead of utils.r) was my first choice, and you did not like it.
  • Option 1 (use Collate) was my second choice, and you did not like it

I see there are only left two options out of the four I proposed. I still don't understand your reasons for disliking the Collate solution, but I accept your criteria because a) it is your package and b) I trust your judgement (you are Hadley Wickham!). Also, I usually don't ask you for details on your decisions because I don't want to waste (more of) your time with questions that I usually understand with my own reading/work/time.

However this time either I get some rationale on what solution you would like or why you aren't convinced by any of the four proposals (if that's the case), or I assume my incompetence to address this issue and surrender this pull request. After all, if in 1.5 years I've been unable to get it merged I am starting to doubt I ever will.

In any case, I have learnt a lot during the process and I am very grateful to all of you for that.

@hadley
Copy link
Member

hadley commented Jun 9, 2018

Sorry for being so vague. I've now taken a thorough look at the code and can give you a better path forward. The basic problem, I think, is that dollar_format() is the wrong way around — the majority of the work is done inside of the anonymous function rather than in dollar(). Contrast the organisation of dollar_format() to comma_format() which is simple enough to fit in a few lines:

comma_format <- function(...) {
  function(x) comma(x, ...)
}

comma <- function(x, ...) {
  format(x, ..., big.mark = ",", scientific = FALSE, trim = TRUE)
}

I think this is a better organisation since the bulk of the code is stored inside a function that you can test directly, rather than via a layer of indirection.

I appreciate your patience with this pull request — I know it's taken a very long time and you have been very understanding — if you don't want to work on it anymore, @dpseidel can take it over the finish line.

@zeehio
Copy link
Contributor Author

zeehio commented Jun 10, 2018

Good news, everyone!

Following the path you gave me I rewrote dollar_format in terms of dollar and percent_format in terms of percent. So the issue with force_all being needed at build time is not an issue anymore.

I fully agree with you that this approach is better. And thanks to your specific explanation I've learnt a bit about code organization as well. So big thanks again! 😃

If you review the commits, you will also see that I dropped ... from dollar. The ... were arguments supposed to be passed on to the format function inside, but they were never passed on and no one ever complained about it (as far as I could see). See the git blame page to get an idea of the history if you want. I can bring the ... back if you really want them (even though they have never been used).

@dpseidel dpseidel merged commit 4056ec3 into r-lib:master Jun 15, 2018
zeehio added a commit to zeehio/ggplot2 that referenced this pull request Feb 13, 2020
Scales 1.0.0 fixed a bug r-lib/scales#81. The bug can lead to wrong plots when they are defined inside for loops.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants