New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rescale() works with Date, POSIXct and POSIXlt objects. #75

Merged
merged 2 commits into from Jun 26, 2017

Conversation

Projects
None yet
4 participants
@zeehio
Copy link
Contributor

zeehio commented Mar 1, 2016

This pull request deals with #74 as discussed.

Changes:

  • documents the change in NEWS.md
  • adds tests for rescaling dates (simple tests but enough to show and test the feature).
  • edits R/bounds.r, checking for Date, POSIXct and POSIXlt inputs; converting them to numeric.

Thanks for your time. Feel free to tell me if there is anything that does not convince you.

@zeehio zeehio force-pushed the zeehio:scales_dates branch 2 times, most recently from 6864ea5 to 1fc1920 Mar 1, 2016

@zeehio zeehio force-pushed the zeehio:scales_dates branch from 1fc1920 to 7aea0a1 Nov 1, 2016

@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Nov 1, 2016

Current coverage is 61.71% (diff: 100%)

Merging #75 into master will increase coverage by 0.57%

@@             master        #75   diff @@
==========================================
  Files            26         26          
  Lines           803        815    +12   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
+ Hits            491        503    +12   
  Misses          312        312          
  Partials          0          0          

Powered by Codecov. Last update d58d83a...8044799

R/bounds.r Outdated
#' If needed, converts input from classes that lack division and multiplication
#' to classes that can be scaled.
#' @param x vector of values to convert, if needed
adapt_input_class <- function(x) {

This comment has been minimized.

@hadley

hadley Nov 1, 2016

Member

This shouldn't be documented

This comment has been minimized.

@zeehio

zeehio Nov 1, 2016

Author Contributor

I left the comments as conventional R comments, so it does not appear under man/. I can remove the comments completely if you prefer.

R/bounds.r Outdated
#' to classes that can be scaled.
#' @param x vector of values to convert, if needed
adapt_input_class <- function(x) {
if (any(class(x) %in% c("Date", "POSIXct", "POSIXlt", "POSIXt"))) {

This comment has been minimized.

@hadley

hadley Nov 1, 2016

Member

Please use inherits() here.

A better formulation would be:

if (...) {
  ...
} else {
 ...
}

as then you no longer need explicit return() statements..

This comment has been minimized.

@zeehio

zeehio Nov 1, 2016

Author Contributor

Done, thank you for your quick review.

@zeehio zeehio force-pushed the zeehio:scales_dates branch from 7aea0a1 to 8044799 Nov 1, 2016

@karawoo

This comment has been minimized.

Copy link
Collaborator

karawoo commented Jun 23, 2017

This would be useful for some of the scales stuff I'm working on for ggplot2 – it would allow scaling alpha with datetimes, for one. But I think we need a way to go back to datetimes after the conversion, otherwise you get legends that look like this:

dat <- data.frame(
  x = sample(1:10, 10),
  y = (1:10)^2,
  dt = seq(
    as.POSIXct("2017-06-01 15:28:44 PDT"),
    as.POSIXct("2017-06-12 15:28:44 PDT"),
    length.out = 10
  )
)

ggplot(dat, aes(x, y)) +
  geom_point(aes(alpha = dt))

scale_alpha_datetimes

@zeehio zeehio force-pushed the zeehio:scales_dates branch from 8044799 to 5e8ef20 Jun 23, 2017

@zeehio

This comment has been minimized.

Copy link
Contributor Author

zeehio commented Jun 23, 2017

@karawoo Thanks for reviewing this. I believe that what you are asking for fits better in the ggplot2 package and not in the scales package.

Here is your example, with a draft of what the ggplot2::scale_alpha_datetime could be:

library(ggplot2)
dat <- data.frame(
  x = sample(1:10, 10),
  y = (1:10)^2,
  dt = seq(
    as.POSIXct("2017-06-01 15:28:44 PDT"),
    as.POSIXct("2017-06-12 15:28:44 PDT"),
    length.out = 10
  )
)


scale_alpha_datetime <- function(name = waiver(), origin = "1970-01-01 00:00.00 UTC", ...) {
  origin <- force(origin)
  label_fun <- function(breaks) {
    as.POSIXct(breaks, origin = origin)
  }
  scale_alpha_continuous(name = name, labels = label_fun, ...)
}


ggplot(dat, aes(x, y)) +
  geom_point(aes(alpha = dt)) +
  scale_alpha_datetime()

imatge

The actual scale_alpha_datetime function should be closer to the other scale_*_datetime functions, to provide better label formatting and a more reasonable choice of date breaks.

Link to the relevant ggplot2 issue: tidyverse/ggplot2#1526

R/bounds.r Outdated
# If needed, converts input from classes that lack division and multiplication
# to classes that can be scaled.
# @param x vector of values to convert, if needed
adapt_input_class <- function(x) {

This comment has been minimized.

@hadley

hadley Jun 23, 2017

Member

I'm not sure I follow the reasoning here. Why not just unconditionally coerce to numeric?

This comment has been minimized.

@zeehio

zeehio Jun 23, 2017

Author Contributor

scales::rescale mainly depends on elementary arithmetic operations (addition, subtraction, multiplication and division). Coercing to numeric is a good way to define multiplication and division operations for POSIXct objects, but it is not the way to go for all R objects. For instance integer64 numbers, from the bit64 package, define their own arithmetic operators to prevent loss of precision.

If the coercion to numeric is done, there will be a warning when using some integer64 objects:

x <- bit64::as.integer64(c(2^50, 2^54+1, 2^60+2))
from <- range(x, na.rm = TRUE)
# I need to compute the `from` range here due to an unrelated bit64 bug triggered
# when range(..., finite = TRUE) is used. (this bug has already been
# reported to the bit64 maintainer)

# No warning given now:
scales::rescale(x = x, from = from)
# [1] 0.00000000 0.01466276 1.00000000

# If we force coercion to numeric there is a warning:
scales::rescale(x = as.numeric(x), from = as.numeric(from))
# Warning messages:
# 1: In as.double.integer64(from) :
#   integer precision lost while converting to double
# 2: In as.double.integer64(x) :
#   integer precision lost while converting to double
# [1] 0.00000000 0.01466276 1.00000000

I wanted to provide an example were the as.numeric approach is wrong, but scales::rescale calls scales::zero_range that has a default tolerance too large for integer64 data. I could work towards letting the user give a custom tolerance value, but for the sake of the explanation you can set tol to zero manually in the zero_range definition in r/bounds.r and rebuild, in case you want to reproduce this example:

x <- bit64::as.integer64(2^60) + 1:3 
# x has three consecutive numbers, we expect the result to be c(0, 0.5, 1)
x_range <- range(x, na.rm = TRUE)
scales::rescale(x = x, from = x_range) # Correct result
# [1] 0.0 0.5 1.0
scales::rescale(x = as.numeric(x), from = as.numeric(x_range)) # loss of precision
# [1] 0.5 0.5 0.5

Even if we have tolerance issues in this case, I prefer to work towards making scales work with other objects, and therefore not coercing everything to numeric.

But you have the last word, though 😃

This comment has been minimized.

@hadley

hadley Jun 23, 2017

Member

In that case I'd prefer to use an S3 generic.

This comment has been minimized.

@zeehio

zeehio Jun 26, 2017

Author Contributor

Using an S3 generic is the best approach, my apologies for not thinking about them when I wrote this patch. I was now able to provide a method (and test case) for integer64 data as well.

rescale and rescale_mid as S3 generics
Provide support for Date, POSIXct, POSIXlt and integer64 types, as well as numeric.

@zeehio zeehio force-pushed the zeehio:scales_dates branch from 64507f2 to 497a1f0 Jun 26, 2017

@hadley hadley merged commit 0101575 into r-lib:master Jun 26, 2017

3 checks passed

codecov/patch 80% of diff hit (target 60.91%)
Details
codecov/project 61.25% (+0.34%) compared to ed68c7b
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hadley

This comment has been minimized.

Copy link
Member

hadley commented Jun 26, 2017

Thanks!

zeehio added a commit to zeehio/scales that referenced this pull request Oct 17, 2017

Provides a default method for rescale and rescale_mid
The idea behind r-lib#74 and r-lib#75 was to use an S3 method so we never coerced to numeric
unless we had to. The aim was to prevent future cases similar to the integer64 one
where the coercion to numeric implies a loss of precision.

Reality is that there are many more numeric types out there that do not inherit from the numeric
class (like the dist object mentioned in r-lib#102), compared to the number of cases similar to integer64, where the numeric coercion is wrong.

So there is a trade-off:

- Not providing the `default` methods has the advantage that we only rescale the classes that we know can be rescaled, and the disadvantage that we get issues like r-lib#102 when we find a new object. This is what was implemented in r-lib#75

- Providing `rescale.default` and `rescale_mid.default` has the advantage of catching all the numeric-like types that do not inherit the numeric class (such as the dist object) and working as expected. However, we risk to deliver wrong results without an error if a case similar to integer64 appeared. This is what this commit implements.

@zeehio zeehio deleted the zeehio:scales_dates branch Feb 18, 2018

zeehio added a commit to zeehio/scales that referenced this pull request Jun 9, 2018

Provides a default method for rescale and rescale_mid
The idea behind r-lib#74 and r-lib#75 was to use an S3 method so we never coerced to numeric
unless we had to. The aim was to prevent future cases similar to the integer64 one
where the coercion to numeric implies a loss of precision.

Reality is that there are many more numeric types out there that do not inherit from the numeric
class (like the dist object mentioned in r-lib#102), compared to the number of cases similar to integer64, where the numeric coercion is wrong.

So there is a trade-off:

- Not providing the `default` methods has the advantage that we only rescale the classes that we know can be rescaled, and the disadvantage that we get issues like r-lib#102 when we find a new object. This is what was implemented in r-lib#75

- Providing `rescale.default` and `rescale_mid.default` has the advantage of catching all the numeric-like types that do not inherit the numeric class (such as the dist object) and working as expected. However, we risk to deliver wrong results without an error if a case similar to integer64 appeared. This is what this commit implements.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment