fix tests for ties.method='last' to work on R 3.1 #4243

MichaelChirico · 2020-02-14T01:51:35Z

Follow up to #4242

ties.method='last' only introduced on R 3.3 < 3.1, so the tests comparing base::rank to frank fail there.

This approach shuts of ties.method='last' testing for R 3.1 (easiest solution).

cc @jangorecki

PS anyone remember why we apparently shut off ties.method='random' testing? It was failing for me here, not sure why.

codecov · 2020-02-14T02:02:49Z

Codecov Report

Merging #4243 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4243      +/-   ##
==========================================
+ Coverage   99.61%   99.61%   +<.01%     
==========================================
  Files          72       72              
  Lines       13873    13874       +1     
==========================================
+ Hits        13819    13820       +1     
  Misses         54       54

Impacted Files	Coverage Δ
R/frank.R	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 230cfcf...518f39a. Read the comment docs.

mattdowle · 2020-02-15T00:37:46Z

Very nice solution using setdiff ... formals. This way if any more are added to R-devel in future we'll know straight away in dev. Nice.

I loooked at history and I couldn't find ties.method='random' test being turned off. Rather it just seems it wasn't tested from the start: 1ff5e1e#diff-d322b46c285376a06f948ecef50b6763
frank was later moved from setkey.R to frank.R on March 9, 2016, and random wasn't tested then either.
Then you caught it and added coverage in test 1962.027.
So the question remains why random can't be added in this loop.
Including random works for me locally. Turning it on in this branch to see if it passes CI.
When you ran locally did you run through test.data.table() which sets the RNG to R's old default for test purposes?

mattdowle · 2020-02-15T01:15:04Z

Ok I get a test fail too. Just in the second test (1369) with na.last=NA. So I left "random" on in 1368 and added a comment to 1369 so this branch should pass for now.
Will have another look later to see why there's a difference with na.last=NA ...

MichaelChirico · 2020-02-15T01:31:09Z

I did look at it briefly and yes I think NA is the issue

# base::rank
random = sort.list(order(x, stats::runif(sum(!nas))))
# frankv
if (ties.method == "random") {
    v = stats::runif(nrow(x))
    if (is.na(na.last)) {
      idx = which_(nas, FALSE)
      set(x, idx, '..stats_runif..', v[idx])
    } else set(x, NULL, '..stats_runif..', v)
    order = if (length(order) == 1L) c(rep(order, length(cols)), 1L) else c(order, 1L)
    cols = c(cols, ncol(x))
  }

runif is called potentially a different number of times in the two functions which would throw off the seeds?

mattdowle · 2020-02-15T01:50:11Z

But the test sets the seed before the calls. Isn't it runif(sum(!nas))) vs runif(nrow(x)) i.e. frank including the NA positions to remove them afterwards could result in a different ordering than the base way. Both are right I guess, just different.

MichaelChirico · 2020-02-15T02:06:59Z

say the first element of the input is NA

then frank will use the first random draw on an NA, while base::rank will not take any draws, so the random ordering will end up different IINM.

I don't think we need to force exact consistency between rank and frank here -- random ordering should be random after all

…put to bed

mattdowle · 2020-02-15T02:20:56Z

Yes both are random. But we may as well match base unless there's a reason not to. In this case it was easier to match base and turn the test on with 'random' included (thanks to your spot) so it could all be put to bed. Otherwise we'd have had to find another way to test random with na.last=NA.

MichaelChirico · 2020-02-15T02:34:09Z

nice! I see the difference was only a few lines so yes that's preferred

fix tests to work on R 3.1

cf1cdd7

mattdowle added this to the 1.12.9 milestone Feb 15, 2020

turned 'random' on in the loop and added back the two set.seed

85a6b49

comment added to 1369 about 'random' with TODO

5d2847e

ties='random' with na.last=NA now consistent with base and test 1369 …

518f39a

…put to bed

mattdowle merged commit 29ad52f into master Feb 15, 2020

mattdowle deleted the frank-last-3.1 branch February 15, 2020 02:32

jangorecki modified the milestones: 1.12.11, 1.12.9 May 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix tests for ties.method='last' to work on R 3.1 #4243

fix tests for ties.method='last' to work on R 3.1 #4243

MichaelChirico commented Feb 14, 2020 •

edited by mattdowle

Loading

codecov bot commented Feb 14, 2020 •

edited

Loading

mattdowle commented Feb 15, 2020 •

edited

Loading

mattdowle commented Feb 15, 2020

MichaelChirico commented Feb 15, 2020 •

edited

Loading

mattdowle commented Feb 15, 2020

MichaelChirico commented Feb 15, 2020

mattdowle commented Feb 15, 2020 •

edited

Loading

MichaelChirico commented Feb 15, 2020

fix tests for ties.method='last' to work on R 3.1 #4243

fix tests for ties.method='last' to work on R 3.1 #4243

Conversation

MichaelChirico commented Feb 14, 2020 • edited by mattdowle Loading

codecov bot commented Feb 14, 2020 • edited Loading

Codecov Report

mattdowle commented Feb 15, 2020 • edited Loading

mattdowle commented Feb 15, 2020

MichaelChirico commented Feb 15, 2020 • edited Loading

mattdowle commented Feb 15, 2020

MichaelChirico commented Feb 15, 2020

mattdowle commented Feb 15, 2020 • edited Loading

MichaelChirico commented Feb 15, 2020

MichaelChirico commented Feb 14, 2020 •

edited by mattdowle

Loading

codecov bot commented Feb 14, 2020 •

edited

Loading

mattdowle commented Feb 15, 2020 •

edited

Loading

MichaelChirico commented Feb 15, 2020 •

edited

Loading

mattdowle commented Feb 15, 2020 •

edited

Loading