Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WISH: _R_CHECK_LENGTH_1_LOGIC2_=package:_R_CHECK_PACKAGE_NAME_,abort,verbose #71

Closed
HenrikBengtsson opened this issue Dec 3, 2019 · 16 comments

Comments

@HenrikBengtsson
Copy link

Please consider adding:

_R_CHECK_LENGTH_1_LOGIC2_=package:_R_CHECK_PACKAGE_NAME_,abort,verbose

to R_env_vars.{bat,sh} so that bugs like:

> c(TRUE, TRUE) && TRUE
Error in c(TRUE, TRUE) && TRUE : 
  'length(x) = 2 > 1' in coercion to 'logical(1)'

are caught.

@hpages
Copy link
Contributor

hpages commented Dec 3, 2019

Makes sense to me, especially since we already set _R_CHECK_LENGTH_1_CONDITION_ to this value (I'm actually surprised that we need 2 separate environment variables to control these things, which are IMO programming mistakes of the same nature).

BTW, it seems that this does not work interactively:

Sys.setenv(`_R_CHECK_LENGTH_1_LOGIC2_`="package:_R_CHECK_PACKAGE_NAME_,abort,verbose")
c(TRUE, TRUE) && TRUE
# [1] TRUE

Sys.setenv(`_R_CHECK_LENGTH_1_CONDITION_`="package:_R_CHECK_PACKAGE_NAME_,abort,verbose")
if (c(TRUE, FALSE)) "yes" else "no"
# [1] "yes"
# Warning message:
# In if (c(TRUE, FALSE)) "yes" else "no" :
#   the condition has length > 1 and only the first element will be used

Is it only supposed to work in the context of R CMD check or am I missing something?

@HenrikBengtsson
Copy link
Author

HenrikBengtsson commented Dec 3, 2019

Is it only supposed to work in the context of R CMD check or am I missing something?

Yes, the package:_R_CHECK_PACKAGE_NAME_ part makes sure the error is only produced on the package being tested. This way your package will pass R CMD check with all OK in cases where one of your dependent packages has this bug. This strategy will put the burden on the maintainer where the bug occurs. This is the strategy that has been used on CRAN for quite a while, i.e. incoming package updates are tested for this but not existing ones, which would risk blowing up all of CRAN (there are still lots of these bugs out there).

So, use ...=true or ...=verbose to check for these bugs regardless of namespace etc., e.g.

Sys.setenv(`_R_CHECK_LENGTH_1_LOGIC2_`="true")
c(TRUE, TRUE) && TRUE
Error in c(TRUE, TRUE) && TRUE : 
  'length(x) = 2 > 1' in coercion to 'logical(1)'

Sys.setenv(`_R_CHECK_LENGTH_1_CONDITION_`="true")
if (c(TRUE, FALSE)) "yes" else "no"
Error in if (c(TRUE, FALSE)) "yes" else "no" : 
  the condition has length > 1

@hpages
Copy link
Contributor

hpages commented Dec 3, 2019

Ok I see. Thanks for the clarification. Done in c28f34d

@hpages hpages closed this as completed Dec 3, 2019
@HenrikBengtsson
Copy link
Author

Awesome. Enabling this on Bioc, in addition to CRAN, will speed up the process of being able to make this bug an error by default in R.

@hpages
Copy link
Contributor

hpages commented Dec 3, 2019

Hopefully. But when you realize that base R itself still contains numerous misuses of & (& used instead of && in if statements, so not exactly the same as above but still showing confusion about proper use of &&), you know it's going to be a long road...

@hpages
Copy link
Contributor

hpages commented Dec 19, 2019

@HenrikBengtsson @lshep This new setting is currently causing R CMD check to fail on the following 72 Bioconductor packages on all platforms in devel:

AnalysisPageServer
annotate
AnnotationDbi
ASpli
BgeeDB
birta
CAGEr
CAMERA
ClassifyR
CNORfeeder
consensusSeekeR
DOSE
fabia
FEM
flowType
geecc
genbankr
genefu
GeneOverlap
GeneSelectMMD
graph
GSCA
Gviz
HTSeqGenie
ImpulseDE
iPAC
IPPD
JunctionSeq
LEA
limma
lumi
matter
MEIGOR
MergeMaid
messina
metagenomeFeatures
metaseqR
methVisual
methylPipe
minfi
MotIV
MPFE
netbiov
OGSA
ontoProc
openPrimeR
PathwaySplice
PharmacoGx
ppiStats
prada
rBiopaxParser
rDGIdb
reb
recoup
Repitools
RITAN
rTANDEM
scsR
SELEX
SGSeq
ShortRead
SIAMCAT
SIM
spotSegmentation
SSPA
Streamer
SummarizedExperiment
survcomp
TSRchitect
twoddpcr
variancePartition
VariantFiltering

So we might disable this for now (there are other more urgent problems in BioC devel that we want to focus on at the moment).

@hpages hpages reopened this Dec 19, 2019
@lshep
Copy link
Contributor

lshep commented Dec 19, 2019

@hpages up to you. We knew this would break a lot of packages and to me the sooner it can get cleaned up the better. They will all have to deal with the errors eventually whether we implemented it now, 3 months from now, 6 months from now... especially if the overall plan is to make this a default in R.

@HenrikBengtsson
Copy link
Author

Thanks. For the record: 72 out of ~1823 gives that 3.9% of the packages (still) have this bug. (That is in line with what I've found when running 100's of revdep checks in the past).

I think there is a risk that one of these _R_CHECK_LENGTH_1_*_ bugs may produce scientifically incorrect results. Who knows, there might even be another Duke Study in the pipelines out there right now because of these mistakes. I argue for flipping the switch and having these bugs fixed ASAP.

PS. I've submitted quite a few PR bug fixes on this and the fixes have often been quick and straightforward when the bugs where harmless - only a few required deep brain cycles from the maintainers.

@HenrikBengtsson
Copy link
Author

HenrikBengtsson commented Dec 19, 2019

Also, the sooner we can get all Bioconductor and CRAN packages to pass their own tests with _R_CHECK_LENGTH_1_LOGIC2_=package:_R_CHECK_PACKAGE_NAME_,abort,verbose, the sooner we can get to making _R_CHECK_LENGTH_1_LOGIC2_={warn,true} the new default in R. Note that the first setting only captures bugs in code that is covered by package tests. There will be plenty more that will be revealed at run-time by users when we run with _R_CHECK_LENGTH_1_LOGIC2_={warn,true}. So there's still a long journey ahead and we're already two years in but speeding up the first step will really help.

@hpages
Copy link
Contributor

hpages commented Dec 19, 2019

Disabled for now (commit 88bfa1f). Will enable this again at some point after the holidays. Keeping the issue open in the mean time.

H.

@lshep
Copy link
Contributor

lshep commented Jan 24, 2020

I have re-enabled this on our devel builders yesterday 473d26b and it should be reflected in today or tomorrow's report. I will try and send an email to packages that are affected by this setting Monday and probably post an announcement on the bioc-devel mailing list today (hopefully some will fix before monday and less to contact)

@lshep lshep closed this as completed Jan 24, 2020
@HenrikBengtsson
Copy link
Author

@HenrikBengtsson @lshep This new setting is currently causing R CMD check to fail on the following 72 Bioconductor packages on all platforms in devel: ...

It's been ~6 weeks since you enabled the _R_CHECK_LENGTH_1_LOGIC2_ on devel. I'm curious to hear how many of the packages has been fixed since the last tally. Is this something you can run easily?

@lshep
Copy link
Contributor

lshep commented Mar 3, 2020

I sent out private emails to all failing Bioconductor packages this week so it might take another week or two to have a significant change since many didn't acknowledge my first group email but many responded to this one ... I'm hoping the numbers improved and I'm hoping even more so in the next two weeks or so ...

@lshep
Copy link
Contributor

lshep commented Apr 20, 2020

@HenrikBengtsson We just realized that we miss catching some ERROR's when using package:_R_CHECK_PACKAGE_NAME_,abort,verbose currently because it only checks for the ERROR's in R CMD check , we currently run R CMD check --no-vignette since we re-build all the packages ourselves with R CMD build -- we wanted to save time and not put so much strain on the builders to rebuild all the vignettes -- so we would miss any ERROR's triggered in vignette -- Do you know if there is a similar flag that would limit the check to the package specific code but that can be applied to R CMD build as well? (these aren't very well documented)

@HenrikBengtsson
Copy link
Author

I think setting environment variable _R_CHECK_PACKAGE_NAME to the name of the package before you build would work, e.g.

$ export _R_CHECK_LENGTH_1_LOGIC2_="package:_R_CHECK_PACKAGE_NAME_,abort,verbose"
$ export _R_CHECK_LENGTH_1_CONDITION_="package:_R_CHECK_PACKAGE_NAME_,abort,verbose"
$ _R_CHECK_PACKAGE_NAME=foo R CMD build foo

@hpages
Copy link
Contributor

hpages commented Apr 21, 2020

Thanks Henrik.

@lshep That works but that means that if we want to use this solution in the context of the SPB or the nightly builds then we need to modify the exact command that we use for the BUILD step (we need to prepend it with _R_CHECK_PACKAGE_NAME_=<pkgname>). I'm not sure we want to go there though because that is putting one more complication (even if it's a small one) in the way of reproducibility by package maintainers.

Unfortunately at this point I don't see an easy way to use _R_CHECK_LENGTH_1_CONDITION_ and _R_CHECK_LENGTH_1_LOGIC2_ settings that would give us consistent behavior between R CMD build and R CMD check without making our setup even more confusing.

The 2 alternatives I can think of are:

  1. Stop using the --no-vignettes flag when we run R CMD check so we catch errors triggered by the execution of the vignette code. The cost will be longer builds but we can't afford this at the moment.
  2. Set _R_CHECK_LENGTH_1_CONDITION_ and _R_CHECK_LENGTH_1_LOGIC2_ to true. However, as we discussed earlier, this would trigger the error even if it doesn't occur in the code of the package being built or checked, which is probably too brutal. But at least the errors would be triggered whatever the context is (R CMD build, R CMD check, or interactive session) which would greatly improve reproducibility.

Just putting them here for the record but nothing satisfying. What other options do we have?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants