Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-22281][SPARKR] Handle R method breaking signature changes #19557

Closed
wants to merge 4 commits into from

Conversation

felixcheung
Copy link
Member

@felixcheung felixcheung commented Oct 23, 2017

What changes were proposed in this pull request?

This is to fix the code for the latest R changes in R-devel, when running CRAN check

checking for code/documentation mismatches ... WARNING
Codoc mismatches from documentation object 'attach':
attach
Code: function(what, pos = 2L, name = deparse(substitute(what),
backtick = FALSE), warn.conflicts = TRUE)
Docs: function(what, pos = 2L, name = deparse(substitute(what)),
warn.conflicts = TRUE)
Mismatches in argument default values:
Name: 'name' Code: deparse(substitute(what), backtick = FALSE) Docs: deparse(substitute(what))

Codoc mismatches from documentation object 'glm':
glm
Code: function(formula, family = gaussian, data, weights, subset,
na.action, start = NULL, etastart, mustart, offset,
control = list(...), model = TRUE, method = "glm.fit",
x = FALSE, y = TRUE, singular.ok = TRUE, contrasts =
NULL, ...)
Docs: function(formula, family = gaussian, data, weights, subset,
na.action, start = NULL, etastart, mustart, offset,
control = list(...), model = TRUE, method = "glm.fit",
x = FALSE, y = TRUE, contrasts = NULL, ...)
Argument names in code not in docs:
singular.ok
Mismatches in argument names:
Position: 16 Code: singular.ok Docs: contrasts
Position: 17 Code: contrasts Docs: ...

With attach, it's pulling in the function definition from base::attach. We need to disable that but we would still need a function signature for roxygen2 to build with.

With glm it's pulling in the function definition (ie. "usage") from the stats::glm function. Since this is "compiled in" when we build the source package into the .Rd file, when it changes at runtime or in CRAN check it won't match the latest signature. The solution is not to pull in from stats::glm since there isn't much value in doing that (none of the param we actually use, the ones we do use we have explicitly documented them)

Also with attach we are changing to call dynamically.

How was this patch tested?

Manually.

  • check documentation output - yes
  • check help ?attach ?glm - yes
  • check on other platforms, r-hub, on r-devel etc..

args <- as.list(environment()) # capture all function parameters - this must be the first line
newEnv <- assignNewEnv(args$what)
args$what <- newEnv
do.call(attach, args)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since I'm changing to do.call - so that we can retain the param name/value order without having to hardcode them here - we are no longer being flagged with the NOTE on calling attach :)

@SparkQA
Copy link

SparkQA commented Oct 23, 2017

Test build #82972 has finished for PR 19557 at commit 7b145ba.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

Is there a reason we can't use the same glm trick for attach ? I guess this was explained above but I'm wondering if there is a reason the base::attach is not compiled in the same way ?

@felixcheung felixcheung changed the title [SPARK-22281][SPARKR] Handle R method breaking signature changes [SPARK-22281][SPARKR][WIP] Handle R method breaking signature changes Oct 24, 2017
@felixcheung
Copy link
Member Author

felixcheung commented Oct 24, 2017

unfortunately attach part didn't work on r-devel.

I think you are right, I'm going try to apply the similar approach for attach - i started that way before I found out method signature was not the cause for glm. So now I'm going backtrack to re-do attach too. I think it should work.

edit: wait, actually I did try that. it doesn't work for attach because it will complain there is no \usage block. I'll need to change it back to the fixed, non-dynamic function signature.

@SparkQA
Copy link

SparkQA commented Oct 24, 2017

Test build #83004 has finished for PR 19557 at commit a9b74de.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 24, 2017

Test build #83005 has finished for PR 19557 at commit 442711d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@shivaram
Copy link
Contributor

@felixcheung Are the docs on this version good ?

@felixcheung
Copy link
Member Author

felixcheung commented Oct 24, 2017 via email

newEnv <- assignNewEnv(what)
attach(newEnv, pos = pos, name = name, warn.conflicts = warn.conflicts)
function(what, pos = 2L, name = deparse(substitute(what), backtick = FALSE),
warn.conflicts = TRUE) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing this to match the latest R-devel param signature

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is the only thing we might need to update if R changes something ? Also how does this keep working with the older R version as well

Copy link
Member Author

@felixcheung felixcheung Oct 25, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will need to be updated manually...
from my testing on r-release it's working against the older signature. I think for generic to work it just need to match the param names and order but not default value.

this does change the behavior a bit, from name = deparse(substitute(what)) to name = deparse(substitute(what), backtick = FALSE)

args <- as.list(environment()) # capture all parameters - this must be the first line
newEnv <- assignNewEnv(args$what)
args$what <- newEnv
do.call(attach, args)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling attach this way to avoid it getting flagged...

R/run-tests.sh Outdated
@@ -38,6 +38,7 @@ FAILED=$((PIPESTATUS[0]||$FAILED))
NUM_CRAN_WARNING="$(grep -c WARNING$ $CRAN_CHECK_LOG_FILE)"
NUM_CRAN_ERROR="$(grep -c ERROR$ $CRAN_CHECK_LOG_FILE)"
NUM_CRAN_NOTES="$(grep -c NOTE$ $CRAN_CHECK_LOG_FILE)"
HAS_PACKAGE_VERSION_WARN="$(grep -c "Insufficient package version" $CRAN_CHECK_LOG_FILE)"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes to this file is from the other PR plus one change to $NUM_CRAN_NOTES - 1

@felixcheung
Copy link
Member Author

felixcheung commented Oct 25, 2017

well, this change avoided the warning but disabled help too (which is probably why it's working)

> showMethods("attach")
Function: attach (package base)
what="ANY"
what="SparkDataFrame"

> showMethods("fitted")
Function: fitted (package stats)
object="ANY"
object="BisectingKMeansModel"
object="KMeansModel"

> ?fitted
Help on topic ‘fitted’ was found in the following packages:

  Package               Library
  SparkR                /opt/spark-2.3.0-bin-hadoop2.7/R/lib
  stats                 /usr/lib/R/library

Choose one

1: Get fitted result from a k-means model {SparkR}
2: Extract Model Fitted Values {stats}

Selection: 0
> ?attach
> ?SparkR::fitted
> ?SparkR::attach
No documentation for ‘attach’ in specified packages and libraries:
you could try ‘??attach’
> ?attach,SparkDataFrame-method
Error: unexpected ',' in "?attach,"

basically it shows without asking the base::attach, ?SparkR::attach doesn't work either.
I can confirm the attach.Rd attach.html getting knitr though, so not sure why the help index isn't generated.

this > ?"attach,SparkDataFrame-method"and ?"glm,formula,ANY,SparkDataFrame-method" works though, I think I know what it is missing

@SparkQA
Copy link

SparkQA commented Oct 25, 2017

Test build #83038 has finished for PR 19557 at commit 8f84aa6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member Author

tested on windows, r-hub/r-devel

@felixcheung felixcheung changed the title [SPARK-22281][SPARKR][WIP] Handle R method breaking signature changes [SPARK-22281][SPARKR] Handle R method breaking signature changes Oct 26, 2017
@shivaram
Copy link
Contributor

LGTM.

@felixcheung
Copy link
Member Author

rebased

@SparkQA
Copy link

SparkQA commented Oct 31, 2017

Test build #83248 has finished for PR 19557 at commit 1b41f73.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member Author

merged to master/2.2

asfgit pushed a commit that referenced this pull request Nov 8, 2017
## What changes were proposed in this pull request?

This is to fix the code for the latest R changes in R-devel, when running CRAN check
```
checking for code/documentation mismatches ... WARNING
Codoc mismatches from documentation object 'attach':
attach
Code: function(what, pos = 2L, name = deparse(substitute(what),
backtick = FALSE), warn.conflicts = TRUE)
Docs: function(what, pos = 2L, name = deparse(substitute(what)),
warn.conflicts = TRUE)
Mismatches in argument default values:
Name: 'name' Code: deparse(substitute(what), backtick = FALSE) Docs: deparse(substitute(what))

Codoc mismatches from documentation object 'glm':
glm
Code: function(formula, family = gaussian, data, weights, subset,
na.action, start = NULL, etastart, mustart, offset,
control = list(...), model = TRUE, method = "glm.fit",
x = FALSE, y = TRUE, singular.ok = TRUE, contrasts =
NULL, ...)
Docs: function(formula, family = gaussian, data, weights, subset,
na.action, start = NULL, etastart, mustart, offset,
control = list(...), model = TRUE, method = "glm.fit",
x = FALSE, y = TRUE, contrasts = NULL, ...)
Argument names in code not in docs:
singular.ok
Mismatches in argument names:
Position: 16 Code: singular.ok Docs: contrasts
Position: 17 Code: contrasts Docs: ...
```

With attach, it's pulling in the function definition from base::attach. We need to disable that but we would still need a function signature for roxygen2 to build with.

With glm it's pulling in the function definition (ie. "usage") from the stats::glm function. Since this is "compiled in" when we build the source package into the .Rd file, when it changes at runtime or in CRAN check it won't match the latest signature. The solution is not to pull in from stats::glm since there isn't much value in doing that (none of the param we actually use, the ones we do use we have explicitly documented them)

Also with attach we are changing to call dynamically.

## How was this patch tested?

Manually.
- [x] check documentation output - yes
- [x] check help `?attach` `?glm` - yes
- [x] check on other platforms, r-hub, on r-devel etc..

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #19557 from felixcheung/rattachglmdocerror.

(cherry picked from commit 2ca5aae)
Signed-off-by: Felix Cheung <felixcheung@apache.org>
@asfgit asfgit closed this in 2ca5aae Nov 8, 2017
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
## What changes were proposed in this pull request?

This is to fix the code for the latest R changes in R-devel, when running CRAN check
```
checking for code/documentation mismatches ... WARNING
Codoc mismatches from documentation object 'attach':
attach
Code: function(what, pos = 2L, name = deparse(substitute(what),
backtick = FALSE), warn.conflicts = TRUE)
Docs: function(what, pos = 2L, name = deparse(substitute(what)),
warn.conflicts = TRUE)
Mismatches in argument default values:
Name: 'name' Code: deparse(substitute(what), backtick = FALSE) Docs: deparse(substitute(what))

Codoc mismatches from documentation object 'glm':
glm
Code: function(formula, family = gaussian, data, weights, subset,
na.action, start = NULL, etastart, mustart, offset,
control = list(...), model = TRUE, method = "glm.fit",
x = FALSE, y = TRUE, singular.ok = TRUE, contrasts =
NULL, ...)
Docs: function(formula, family = gaussian, data, weights, subset,
na.action, start = NULL, etastart, mustart, offset,
control = list(...), model = TRUE, method = "glm.fit",
x = FALSE, y = TRUE, contrasts = NULL, ...)
Argument names in code not in docs:
singular.ok
Mismatches in argument names:
Position: 16 Code: singular.ok Docs: contrasts
Position: 17 Code: contrasts Docs: ...
```

With attach, it's pulling in the function definition from base::attach. We need to disable that but we would still need a function signature for roxygen2 to build with.

With glm it's pulling in the function definition (ie. "usage") from the stats::glm function. Since this is "compiled in" when we build the source package into the .Rd file, when it changes at runtime or in CRAN check it won't match the latest signature. The solution is not to pull in from stats::glm since there isn't much value in doing that (none of the param we actually use, the ones we do use we have explicitly documented them)

Also with attach we are changing to call dynamically.

## How was this patch tested?

Manually.
- [x] check documentation output - yes
- [x] check help `?attach` `?glm` - yes
- [x] check on other platforms, r-hub, on r-devel etc..

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes apache#19557 from felixcheung/rattachglmdocerror.

(cherry picked from commit 2ca5aae)
Signed-off-by: Felix Cheung <felixcheung@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants