Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impute srcrefs for subexpressions #154

Merged
merged 22 commits into from Mar 17, 2016
Merged

Impute srcrefs for subexpressions #154

merged 22 commits into from Mar 17, 2016

Conversation

@krlmlr
Copy link
Member

krlmlr commented Mar 9, 2016

This allows collecting coverage for non-braced subexpressions in if, for, and while clauses. Works by imputing missing srcrefs on the fly (in trace_calls()) by analyzing getParseData() output.

Caveats:

  • This doesn't work for me in package_coverage(), I don't know why. Perhaps package_coverage() somehow needs to know all srcrefs beforehand?
  • Probably broken for non-ASCII characters.
  • Percentage changes in one of the tests.

Fixes #39.

Test file:

f <- function() {
  if (FALSE)
    FALSE

  for (i in character())
    FALSE

  while (FALSE)
    FALSE

  repeat
    break
}

cv <- covr::function_coverage(f, f())
covr::shine(cv)

screenshot from 2016-03-09 16-32-53

Kirill Müller added 2 commits Mar 9, 2016
- percentage changes in one of the tests
@lintr-bot

This comment has been minimized.

Copy link

lintr-bot commented on 72ad35c Mar 9, 2016

R/trace_calls.R:136:39: style: Commas should always have a space after.

pd_child <- pd[pd$parent == expr_id,]
                                      ^
@jimhester

This comment has been minimized.

Copy link
Member

jimhester commented Mar 9, 2016

This is awesome thank you for working on it, I had a couple aborted attempts and this has been a nagging problem I have been wanting to fix.

There shouldn't be any difference in srcrefs for package_coverage().

However package_coverage() does use a subprocess, so you need to actually install the package with your changes before they will be used.

Kirill Müller
@lintr-bot

This comment has been minimized.

Copy link

lintr-bot commented on f6d9d1c Mar 9, 2016

R/trace_calls.R:136:39: style: Commas should always have a space after.

pd_child <- pd[pd$parent == expr_id,]
                                      ^
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 9, 2016

package_coverage(): It seems that the package loading mechanism does something "evil": It concatenates all R files to one, and adds #line directives. This breaks getParseData() on which my code relies. devtools::load_all() loads the code from the original files; would you mind using that instead of loadNamespace?

@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 10, 2016

Using load_all() breaks the S4 tests. Bummer...

Kirill Müller added 3 commits Mar 9, 2016
@jimhester

This comment has been minimized.

Copy link
Member

jimhester commented Mar 10, 2016

I originally used load_all(), but as you saw S4 doesn't work correctly with it. I ran into the line directive issue for getting the source of each file as well. Here is the function which actually installs the code when you run R CMD INSTALL (https://github.com/wch/r-source/blob/b156e3a711967f58131e23c1b1dc1ea90e2f0c43/src/library/tools/R/admin.R#L205-L335).

To get around this I think we could just re-parse the text in the parent source reference to get the token data. Something like

txt <- as.character(parent_ref)
sf <- srcfile(txt)
parse(text = txt, srcfile = sf)
pd <- getParseData(sf)

Then you need to fix the line and column numbers from the tokens with those from where the if block starts.

@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 10, 2016

An alternative would be to fix getParseData() in this case. The "srcfile" attribute is an environment which contains an "original" object; I think the parse data belongs there, but is instead attached to the last file in the package.

Test package: https://github.com/krlmlr/covr.dummy
Example output: http://rpubs.com/krlmlr/getParseData

Just posted to r-devel.

@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 10, 2016

Until this is fixed in R, I suggest we tweak the package loading process: Create a "last" file ourselves with an otherwise useless function, grab the parseData from there and put it where it belongs.

@jimhester

This comment has been minimized.

Copy link
Member

jimhester commented Mar 10, 2016

From your report it looks like the last srcref will always have the parse data attached. Could we just find that srcref and copy the parse data to the other srcrefs?

Figuring out the last file is slightly complicated due to Collate directives, locale issues etc.

@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 10, 2016

What if there are no objects in the last file? Think good ol' zzz.R.

Kirill Müller added 3 commits Mar 10, 2016
This reverts commit 4982d94.
Kirill Müller
also be aware of line offsets due to #line directives
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 10, 2016

screenshot from 2016-03-10 16-38-48

Kirill Müller added 2 commits Mar 10, 2016
Kirill Müller
@lintr-bot

This comment has been minimized.

Copy link

lintr-bot commented on bf38a63 Mar 10, 2016

R/parse_data.R:6:1: style: lines should not be more than 120 characters.

​    warning("Parse data not found, coverage may be inaccurate. Try declaring a function in the last file of your R package.",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

R/parse_data.R:22:17: style: Variable and function names should be all lowercase.

original[[1]]$parseData <- parse_data[[1L]]
                ^~~~~~~~~
@jimhester

This comment has been minimized.

Copy link
Member

jimhester commented Mar 10, 2016

Ok so we can tokens of the entire package including line directives from any srcref in the package with the following.

get_tokens <- function(srcref) {
  getParseData(attr(parse(text = attr(getSrcref(srcref), "srcfile")$original$lines), "srcfile"))
}

Then you would need to fix the line numbers based on the line directives.

This works because the object returned by parse returns a srcfile with the parse data attached.

@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 10, 2016

Much easier, don't need to reparse at all. Working on a custom version of getParseData() to make it a tad faster. Functionality is in place already, tests pass, only lintr keeps complaining.

Kirill Müller added 5 commits Mar 10, 2016
Kirill Müller
Kirill Müller
- avoids sorting
- could use data.table for lookup performance here
Kirill Müller
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 10, 2016

Tests pass (AppVeyor requires an appveyor.yml file; if you delete the project there, it shouldn't show up here anymore). Character encoding still might be an issue (if we're ever looking at exact column position of the imputed srcrefs).

There will be warnings if parse data cannot be repaired.

Kirill Müller added 2 commits Mar 10, 2016
Kirill Müller
safeguard
seems to happen (at least) with the :: operator
@krlmlr krlmlr changed the title WIP: Impute srcrefs for subexpressions Impute srcrefs for subexpressions Mar 10, 2016
Kirill Müller
@krlmlr krlmlr mentioned this pull request Mar 10, 2016
make_srcref(5),
make_srcref(6, 7)
)
src_ref[seq_along(x)]

This comment has been minimized.

Copy link
@jimhester

jimhester Mar 10, 2016

Member

We need this to handle if's without else's right? Might be worth adding a comment so it is clear why the if case is different than for and while.

Kirill Müller
get_parse_data(x$original)
else if (exists("covr_parse_data", x))
x$covr_parse_data
else if (!is.null(data <- x[["parseData"]])) {

This comment has been minimized.

Copy link
@jimhester

jimhester Mar 10, 2016

Member

Could you put braces around the bodies of these conditionals, I prefer to be explicit to avoid issues in the future when statements are added and someone forgets to add braces.

This comment has been minimized.

Copy link
@krlmlr

krlmlr Mar 10, 2016

Author Member

Like this:

  if (inherits(x, "srcref")) {
    get_parse_data(attr(x, "srcfile"))
  } else if (exists("original", x)) {

?

This comment has been minimized.

Copy link
@jimhester
Kirill Müller
@krlmlr

This comment has been minimized.

Copy link
Member Author

krlmlr commented Mar 10, 2016

Done. Should we wait for community feedback before merging? The changes here are rather invasive.

@jimhester

This comment has been minimized.

Copy link
Member

jimhester commented Mar 10, 2016

Well the current behavior is clearly wrong, so if we wait it shouldn't need to be long.

@krlmlr krlmlr mentioned this pull request Mar 11, 2016
@jimhester jimhester merged commit 6442dd8 into r-lib:master Mar 17, 2016
1 check passed
1 check passed
continuous-integration/wercker Wercker build passed
Details
@krlmlr krlmlr deleted the krlmlr:feature/39-impute-srcref branch Mar 17, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.