Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upVectorize arguments of webshot #33
Conversation
|
Wow, this looks really nice! Thanks for your efforts adding this functionality. |
|
I like the general approach, although since the change is pretty substantial, I'm concerned about the risk of breaking existing behavior. If you can verify that you've tested the behavior for the following (vectorized and un-vectorized), that would help:
Also, it would be good to have one or two examples in the README and in the vignette. |
| #' @param file Name of output file. Should end with \code{.png}, \code{.pdf}, or | ||
| #' \code{.jpeg}. | ||
| #' @param url A vector of URLs to visit. | ||
| #' @param file A Vector of names of output files. Should end with \code{.png}, |
wch
Nov 28, 2016
Owner
Should be "vector" instead of "Vector".
Should be "vector" instead of "Vector".
wch
Nov 28, 2016
Owner
Also, can you describe what happens to filenames if there are multiple URLs but only one filename?
Also, can you describe what happens to filenames if there are multiple URLs but only one filename?
| #' \code{.jpeg}. | ||
| #' @param url A vector of URLs to visit. | ||
| #' @param file A Vector of names of output files. Should end with \code{.png}, | ||
| #' \code{.pdf}, or \code{.jpeg}. |
wch
Nov 28, 2016
Owner
Could you indent consistently with the other documentation? If you're using RStudio, you can press Ctrl-Shift-/ (or Cmd-Shift-/ on Mac) to automatically reflow/indent correctly.
Could you indent consistently with the other documentation? If you're using RStudio, you can press Ctrl-Shift-/ (or Cmd-Shift-/ on Mac) to automatically reflow/indent correctly.
| @@ -112,6 +124,17 @@ webshot <- function( | |||
| stop("Need url.") | |||
| } | |||
|
|
|||
| if (length(url) > 1) { | |||
| if (length(file) == 1) { | |||
| file <- sapply(1:length(url), function(i) { | |||
wch
Nov 28, 2016
Owner
Please use vapply or lapply instead of sapply -- this makes it easier for others to infer what is returned.
Also, use seq_along(url) instead of 1:length(url).
Please use vapply or lapply instead of sapply -- this makes it easier for others to infer what is returned.
Also, use seq_along(url) instead of 1:length(url).
| gsub("\\.(.{3,4})$", replacement, file) | ||
| }) | ||
| } else if (length(file) != length(url)) { | ||
| stop("parameters 'url' and 'file' should have same length") |
wch
Nov 28, 2016
Owner
Should say that file can also be length 1.
Should say that file can also be length 1.
| @@ -112,6 +124,17 @@ webshot <- function( | |||
| stop("Need url.") | |||
| } | |||
|
|
|||
| if (length(url) > 1) { | |||
| if (length(file) == 1) { | |||
wch
Nov 28, 2016
Owner
It would be helpful to have a short comment describing what this block does.
It would be helpful to have a short comment describing what this block does.
| if (opts.selector) { | ||
| opts.selector = opts.selector.split(","); | ||
| } | ||
| var data = JSON.parse(args[1]); |
wch
Nov 28, 2016
Owner
I think data is a bit vague. Something like allOpts would be better (though there's probably a better name that could be used).
I think data is a bit vague. Something like allOpts would be better (though there's probably a better name that could be used).
| var opts = response.data; | ||
|
|
||
| // Prepare options | ||
| opts = utils.merge(opt_defaults, opts); |
wch
Nov 28, 2016
Owner
I believe the utils.merge function has the side effect of modifying opts_defaults, which is undesirable, since that variable gets reused now.
I believe the utils.merge function has the side effect of modifying opts_defaults, which is undesirable, since that variable gets reused now.
FrancoisGuillem
Nov 28, 2016
Author
Contributor
Does replacing this by opts = utils.merge(opts, opt_defaults); resolve the problem ?
Does replacing this by opts = utils.merge(opts, opt_defaults); resolve the problem ?
FrancoisGuillem
Nov 28, 2016
Author
Contributor
Ho no I see the problem, forget it.
Ho no I see the problem, forget it.
wch
Nov 28, 2016
•
Owner
That has different behavior: it will overwrite values in opts with those from opt_defaults, so that the defaults will always supersede the values in opts. I think it would make sense to add a (shallow) object cloning function called utils.clone.
That has different behavior: it will overwrite values in opts with those from opt_defaults, so that the defaults will always supersede the values in opts. I think it would make sense to add a (shallow) object cloning function called utils.clone.
| this.capture(filename, cr); | ||
| // Go to url and perform the desired screenshot | ||
| this.thenOpen(url, function(r) { | ||
| this.zoom(opts.zoom) |
wch
Nov 28, 2016
Owner
I'm not 100% sure about this, but I believe that it's more efficient to do the zoom and viewport sizing before opening the URL.
I think it would be better to do the chaining like this:
casper
.zoom(...)
.viewport(...)
.thenOpen(...)
.wait(..., function() {
...
});
I'm not 100% sure about this, but I believe that it's more efficient to do the zoom and viewport sizing before opening the URL.
I think it would be better to do the chaining like this:
casper
.zoom(...)
.viewport(...)
.thenOpen(...)
.wait(..., function() {
...
});| }) | ||
| } | ||
|
|
||
| if (!is.null(selector)) data$selector <- argToVec(selector) |
wch
Nov 28, 2016
Owner
selector and expand should probably check for the correct number of items (either 1 or 4) in the R code instead of in the Javascript, because an error in the JS could now result in a partially-successful run. Better to have it not work at all than to partially work.
selector and expand should probably check for the correct number of items (either 1 or 4) in the R code instead of in the Javascript, because an error in the JS could now result in a partially-successful run. Better to have it not work at all than to partially work.
| eval(opts.eval); | ||
| } | ||
| casper.eachThen(data, function(response) { | ||
| var url = response.data.url; |
wch
Nov 28, 2016
Owner
url and file don't need to be pulled out separately from the rest of the opts, do they?
url and file don't need to be pulled out separately from the rest of the opts, do they?
…optsList" in the R code for consistency
|
I think I have corrected every problem. I have tested unvectorized and vectorized versions of examples of the vignette and everything looks fine. Invalid URLs are better handled. Before, I Have not tested functions resize and shrink because I have not the required tools on my profesional computer. Could you test it ? |
|
@cuche27 Looking good! I have more comments though... I don't have time right now to test out those functions but I may in the next couple of days. @yihui with these changes, the object of class @jimhester will the vector of names work with the code you've written to use |
| @@ -21,6 +21,11 @@ | |||
| #' } | |||
| #' @export | |||
| resize <- function(filename, geometry) { | |||
| mapply(.resize, filename = filename, geometry = geometry) | |||
wch
Nov 29, 2016
Owner
2 issues with mapply:
- Like
sapply, the return type isn't known in advance (it could be a list, char vector, numeric vector, etc.) unless you use SIMPLIFY = FALSE.
- If you pass an unnamed char vector to
mapply, it returns a named vector or list, which is undesirable here. (For example, try mapply(function(x) 1, c("a","b")).)
To ensure that the result is an unnamed char vector, it should be something like:
as.character(mapply( ... , USE.NAMES = FALSE, SIMPLIFY = FALSE))
Also, I think the documentation needs to be updated to indicate that the args can be vectors. This would be similar to the documentation changes for webshot.
2 issues with mapply:
- Like
sapply, the return type isn't known in advance (it could be a list, char vector, numeric vector, etc.) unless you useSIMPLIFY = FALSE. - If you pass an unnamed char vector to
mapply, it returns a named vector or list, which is undesirable here. (For example, trymapply(function(x) 1, c("a","b")).)
To ensure that the result is an unnamed char vector, it should be something like:
as.character(mapply( ... , USE.NAMES = FALSE, SIMPLIFY = FALSE))Also, I think the documentation needs to be updated to indicate that the args can be vectors. This would be similar to the documentation changes for webshot.
| structure(filename, class = "webshot") | ||
| } | ||
|
|
||
| .resize <- function(filename, geometry) { |
wch
Nov 29, 2016
Owner
I'd prefer this to be called resize_one.
I'd prefer this to be called resize_one.
| if(!is.null(expand) && !is.list(expand)) expand <- list(expand) | ||
|
|
||
| # Check length of arguments | ||
| lengthOfArgs <- vapply(environment(), length, numeric(1)) |
wch
Nov 29, 2016
Owner
I don't like the vapply(environment()), because it's fragile (if code above is modified to add a new var, it could break) and because using vapply on an environment just seems strange.
I don't like the vapply(environment()), because it's fragile (if code above is modified to add a new var, it could break) and because using vapply on an environment just seems strange.
| gsub("\\.(.{3,4})$", replacement, file) | ||
| }) | ||
| } | ||
|
|
||
| if (is_windows()) { | ||
| url <- fix_windows_url(url) |
wch
Nov 29, 2016
Owner
I think fix_windows_url won't work with multiple URLs.
I think fix_windows_url won't work with multiple URLs.
FrancoisGuillem
Nov 29, 2016
Author
Contributor
It looks fine actually. All functions used in fix_windows_url accept vectors.
It looks fine actually. All functions used in fix_windows_url accept vectors.
wch
Nov 29, 2016
Owner
Yes, but it calls if(grepl(...)), and if is not vectorized. For example:
> if (c(T, F, F)) 1 else 0
[1] 1
Warning message:
In if (c(T, F, F)) 1 else 0 :
the condition has length > 1 and only the first element will be used
Yes, but it calls if(grepl(...)), and if is not vectorized. For example:
> if (c(T, F, F)) 1 else 0
[1] 1
Warning message:
In if (c(T, F, F)) 1 else 0 :
the condition has length > 1 and only the first element will be used
FrancoisGuillem
Nov 29, 2016
Author
Contributor
Oh yes I missed that. It should be fixed now.
Oh yes I missed that. It should be fixed now.
| ' phantomjs webshot.js <url> <name>.png [options]'); | ||
| if (args.length < 2) { | ||
| console.log( | ||
| 'usage: phantomjs webshot.js <optsList>\n' + |
wch
Nov 29, 2016
Owner
I'd prefer the original formatting, which puts "Usage:" on a separate line:
Usage:
phantomjs webshot.js <optsList>
...
I'd prefer the original formatting, which puts "Usage:" on a separate line:
Usage:
phantomjs webshot.js <optsList>
...
| casper.zoom(opts.zoom) | ||
| .viewport(opts.zoom * opts.vwidth, opts.zoom * opts.vheight) | ||
| .thenOpen(opts.url, function(r) { | ||
| casper.wait(opts.delay * 1000, function() { |
wch
Nov 29, 2016
Owner
I think the usual convention is to use this instead of casper, right?
I think the usual convention is to use this instead of casper, right?
FrancoisGuillem
Nov 29, 2016
Author
Contributor
I fill unconfortable with this convention but indeed it is the case. I will change it back :)
I fill unconfortable with this convention but indeed it is the case. I will change it back :)
| @@ -77,6 +81,11 @@ resize <- function(filename, geometry) { | |||
| #' } | |||
| #' @export | |||
| shrink <- function(filename) { | |||
| mapply(.shrink, filename = filename) | |||
| structure(filename, class = "webshot") | |||
wch
Nov 29, 2016
Owner
Same issues here as with resize.
Same issues here as with resize.
| #' of multiple URLs, this parameter can also be a list with same length as | ||
| #' \code{url} with each element of the list being "viewport" or a | ||
| #' four-elements numeric vector. This option is not compatible with | ||
| #' \code{selector} |
wch
Nov 29, 2016
Owner
Missing period at end of sentence.
Missing period at end of sentence.
| @@ -45,6 +56,12 @@ | |||
| #' # Might need a longer delay for all assets to display | |||
| #' webshot("http://rstudio.github.io/leaflet", delay = 0.5) | |||
| #' | |||
| #' # One can also take screenshots of several URLs with only one command. | |||
| #' # This is more efficient than calling multiple times 'webshot'. | |||
wch
Nov 29, 2016
Owner
"... calling 'webshot' multiple times."
"... calling 'webshot' multiple times."
| @@ -2,6 +2,8 @@ | |||
| #' | |||
| #' @inheritParams webshot | |||
| #' @param app A Shiny app object, or a string naming an app directory. | |||
| #' @param file Name of output file. Should end with \code{.png}, \code{.pdf}, or | |||
wch
Nov 29, 2016
Owner
I don't think this is necessary, because appshot's documentation should inherit values from webshot.
I don't think this is necessary, because appshot's documentation should inherit values from webshot.
|
I can make changes in knitr after this PR is merged. |
|
No this does not currently work with printing within knitr documents. The |
|
Actually this seems to be all that is needed to get it working. diff --git a/R/webshot.R b/R/webshot.R
index 02f9c0d..1aadbdc 100644
--- a/R/webshot.R
+++ b/R/webshot.R
@@ -236,9 +236,11 @@ webshot <- function(
}
knit_print.webshot <- function(x, ...) {
- res <- readBin(x, "raw", file.size(x))
- ext <- gsub(".*[.]", "", basename(x))
- structure(list(image = res, extension = ext), class = "html_screenshot")
+ lapply(x, function(filename) {
+ res <- readBin(filename, "raw", file.size(filename))
+ ext <- gsub(".*[.]", "", basename(filename))
+ structure(list(image = res, extension = ext), class = "html_screenshot")
+ })
}A simple test is knit(text = c("```{r}", "webshot(c(\"https://www.r-project.org/\", \"https://cran.r-project.org/package=webshot\"))", "```"))
#> [1] "\n```r\nwebshot(c(\"https://www.r-project.org/\", \"https://cran.r-project.org/package=webshot\"))\n```\n\n" |
|
I have tested functions |
|
Thanks! |
|
Oh, I just realized that the result of the |
Hello,
This pull request solves #32 . It vectorizes all arguments of "webshot" in order to improve performance when taking a lot of screenshots. For instance, on my laptop, taking 10 screenshots with default parameters takes 7 seconds instead of 27 seconds.
Parameters for each screenshot are passed to script webshot.js as a JSON array. Then
casper.eachThenis used to loop over each configuration object.