New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DevPackage Dependencies #902

Merged
merged 12 commits into from Sep 2, 2015

Conversation

Projects
None yet
4 participants
@jimhester
Member

jimhester commented Aug 25, 2015

This allows package authors to specify DevPackage in their DESCRIPTION,
which devtools will use to install development versions of the
dependencies prior to looking for them in normal repositories. The
format is 'type|user/repo', where type is optional (default github)
defining the install_type to use, e.g. (git, gitorious, svn, ect.).

An example description would be (for the ggplot2 book for instance)

Package: ggbook2
Title: ggbook2 is not really an R package, but this is here to list the dependencies for building the ggplot2 book.
Version: 0.1
Authors@R: person("First", "Last", email = "first.last@example.com",
                  role = c("aut", "cre"))
Depends: R (>= 3.1.0)
URL: https://github.com/hadley/ggplot2-book
Imports:
  lubridate,
  rvest,
  magrittr,
  dplyr,
  plyr,
  tidyr,
  xtable,
  nlme,
  effects,
  broom,
  hexbin,
  maps,
  Hmisc,
  devtools,
  readr,
  rmarkdown,
  Lahman,
  ggmap,
  directlabels,
  wesanderson,
  babynames
DevPackage:
  hadley/bookdown,
  hadley/scales, 
  hadley/ggplot2,
  hadley/ggplot2movies,
  hadley/tidyr,
  hadley/directlabels, 
  jrnold/ggthemes
SystemRequirements: pandoc (>= 1.12.3) - http://johnmacfarlane.net/pandoc

This would allow people to install it with simply

devtools::install_github("hadley/ggplot2-book")

This implementation seems to work fine as-is, but I had a few questions as far as the API

  1. Is this useful (I think it is)!
  2. Is DevPackage a good name for this or should we use something else?
  3. is | a good delimiter for the type? I picked it because it should not appear in URLs or user/repository names. Using / or : seem like they would require more complicated logic to parse properly in all cases.
  4. Where is the best place to document/popularize this feature? The functions to implement are all non-exported.
  5. What is the best way to test this, just mock the install functions and verify we are passing them the correct parameters?
@hadley

This comment has been minimized.

Member

hadley commented Aug 25, 2015

  1. I like it. It's especially nice since it means you no longer need to describe in multiple places (e.g. .travis.yml and README.md and somewhere else that I forget, but is annoying to update).
  2. DevPackage seems ok to me, but I suspect there's a better name. Note also that CRAN will warn about non-standard fields in the DESCRIPTION, but I guess that's actually ok because they should be all removed by the time you submit to CRAN.
  3. Seems reasonable to me.
  4. Hmmm good question. Maybe we need an installation vignette that describes in general how dependencies are fixed?
  5. Rather than mocking (which I really rather dislike), I'd rather you separate the code out into parsing vs. doing. Then just test the parsing code. (The doing should just be a simple lapply() so shouldn't need testing)
@jimhester

This comment has been minimized.

Member

jimhester commented Aug 26, 2015

That is true about the non-standard fields, although it is only included in the general NOTE, and Writing R Extensions does state

There is no restriction on the use of other fields not mentioned here (but using other capitalizations of these field names would cause confusion). Fields Note, Contact (for contacting the authors/developers) and MailingList are in common use. Some repositories (including CRAN and R-forge) add their own fields.

So it shouldn't be be hard to justify it if needed. But the argument is largely moot as you said because once you submit to CRAN you wouldn't need this anyway.

It would be possible to get the name to pass the check by prefixing it with X-CRAN or with a standard field suffixed with Note, see r-source/library/tools/R/QC.R#L6605-L6610. But even the best of those options ImportsNote, Additional_repositoriesNote, ect seem a little hacky to me.

Other names we could consider are RemotePackages, RemoteDepends, just Remotes...

I will work on some tests and ping you when they are done!

@jimhester

This comment has been minimized.

Member

jimhester commented Aug 31, 2015

Ok e43e46e has tests, a note to NEWS and a vignette with example usage. Let me know if I can make it clearer on first reading.

I changed the field name to Remotes as I think it is both shorter and more accurate. Plus theoretically this doesn't have to just be for development packages, you could distribute packages entirely without CRAN if you wanted.

The main remaining limitation with this approach is there is no way to supply any additional arguments to the install_* functions. This won't be a large deal for the git based install functions, as installing from a branch/sha1/pull/tag is all supported in the syntax (@, #, ect). However for install_svn not being able to supply a branch or subdirectory parameter is pretty limiting. Even for the install_github() it would be nice to support specifying the host= argument for example.

However I don't know what syntax I could use that would be easily parsable. I had initially thought of a syntax like Remotes: user/repo(opt1 = 1, opt2 = 2), user2/repo2. Matching parenthesis with regular expressions while possible with modern engines is tricky to get right. It would also complicate the initial dependency splitting code (to disambiguate commas within parenthesis and without).

So I think it is ok to leave passing additional options out for now, as I am not sure how much use they would get and the current approach is much simpler.

@@ -0,0 +1,86 @@
---
title: "Devtools Remote Dependencies"

This comment has been minimized.

@hadley

hadley Sep 2, 2015

Member

Could we just make this "Devtools dependencies" and include remote versions as a section?

Remotes: github|hadley/ggplot2
```
### Git ###

This comment has been minimized.

@hadley

hadley Sep 2, 2015

Member

I think these could just be a single bulleted list

R/deps.R Outdated
# remote type
dev_remote_type <- function(remotes) {
dev_packages <- trimws(unlist(strsplit(remotes, ",[[:space:]]*")))

This comment has been minimized.

@hadley

hadley Sep 2, 2015

Member

I think this will requiring a bump in the R version in DESCRIPTION?

This comment has been minimized.

@jimhester

jimhester Sep 2, 2015

Member

I forgot trimws() was only available in new R versions. We can replace it with trimws <- function(x, gsub("^[[:space:]]+|[[:space:]]+$", "", x)) so we don't have to bump the version.

This comment has been minimized.

@jimhester

jimhester Sep 2, 2015

Member

I think lengths is also only available in R-devel as well, I will use the appropriate vapply instead.

@hadley

This comment has been minimized.

Member

hadley commented Sep 2, 2015

This looks good overall and I'm happy with the name Remotes.

I agree that it's fine to leave out additional options for other endpoints for now.

Could you please add a few tests for badly formed remote specifications?

@jimhester

This comment has been minimized.

Member

jimhester commented Sep 2, 2015

Ok jimhester@ce16021 should be good to go, I think I addressed all of your comments. Thank you for the review!

R/deps.R Outdated
return()
}
dev_packages <- trimws(unlist(strsplit(remotes, ",[[:space:]]*")))

This comment has been minimized.

@hadley

hadley Sep 2, 2015

Member

You forgot to make this trim_ws

R/deps.R Outdated
pieces <- strsplit(dev_packages, "|", fixed = TRUE)
lengths <- vapply(pieces, length, integer(1))

This comment has been minimized.

@hadley

hadley Sep 2, 2015

Member

I feel like this might be simpler if you just iterated once over pieces:

parse_one <- function(parts) {
  if (length(x) == 1) {
    type <- "github"
    repo <- x
  } else if (length(x) == 2) {
    type <- x[1]
    repo <- x[2]
  } else {
    stop("Malformed...")
  }
  fun <- match.fun(paste0("install_", type))
}
@wch

This comment has been minimized.

Member

wch commented Sep 2, 2015

Some thoughts:

  • It would be good for release() to check for the Remotes field and issue a warning or possibly even an error.
  • I personally don't like having github be the default. I know that it's by far the most common source of packages, but still, I think the location of a package should be explicit.
  • I think a :: separator would be nicer than |. For example, github::hadley/ggplot2, local::/pkgs/testthat, svn::https://github.com/hadley/stringr.

Also, is there a way to specify private git/github/bitbucket/svn repos? I suppose one possibility is for users to install those packages separately. But it would be nice not to have to do that. (I've almost never had to install a private package so there may well be an obvious way.)

@jimhester

This comment has been minimized.

Member

jimhester commented Sep 2, 2015

  • Agreed on release() checking for Remotes, that is a good suggestion!
  • The package authors can choose to make it explicit if they would like in the current implementation. I personally prefer making this syntax as simple as possible for the common case, but if you and hadley prefer it to be explicit always it can be so.
  • I agree :: is nicer than |, and also is familiar to R users as a bonus. I am happy to switch to :: as long as we can be sure it won't occur in URLs. Looking at https://en.wikipedia.org/wiki/Uniform_resource_identifier it looks safe to me.

private repos can currently be accessed in a couple of ways

  1. Using the git remote (rather than github and the direct private git url) e.g.
    git|https://github.com/hadley/devtools.git. This should also work for the other git based services.
  2. Using the github remote and setting the GITHUB_PAT environment variable to your github personal access token. This could be done on travis using their encrypted environment variables for instance.
  3. For svn it caches your credentials by default so it assuming you have used svn on the private repo before on that machine it should work.
    That information should probably be included in the vignette as well.

That should handle a large number of the cases however 100% support would require us to define a syntax that would allow passing in additional arguments to the backends. Something like (using the suggested ::)

Remotes: 
  github::hadley/ggplot[host = "github.company.com", repos = "cran.company.com"], 
  github::hadley/testthat

The only real tricky part in this is disambiguate the commas within the options from those outside, as the remotes are split on commas first.

@hadley

This comment has been minimized.

Member

hadley commented Sep 2, 2015

I prefer to omit the github - I think the vast vast majority of packages use github so it's ok to be terse in that case. I don't think we need to deal with private repos now - we can see what happens as people use this tool more.

@jimhester

This comment has been minimized.

Member

jimhester commented Sep 2, 2015

Ok! @hadley any opinion on using :: as Winston suggested? I can change that, add a check to release() to remove the Remotes: and fix the conflicts, then it can be merged.

@gaborcsardi

This comment has been minimized.

Member

gaborcsardi commented Sep 2, 2015

How about calling it Repositories? I think it is a better name than Remotes, especially if some of them are local.

Or how about using (and abusing) the already used Additional_repositories field?

@gaborcsardi

This comment has been minimized.

Member

gaborcsardi commented Sep 2, 2015

This is a great PR, by the way!

@jimhester

This comment has been minimized.

Member

jimhester commented Sep 2, 2015

R CMD check already does some stuff with Additional_repositories (https://github.com/wch/r-source/blob/1b8396c/src/library/tools/R/QC.R#L6746-L6808), so I don't think you could use it without causing errors. Plus that field name always grates on my nerves with it's random underscore and lowercase, inconsistency is the only consistency in R-base!

I am not opposed to using Repositories apart from being a longer name, as it is slightly more accurate...

@hadley

This comment has been minimized.

Member

hadley commented Sep 2, 2015

I don't really like Repositories because technically those urls are not repositories (they're individual packages).

I do think I prefer :: to |

@gaborcsardi

This comment has been minimized.

Member

gaborcsardi commented Sep 2, 2015

@hadley Fair point, I was already thinking about extending this to repositories. 👍)

jimhester added some commits Aug 25, 2015

DevPackage Dependencies
This allows package authors to specify DevPackage in their DESCRIPTION,
which devtools will use to install development versions of the
dependencies prior to looking for them in normal repositories.  The
format is 'type|user/repo', where type is optional (default github)
defining the install_type to use, e.g. (git, gitorious, svn, ect.).
Refactor remote dependency installation and add tests
1. Use Remotes as DESCRIPTION field
2. Rename internal functions to dev_remote_type, which does parsing and
function lookup
Need to use `[[` accessor rather than `$`
The latter partially matches remotesha.
@jimhester

This comment has been minimized.

Member

jimhester commented Sep 2, 2015

As of 2e9589b everything uses :: now, I added a check for Remotes: to release() and I rebased on eda31f7

hadley added a commit that referenced this pull request Sep 2, 2015

@hadley hadley merged commit 7211ff8 into r-lib:master Sep 2, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hadley

This comment has been minimized.

Member

hadley commented Sep 2, 2015

Thanks - this is awesome :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment