Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

package dependencies and interdependencies #11

Closed
slager opened this issue Feb 22, 2023 · 15 comments
Closed

package dependencies and interdependencies #11

slager opened this issue Feb 22, 2023 · 15 comments
Assignees

Comments

@slager
Copy link
Contributor

slager commented Feb 22, 2023

I ran into some continued installation issues, which I think I've figured out how to resolve. A few observations:

  1. BirdFlowModels currently imports BirdFlowR, which seems unnecessary since it's just a data object
  2. the BirdFlowR vignette currently tries to load BirdFlowModels at build time, and errors out if the package doesn't exist yet
  3. If you install BirdFlowModels first to circumvent number 2, the current behavior is that it also installs its dependency BirdFlowR, but doesn't build the BirdFlowR vignette.
  4. (I think) BirdFlowR requires the package rnaturalearthdata, but it's only in the Suggests, so it's not installed by default, also throwing an error during the vignette bulid.

Overall, these things make the installation rather hard to control/troubleshoot.

A proposed solution:

  • In the BirdFlowR DESCRIPTION, move rnaturalearthdata from Suggests to Imports
  • In the BirdFlowModels DESCRIPTION, remove BirdFlowR from Depends and remove the entire Remotes section
  • In the BirdFlowModels NAMESPACE, remove import(BirdFlowR)
  • In the README.md for BirdFlowR, flip the order of the install_github lines so that BirdFlowModels is installed before BirdFlowR & its vignette.

When I made these changes, everything installed clean for me in a fresh Docker container. Also happy to submit any of this or upcoming suggestions as a PR(s) if that's easier.

@ethanplunkett
Copy link
Contributor

Thanks! That all makes sense. Building the two side by side I hadn't encountered this directly. It has also all been working in the automated checks on github which should be installing both packages fresh, but since the co-dependence adds roadblocks to installation it makes sense to simplify.

I'm not certain the BirdFlowModels package will pass rcmd check if it doesn't import or depend on the BirdFlowR which defines the class of its object; but as you say there's no code there and with lazy loading I'm not sure that running RCMD check will ever even require the object to be in memory. I guess I'll find out quickly if it doesn't pass the check.

@slager
Copy link
Contributor Author

slager commented Feb 22, 2023

Sounds good. My current understanding is that the .rda file contains all the necessary class information inside it natively for it to be loaded as an object into any R session, and that the BirdFlowR functions won't actually be needed until a method is called on the amewoo object. I don't see anything in the BirdFlowModels package that would necessitate importing BirdFlowR, but if it does fail a check because of this, I'm happy to be corrected and interested in learning why.

@ethanplunkett
Copy link
Contributor

Just tried it apparently (1) RCMD check does test load the object, (2) if its an S3 class you don't need to have the package that exports the class loaded, and (3) for S4 classes you do need to import or depend on the package that defines the class. So dropping the dependency on BirdFlowR initially broke BirdFlowModels, until I added Matrix to the Imports fields and imported the Matrix classes from it because BirdFlowR imports Matrix and uses it to store the sparse marginal matrices. In any case BirdFlowModels is passing the check now without importing or depending on BirdFlowR.

@ethanplunkett
Copy link
Contributor

A related question, I've been debating moving terra from imports to depends. The difference would be terra functions would be automatically attached to the search path when BirdFlowR is loaded; generally depends is frowned on as it can produce name conflicts and because it's the older way of doing things. That said if you expect your packages users to be using the functions in the other package than depends is recommended. Currently, if users don't load terra explicitly it's possible to produce a SpatRaster object with BirdFlow and then get an error when you try to plot it:

library(BirdFlowR)
library(BirdFlowModels)
a <- rast(amewoo, 1)  # works, producing a SpatRaster 
plot(a)  # Throws error "invalid type passed to graphics function"  because the terra's plot method isn't on the search path

ethanplunkett added a commit to birdflow-science/BirdFlowModels that referenced this issue Feb 22, 2023
@slager
Copy link
Contributor Author

slager commented Feb 22, 2023

Nice (re: getting the packages to install separately and pass the checks)!

For imports vs. depends, I am a fan of using imports, and then using terra::plot(a) to disambiguate in this case. Having examples of that and a note in the documentation could clue people in, or, alternatively, BirdFlowR could export a small wrapper function to accomplish this. Not a fan of putting terra into Depends, because then a much larger number of functions from terra will be in the namespace causing other potential issues.

@slager
Copy link
Contributor Author

slager commented Feb 22, 2023

With the new commit I'm still experiencing this error during vignette build. I'm not entirely sure what's going on, but when I put rnaturalearthdata into Imports for BirdFlowR, it resolved this issue. Maybe that's how to fix it or maybe there's another way?

── R CMD build ───────────────────────────────────────────────────────────────────────
✔  checking for file ‘/tmp/RtmpQTPqAu/remotes1232cbac44/birdflow-science-BirdFlowR-4f69a19/DESCRIPTION’ ...
─  preparing ‘BirdFlowR’:
✔  checking DESCRIPTION meta-information ...
─  installing the package to build vignettes
E  creating vignettes (37.2s)
   --- re-building ‘BirdFlowR.Rmd’ using rmarkdown
   Quitting from lines 153-157 (BirdFlowR.Rmd) 
   Error: processing vignette 'BirdFlowR.Rmd' failed with diagnostics:
   Failed to install the rnaturalearthdata package.
    Please try installing the package for yourself using the following command: 
    install.packages("rnaturalearthdata")
   --- failed re-building ‘BirdFlowR.Rmd’
   
   SUMMARY: processing the following file failed:
     ‘BirdFlowR.Rmd’
   
   Error: Vignette re-building failed.
   Execution halted
Error: Failed to install 'BirdFlowR' from GitHub:
  ! System command 'R' failed

@ethanplunkett
Copy link
Contributor

rnaturalearth is weird. It suggests rnaturalearthdata but doesn't depend or import it. Then the functions that need rnaturalearthdata run a check to see if it's installed and attempt to install it if it isn't installed. I think that's the install that's failing for you in docker.

The writing r extensions manual states that packages needed for examples and vignettes should be under suggest not imports to make lean installs possible. I suspect that CRAN wouldn't let rnaturalearth include their own large data package in imports and that's why they have it in suggests. All this makes me hesitant to put it in BirdFlowR's imports field. Tomorrow I'll add code to install the rnaturalearth data package to the beginning of the vignette and see if that helps.

ethanplunkett added a commit that referenced this issue Feb 22, 2023
…. Data package is now installed before RBirdFlow in readme and vignette.
@ethanplunkett ethanplunkett self-assigned this Feb 22, 2023
@slager
Copy link
Contributor Author

slager commented Feb 23, 2023

Even with the new commit I'm still getting this error during the vignette build:

── R CMD build ───────────────────────────────────────────────────────────────────────
✔  checking for file ‘/tmp/Rtmpj1HJ89/remotes1237e4da47a/birdflow-science-BirdFlowR-365b1bf/DESCRIPTION’ (347ms)
─  preparing ‘BirdFlowR’:
✔  checking DESCRIPTION meta-information ...
─  installing the package to build vignettes
E  creating vignettes (49.4s)
   --- re-building ‘BirdFlowR.Rmd’ using rmarkdown
   Quitting from lines 156-160 (BirdFlowR.Rmd) 
   Error: processing vignette 'BirdFlowR.Rmd' failed with diagnostics:
   Failed to install the rnaturalearthdata package.
    Please try installing the package for yourself using the following command: 
    install.packages("rnaturalearthdata")
   --- failed re-building ‘BirdFlowR.Rmd’
   
   SUMMARY: processing the following file failed:
     ‘BirdFlowR.Rmd’
   
   Error: Vignette re-building failed.
   Execution halted
Error: Failed to install 'BirdFlowR' from GitHub:
  ! System command 'R' failed

@ethanplunkett
Copy link
Contributor

ethanplunkett commented Feb 23, 2023

Last night I did some testing and devtools::install_cran() was failing to install rnaturalearthdata with a warning: "package ‘naturalearthdata’ is not available for this version of R"; but utils::install.packages() worked fine. This morning both functions can install it. Both functions are using getOptions("repos") to determine which CRAN mirror to use, set to "https://cran.rstudio.com/" on my system, so it's not a question of different mirrors. Last night my theory was that the devtools function had an excessively stringent requirement about what version of R the package was built under and thus it wasn't considering the binary as appropriate. The rnaturalearth package on cran has never been updated - this is by design - the intent is that the data should never change so shouldn't need to be re-downloaded often, it is probably also not rebuilt often but I'm not certain on that.

Both last night and this morning available.packages(ignore_repo_cache = TRUE) returned the same version of rnaturalearthdata and the MD5sum hasn't changed, so I don't think the change in behavior I'm seeing is from an updated package build on the repository. I'm baffled. Possibly this large package intermittently fails to install? The CRAN check on github hasn't had problems though and the behavior last night was consistent over multiple trials.

I am going to switch to the standard install.packages() in the Vignette as maybe it's more robust.

@slager
Copy link
Contributor Author

slager commented Feb 23, 2023

Weird!

@slager
Copy link
Contributor Author

slager commented Feb 24, 2023

Hmmm, looks like rnaturalearthdata still isn't installing properly during the vignette build. This works fine when I call the same install command interactively immediately afterwards, as you can see below, so it must be something about how the vignette build is handling it differently.

> devtools::install_github("birdflow-science/BirdFlowR", build_vignettes = TRUE, upgrade = 'always')
Downloading GitHub repo birdflow-science/BirdFlowR@HEAD
e1071        (1.7-12 -> 1.7-13) [CRAN]
rnaturale... (NA     -> 0.3.2 ) [CRAN]
lubridate    (1.9.1  -> 1.9.2 ) [CRAN]
Installing 3 packages: e1071, rnaturalearth, lubridate
Installing packages into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/e1071_1.7-13.tar.gz'
Content type 'binary/octet-stream' length 576353 bytes (562 KB)
==================================================
downloaded 562 KB

trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/rnaturalearth_0.3.2.tar.gz'
Content type 'binary/octet-stream' length 640761 bytes (625 KB)
==================================================
downloaded 625 KB

trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/lubridate_1.9.2.tar.gz'
Content type 'binary/octet-stream' length 962315 bytes (939 KB)
==================================================
downloaded 939 KB

* installing *binary* package ‘e1071’ ...
* DONE (e1071)
* installing *binary* package ‘rnaturalearth’ ...
* DONE (rnaturalearth)
* installing *binary* package ‘lubridate’ ...
* DONE (lubridate)

The downloaded source packages are in
	‘/tmp/RtmpM0xauo/downloaded_packages’
── R CMD build ───────────────────────────────────────────────────────────────────────
✔  checking for file ‘/tmp/RtmpM0xauo/remotes1235922f162/birdflow-science-BirdFlowR-5ab946f/DESCRIPTION’ (801ms)
─  preparing ‘BirdFlowR’:
✔  checking DESCRIPTION meta-information ...
─  installing the package to build vignettes (483ms)
E  creating vignettes (48.3s)
   --- re-building ‘BirdFlowR.Rmd’ using rmarkdown
   Quitting from lines 156-160 (BirdFlowR.Rmd) 
   Error: processing vignette 'BirdFlowR.Rmd' failed with diagnostics:
   Failed to install the rnaturalearthdata package.
    Please try installing the package for yourself using the following command: 
    install.packages("rnaturalearthdata")
   --- failed re-building ‘BirdFlowR.Rmd’
   
   SUMMARY: processing the following file failed:
     ‘BirdFlowR.Rmd’
   
   Error: Vignette re-building failed.
   Execution halted
Error: Failed to install 'BirdFlowR' from GitHub:
  ! System command 'R' failed
> install.packages('rnaturalearthdata')
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/rnaturalearthdata_0.1.0.tar.gz'
Content type 'binary/octet-stream' length 3234253 bytes (3.1 MB)
==================================================
downloaded 3.1 MB

* installing *binary* package ‘rnaturalearthdata’ ...
* DONE (rnaturalearthdata)

The downloaded source packages are in
	‘/tmp/RtmpM0xauo/downloaded_packages’

@ethanplunkett
Copy link
Contributor

I was curious how this is working on github actions for BirdFlowR

Our action calls this action to install the packages:

Which uses pak (not devtools!) to install packages.

Maybe try this in docker:

install.packages("pak")
pak::pkg_install("github::birdflow-science/BirdFlowModels", ask = FALSE, dependencies = TRUE)

That should install both the formal dependencies (depends, imports) and the packages that are under suggests. The pak manual doesn't include the word vignette anywhere and a test shows that it doesn't build vignettes so if we really want a vignette we could follow up with a call to devtools::install_github("birdflow-science/BirdFlowR" or use pak for rnaturalearthdata and then install BirdFlowR with devtools. Both of these options feel like kludges. Once I've setup pkgdown the vignettes will be rendered on the website and it may matter less that they aren't installed locally.

@ethanplunkett
Copy link
Contributor

Or maybe the install instructions if you want to build vignettes should be:

install.packages("devtools")  
install.packages("rnaturalearthdata")
devtools::install_github("birdflow-science/BirdFlowModels")  # data package
devtools::install_github("birdflow-science/BirdFlowR", build_vignettes = TRUE)

@slager
Copy link
Contributor Author

slager commented Feb 24, 2023

Agreed that it might be easier at this point just to provide separate instructions for building the vignette rather than continue troubleshooting rnaturalearthdata installation.

The below worked well for me to build the packages and vignette from a clean rocker/geospatial:4.2.2 container

(We could consider using remotes:: instead of devtools:: in the instructions since it will sometimes require end users to install/compile fewer things)

https://devtools.r-lib.org/#conscious-uncoupling

> library(remotes)
> install.packages("rnaturalearthdata")
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/rnaturalearthdata_0.1.0.tar.gz'
Content type 'binary/octet-stream' length 3234253 bytes (3.1 MB)
==================================================
downloaded 3.1 MB

* installing *binary* package ‘rnaturalearthdata’ ...
* DONE (rnaturalearthdata)

The downloaded source packages are in
	‘/tmp/Rtmpqq3hNn/downloaded_packages’
> 
> remotes::install_github("birdflow-science/BirdFlowModels")  # data package
Downloading GitHub repo birdflow-science/BirdFlowModels@HEAD
── R CMD build ───────────────────────────────────────────────────────────────────────
✔  checking for file ‘/tmp/Rtmpqq3hNn/remotes123511cf7b6/birdflow-science-BirdFlowModels-b11d55e/DESCRIPTION’ (513ms)
─  preparing ‘BirdFlowModels’:
✔  checking DESCRIPTION meta-information ...
─  checking for LF line-endings in source and make files and shell scripts
─  checking for empty or unneeded directories
─  building ‘BirdFlowModels_0.0.0.9002.tar.gz’
   
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
* installing *source* package ‘BirdFlowModels’ ...
** using staged installation
** R
** data
*** moving datasets to lazyload DB
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (BirdFlowModels)
> remotes::install_github("birdflow-science/BirdFlowR", build_vignettes = TRUE, upgrade = 'always')
Downloading GitHub repo birdflow-science/BirdFlowR@HEAD
e1071        (1.7-12 -> 1.7-13) [CRAN]
rnaturale... (NA     -> 0.3.2 ) [CRAN]
lubridate    (1.9.1  -> 1.9.2 ) [CRAN]
Installing 3 packages: e1071, rnaturalearth, lubridate
Installing packages into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/e1071_1.7-13.tar.gz'
Content type 'binary/octet-stream' length 576353 bytes (562 KB)
==================================================
downloaded 562 KB

trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/rnaturalearth_0.3.2.tar.gz'
Content type 'binary/octet-stream' length 640761 bytes (625 KB)
==================================================
downloaded 625 KB

trying URL 'https://packagemanager.posit.co/cran/__linux__/jammy/latest/src/contrib/lubridate_1.9.2.tar.gz'
Content type 'binary/octet-stream' length 962315 bytes (939 KB)
==================================================
downloaded 939 KB

* installing *binary* package ‘e1071’ ...
* DONE (e1071)
* installing *binary* package ‘rnaturalearth’ ...
* DONE (rnaturalearth)
* installing *binary* package ‘lubridate’ ...
* DONE (lubridate)

The downloaded source packages are in
	‘/tmp/Rtmpqq3hNn/downloaded_packages’
── R CMD build ───────────────────────────────────────────────────────────────────────
✔  checking for file ‘/tmp/Rtmpqq3hNn/remotes1237e44f353/birdflow-science-BirdFlowR-b63fbf4/DESCRIPTION’ (686ms)
─  preparing ‘BirdFlowR’:
✔  checking DESCRIPTION meta-information
─  installing the package to build vignettes (535ms)
✔  creating vignettes (1m 52.7s)
─  checking for LF line-endings in source and make files and shell scripts (437ms)
─  checking for empty or unneeded directories
─  building ‘BirdFlowR_0.0.0.9002.tar.gz’
   
Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)
* installing *source* package ‘BirdFlowR’ ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (BirdFlowR)
* ```

@ethanplunkett
Copy link
Contributor

Docker file added with: 3885c69 (instructions)
Updated README installation instructions: 964a62f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants