Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"path must exist" with pak::lockfile_install() on archived/old packages #425

Closed
mcanouil opened this issue Oct 14, 2022 · 8 comments
Closed
Labels
bug an unexpected problem or unintended behavior

Comments

@mcanouil
Copy link

mcanouil commented Oct 14, 2022

Hi,

I have been facing an issue when restoring a lock file with pak when the version is no longer the latest available on CRAN.

For example, a lock file generated with pak for curl 4.3.2
{
  "lockfile_version": 1,
  "os": "Debian GNU/Linux 11 (bullseye)",
  "r_version": "R version 4.2.0 (2022-04-22)",
  "platform": "x86_64-pc-linux-gnu",
  "packages": [
    {
      "ref": "curl",
      "package": "curl",
      "version": "4.3.2",
      "type": "standard",
      "direct": false,
      "binary": false,
      "dependencies": [],
      "vignettes": false,
      "metadata": {
        "RemotePkgRef": "curl",
        "RemoteType": "standard",
        "RemoteRef": "curl",
        "RemoteRepos": "https://cloud.r-project.org",
        "RemotePkgPlatform": "source",
        "RemoteSha": "4.3.2"
      },
      "sources": ["https://cloud.r-project.org/src/contrib/curl_4.3.2.tar.gz", "https://cloud.r-project.org/src/contrib/Archive/curl_4.3.2.tar.gz"],
      "target": "src/contrib/curl_4.3.2.tar.gz",
      "platform": "source",
      "rversion": "*",
      "directpkg": false,
      "license": "MIT + file LICENSE",
      "dep_types": ["Depends", "Imports", "LinkingTo"],
      "params": [],
      "install_args": "",
      "needscompilation": true,
      "sha256": "90b1facb4be8b6315bb3d272ba2dd90b88973f6ea1ab7f439550230f8500a568",
      "filesize": 793345,
      "repotype": "cran"
    }
  ]
}
A Dockerfile (based on Debian 11) build using docker buildx build --platform "linux/amd64" --file Dockerfile .
ARG VARIANT="bullseye"
FROM --platform=linux/amd64 buildpack-deps:${VARIANT}-curl

ARG RIG_VERSION="latest"
ARG R_VERSION="4.2.0"
COPY pkg.lock pkg.lock
RUN wget -q -P /tmp/ "https://github.com/r-lib/rig/releases/download/${RIG_VERSION}/rig-linux-${RIG_VERSION#v}.tar.gz" \
    && tar -C /usr/local -zxvf /tmp/rig-linux-${RIG_VERSION#v}.tar.gz \
    && rig add ${R_VERSION}
COPY inst/setup/curl.lock pkg.lock
RUN Rscript -e "pak::lockfile_install(update = TRUE)"
The produced log
[+] Building 9.2s (9/9) FINISHED                                                                                                                                
 => [internal] load build definition from Dockerfile_curl                                                                                                  0.0s
 => => transferring dockerfile: 518B                                                                                                                       0.0s
 => [internal] load .dockerignore                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                            0.0s
 => [internal] load metadata for docker.io/library/buildpack-deps:bullseye-curl                                                                            1.1s
 => [1/5] FROM docker.io/library/buildpack-deps:bullseye-curl@sha256:126f5c7e09be8d7b1e64a6f07e164ac91ea8b7f9980edb7f4f14c3c306ec6011                      0.0s
 => [internal] load build context                                                                                                                          0.0s
 => => transferring context: 1.36kB                                                                                                                        0.0s
 => CACHED [2/5] COPY inst/setup/pkg.lock pkg.lock                                                                                                         0.0s
 => CACHED [3/5] RUN wget -q -P /tmp/ "https://github.com/r-lib/rig/releases/download/${RIG_VERSION}/rig-linux-${RIG_VERSION#v}.tar.gz"     && tar -C /us  0.0s
 => [4/5] COPY inst/setup/curl.lock pkg.lock                                                                                                               0.0s
 => ERROR [5/5] RUN Rscript -e "pak::lockfile_install(update = TRUE)"                                                                                      8.0s
------                                                                                                                                                          
 > [5/5] RUN Rscript -e "pak::lockfile_install(update = TRUE)":                                                                                                 
#0 5.268 i Installing lockfile 'pkg.lock'                                                                                                                       
#0 6.176                                                                                                                                                        
#0 6.213 > Will install 1 package.                                                                                                                              
#0 6.481 > Will download 1 CRAN package (793.35 kB).                                                                                                            
#0 6.509 + curl   4.3.2 [bld][cmp][dl] (793.35 kB)
#0 6.600 i Getting 1 pkg (793.35 kB)
#0 7.398 x Failed to download curl 4.3.2 (source)
#0 7.557 i Building curl 4.3.2
#0 7.874 
#0 7.874 Error: <callr_remote_error: `path` must exist>
#0 7.878  in process 49 
#0 7.878 -->
#0 7.878 <simpleError: `path` must exist>
#0 7.894 
#0 7.894  Stack trace:
#0 7.894 
#0 7.894  12. (function (...)  ...
#0 7.894  13. base:::withCallingHandlers(cli_message = function(msg) { ...
#0 7.894  14. get("lockfile_install_internal", asNamespace("pak"))(...)
#0 7.894  15. plan$install()
#0 7.894  16. pkgdepends:::install_package_plan(plan, lib = private$library,  ...
#0 7.894  17. base:::withCallingHandlers({ ...
#0 7.894  18. pkgdepends:::start_task(state, task)
#0 7.894  19. pkgdepends:::start_task_build(state, task)
#0 7.894  20. pkgdepends:::make_build_process(path, pkg, tmp_dir, lib, vignettes,  ...
#0 7.894  21. withr::with_libpaths(c(tmplib, lib), action = "prefix", pkgbuild_process$ne ...
#0 7.894  22. base:::force(code)
#0 7.894  23. pkgbuild_process$new(path, tmp_dir, binary = binary, vignettes = vignettes, ...
#0 7.894  24. pkgbuild:::initialize(...)
#0 7.894  25. pkgbuild:::rcb_init(self, private, super, path, dest_path, binary,  ...
#0 7.894  26. pkgbuild:::build_setup(path, dest_path, binary, vignettes, manual,  ...
#0 7.894  27. base:::stop("`path` must exist", call. = FALSE)
#0 7.894  28. base:::.handleSimpleError(function (e)  ...
#0 7.894  29. h(simpleError(msg, call))
#0 7.895  30. base:::stop(e)
#0 7.895  31. (function (e)  ...
#0 7.895 
#0 7.895  x `path` must exist 
#0 7.895 
#0 7.897 Execution halted
------
ERROR: failed to solve: executor failed running [/bin/sh -c Rscript -e "pak::lockfile_install(update = TRUE)"]: exit code: 1

Right now, the latest version of curl is 4.3.3 on CRAN.
If the lock file is pointing to the CRAN version, it is restored properly.

curl 4.3.3 lock file
{
  "lockfile_version": 1,
  "os": "Debian GNU/Linux 11 (bullseye)",
  "r_version": "R version 4.2.0 (2022-04-22)",
  "platform": "x86_64-pc-linux-gnu",
  "packages": [
    {
      "ref": "curl",
      "package": "curl",
      "version": "4.3.3",
      "type": "standard",
      "direct": true,
      "binary": false,
      "dependencies": [],
      "vignettes": false,
      "needscompilation": true,
      "metadata": {
        "RemoteType": "standard",
        "RemotePkgRef": "curl",
        "RemoteRef": "curl",
        "RemoteRepos": "https://cloud.r-project.org",
        "RemotePkgPlatform": "source",
        "RemoteSha": "4.3.3"
      },
      "sources": ["https://cloud.r-project.org/src/contrib/curl_4.3.3.tar.gz", "https://cloud.r-project.org/src/contrib/Archive/curl_4.3.3.tar.gz"],
      "target": "src/contrib/curl_4.3.3.tar.gz",
      "platform": "source",
      "rversion": "*",
      "directpkg": true,
      "license": "MIT + file LICENSE",
      "sha256": "3567b6acad40dad68acfe07511c853824839d451a50219a96dd6d125ed617c9e",
      "filesize": 670416,
      "dep_types": ["Depends", "Imports", "LinkingTo", "Suggests"],
      "params": [],
      "install_args": "",
      "repotype": "cran"
    }
  ]
}
Docker build log from the same Dockerfile as above
[+] Building 45.4s (10/10) FINISHED                                                                                                                             
 => [internal] load build definition from Dockerfile_curl                                                                                                  0.1s
 => => transferring dockerfile: 37B                                                                                                                        0.0s
 => [internal] load .dockerignore                                                                                                                          0.0s
 => => transferring context: 2B                                                                                                                            0.0s
 => [internal] load metadata for docker.io/library/buildpack-deps:bullseye-curl                                                                            1.2s
 => [internal] load build context                                                                                                                          0.0s
 => => transferring context: 1.37kB                                                                                                                        0.0s
 => [1/5] FROM docker.io/library/buildpack-deps:bullseye-curl@sha256:126f5c7e09be8d7b1e64a6f07e164ac91ea8b7f9980edb7f4f14c3c306ec6011                      0.0s
 => CACHED [2/5] COPY inst/setup/pkg.lock pkg.lock                                                                                                         0.0s
 => CACHED [3/5] RUN wget -q -P /tmp/ "https://github.com/r-lib/rig/releases/download/${RIG_VERSION}/rig-linux-${RIG_VERSION#v}.tar.gz"     && tar -C /us  0.0s
 => [4/5] COPY inst/setup/curl.lock pkg.lock                                                                                                               0.0s
 => [5/5] RUN Rscript -e "pak::lockfile_install(update = TRUE)"                                                                                           41.9s
 => exporting to image                                                                                                                                     2.1s 
 => => exporting layers                                                                                                                                    2.1s 
 => => writing image sha256:b6dcc0d3c9d22c040f2d4f0c5e0b200b7164d96e4a37869c25d506217cdd66f3 

The original issue was encountered with https://github.com/mcanouil/eggla/tree/main/inst/setup using the following GitHub Action workflow https://github.com/mcanouil/eggla/blob/main/.github/workflows/build-docker.yml.

@mcanouil mcanouil changed the title "path must exist" when pak::lockfile_install(update = TRUE) "path must exist" with pak::lockfile_install() on archived/old packages Oct 14, 2022
@gaborcsardi
Copy link
Member

From the manual:

Note, since the URLs of CRAN and most CRAN-like repositories change over time, in practice you cannot use the lock file much later. For example, binary packages of older package version might be deleted from the repository, breaking the URLs in the lock file.

Currently the intended use case of lock files in on CI systems, to facilitate caching. The (hash of the) lock file provides a good key for caching systems.

Unfortunately you cannot reliably use pak's lock files like this currently.

@mcanouil
Copy link
Author

oh, I missed that part.🤦‍♂️
Any suggestion to achieve the same thing?

  • Maybe renv+pak as installer 🤔
  • Does pak works with MRAN ? 🤔

@gaborcsardi
Copy link
Member

OTOH, this specific case should work fine, and it is indeed a bug in pak/pkgdepends: the alternative URL is wrong, instead of

https://cloud.r-project.org/src/contrib/Archive/curl_4.3.2.tar.gz

it should be

https://cloud.r-project.org/src/contrib/Archive/curl/curl_4.3.2.tar.gz

@gaborcsardi gaborcsardi added the bug an unexpected problem or unintended behavior label Oct 14, 2022
mcanouil added a commit to mcanouil/eggla that referenced this issue Oct 14, 2022
@gaborcsardi
Copy link
Member

gaborcsardi commented Oct 14, 2022

Yes, I think we could document the cases when lock files are reliable.

E.g. if you only use CRAN source packages, then they should work fine.

An MRAN snapshot should be also fine, as that never changes.

It is possible that RSPM snapshots also work, and packages from Bioc could be fines as well (possibly even their binaries), but I would need to double check that.

@mcanouil
Copy link
Author

Thanks, I think the safest would be to set MRAN as default repository in the Dockerfile and document it in the website for my current use case (when most of the development will be over).

gaborcsardi added a commit to r-lib/pkgcache that referenced this issue Oct 14, 2022
gaborcsardi added a commit that referenced this issue Oct 14, 2022
@gaborcsardi
Copy link
Member

This is now fixed in dev pkgcache and will be fixed in the the next dev pak builds. I'll comment here when they are available.

@gaborcsardi
Copy link
Member

This should be available now, except on arm64 platforms.

@mcanouil
Copy link
Author

Thanks, for now I simply manually updated the archive URLs.

mcanouil added a commit to mcanouil/eggla that referenced this issue Dec 5, 2022
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Dec 18, 2022
# pkgcache 2.0.3

* The `built` and `sysreqs` columns of the metadata case are always
  character vectors now, and not logicals, as it used to be in some
  edges cases in the past.

* The `deps` column of the metadata cache is not a tibble any more,
  but a data frame with a `tbl` class, as it should be.

* `cran_archive_*()` functions now only download the metadata if it is newer
  than what you have currently.

* `cran_archive_cleanup()` now does not ignore the `force` argument.

* The `sources` column in the metadata cache now has the correct URL for
  packages in the CRAN archive (r-lib/pak#425).

# pkgcache 2.0.2

* pkgcache error messages are better now.

* pkgcache now does not compress the metadata cache files, which makes
  loading the metadata cache faster.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants