Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot access files in inst/ #1

Open
tiernanmartin opened this issue Jul 27, 2018 · 6 comments
Open

Cannot access files in inst/ #1

tiernanmartin opened this issue Jul 27, 2018 · 6 comments

Comments

@tiernanmartin
Copy link
Owner

While working within the drakepkg directory itself, it would be nice if I could make() plans that include filepaths in the inst/ directory.

That won't work because when a user installs the package, the sub-directories of inst/ are moved to the package root directory which breaks the filepaths included in the plan.

@tiernanmartin
Copy link
Owner Author

tiernanmartin commented Jul 27, 2018

My current solution to this problem includes the following steps:

  1. Move all inst/ subdirectories to the package's root directory
  2. Add these directories to both .gitignore and .Rbuildignore
  3. Create directory junctions (I work on a Windows machine) for each of these directories which link to inst/

Now, I can run make(plan_example) from within the package and it can find all of the filepaths included in the plan.

I doubt this is considered a best practice, so I'll keep this issue open until I find a more elegant solution.

@wlandau
Copy link

wlandau commented Jul 31, 2018

What about a step in the plan that makes the files available?

plan <- drake_plan(
  get_files = target(
    command = {
      file_out("exdata/other-iris.xlsx")
      copy_pkg_files()
    },
    trigger = trigger(change = packageVersion("drakepkg"))
  )
)

@wlandau
Copy link

wlandau commented Jul 31, 2018

Or maybe you could just return the formally packaged versions of the datasets.

plan <- drake_plan(
  get_iris_data = target(
    command = {
      data(otheriris)
      otheriris
    },
    trigger = trigger(change = packageVersion("drakepkg"))
  )
)

@tiernanmartin
Copy link
Owner Author

I like the way trigger() is used in your suggestions. I definitely would not have thought to connect packageVersion() to a trigger!

@tiernanmartin
Copy link
Owner Author

I think it would be best for most use cases if a workflow's data are included as formal datasets in the package -- it's very intuitive. The USGS groundwater model package that I mentioned before is a good illustration of this approach.

Unfortunately, in my particular use case (GIS-type projects) I will run into file size limitation issues with this approach. A Github-hosted package with a bunch of ~5 GB .rda files isn't going to work very well.

Instead, I would like to host the files outside the package (right now I'm experimenting with osf.io at Ben Marwick's suggestion) and write plans that allow the package user to download the files into their working directory. The directory junctions allow the same plan to work regardless of whether it's run in the directory where I'm developing drakepkg or in the working directory of someone who is using the package.

So since my use case is likely to be more rare (and not good for a minimal example), I'll revise the package to include otheriris as a formal dataset and I can make a note somewhere about my external data workaround.

@wlandau
Copy link

wlandau commented Aug 1, 2018

osf.io and osfr sound great! There is a lot to explore here, and I think OSF deserves attention in fully-implemented examples. Would you be open to having a plan that uses osfr::download_files() to get data and triggers based on osfr::get_files_info()?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants