-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Maybe use a cache for larger data sets #2
Comments
https://github.com/emilhvitfeldt/textdata is an expanded version of what you are proposing. You are free to take bits and pieces as you need. |
Yea that looks great! I also didn't know rappdirs was an actual package. This will be helpful too |
yes rappdirs is gonna save you a lot of headaches |
Alternatively, gargle stores things in |
Should have:
|
should also have a way to overwrite the file path |
yea i think the order of figuring out what path to use should look something like:
|
Perfect timing for some new data too. @trang1618 😃 |
Would pins be able to solve this problem as well? |
|
We are doing all this in https://github.com/tidymodels/modeldatatoo 😄 |
This issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with a reprex: https://reprex.tidyverse.org) and link to this issue. |
To get around CRAN's package size limit, we could try and have URLs that point to data sets which would live on github in this repo, and then cache them on the user's machine.
I imagine it would look like:
We could follow the lead of pak, which uses the following function to determine where R's global permanent cache is:
https://github.com/r-lib/pak/blob/e65de1e9630dbfcaf1044718b742bf806486b107/R/utils.R#L84
and then we could save into
<cache-path>/model-data/ames.rds
To be even faster, we would only load the data once per R session. Once we load it from the cache directory, we would store it in an environment internal to
modeldata
and pull it from there each timedata_ames()
is called. So it might look more like:The datasets themselves would actually live in a folder in this repo that would be
.Rbuildignore
-d. For example:inst/data/ames.rds
and then ignoreinst/data
The text was updated successfully, but these errors were encountered: