This repository has been archived by the owner on Jan 12, 2024. It is now read-only.
Allow local file caching to be disabled when appropriate #6
Labels
intake
Intake data catalogs
performance
Make data go faster by using less memory, disk, network, compute, etc.
Local file caching is via
simplecache::
is hugely valuable when you have a lot of cheap disk and a slower net connection (WFH),but it's not necessarily appropriate in a cloud computing context (e.g. our JupyterHub or CI/CD) where the network is extremely fast, there are no data egress fees, and fast disk is more likely to be constrained.If we are going to use our Intake data catalog as a primary means of accessing versioned, processed data, the user should be able to turn off caching when appropriate. Is this as easy as not setting
PUDL_INTAKE_CACHE
so there's no designated location for the cache? Or can it / should it be set explicitly in the arguments to the data source?The text was updated successfully, but these errors were encountered: