-
Notifications
You must be signed in to change notification settings - Fork 75
Description
Currently, _refresh_cache is called as part of a lot of methods to ensure the local cache is up-to-date.
Unfortunately, _refresh_cache makes a whole bunch of network requests to fetch metadata from the cloud storage service. In fact, it looks like a run through _refresh_cache that doesn't download a new copy still needs to make 4 of these requests, while a run through that does download makes a total of 6 requests. On my current internet connection, one of these requests is 350-400 ms. This means cloud paths may take a minimum of 1-2 sec for doing any method that hits _refresh_cache for a file that exists in cloud storage, no matter how small the file is.
cloudpathlib/cloudpathlib/cloudpath.py
Lines 612 to 650 in 8b230c3
| def _refresh_cache(self, force_overwrite_from_cloud=False): | |
| # nothing to cache if the file does not exist; happens when creating | |
| # new files that will be uploaded | |
| if not self.exists(): | |
| return | |
| if self.is_dir(): | |
| raise ValueError("Only individual files can be cached") | |
| # if not exist or cloud newer | |
| if ( | |
| not self._local.exists() | |
| or (self._local.stat().st_mtime < self.stat().st_mtime) | |
| or force_overwrite_from_cloud | |
| ): | |
| # ensure there is a home for the file | |
| self._local.parent.mkdir(parents=True, exist_ok=True) | |
| self.download_to(self._local) | |
| # force cache time to match cloud times | |
| os.utime(self._local, times=(self.stat().st_mtime, self.stat().st_mtime)) | |
| if self._dirty: | |
| raise OverwriteDirtyFile( | |
| f"Local file ({self._local}) for cloud path ({self}) has been changed by your code, but " | |
| f"is being requested for download from cloud. Either (1) push your changes to the cloud, " | |
| f"(2) remove the local file, or (3) pass `force_overwrite_from_cloud=True` to " | |
| f"overwrite." | |
| ) | |
| # if local newer but not dirty, it was updated | |
| # by a separate process; do not overwrite unless forced to | |
| if self._local.stat().st_mtime > self.stat().st_mtime: | |
| raise OverwriteNewerLocal( | |
| f"Local file ({self._local}) for cloud path ({self}) is newer on disk, but " | |
| f"is being requested for download from cloud. Either (1) push your changes to the cloud, " | |
| f"(2) remove the local file, or (3) pass `force_overwrite_from_cloud=True` to " | |
| f"overwrite." | |
| ) |
