Conversation
| async fn build_gcs_object_store(uri: &str) -> Result<Arc<dyn OSObjectStore>> { | ||
| // GCS enables cache for public buckets, we disable to improve consistency | ||
| let mut headers = HeaderMap::new(); | ||
| headers.insert(CACHE_CONTROL, "no-cache".parse().unwrap()); |
There was a problem hiding this comment.
Does this impact on read or write? Any implication that you can see we hardcode the cache behavior on user's behavior?
There was a problem hiding this comment.
It applies to all requests (read and write), but in practice what is does it that it sets the "Cache-Control" metadata of the uploaded files. object_store does not cache responses, so lance will always retrieve the files when performing a GET even for objects that could be cached
There was a problem hiding this comment.
Does it mark the file as "uncached" even users want to cache it (outside of the lance library), for example, if users want to have a cache setup for their data lake, will this overwrite their cache setup or is it only applicalicable for our lance library.
There was a problem hiding this comment.
It will apply to all http caches - so let's say if chrome used to cache files downloaded from the bucket, it will no longer do so. I'm not sure how the cache setup for data lakes work, but if the bucket is non-public the files are already setup to not be cached.
GCloud docs on Caching: https://cloud.google.com/storage/docs/xml-api/reference-headers#cachecontrol
close #780