test(bigframes): Disable gcsfs cache for tests#16995
Conversation
There was a problem hiding this comment.
Code Review
This pull request adds a session-scoped pytest fixture to disable the gcsfs listings cache, preventing failures caused by stale file metadata during system tests. The review feedback suggests that simply instantiating GCSFileSystem is insufficient for global effect; instead, the configuration should be set via fsspec.config.conf and the instance cache should be cleared, requiring an additional import of fsspec.
| # gcsfs by default uses a cache that can be stale, causing file loads to | ||
| # fail if the file was uploaded indirectly (eg via bq export job) during the | ||
| # course of the tests. disable the cache to avoid this. | ||
| gcsfs.GCSFileSystem(use_listings_cache=False) |
There was a problem hiding this comment.
Instantiating gcsfs.GCSFileSystem(use_listings_cache=False) only creates a single instance with the cache disabled and does not affect other instances created elsewhere in the codebase (e.g., by bigframes internals or pandas). To globally disable the listings cache for all gcsfs instances during the test session, you should set the configuration in fsspec.config.conf and clear the instance cache to ensure any existing instances are recreated with the new settings.
| gcsfs.GCSFileSystem(use_listings_cache=False) | |
| fsspec.config.conf["gcs"] = {"use_listings_cache": False} | |
| gcsfs.GCSFileSystem.clear_instance_cache() |
| from datetime import datetime | ||
| from typing import Dict, Generator, Optional | ||
|
|
||
| import gcsfs |
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
Fixes #<issue_number_goes_here> 🦕