Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-96463: fix load_tzdata raising OSError, not `ZoneInfoNotFound… #99602

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sobolevn
Copy link
Member

@sobolevn sobolevn commented Nov 19, 2022

I expect this to fail at first. This will make sure that we have tzdata installed at least somewhere in out test suite.

@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sobolevn for commit 95dcf3e 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 19, 2022
@sobolevn sobolevn added the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 20, 2022
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @sobolevn for commit 22be349 🤖

If you want to schedule another build, you need to add the ":hammer: test-with-buildbots" label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-buildbots Test PR w/ buildbots; report in status section label Nov 20, 2022
@sobolevn
Copy link
Member Author

Looks like we don't have tzdata installed on any buildbot / CI 🤔
I don't think it a good idea. Since tzdata does impact the zoneinfo module quite a lot.

CC @pganssle

@gpshead
Copy link
Member

gpshead commented Nov 20, 2022

The ARM Raspbian bot has tzdata installed as a required package, and I suspect most anything else Debian based does as well.

@pganssle
Copy link
Member

Are we talking about the system tzdata package, or the PyPI package?

@sobolevn
Copy link
Member Author

sobolevn commented Nov 20, 2022

I am talking about these lines:

try:
    importlib.metadata.metadata("tzdata")
    HAS_TZDATA_PKG = True
except importlib.metadata.PackageNotFoundError:
    HAS_TZDATA_PKG = False

if HAS_TZDATA_PKG:
    raise ValueError('Fail the test')

@FFY00
Copy link
Member

FFY00 commented Dec 15, 2022

That'd be the PyPI package, but that is not required if you have system provided tzdata. I would expect most, if not all, the build bots to have system provided tzdata, so the behavior we are seeing makes sense.

@classmethod
def _new_instance(cls, key):
obj = super().__new__(cls)
obj._key = key
obj._file_path = obj._find_tzfile(key)
if obj._file_path is not None:
file_obj = open(obj._file_path, "rb")
else:
file_obj = _common.load_tzdata(key)
with file_obj as f:
obj._load_file(f)
return obj

We have two ways of loading tzdata. The first is via tzpath (searches in PYTHONTZPATH and TZDATA), which happens in zoneinfo._tzpath._find_tzfile, with the tzpath being initialized in zoneinfo._tzpath.reset_tzpath. This commonly uses system provided tzdata.

def find_tzfile(key):
"""Retrieve the path to a TZif file from a key."""
_validate_tzfile_path(key)
for search_path in TZPATH:
filepath = os.path.join(search_path, key)
if os.path.isfile(filepath):
return filepath
return None

def reset_tzpath(to=None):
global TZPATH
tzpaths = to
if tzpaths is not None:
if isinstance(tzpaths, (str, bytes)):
raise TypeError(
f"tzpaths must be a list or tuple, "
+ f"not {type(tzpaths)}: {tzpaths!r}"
)
if not all(map(os.path.isabs, tzpaths)):
raise ValueError(_get_invalid_paths_message(tzpaths))
base_tzpath = tzpaths
else:
env_var = os.environ.get("PYTHONTZPATH", None)
if env_var is not None:
base_tzpath = _parse_python_tzpath(env_var)
else:
base_tzpath = _parse_python_tzpath(
sysconfig.get_config_var("TZPATH")
)
TZPATH = tuple(base_tzpath)

The second one, is via a tzadata package, which happens in zonoinfo._common.load_tzdata, the function name could reflect this. The tzdata package exists to allow the zoneinfo module to still lookup data when the system does not provide tzdata. In these cases, the user can install tzdata via pip install tzdata.

def load_tzdata(key):
from importlib import resources
components = key.split("/")
package_name = ".".join(["tzdata.zoneinfo"] + components[:-1])
resource_name = components[-1]
try:
return resources.files(package_name).joinpath(resource_name).open("rb")
except (ImportError, FileNotFoundError, UnicodeEncodeError):
# There are three types of exception that can be raised that all amount
# to "we cannot find this key":
#
# ImportError: If package_name doesn't exist (e.g. if tzdata is not
# installed, or if there's an error in the folder name like
# Amrica/New_York)
# FileNotFoundError: If resource_name doesn't exist in the package
# (e.g. Europe/Krasnoy)
# UnicodeEncodeError: If package_name or resource_name are not UTF-8,
# such as keys containing a surrogate character.
raise ZoneInfoNotFoundError(f"No time zone found with key {key}")

The current tests override tzpath and let ZoneInfo find the tzdata via that mechanism, we don't have a test for load_tzdata it sems, perhaps it would make sense to add it.

@contextlib.contextmanager
def tzpath_context(self, tzpath, block_tzdata=True, lock=TZPATH_LOCK):
def pop_tzdata_modules():
tzdata_modules = {}
for modname in list(sys.modules):
if modname.split(".", 1)[0] != "tzdata": # pragma: nocover
continue
tzdata_modules[modname] = sys.modules.pop(modname)
return tzdata_modules
with lock:
if block_tzdata:
# In order to fully exclude tzdata from the path, we need to
# clear the sys.modules cache of all its contents — setting the
# root package to None is not enough to block direct access of
# already-imported submodules (though it will prevent new
# imports of submodules).
tzdata_modules = pop_tzdata_modules()
sys.modules["tzdata"] = None
old_path = self.module.TZPATH
try:
self.module.reset_tzpath(tzpath)
yield
finally:
if block_tzdata:
sys.modules.pop("tzdata")
for modname, module in tzdata_modules.items():
sys.modules[modname] = module
self.module.reset_tzpath(old_path)

def setUp(self):
with contextlib.ExitStack() as stack:
stack.enter_context(
self.tzpath_context(
self.tzpath,
block_tzdata=self.block_tzdata,
lock=TZPATH_TEST_LOCK,
)
)
self.addCleanup(stack.pop_all().close)
super().setUp()

Adding the invalid key to test_bad_keys won't do anything because the data will not be loaded via load_tzdata 🤣

The equivalent code for the tzdata codepath is find_tzfile, which does not have the same issue with raising OSError.

def find_tzfile(key):
"""Retrieve the path to a TZif file from a key."""
_validate_tzfile_path(key)
for search_path in TZPATH:
filepath = os.path.join(search_path, key)
if os.path.isfile(filepath):
return filepath
return None

Like I mentioned above, I think we need a specific test for load_tzdata.

@sobolevn does that make sense to you? Is there anything you think I am missing? I hope this makes things clearer for you 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants