Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Tests fragile with respect to the system locale #44625

Open
3 tasks done
burnpanck opened this issue Nov 26, 2021 · 5 comments
Open
3 tasks done

BUG: Tests fragile with respect to the system locale #44625

burnpanck opened this issue Nov 26, 2021 · 5 comments
Labels
Bug Testing pandas testing functions or related to the test suite Timeseries

Comments

@burnpanck
Copy link
Contributor

burnpanck commented Nov 26, 2021

Fragile tests with respect to the system locale

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of pandas.
  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

On a OS with it's locale set to e.g. de_CH, run the following:

pytest pandas/tests/tools/test_to_datetime.py

Issue Description

The tests of to_datetime seem to be fragile with respect to the environment the tests are run on. These tests were introduced in #25541, and that issue was already noted there: #25541 (comment). However, the "solution" as described in a subsequent comment: #25541 (comment) is not satisfactory: The tests are simply excluded if the locale of the running environment is set to either zh_CN or it_IT, presumably the two non-english locales found on the developers involved in that issue back at the time. However, I happen to be on de_CH, and I'm sure there are many more potential contributors out there who have trouble running tests locally, and not knowing why.

Expected Behavior

All tests on the master branch should pass, independent on the system locale.

Installed Versions

INSTALLED VERSIONS

commit : aad39a8
python : 3.9.7.final.0
python-bits : 64
OS : Darwin
OS-release : 21.1.0
Version : Darwin Kernel Version 21.1.0: Wed Oct 13 17:33:23 PDT 2021; root:xnu-8019.41.5~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : de_CH.UTF-8
LANG : en_GB.UTF-8
LOCALE : de_CH.UTF-8

pandas : 1.4.0.dev0+1072.gaad39a86d5
numpy : 1.21.4
pytz : 2021.3
dateutil : 2.8.2
pip : 21.3.1
setuptools : 57.4.0
Cython : 0.29.24
pytest : 6.2.5
hypothesis : 6.27.1
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : 2021.07.0
fastparquet : None
gcsfs : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 6.0.1
pyxlsb : None
s3fs : 2021.07.0
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : None

@burnpanck burnpanck added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 26, 2021
@jbrockmendel
Copy link
Member

PR to improve this would be welcome

@jbrockmendel
Copy link
Member

@burnpanck i could use your help testing a locale-related hypothesis. can you tell me what you get from np.dtype(np.datetime64).name, np.dtype(np.timedelta64).name and np.dtype("Float16")?

@burnpanck
Copy link
Contributor Author

@jbrockmendel: I fear your thesis is going to be falsified, I don't seem to get anything remotely locale-dependent:

for dts in [
    np.datetime64,
    np.timedelta64,
    "float16",
    "Float16",
]:
    try:
        dt = np.dtype(dts)
    except Exception as ex:
        print(f"{dts!r} failed! {ex!r}")
    else:
        print(f'{dts!r}: {dt!r} ("{dt!s}")')

outputs

<class 'numpy.datetime64'>: dtype('<M8') ("datetime64")
<class 'numpy.timedelta64'>: dtype('<m8') ("timedelta64")
'float16': dtype('float16') ("float16")
'Float16' failed! TypeError("data type 'Float16' not understood")

@jbrockmendel
Copy link
Member

Thanks for your help

@burnpanck
Copy link
Contributor Author

burnpanck commented Dec 22, 2021

What makes things even more difficult is that, under current master and on my local machine, the tests now pass when invoked as pytest pandas/tests/tools/test_to_datetime.py but not when invoked as ./test_fast.sh. Again, I do not know neither these tests nor the code it is supposed to test, so it is hard for me to understand what's really going on. All I can say is, the current master is broken for me, and I would not be surprised if that were the case also for anyone else not on either en_US, it_IT or zh_CN.

@mroeschke mroeschke added Testing pandas testing functions or related to the test suite Timeseries and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Dec 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Testing pandas testing functions or related to the test suite Timeseries
Projects
None yet
Development

No branches or pull requests

3 participants