CI: New Numpydev version broke ci #44343

phofl · 2021-11-07T15:42:10Z

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

Two tests are affected

Issue Description


    @pytest.mark.parametrize("quantile", [0.0, 0.1, 0.45, 0.5, 1])
    @pytest.mark.parametrize(
        "interpolation", ["linear", "lower", "higher", "nearest", "midpoint"]
    )
    @pytest.mark.parametrize(
        "data",
        [
            [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0],
            [8.0, 1.0, 3.0, 4.0, 5.0, 2.0, 6.0, 7.0],
            [0.0, np.nan, 0.2, np.nan, 0.4],
            [np.nan, np.nan, np.nan, np.nan],
            [np.nan, 0.1, np.nan, 0.3, 0.4, 0.5],
            [0.5],
            [np.nan, 0.7, 0.6],
        ],
    )
    def test_rolling_quantile_interpolation_options(quantile, interpolation, data):
        # Tests that rolling window's quantile behavior is analogous to
        # Series' quantile for each interpolation option
        s = Series(data)
    
        q1 = s.quantile(quantile, interpolation)
        q2 = s.expanding(min_periods=1).quantile(quantile, interpolation).iloc[-1]
    
        if np.isnan(q1):
            assert np.isnan(q2)
        else:
>           assert q1 == q2
E           assert 0.040000000000000036 == 0.04000000000000001

pandas/tests/window/test_rolling.py:1664: AssertionError
________ test_rolling_quantile_interpolation_options[data2-linear-0.45] ________
[gw0] linux -- Python 3.9.7 /usr/share/miniconda/envs/pandas-dev/bin/python

quantile = 0.45, interpolation = 'linear', data = [0.0, nan, 0.2, nan, 0.4]

    @pytest.mark.parametrize("quantile", [0.0, 0.1, 0.45, 0.5, 1])
    @pytest.mark.parametrize(
        "interpolation", ["linear", "lower", "higher", "nearest", "midpoint"]
    )
    @pytest.mark.parametrize(
        "data",
        [
            [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0],
            [8.0, 1.0, 3.0, 4.0, 5.0, 2.0, 6.0, 7.0],
            [0.0, np.nan, 0.2, np.nan, 0.4],
            [np.nan, np.nan, np.nan, np.nan],
            [np.nan, 0.1, np.nan, 0.3, 0.4, 0.5],
            [0.5],
            [np.nan, 0.7, 0.6],
        ],
    )
    def test_rolling_quantile_interpolation_options(quantile, interpolation, data):
        # Tests that rolling window's quantile behavior is analogous to
        # Series' quantile for each interpolation option
        s = Series(data)
    
        q1 = s.quantile(quantile, interpolation)
        q2 = s.expanding(min_periods=1).quantile(quantile, interpolation).iloc[-1]
    
        if np.isnan(q1):
            assert np.isnan(q2)
        else:
>           assert q1 == q2
E           assert 0.18000000000000005 == 0.18000000000000002

Expected Behavior

Passing

Installed Versions

INSTALLED VERSIONS

commit : 0861da6
python : 3.8.12.final.0
python-bits : 64
OS : Linux
OS-release : 5.11.0-38-generic
Version : #42~20.04.1-Ubuntu SMP Tue Sep 28 20:41:07 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.4.0.dev0+1056.g0861da680b
numpy : 1.22.0.dev0+1693.gf52acb53a
pytz : 2021.1
dateutil : 2.8.2
pip : 21.2.4
setuptools : 58.0.4
Cython : 0.29.24
pytest : 6.2.5
hypothesis : 6.23.1
sphinx : 4.2.0
blosc : None
feather : None
xlsxwriter : 3.0.1
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.28.0
pandas_datareader: None
bs4 : 4.10.0
bottleneck : 1.3.2
fsspec : 2021.11.0
fastparquet : 0.7.1
gcsfs : 2021.05.0
matplotlib : 3.4.3
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.9
pandas_gbq : None
pyarrow : 5.0.0
pyxlsb : None
s3fs : 2021.11.0
scipy : 1.7.1
sqlalchemy : 1.4.25
tables : 3.6.1
tabulate : 0.8.9
xarray : 0.18.2
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.53.1
None

Process finished with exit code 0

The text was updated successfully, but these errors were encountered:

jreback · 2021-11-07T17:45:46Z

cc @seberg

seberg · 2021-11-07T18:15:18Z

Ping @bzah in case you have an idea. Percentile/quantile was fixed with respect to the interpolation (any input welcome, as mentioned on the mailing list I would like to rename interpolation to `method).

Not quite sure how much we should worry about it, but my guess is that this is due to rounding differences in:

    return n * quantiles + (
             alpha + quantiles * (1 - alpha - beta)
     ) - 1

behaving different for different n in the case where alpha=1 and beta=1 (the case for the default "linear" method).

I was wondering before if we should rewrite it, maybe even using _lerp, but not sure. E.g.: quantiles * (n + 1 - alpha - beta) + (alpha - 1) (not thought through.)

bzah · 2021-11-08T09:54:28Z

Yes it really seems to come from the rounding of _compute_virtual_index which uses the formula @seberg has given.

You can see that with:

data = [0.0, np.nan, 0.2, np.nan, 0.4]

# the new formula with values replaced 
len(data) * 0.45 + (
            1 + 0.45 * (1 - 1 - 1)
    ) - 1
# Out[20]: 1.7999999999999998

0.45 * (len(data) - 1)
# Out[19]: 1.8

To avoid this issue we could use (n -1) * quantile when the method 7 (a.k.a linear) is used and the full formula for others methods. But it would make the code less readable.

alimcmaster1 · 2021-11-08T19:02:28Z

Unless a fix for this is coming soon on the numpy side? shall we add this build against numpy dev to our allowed failures - since there is quite a CI backlog.

numpy/numpy#19857

jreback · 2021-11-08T19:06:31Z

yeah let's either xfail that test or allow failures for now

seberg · 2021-11-08T19:15:34Z

Mid-term, the right thing seems to be either relax the test to allow a few ULPs of error, or modify NumPy. But I guess you are talking about short-term CI-unbreaking?

jreback · 2021-11-08T19:17:03Z

sure we should fix the test too
but this is making our ci red so let's fix that first

seberg · 2021-11-09T22:18:29Z

I merged the fixup that ensures we use the simpler (and thus more precise) formula again for the default. That should fix this issue.

A heads up though: I am hoping to rename interpolation= to method= with a DeprecationWarning. This would be very soon and should be in 1.22. If that is problematic, let us know and we could reconsider or take it slower.

QuLogic · 2022-01-16T00:18:52Z

A heads up though: I am hoping to rename interpolation= to method= with a DeprecationWarning. This would be very soon and should be in 1.22. If that is problematic, let us know and we could reconsider or take it slower.

Is this bit fixed in Pandas somewhere?

phofl added Bug Needs Triage Issue that has not been reviewed by a pandas team member CI Continuous Integration Dependencies Required and optional dependencies and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Nov 7, 2021

alimcmaster1 mentioned this issue Nov 8, 2021

CI: xfail tests failing on numpy dev #44362

Merged

1 task

bzah mentioned this issue Nov 9, 2021

MAINT: Update quantile default lerp method numpy/numpy#20331

Merged

lithomas1 mentioned this issue Nov 14, 2021

Revert "CI: xfail tests failing on numpy dev" #44452

Merged

alimcmaster1 closed this as completed in #44452 Nov 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI: New Numpydev version broke ci #44343

CI: New Numpydev version broke ci #44343

phofl commented Nov 7, 2021

INSTALLED VERSIONS

jreback commented Nov 7, 2021

seberg commented Nov 7, 2021

bzah commented Nov 8, 2021

alimcmaster1 commented Nov 8, 2021

jreback commented Nov 8, 2021

seberg commented Nov 8, 2021

jreback commented Nov 8, 2021

seberg commented Nov 9, 2021

QuLogic commented Jan 16, 2022

CI: New Numpydev version broke ci #44343

CI: New Numpydev version broke ci #44343

Comments

phofl commented Nov 7, 2021

Reproducible Example

Issue Description

Expected Behavior

Installed Versions

INSTALLED VERSIONS

jreback commented Nov 7, 2021

seberg commented Nov 7, 2021

bzah commented Nov 8, 2021

alimcmaster1 commented Nov 8, 2021

jreback commented Nov 8, 2021

seberg commented Nov 8, 2021

jreback commented Nov 8, 2021

seberg commented Nov 9, 2021

QuLogic commented Jan 16, 2022