Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assert_almost_equal #13357

Closed
jaysw opened this issue Jun 3, 2016 · 9 comments · Fixed by #30562
Closed

assert_almost_equal #13357

jaysw opened this issue Jun 3, 2016 · 9 comments · Fixed by #30562
Assignees
Labels
Bug Testing pandas testing functions or related to the test suite
Milestone

Comments

@jaysw
Copy link

jaysw commented Jun 3, 2016

Hi, I'm wondering if the following behavior of pandas._testing.assert_almost_equal is expected:

Code Sample, a copy-pastable example if possible

from pandas import _testing
_testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=True)

Expected Output

Expect no output / no AssertionError

Actual Output

AssertionError                            Traceback (most recent call last)
<ipython-input-199-ace78e82c603> in <module>()
      1 from pandas import _testing
----> 2 _testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=True)

pandas/src/testing.pyx in pandas._testing.assert_almost_equal (pandas/src/testing.c:3887)()

pandas/src/testing.pyx in pandas._testing.assert_almost_equal (pandas/src/testing.c:3653)()

AssertionError: expected 0.00001 but got 0.00001, with decimal 3

Note that the numbers differ at decimal 6 and the output suggests they are different at position 3.

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Darwin
OS-release: 15.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.18.1
nose: None
pip: 8.0.2
setuptools: 21.2.1
Cython: None
numpy: 1.11.0
scipy: 0.17.1
statsmodels: None
xarray: None
IPython: 4.2.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: 1.5.1
openpyxl: 2.3.3
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None

@jreback
Copy link
Contributor

jreback commented Jun 3, 2016

hmm, that seems odd. in any event try with master, you can now pass an integer to check_less_precise (it still breaks but more flexibility).

@jreback jreback added Bug Testing pandas testing functions or related to the test suite Difficulty Intermediate labels Jun 3, 2016
@jreback jreback added this to the 0.18.2 milestone Jun 3, 2016
@jaysw
Copy link
Author

jaysw commented Jun 3, 2016

Hi, thanks for the speedy response! Looking at the master branch, we will see the same problem:

https://github.com/pydata/pandas/blob/master/pandas/src/testing.pyx#L197

decimal_almost_equal(1, fb / fa, decimal)

The numbers (fa, fb) are divided and compared to 1. Which is interesting, but not quite the same as just comparing the parts after the decimal point (as the docstring suggests). Perhaps I'm just reading the docs incorrectly.

@sinhrks
Copy link
Member

sinhrks commented Jun 4, 2016

+1 to always compare using specified precision (not fb / fa).

@jorisvandenbossche jorisvandenbossche modified the milestones: 0.20.0, 0.19.0 Aug 21, 2016
@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017
@mmngreco
Copy link
Contributor

mmngreco commented Aug 1, 2017

The problem is still present:

Here some examples based on the first comment.

>>> import pandas as pd
>>> pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=True)
Traceback (most recent call last):
  File "C:\Anaconda3\envs\worker\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-12-20c1c2335303>", line 1, in <module>
    pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=True)
  File "pandas/_libs/testing.pyx", line 59, in pandas._libs.testing.assert_almost_equal (pandas\_libs\testing.c:4156)
  File "pandas/_libs/testing.pyx", line 209, in pandas._libs.testing.assert_almost_equal (pandas\_libs\testing.c:3863)
AssertionError: expected 0.00001 but got 0.00001, with decimal 3

>>> pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=2)
Traceback (most recent call last):
  File "C:\Anaconda3\envs\worker\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-13-4fe4a2043aa2>", line 1, in <module>
    pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=2)
  File "pandas/_libs/testing.pyx", line 59, in pandas._libs.testing.assert_almost_equal (pandas\_libs\testing.c:4156)
  File "pandas/_libs/testing.pyx", line 209, in pandas._libs.testing.assert_almost_equal (pandas\_libs\testing.c:3863)
AssertionError: expected 0.00001 but got 0.00001, with decimal 2

>>> pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=1)
Traceback (most recent call last):
  File "C:\Anaconda3\envs\worker\lib\site-packages\IPython\core\interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-14-15b3658bbb6c>", line 1, in <module>
    pd._libs.testing.assert_almost_equal(0.000011, 0.000012, check_less_precise=1)
  File "pandas/_libs/testing.pyx", line 59, in pandas._libs.testing.assert_almost_equal (pandas\_libs\testing.c:4156)
  File "pandas/_libs/testing.pyx", line 209, in pandas._libs.testing.assert_almost_equal (pandas\_libs\testing.c:3863)
AssertionError: expected 0.00001 but got 0.00001, with decimal 1

I don't know how to make this works, in addition, I find the message a little weird.

@jreback
Copy link
Contributor

jreback commented Aug 1, 2017

The problem is still present:

and this is an open issue. you are welcome to submit a PR to fix.

@NewbiZ
Copy link

NewbiZ commented Aug 3, 2017

I would like to fix it but I not 100% sure that I understand the rationale of the code.

When using

decimal_almost_equal(1, fb / fa, decimal)

instead of

decimal_almost_equal(fa, fb, decimal)

From my understanding, the idea here is that the comparison precision (i.e. represented by decimal here) is expressed relatively to 1, and that is scaled along with the numbers to compare. I guess the goal is to be able to provide a relative precision and have it work with numbers spanning very different ranges.
While interesting, it is a bit confusing, and not clearly specified in the documentation.

Should we update the documentation or the code? I would vote to change the code. The whole point of providing the precision is that the user knows what is the correct epsilon to use with the values that are being compared.

Did I understood the rationale of the code? Are you OK with a PR to change the code?

@gfyoung
Copy link
Member

gfyoung commented Aug 3, 2017

@NewbiZ : PR changes are welcome so long as they fix the issue AND not break existing tests! 😄

@mmngreco
Copy link
Contributor

mmngreco commented Aug 3, 2017

@jreback I'm Sorry, maybe I did not pick the right words, I just wanted to leave a reminder and add a couple of things. It wasn't a complaint at all. You're right, I tried to correct it but it was difficult for me to understand the why of some parts of this function.

Bests,

joaoleveiga pushed a commit to joaoleveiga/pandas that referenced this issue Dec 30, 2019
This commit adds a new keyword argument `check_low_values`, that will
allow the approximate comparison of numerics based on literal decimal
places. This is particularly useful when comparing low values:

    # This fails because it's doing (1 - .1 / .1001)
    assert_almost_equal(0.1, 0.1001, check_less_precise=True)

    # This will work as intuitively expected
    assert_almost_equal(
        0.1, 0.1001,
        check_less_precise=True,
        check_low_values=True
    )
@joaoleveiga
Copy link
Contributor

take

joaoleveiga pushed a commit to joaoleveiga/pandas that referenced this issue Dec 30, 2019
This commit adds a new keyword argument `check_low_values`, that will
allow the approximate comparison of numerics based on literal decimal
places. This is particularly useful when comparing low values:

    # This fails because it's doing (1 - .1 / .1001)
    assert_almost_equal(0.1, 0.1001, check_less_precise=True)

    # This will work as intuitively expected
    assert_almost_equal(
        0.1, 0.1001,
        check_less_precise=True,
        check_low_values=True
    )
@jreback jreback modified the milestones: Contributions Welcome, 1.1 Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants