BUG: comparison warning with NaN when DataFrame/Series broadcasting #16378

Closed
Zhang18 opened this Issue May 17, 2017 · 10 comments

Comments

Projects
None yet
4 participants
@Zhang18

Zhang18 commented May 17, 2017

Code Sample

>>> df = pd.DataFrame({'A' : [np.nan, 2.0, np.nan]})
>>> df
     A
0  NaN
1  2.0
2  NaN
>>> x = pd.Series([1, 1, 1])
>>> x
0    1
1    1
2    1
dtype: int64
>>> df.clip_lower(x, axis=0)
/usr/local/lib/python3.5/dist-packages/pandas/core/ops.py:1253: RuntimeWarning: invalid value encountered in greater_equal
  result = op(x, y)
     A
0  NaN
1  2.0
2  NaN

>>> df.clip_lower(x, axis=0)
     A
0  NaN
1  2.0
2  NaN

Problem description

So the RuntimeWarning is only thrown the first time the clip is called. But not on subsequent times. This is inconsistent.

Expected Output

I propose to suppress this warning altogether as the triggering use cases seem to be completely common and fully legit.

Output of pd.show_versions()

>>> pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-75-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.1
pytest: None
pip: 9.0.1
setuptools: 20.7.0
Cython: None
numpy: 1.12.1
scipy: 0.19.0
xarray: None
IPython: 6.0.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 0.7.3
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None

@TomAugspurger

This comment has been minimized.

Show comment
Hide comment
@TomAugspurger

TomAugspurger May 17, 2017

Contributor

This repeatedly shows warnings for me. Do you perhaps set a warningfilter somewhere like

import warnings
warnings.simplefilter('once', RuntimeWarning)

an easy way to test is to call

In [19]: def f():
    ...:     raise RuntimeWarning("test")
    ...:     return 2

a couple times.

Contributor

TomAugspurger commented May 17, 2017

This repeatedly shows warnings for me. Do you perhaps set a warningfilter somewhere like

import warnings
warnings.simplefilter('once', RuntimeWarning)

an easy way to test is to call

In [19]: def f():
    ...:     raise RuntimeWarning("test")
    ...:     return 2

a couple times.

@Zhang18

This comment has been minimized.

Show comment
Hide comment
@Zhang18

Zhang18 May 17, 2017

Here is a continuous screen output per your suggestion. Apparently f() keeps raise warnings whereas clip doesn't.

Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def f():
...     raise RuntimeWarning("test")
...     return 1
...
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
RuntimeWarning: test
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
RuntimeWarning: test
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
RuntimeWarning: test
>>> df = pd.DataFrame({'A' : [np.nan, 2.0, np.nan]})
>>> x = pd.Series([1, 1, 1])
>>> df.clip_lower(x, axis=0)
/usr/local/lib/python3.5/dist-packages/pandas/core/ops.py:1253: RuntimeWarning: invalid value encountered in greater_equal
  result = op(x, y)
     A
0  NaN
1  2.0
2  NaN
>>> df.clip_lower(x, axis=0)
     A
0  NaN
1  2.0
2  NaN

Zhang18 commented May 17, 2017

Here is a continuous screen output per your suggestion. Apparently f() keeps raise warnings whereas clip doesn't.

Python 3.5.2 (default, Nov 17 2016, 17:05:23)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> def f():
...     raise RuntimeWarning("test")
...     return 1
...
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
RuntimeWarning: test
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
RuntimeWarning: test
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
RuntimeWarning: test
>>> df = pd.DataFrame({'A' : [np.nan, 2.0, np.nan]})
>>> x = pd.Series([1, 1, 1])
>>> df.clip_lower(x, axis=0)
/usr/local/lib/python3.5/dist-packages/pandas/core/ops.py:1253: RuntimeWarning: invalid value encountered in greater_equal
  result = op(x, y)
     A
0  NaN
1  2.0
2  NaN
>>> df.clip_lower(x, axis=0)
     A
0  NaN
1  2.0
2  NaN
@TomAugspurger

This comment has been minimized.

Show comment
Hide comment
@TomAugspurger

TomAugspurger May 17, 2017

Contributor

Can you post the output of warnings.filters? It seems unlikely, but it's possible to filter warnings form specific libraries.

Contributor

TomAugspurger commented May 17, 2017

Can you post the output of warnings.filters? It seems unlikely, but it's possible to filter warnings form specific libraries.

@Zhang18

This comment has been minimized.

Show comment
Hide comment
@Zhang18

Zhang18 May 17, 2017

Here you go:

>>> pprint.pprint(warnings.filters)
[('ignore',
  re.compile('numpy.ndarray size changed', re.IGNORECASE),
  <class 'Warning'>,
  re.compile(''),
  0),
 ('ignore',
  re.compile('numpy.ufunc size changed', re.IGNORECASE),
  <class 'Warning'>,
  re.compile(''),
  0),
 ('ignore',
  re.compile('numpy.dtype size changed', re.IGNORECASE),
  <class 'Warning'>,
  re.compile(''),
  0),
 ('always', None, <class 'numpy.lib.polynomial.RankWarning'>, None, 0),
 ('default', None, <class 'urllib3.exceptions.SNIMissingWarning'>, None, 0),
 ('default', None, <class 'urllib3.exceptions.SubjectAltNameWarning'>, None, 0),
 ('ignore', None, <class 'DeprecationWarning'>, None, 0),
 ('ignore', None, <class 'PendingDeprecationWarning'>, None, 0),
 ('ignore', None, <class 'ImportWarning'>, None, 0),
 ('ignore', None, <class 'BytesWarning'>, None, 0),
 ('ignore', None, <class 'ResourceWarning'>, None, 0),
 ('always', None, <class 'urllib3.exceptions.SecurityWarning'>, None, 0),
 ('default',
  None,
  <class 'urllib3.exceptions.InsecurePlatformWarning'>,
  None,
  0),
 ('default', None, <class 'requests.exceptions.FileModeWarning'>, None, 0)]

Zhang18 commented May 17, 2017

Here you go:

>>> pprint.pprint(warnings.filters)
[('ignore',
  re.compile('numpy.ndarray size changed', re.IGNORECASE),
  <class 'Warning'>,
  re.compile(''),
  0),
 ('ignore',
  re.compile('numpy.ufunc size changed', re.IGNORECASE),
  <class 'Warning'>,
  re.compile(''),
  0),
 ('ignore',
  re.compile('numpy.dtype size changed', re.IGNORECASE),
  <class 'Warning'>,
  re.compile(''),
  0),
 ('always', None, <class 'numpy.lib.polynomial.RankWarning'>, None, 0),
 ('default', None, <class 'urllib3.exceptions.SNIMissingWarning'>, None, 0),
 ('default', None, <class 'urllib3.exceptions.SubjectAltNameWarning'>, None, 0),
 ('ignore', None, <class 'DeprecationWarning'>, None, 0),
 ('ignore', None, <class 'PendingDeprecationWarning'>, None, 0),
 ('ignore', None, <class 'ImportWarning'>, None, 0),
 ('ignore', None, <class 'BytesWarning'>, None, 0),
 ('ignore', None, <class 'ResourceWarning'>, None, 0),
 ('always', None, <class 'urllib3.exceptions.SecurityWarning'>, None, 0),
 ('default',
  None,
  <class 'urllib3.exceptions.InsecurePlatformWarning'>,
  None,
  0),
 ('default', None, <class 'requests.exceptions.FileModeWarning'>, None, 0)]
@Zhang18

This comment has been minimized.

Show comment
Hide comment
@Zhang18

Zhang18 May 17, 2017

But inconsistency aside, do you agree this warning is unnecessary and should be removed?

Zhang18 commented May 17, 2017

But inconsistency aside, do you agree this warning is unnecessary and should be removed?

@TomAugspurger

This comment has been minimized.

Show comment
Hide comment
@TomAugspurger

TomAugspurger May 17, 2017

Contributor

Sorry, missed that section of your original post. I think that was fixed by #16373 Can you try on master?

Contributor

TomAugspurger commented May 17, 2017

Sorry, missed that section of your original post. I think that was fixed by #16373 Can you try on master?

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback May 17, 2017

Contributor

This is a different issue.
This has to do with broadcasting of a Series and a DataFrame.

In [6]: df = pd.DataFrame({'A' : [np.nan, 2.0, np.nan]})

In [7]: x = pd.Series([1, 1, 1])

In [8]: df.ge(x, axis=0)
/Users/jreback/pandas/pandas/core/ops.py:1253: RuntimeWarning: invalid value encountered in greater_equal
  result = op(x, y)
Out[8]: 
       A
0  False
1   True
2  False
Contributor

jreback commented May 17, 2017

This is a different issue.
This has to do with broadcasting of a Series and a DataFrame.

In [6]: df = pd.DataFrame({'A' : [np.nan, 2.0, np.nan]})

In [7]: x = pd.Series([1, 1, 1])

In [8]: df.ge(x, axis=0)
/Users/jreback/pandas/pandas/core/ops.py:1253: RuntimeWarning: invalid value encountered in greater_equal
  result = op(x, y)
Out[8]: 
       A
0  False
1   True
2  False
@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback May 17, 2017

Contributor

I recall the same issue, but can't find it w/o a quick search. @TomAugspurger any idea?

Contributor

jreback commented May 17, 2017

I recall the same issue, but can't find it w/o a quick search. @TomAugspurger any idea?

@jreback jreback changed the title from clip throw inconsistent warnings against NaN to BUG: comparison warning with NaN when DataFrame/Series broadcasting May 17, 2017

@jreback jreback added this to the Next Major Release milestone May 17, 2017

@jreback

This comment has been minimized.

Show comment
Hide comment
@jreback

jreback May 17, 2017

Contributor

note that this fix is the same as in #16373, e.g. wrapping with np.errstate

Contributor

jreback commented May 17, 2017

note that this fix is the same as in #16373, e.g. wrapping with np.errstate

@AndrewArcher

This comment has been minimized.

Show comment
Hide comment
@AndrewArcher

AndrewArcher May 22, 2017

Contributor

I'm going to start working on this.

Contributor

AndrewArcher commented May 22, 2017

I'm going to start working on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment