Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: IntervalTree should not have NaN in nodes #23352

Closed
jschendel opened this issue Oct 26, 2018 · 0 comments

Comments

Projects
None yet
1 participant
@jschendel
Copy link
Member

commented Oct 26, 2018

Code Sample, a copy-pastable example if possible

Initializing an IntervalTree with arrays containing np.nan can trigger a RuntimeWarning:

In [2]: left, right = [0, 1, 2, np.nan], [1, 2, 3, np.nan]

In [3]: tree = pd._libs.interval.IntervalTree(left, right, leaf_size=2)
/home/jeremy/anaconda3/lib/python3.6/site-packages/numpy/lib/function_base.py:4033: RuntimeWarning: Invalid value encountered in median
  r = func(a, **kwargs)

This causes some attributes to be incorrectly set as np.nan, which can lead to incorrect results:

In [4]: tree.get_loc(0.5)
Out[4]: array([0, 1, 2, 3])

Problem description

Initializing an IntervalTree with data containing np.nan triggers a RuntimeWarning and can lead to incorrect results.

Expected Output

I'd expect no warnings to be raised and for the get_loc query to be correct.

Output of pd.show_versions()

INSTALLED VERSIONS

commit: 437f31c
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.14.29-galliumos
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.0.dev0+824.g437f31c
pytest: 3.5.1
pip: 18.0
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.3
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 6.4.0
sphinx: 1.7.4
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.3
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: 2.5.3
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.4
lxml: 4.2.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.7
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

@jschendel jschendel added this to the 0.24.0 milestone Oct 26, 2018

@jschendel jschendel referenced this issue Oct 26, 2018

Merged

BUG: Fix IntervalTree handling of NaN #23353

4 of 4 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.