New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON nested_to_record Silently Drops Top-Level None Values #21356

Closed
WillAyd opened this Issue Jun 7, 2018 · 4 comments

Comments

Projects
None yet
3 participants
@WillAyd
Member

WillAyd commented Jun 7, 2018

xref #21164 (comment)

nested_to_record is silently dropping None values that appear at the top of the JSON. This is IMO unexpected and undesirable.

Code Sample, a copy-pastable example if possible

In [3]: data = {
   ...:     "id": None,
   ...:     "location": {
   ...:         "country": None
   ...:     }
   ...: }

In [5]: nested_to_record(data)
Out[5]: {'location.country': None}

Problem description

The top level None value should not be dropped but rather preserved along with lower levels for consistency.

Expected Output

In [5]: nested_to_record(data)
Out[5]: {'id': None, 'location.country': None}

Note this will break a few tests in pandas/test_normalize.py

Output of pd.show_versions()

INSTALLED VERSIONS

commit: ab6aaf7
python: 3.6.4.final.0
python-bits: 64
OS: Darwin
OS-release: 17.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.0.dev0+67.gab6aaf73a
pytest: 3.4.1
pip: 10.0.1
setuptools: 38.5.1
Cython: 0.27.3
numpy: 1.14.1
scipy: 1.0.0
pyarrow: 0.8.0
xarray: 0.10.0
IPython: 6.2.1
sphinx: 1.7.0
patsy: 0.5.0
dateutil: 2.6.1
pytz: 2018.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.1.2
openpyxl: 2.5.0
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.5
pymysql: 0.8.0
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: 0.1.3
fastparquet: 0.1.4
pandas_gbq: 0.4.1
pandas_datareader: None

@daminisatya

This comment has been minimized.

Contributor

daminisatya commented Jun 7, 2018

Hi,
I'm looking for a good first timer issue. Would like to take this up!
Thank you.

daminisatya added a commit to daminisatya/pandas that referenced this issue Jun 7, 2018

daminisatya added a commit to daminisatya/pandas that referenced this issue Jun 7, 2018

daminisatya added a commit to daminisatya/pandas that referenced this issue Jun 7, 2018

daminisatya added a commit to daminisatya/pandas that referenced this issue Jun 7, 2018

@jorisvandenbossche

This comment has been minimized.

Member

jorisvandenbossche commented Jun 7, 2018

@WillAyd Shouldn't this target 0.23.1?
I didn't fully follow the discussion in the other issues/PR, but I understand that this reverts a change of 0.22 -> 0.23 ?

@WillAyd

This comment has been minimized.

Member

WillAyd commented Jun 7, 2018

It really depends on whether or not you think this is a bug or breaks backwards compatibility. The change that introduced it (#20030) had the explicit intention of dropping those values.

I personally would consider it a bug so am fine if we decide its better fit for 0.23.1, but don't have too strong a preference either way

@jorisvandenbossche

This comment has been minimized.

Member

jorisvandenbossche commented Jun 7, 2018

If we change our minds of what is the best solution, I would also regard it as a regression fix (although intentionally included regression ..).

Let's tag it 0.23.1

@jorisvandenbossche jorisvandenbossche added this to the 0.23.1 milestone Jun 7, 2018

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Jun 12, 2018

david-liu-brattle-1 added a commit to david-liu-brattle-1/pandas that referenced this issue Jun 18, 2018

Sup3rGeo added a commit to Sup3rGeo/pandas that referenced this issue Oct 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment