Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
ENH: json_normalize should allow a different separator than . #14883
Comments
|
The dot notation for accessing columns is just a convenience. You can still get the column normally using |
jowens
commented
Dec 15, 2016
|
I understand. I'm merely offering the observation that columns with |
|
Sure thing. Just wanted to point that out in case it was keeping you from doing something you needed to do now since you said that you were new to pandas. |
jowens
commented
Dec 15, 2016
|
Yeah, it's the vega-lite bug I filed (vega/vega-lite#1775) that's my proximate difficulty here. |
jreback
added the
IO JSON
label
Dec 15, 2016
|
this would be quite easy to add; PR's welcome. |
jreback
added Enhancement Difficulty Novice Effort Low
labels
Dec 15, 2016
jreback
added this to the
Next Major Release
milestone
Dec 15, 2016
This was referenced Dec 16, 2016
jreback
added a commit
to jowens/pandas
that referenced
this issue
Jan 22, 2017
|
|
jowens + jreback |
e707e79
|
jreback
modified the milestone: 0.20.0, Next Major Release
Mar 28, 2017
jreback
added a commit
to jowens/pandas
that referenced
this issue
Mar 28, 2017
|
|
jowens + jreback |
6a0f954
|
jreback
added a commit
to jowens/pandas
that referenced
this issue
Mar 28, 2017
|
|
jowens + jreback |
8edc40e
|
jreback
closed this
in 34c6bd0
Mar 28, 2017
mattip
added a commit
to mattip/pandas
that referenced
this issue
Apr 3, 2017
|
|
jreback + mattip |
75b6512
|
linebp
added a commit
to linebp/pandas
that referenced
this issue
Apr 17, 2017
|
|
jreback + linebp |
6ca3087
|
jowens commentedDec 14, 2016
Problem description
The above snippet shows that it's not ideal to have
.as a character in a column name. (I'm running into this when using Vega for data visualization, vega/vega-lite#1775.) Whenjson_normalizeflattens a nested input JSON, it separates the nesting levels with a.. I believe this happens on this line:https://github.com/pandas-dev/pandas/blob/7d8bc0deaeb8237a0cf361048363c78f4867f218/pandas/io/json.py#L831
I'd like to see an additional argument to
json_normalize,separator, with default., that specified the character (string) that separated nesting levels. In the line of code above,'.'.join(val)would be replaced byseparator.join(val)(if I'm reading what that line does correctly). I could use, say,_to use underscore instead of period.n00b at pandas, please correct me if I'm doing anything wrong.
Expected Output
Output of
pd.show_versions()INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Darwin
OS-release: 16.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.US-ASCII
LOCALE: None.None
pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 30.3.0
Cython: 0.25.2
numpy: 1.11.2
scipy: 0.18.1
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: 1.5.1
patsy: None
dateutil: 2.6.0
pytz: 2016.10
blosc: None
bottleneck: 1.2.0
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.4.1
xlrd: 1.0.0
xlwt: None
xlsxwriter: None
lxml: 3.6.0
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None