Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Dataframe constructor fails when given dict with None value #14381
Comments
gitj
added a commit
to ColumbiaCMB/kid_readout
that referenced
this issue
Oct 9, 2016
|
|
gitj |
aaff3ab
|
jreback
added Bug Reshaping Difficulty Intermediate Effort Low
labels
Oct 9, 2016
jreback
added this to the
Next Major Release
milestone
Oct 9, 2016
|
So this works correctly in the following cases.
The behavior in 0.18.1 is actually wrong, this should coerce to the pull-requests to fix are welcome. |
jorisvandenbossche
modified the milestone: 0.19.1, Next Major Release
Oct 10, 2016
brandonmburroughs
referenced
this issue
Oct 11, 2016
Merged
BUG: Dataframe constructor when given dict with None value #14392
shawnheide
added a commit
to shawnheide/pandas
that referenced
this issue
Oct 11, 2016
|
|
shawnheide |
6eddbab
|
brandonmburroughs
referenced
this issue
Oct 11, 2016
Closed
Dataframe constructor does not coerce data=[None] to np.nan #14393
jreback
added the
Missing-data
label
Oct 11, 2016
|
Hey @brandonmburroughs, I saw that you're working on this too and beat me to the PR. No worries, I wasn't as far along. Just wanted to let you know that the same problem shows up with the Series constructor too, i.e. Series([None]) fails to coerce to NaN. I looked at fixing it a little further down the stack in series.py, but didn't check with any tests yet. Feel free to see my commit above that referenced this. |
gitj
commented
Oct 11, 2016
|
I was going to work on a PR but looks like you guys are on top of it. Thanks! |
|
@shawnheide I actually noticed this problem after I created my PR. I created an issue (#14393) about this and there is some discussion going on there as to how to handle this as the cases are different. Depending upon how they want to handle the API design, your fix may be better suited to handle all cases. |
|
@jreback Given your comment in #14393 (comment), I would personally say that the above case should not coerce to NaN, but keep the None. Thoughts? But in that case, @brandonmburroughs, your PR should be updated. |
|
yeah open to having it be pre-0.19.0 behavior (IOW, remain as |
|
To illustrate, in pandas 0.18:
So for 0.19.1, I would choose to go back to 0.18.1 behaviour, so not coercing to NaN (keep as None). |
gitj commentedOct 9, 2016
•
edited by jorisvandenbossche
A small, complete example of the issue
Expected Output
This previously worked with a sensible output in 0.18.1:
In [2]: pd.DataFrame(dict(a=None),index=[0])
Out[2]:
a
0 None
Output of
pd.show_versions()commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 3.2.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.24
numpy: 1.11.2
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.2.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.5
lxml: 3.6.0
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None
Broken version:
INSTALLED VERSIONS
commit: None
python: 2.7.12.final.0
python-bits: 64
OS: Linux
OS-release: 3.2.0-4-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: None.None
pandas: 0.19.0
nose: 1.3.7
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.24
numpy: 1.11.2
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.2.0
sphinx: 1.4.1
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.4
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.5
lxml: 3.6.0
bs4: 4.3.2
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None