BUG: truncated repr with pd.NA in object dtype column shows "NaN" #33065

HYChou0515 · 2020-03-27T10:57:29Z

Code Sample, a copy-pastable example if possible

# this return '<NA>', cool
str(pd.DataFrame(np.full((60, 1), pd.NA)))

# this return 'NaN', not cool
str(pd.DataFrame(np.full((61, 1), pd.NA)))

# It's just a problem of str, the entry is actually pd.NA
# this return '<NA>'
str(pd.DataFrame(np.full((61, 1), pd.NA)).iloc[0,0])

Problem description

str(pd.NA) should always be <NA> or it will be confusing.

Expected Output

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : None
python : 3.8.1.final.0
python-bits : 64
OS : Linux
OS-release : 4.15.0-91-generic
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : None.None

pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.2.0
Cython : None
pytest : 5.3.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.12.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : 3.2.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.5
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None

The text was updated successfully, but these errors were encountered:

jorisvandenbossche · 2020-03-27T13:35:13Z

@HYChou0515 Thanks for the report! This is indeed a bug in the repr of truncated dataframes.

Note that it only happens for object dtype though, not if using one of the new dtypes that use pd.NA:

In [16]: print(pd.DataFrame(np.full((61, 1), pd.NA)))   
      0
0   NaN
1   NaN
2   NaN
3   NaN
4   NaN
..  ...
56  NaN
57  NaN
58  NaN
59  NaN
60  NaN

[61 rows x 1 columns]

In [17]: print(pd.DataFrame(np.full((61, 1), pd.NA), dtype="string"))
       0
0   <NA>
1   <NA>
2   <NA>
3   <NA>
4   <NA>
..   ...
56  <NA>
57  <NA>
58  <NA>
59  <NA>
60  <NA>

[61 rows x 1 columns]

MichaelTiemannOSC · 2023-07-21T23:23:15Z

Agree with the OP--it's very confusing to see pd.NA rendered as NaN!

HYChou0515 changed the title ~~When df more than 60 rows, str(pd.NA) becomes 'NaN'~~ When df has more than 60 rows, str(pd.NA) becomes 'NaN' Mar 27, 2020

jorisvandenbossche added Bug Output-Formatting __repr__ of pandas objects, to_string labels Mar 27, 2020

jorisvandenbossche changed the title ~~When df has more than 60 rows, str(pd.NA) becomes 'NaN'~~ BUG: truncated repr with pd.NA in object dtype column shows "NaN" Mar 27, 2020

simonjayhawkins mentioned this issue Mar 27, 2020

pd.NA in object dtype #32931

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: truncated repr with pd.NA in object dtype column shows "NaN" #33065

BUG: truncated repr with pd.NA in object dtype column shows "NaN" #33065

HYChou0515 commented Mar 27, 2020 •

edited by jorisvandenbossche

INSTALLED VERSIONS

jorisvandenbossche commented Mar 27, 2020

MichaelTiemannOSC commented Jul 21, 2023

BUG: truncated repr with pd.NA in object dtype column shows "NaN" #33065

BUG: truncated repr with pd.NA in object dtype column shows "NaN" #33065

Comments

HYChou0515 commented Mar 27, 2020 • edited by jorisvandenbossche

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

jorisvandenbossche commented Mar 27, 2020

MichaelTiemannOSC commented Jul 21, 2023

HYChou0515 commented Mar 27, 2020 •

edited by jorisvandenbossche

Output of `pd.show_versions()`