Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series not honoring class __repr__ or __str__ #18843

Open
achapkowski opened this issue Dec 19, 2017 · 8 comments
Open

Series not honoring class __repr__ or __str__ #18843

achapkowski opened this issue Dec 19, 2017 · 8 comments
Labels
Bug Output-Formatting __repr__ of pandas objects, to_string

Comments

@achapkowski
Copy link

Code Sample, a copy-pastable example if possible

class foo(dict):
    def __init__(self, iterable=None, **kwargs):
        if iterable is None:
            iterable = ()
        super(foo, self).__init__(iterable)
        self.update(kwargs)
    def __repr__(self):
        return ",".join(self.keys())
    def __str__(self):
        return ",".join(self.keys())

f = foo({'alpha' : 'b',
    'beta' : 'c'})

import pandas as pd
pd.DataFrame(data=[['A', 1, f]], columns=['D', 'F', 'G'])

Problem description

For a given series with a custom object, I want to control the content when displayed via print or displaying on ipython notebooks. The object foo is a simple class that have the __str__ and __repr__ overwritten, but still displays the object's dictionary content, not the view I want to show the end users. How do I control that?

Expected Output

alpha,beta

what I get is:

{'alpha': 'b', 'beta': 'c'}

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 94 Stepping 3, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.21.1
pytest: 3.3.1
pip: 9.0.1
setuptools: 38.2.4
Cython: None
numpy: 1.11.2
scipy: 0.18.1
pyarrow: None
xarray: None
IPython: 5.3.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 1.5.3
openpyxl: None
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@TomAugspurger
Copy link
Contributor

This is probably the same as #17695 (you inherit from dict, so your objects are iterable). It's difficult for pandas to support formatting arbitrary objects.

Your simple example could be solved by not subclassing dict, and just storing your iterable on an internal ._data attribute. But that likely isn't a solution for your real problem.

@achapkowski
Copy link
Author

@TomAugspurger not subclassing from dict is not an option since the other classes are established. Is there a way to override the print function (not optimal) or set something on the class to say hey use the __repr__

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Dec 19, 2017 via email

@jamesmyatt
Copy link
Contributor

jamesmyatt commented Jun 24, 2019

In pprint_thing, why is hasattr(thing, '__next__') a special case?

if hasattr(thing, '__next__'):

@mroeschke mroeschke added the Bug label May 16, 2020
@rajeee
Copy link

rajeee commented Jul 3, 2020

Bump to @jamesmyatt's question: why is hasattr(thing, '__next__') a special case in pprint_thing?
__next__ attribute is available in iterators, and it doesn't make sense why iterator objects would be printed directly using str and other kinds of objects are passed through as_escaped_unicode function before printing.

@lgharibashvili
Copy link

Quick dirty patch for those who cannot wait for the fix:

from pandas.io.formats import printing as pd_printing
pd_printing.is_sequence = lambda obj: False

@mzeitlin11
Copy link
Member

@rajeee, @jamesmyatt not sure about why that check is there, a well-tested PR trying to fix this issue by removing that would be a next step here if you (or anyone else) is interested!

@mzeitlin11 mzeitlin11 added this to the Contributions Welcome milestone Apr 15, 2021
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
@danking
Copy link
Contributor

danking commented Oct 16, 2023

This bug is no longer a bug:

Out[48]: 
   D  F                            G
0  A  1  {'alpha': 'b', 'beta': 'c'}

But a similar issue arrises when you sub-class Mapping (but not dict). A Mapping is a Collection is a Iterable and a Sized which define, respectively, __iter__ and __len__, which triggers Pandas' special logic.

We could change the isinstance check to use isinstance(value, Mapping) but then our custom mappings will look like dicts. This still seems like an improvement over only seeing the keys.

It does seem a lot easier to just override is_sequence to ignore our custom classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

No branches or pull requests

9 participants