CLN: Refactor string special methods to common decorator + safe unicode everywhere #4090

Closed
jtratner opened this Issue Jul 1, 2013 · 13 comments

Comments

Projects
None yet
2 participants
Contributor

jtratner commented Jul 1, 2013

I was implementing some new objects for another PR and noticed that string methods are duplicated throughout (particularly on objects that don't inherit from each other). Wrote this up on this branch - https://github.com/jtratner/pandas/tree/refactor_string_magic_methods .

If you thing this is worthwhile, I'll split it up a little, add a bit better documentation and submit.

This is used multiple times:

    def __str__(self):
        """
        Return a string representation for a particular object.

        Invoked by str(obj) in both py2/py3.
        Yields Bytestring in Py2, Unicode String in py3.
        """

        if py3compat.PY3:
            return self.__unicode__()
        return self.__bytes__()

    def __bytes__(self):
        """
        Return a string representation for a particular object.

        Invoked by bytes(obj) in py3 only.
        Yields a bytestring in both py2/py3.
        """
        from pandas.core.config import get_option

        encoding = get_option("display.encoding")
        return self.__unicode__().encode(encoding, 'replace')

    def __repr__(self):
        """
        Return a string representation for a particular object.

        Yields Bytestring in Py2, Unicode String in py3.
        """
        return str(self)

Unicode tends to vary, but often is like this:

    def __unicode__(self):
        """
        Return a string representation for a particular object.

        Invoked by unicode(obj) in py2 only. Yields a Unicode String in both
        py2/py3.
        """
        prepr = pprint_thing(self, escape_chars=('\t', '\r', '\n'), quote_strings=True)
        return '%s(%s)' % (type(self).__name__, prepr)

Additionally, a number of objects aren't using a unicode-safe representation of themselves, so this would resolve that as well. Would this be useful to include?

Contributor

jreback commented Jul 1, 2013

I had this change in my refactor of series - but u r welcome to do separately

should be a top level pandas base class ( really should be PandasObject but that is used now)

that PandasObject, NDFrame, Index and such inherit from

with methods like this

Contributor

jtratner commented Jul 1, 2013

One problem is that core/generic imports MultiIndex...should the baseclass go in a different file to avoid the circular import?

Contributor

jreback commented Jul 1, 2013

should be like a mix-in

I think Index makes sense as well here

On Jun 30, 2013, at 8:41 PM, Jeff Tratner notifications@github.com wrote:

@jreback So would you want to make Index inherit from this separately too?


Reply to this email directly or view it on GitHub.

Contributor

jtratner commented Jul 1, 2013

Okay, should be totally simple to adapt.

Contributor

jreback commented Jul 1, 2013

yes maybe core/base.py ?

Contributor

jtratner commented Jul 1, 2013

@jreback haha, I had just starting editing core/base right before I saw your note.

Contributor

jtratner commented Jul 1, 2013

@jreback we could rename PandasObject and call the other PandasContainer or something like that...

Contributor

jreback commented Jul 1, 2013

that works

Contributor

jreback commented Jul 1, 2013

FYI there are also some other public objects that prob should inherit from this as well

Categorical, HDFStore (maybe)

others I am sure

Contributor

jtratner commented Jul 1, 2013

Yeah, I'd caught those. Covered by Series, Frame and Panel via
PandasObject. Will be inherited from by Index, Int64Index and MultiIndex.
Also Sparse* and a number of items in HDFStore. Would you want HDFStore to
inherit from PandasObject or would it make more sense to make a separate
StringMixin (that requires a __unicode__ method)

On Sun, Jun 30, 2013 at 8:57 PM, jreback notifications@github.com wrote:

FYI there are also some other public objects that prob should inherit from
this as well

Categorical, HDFStore (maybe)

others I am sure


Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/issues/4090#issuecomment-20259037
.

Contributor

jreback commented Jul 1, 2013

there be a real base class mixin
and maybe some other base classes (eg for HDStore)

shouldn't be too complicated

Contributor

jtratner commented Jul 1, 2013

@jreback what? Slightly confused. I was thinking fo:

class StringMixin(object): # adds string methods

class PandasObject(StringMixin) # baseclass for Index, Categorical, PandasContainer, etc.

class PandasContainer(PandasObject): # baseclass for NDFrame
Contributor

jreback commented Jul 1, 2013

yep that's good

first 2 in core/base

3rd replaces current PandasObject def

jreback closed this in #4092 Jul 2, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment