Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
CLN: Refactor string special methods #4092
Conversation
jtratner
referenced
this pull request
Jul 1, 2013
Merged
ENH/BUG: Fix names, levels and labels handling in MultiIndex #4039
|
Can squash this into fewer commits if you want ... just like to lay out the changes clearly for people to review. |
|
are there any major api changes here? |
cpcloud
and 1 other
commented on an outdated diff
Jul 1, 2013
| @@ -175,6 +175,16 @@ pandas 0.12 | ||
| ``bs4`` + ``html5lib`` when lxml fails to parse. a list of parsers to try | ||
| until success is also valid | ||
| - more consistency in the to_datetime return types (give string/array of string inputs) (:issue:`3888`) | ||
| + - The internal ``pandas`` class hierarchy has changed (slightly). The | ||
| + previous ``PandasObject`` now is called ``PandasContainer`` and a new | ||
| + ``PandasObject`` has become the baseclass for ``PandasContainer`` as well | ||
| + as ``Index``, ``Categorical``, ``GroupBy``, ``SparseList``, and | ||
| + ``SparseArray`` (+ their base classes). Currently, ``PandasObject`` | ||
| + provides string methods (from ``StringMixin``). (:issue:`4090`) | ||
| + - New ``StringMixin`` that, given a ``__unicode__`` method, gets python 2 and | ||
| + python 3 compatible string methods (``__str__``, ``__bytes__``, and | ||
| + ``__repr__``). Plus string safety throughout. Now employed in many places | ||
| + throughout the pandas library. (:issue:`4090`) | ||
| **Experimental Feautres** |
|
|
|
ahem user facing api changes is what i meant :) |
cpcloud
commented on the diff
Jul 1, 2013
| + Yields Bytestring in Py2, Unicode String in py3. | ||
| + """ | ||
| + return str(self) | ||
| + | ||
| +class PandasObject(StringMixin): | ||
| + """baseclass for various pandas objects""" | ||
| + | ||
| + def __unicode__(self): | ||
| + """ | ||
| + Return a string representation for a particular object. | ||
| + | ||
| + Invoked by unicode(obj) in py2 only. Yields a Unicode String in both | ||
| + py2/py3. | ||
| + """ | ||
| + # Should be overwritten by base classes | ||
| + return object.__repr__(self) |
jtratner
Contributor
|
jtratner
commented on the diff
Jul 1, 2013
| @@ -8,13 +8,13 @@ enhancements along with a large number of bug fixes. | ||
| Highlites include a consistent I/O API naming scheme, routines to read html, | ||
| write multi-indexes to csv files, read & write STATA data files, read & write JSON format | ||
| -files, Python 3 support for ``HDFStore``, filtering of groupby expressions via ``filter``, and a | ||
| +files, Python 3 support for ``HDFStore``, filtering of groupby expressions via ``filter``, and a |
jtratner
Contributor
|
cpcloud
commented on the diff
Jul 1, 2013
| @@ -201,6 +202,10 @@ def __init__(self, obj, keys=None, axis=0, level=None, | ||
| def __len__(self): | ||
| return len(self.indices) | ||
| + def __unicode__(self): | ||
| + # TODO: Better unicode/repr for GroupBy object |
jtratner
Contributor
|
|
This doesn't change any user-facing API at all. It does change the default NDFrame |
|
this would close #3231 the only remaining questions is do we need a PandasScalar ? |
|
I didn't put this into Period or TimeStamp - should it go there too? On Sun, Jun 30, 2013 at 11:15 PM, jreback notifications@github.com wrote:
|
|
would it make string methods in Period-Timestamp ? |
|
what purpose would |
|
Side note: added PandasObject to |
|
my question is are there ops on scalars that are useful outside of their role in arrays and repring themselves, personally i almost never use the scalar versions for anything except inspection |
|
also is there a problem with moving |
|
No. Would that just be On Sun, Jun 30, 2013 at 11:46 PM, Phillip Cloud notifications@github.comwrote:
|
|
yep |
|
only reason to have |
|
if just need the instance check then can just fake with def is_pd_scalar(obj):
return isinstance(obj, (Timestamp, Period))adding to the instance check if (when?) more scalar types are added. |
|
i think ok for 0.12 |
|
|
|
@jtratner maybe squash it down a bit? |
|
If it's list like, use PandasContainer? Think that's already the case.
|
|
retract need for PandasScalar.....noneed really (and if there is legit need later, can always add) On Jul 1, 2013, at 7:41 AM, Jeff Tratner notifications@github.com wrote:
|
|
Yeah, I'm going to squash it down...just wanted to show steps while As an aside, it might make more sense to just create an abstract base class
|
|
FWIW i like that you don't squash for a while, while you're building, i'm starting to do that. i think it helps show the thought process. plenty of examples too, i think the series-as-ndframe pr has a ton of commits |
|
Plus, if you squash, it keeps the text of all the commits which is nice. For me, I like to make sure every commit passes its tests and distinct changes should end up in different commits to make it easier to localize issues if they occur. (e.g. via git bisect) |
|
@cpcloud - I added the constructor argument to the baseclass and removed a On Mon, Jul 1, 2013 at 12:08 PM, jreback notifications@github.com wrote:
|
jtratner
added some commits
Jul 1, 2013
|
just confirming there's no user-facing API changes and have you run the perf regression test? |
|
@wesm there's definitely no user-facing API changes. I ran |
|
nope that would be it. bombs away |
test suite above @wesm - is that what reasonably close? |
|
looks good @jtratner merging UNODIR (unless otherwise directed) |
|
@jreback go ahead - all good on my end |
jreback
added a commit
that referenced
this pull request
Jul 2, 2013
|
|
jreback |
030f613
|
jtratner commentedJul 1, 2013
closes #4090, #3231
pandasclass hierarchy has changed (slightly). Theprevious
PandasObjectnow is calledPandasContainerand a newPandasObjecthas become the baseclass forPandasContaineras wellas
Index,Categorical,GroupBy,SparseList, andSparseArray(+ their base classes). Currently,PandasObjectprovides string methods (from
StringMixin).StringMixinthat, given a__unicode__method, gets python 2 andpython 3 compatible string methods (
__str__,__bytes__, and__repr__). Now employed in many places throughout the pandas library.used in the string or repr methods of a library).
I'm hoping to use this for the new objects in the MultiIndex naming PR, so
hopefully it's okay to merge for v0.12.