CLN: Refactor string special methods #4092

Merged
merged 5 commits into from Jul 2, 2013

Conversation

Projects
None yet
4 participants
Contributor

jtratner commented Jul 1, 2013

closes #4090, #3231

  • The internal pandas class hierarchy has changed (slightly). The
    previous PandasObject now is called PandasContainer and a new
    PandasObject has become the baseclass for PandasContainer as well
    as Index, Categorical, GroupBy, SparseList, and
    SparseArray (+ their base classes). Currently, PandasObject
    provides string methods (from StringMixin).
  • New StringMixin that, given a __unicode__ method, gets python 2 and
    python 3 compatible string methods (__str__, __bytes__, and
    __repr__). Now employed in many places throughout the pandas library.
  • Tried to improve safe unicode handling (e.g., where user/external input is
    used in the string or repr methods of a library).

I'm hoping to use this for the new objects in the MultiIndex naming PR, so
hopefully it's okay to merge for v0.12.

Contributor

jtratner commented Jul 1, 2013

Can squash this into fewer commits if you want ... just like to lay out the changes clearly for people to review.

Member

cpcloud commented Jul 1, 2013

are there any major api changes here?

@cpcloud cpcloud and 1 other commented on an outdated diff Jul 1, 2013

doc/source/release.rst
@@ -175,6 +175,16 @@ pandas 0.12
``bs4`` + ``html5lib`` when lxml fails to parse. a list of parsers to try
until success is also valid
- more consistency in the to_datetime return types (give string/array of string inputs) (:issue:`3888`)
+ - The internal ``pandas`` class hierarchy has changed (slightly). The
+ previous ``PandasObject`` now is called ``PandasContainer`` and a new
+ ``PandasObject`` has become the baseclass for ``PandasContainer`` as well
+ as ``Index``, ``Categorical``, ``GroupBy``, ``SparseList``, and
+ ``SparseArray`` (+ their base classes). Currently, ``PandasObject``
+ provides string methods (from ``StringMixin``). (:issue:`4090`)
+ - New ``StringMixin`` that, given a ``__unicode__`` method, gets python 2 and
+ python 3 compatible string methods (``__str__``, ``__bytes__``, and
+ ``__repr__``). Plus string safety throughout. Now employed in many places
+ throughout the pandas library. (:issue:`4090`)
**Experimental Feautres**
@cpcloud

cpcloud Jul 1, 2013

Member

can u correct this? :)

@jtratner

jtratner Jul 1, 2013

Contributor

Correct what?

@jtratner

jtratner Jul 1, 2013

Contributor

you mean move it out of API changes?

@cpcloud

cpcloud Jul 1, 2013

Member

no there's a typo, Feautres -> Features

Member

cpcloud commented Jul 1, 2013

ahem user facing api changes is what i meant :)

@cpcloud cpcloud commented on the diff Jul 1, 2013

pandas/core/base.py
+ Yields Bytestring in Py2, Unicode String in py3.
+ """
+ return str(self)
+
+class PandasObject(StringMixin):
+ """baseclass for various pandas objects"""
+
+ def __unicode__(self):
+ """
+ Return a string representation for a particular object.
+
+ Invoked by unicode(obj) in py2 only. Yields a Unicode String in both
+ py2/py3.
+ """
+ # Should be overwritten by base classes
+ return object.__repr__(self)
@cpcloud

cpcloud Jul 1, 2013

Member

should this be super(PandasObject, self).__repr__()?

@jtratner

jtratner Jul 1, 2013

Contributor

No. That would be an infinite loop. (because StringMixin calls __str__, etc.) That's the only awkward part of this setup...but for most PandasObjects, it works out okay.

@cpcloud

cpcloud Jul 1, 2013

Member

oh whoops blah sorry

@jtratner jtratner commented on the diff Jul 1, 2013

doc/source/v0.12.0.txt
@@ -8,13 +8,13 @@ enhancements along with a large number of bug fixes.
Highlites include a consistent I/O API naming scheme, routines to read html,
write multi-indexes to csv files, read & write STATA data files, read & write JSON format
-files, Python 3 support for ``HDFStore``, filtering of groupby expressions via ``filter``, and a
+files, Python 3 support for ``HDFStore``, filtering of groupby expressions via ``filter``, and a
@jtratner

jtratner Jul 1, 2013

Contributor

@cpcloud is it worth it to fix the trailing whitespace? Easy to remove the fix, but it was bothering me a little.

@cpcloud

cpcloud Jul 1, 2013

Member

sure why not? i suppose it makes it more difficult to review...but for docs i don't think it's a big deal

@cpcloud cpcloud commented on the diff Jul 1, 2013

pandas/core/groupby.py
@@ -201,6 +202,10 @@ def __init__(self, obj, keys=None, axis=0, level=None,
def __len__(self):
return len(self.indices)
+ def __unicode__(self):
+ # TODO: Better unicode/repr for GroupBy object
@cpcloud

cpcloud Jul 1, 2013

Member

would love this, maybe an issue?

@jtratner

jtratner Jul 1, 2013

Contributor

What would you want it to show? calculates values on the fly right - so probably just keys?

@cpcloud

cpcloud Jul 1, 2013

Member

yeah gb is lazy so yeah keys i guess. just something more informative than the object repr

Contributor

jtratner commented Jul 1, 2013

This doesn't change any user-facing API at all. It does change the default NDFrame __unicode__/__repr__ (which is overridden on all the standard pandas objects anyways), but aside from that doesn't change anything.

Contributor

jreback commented Jul 1, 2013

this would close #3231 the only remaining questions is do we need a PandasScalar ?

Contributor

jtratner commented Jul 1, 2013

I didn't put this into Period or TimeStamp - should it go there too?
(wasn't sure if it would be overkill...)

On Sun, Jun 30, 2013 at 11:15 PM, jreback notifications@github.com wrote:

this would close #3231 pydata#3231 the
only remaining questions is do we need a PandasScalar ?


Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/pull/4092#issuecomment-20261588
.

Contributor

jreback commented Jul 1, 2013

would it make string methods in Period-Timestamp ?

Member

cpcloud commented Jul 1, 2013

what purpose would PandasScalar serve? unifying the repring API of the various scalar types? might be useful if there are more scalar types planned, e.g., Timedelta

Contributor

jtratner commented Jul 1, 2013

Side note: added PandasObject to Block and BlockManager in internals and to Period too.

Member

cpcloud commented Jul 1, 2013

my question is are there ops on scalars that are useful outside of their role in arrays and repring themselves, personally i almost never use the scalar versions for anything except inspection

Member

cpcloud commented Jul 1, 2013

also is there a problem with moving _constructor into PandasObject?

Contributor

jtratner commented Jul 1, 2013

No. Would that just be return self.__class__ unless overridden?

On Sun, Jun 30, 2013 at 11:46 PM, Phillip Cloud notifications@github.comwrote:

also is there a problem with moving _constructor into PandasObject?


Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/pull/4092#issuecomment-20262236
.

Member

cpcloud commented Jul 1, 2013

yep

Contributor

jreback commented Jul 1, 2013

only reason to have PandasScalar as base class of Timestamp, Period is to distinguish these in _is_list_like; so you can do isinstance(obj, PandasScalar),but maybe a little thin......so prob hold off on this

Member

cpcloud commented Jul 1, 2013

if just need the instance check then can just fake with

def is_pd_scalar(obj):
    return isinstance(obj, (Timestamp, Period))

adding to the instance check if (when?) more scalar types are added.

Contributor

jreback commented Jul 1, 2013

i think ok for 0.12

Member

cpcloud commented Jul 1, 2013

👍 for 0.12 here as well

Contributor

jreback commented Jul 1, 2013

@jtratner maybe squash it down a bit?

Contributor

jtratner commented Jul 1, 2013

If it's list like, use PandasContainer? Think that's already the case.
On Jul 1, 2013 7:05 AM, "jreback" notifications@github.com wrote:

only reason to have PandasScalar as base class of Timestamp, Period is to
distinguish these in _is_list_like; so you can do isinstance(obj,
PandasScalar),but maybe a little thin......so prob hold off on this


Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/pull/4092#issuecomment-20275309
.

Contributor

jreback commented Jul 1, 2013

retract need for PandasScalar.....noneed really (and if there is legit need later, can always add)

On Jul 1, 2013, at 7:41 AM, Jeff Tratner notifications@github.com wrote:

If it's list like, use PandasContainer? Think that's already the case.
On Jul 1, 2013 7:05 AM, "jreback" notifications@github.com wrote:

only reason to have PandasScalar as base class of Timestamp, Period is to
distinguish these in _is_list_like; so you can do isinstance(obj,
PandasScalar),but maybe a little thin......so prob hold off on this


Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/pull/4092#issuecomment-20275309
.


Reply to this email directly or view it on GitHub.

Contributor

jtratner commented Jul 1, 2013

Yeah, I'm going to squash it down...just wanted to show steps while
considering I guess.

As an aside, it might make more sense to just create an abstract base class
(eg something that checks that descends from PandadObject and can't
iterate)
On Jul 1, 2013 7:41 AM, "Jeffrey Tratner" jeffrey.tratner@gmail.com wrote:

If it's list like, use PandasContainer? Think that's already the case.
On Jul 1, 2013 7:05 AM, "jreback" notifications@github.com wrote:

only reason to have PandasScalar as base class of Timestamp, Period is
to distinguish these in _is_list_like; so you can do isinstance(obj,
PandasScalar),but maybe a little thin......so prob hold off on this


Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/pull/4092#issuecomment-20275309
.

Member

cpcloud commented Jul 1, 2013

FWIW i like that you don't squash for a while, while you're building, i'm starting to do that. i think it helps show the thought process. plenty of examples too, i think the series-as-ndframe pr has a ton of commits

Contributor

jtratner commented Jul 1, 2013

Plus, if you squash, it keeps the text of all the commits which is nice. For me, I like to make sure every commit passes its tests and distinct changes should end up in different commits to make it easier to localize issues if they occur. (e.g. via git bisect)

Contributor

jreback commented Jul 1, 2013

looks ok to merge into 0.12 -

any objections
@wesm, @y-p

Contributor

jtratner commented Jul 1, 2013

@cpcloud - I added the constructor argument to the baseclass and removed a
number of instances where it was erroneous.

On Mon, Jul 1, 2013 at 12:08 PM, jreback notifications@github.com wrote:

looks ok to merge into 0.12 -

any objections
@wesm https://github.com/wesm, @y-p https://github.com/y-p


Reply to this email directly or view it on GitHubhttps://github.com/pydata/pandas/pull/4092#issuecomment-20291886
.

jtratner added some commits Jul 1, 2013

@jtratner jtratner CLN: Refactor string methods and add PandasObject
Previous PandasObject becomes PandasContainer. New PandasObject becomes
baseclass for more elements (like Index, Categorical, etc.), moves
string methods to baseclass and subclassing objects need only define
`__unicode__` methods to get all string methods for free (and Py2/3
compatible).

CLN: Cleanup extraneous str methods from Panel

CLN: Remove unnecessary string methods from frame

CLN: Change name of TestPandasObjects --> TestPandasContainer
411d13f
@jtratner jtratner CLN: Make more core objects inherit PandasObject
CLN: Make Categorical inherit from PandasObject

CLN: Make GroupBy inherit from PandasObject

CLN/ENH: Make Sparse* into PandasObjects

Plus get all the string methods working...

CLN: Index now a PandasObject + str method cleanup

CLN: Make tseries/index fit with PandasObject.

CLN: Use PandasObject in internals + cleanup

CLN: Make Period into a PandasObject + cleanup

CLN: Remove extraneous __repr__ from io/excel
0cf93aa
@jtratner jtratner CLN: Have PyTables, stats, & Stata use StringMixin
CLN: Make PyTables unicode safe + add StringMixin

CLN: Make StataMissingValue use StringMixin

ENH: Use StringMixin for addl string methods in stats
7222e5a
@jtratner jtratner DOC: New class hierarchy + StringMixin
DOC/CLN: Remove extra whitespace from v0.12.0.txt

DOC: Add PR issue number too

DOC: Fix spelling error
8468b13
@jtratner jtratner CLN: Move _constructor checks to PandasObject base a558314
Owner

wesm commented Jul 2, 2013

just confirming there's no user-facing API changes and have you run the perf regression test?

Contributor

jtratner commented Jul 2, 2013

@wesm there's definitely no user-facing API changes. I ran test_perf and it looked similar, is there a different "perf regression test"?

Owner

wesm commented Jul 2, 2013

nope that would be it. bombs away

Contributor

jtratner commented Jul 2, 2013


Invoked with :
--ncalls: 3
--repeats: 3


-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_constructor_ndarray                    |   0.0540 |   0.0629 |   0.8586 |
series_constructor_ndarray                   |   0.0160 |   0.0173 |   0.9220 |
timeseries_slice_minutely                    |   0.0653 |   0.0707 |   0.9246 |
groupby_frame_apply_overhead                 |  13.3937 |  14.3940 |   0.9305 |
groupbym_frame_apply                         |  66.8457 |  71.7251 |   0.9320 |
ctor_index_array_string                      |   0.0637 |   0.0683 |   0.9325 |
groupby_frame_singlekey_integer              |   2.9016 |   3.1103 |   0.9329 |
groupby_frame_cython_many_columns            |   3.5630 |   3.7840 |   0.9416 |
sparse_frame_constructor                     |   7.0070 |   7.4353 |   0.9424 |
query_store_table                            |   6.6509 |   7.0089 |   0.9489 |
timeseries_add_irregular                     |  23.4577 |  24.7170 |   0.9491 |
indexing_dataframe_boolean_no_ne             |  85.8157 |  90.0673 |   0.9528 |
frame_iteritems                              |   2.6190 |   2.7456 |   0.9539 |
frame_xs_row                                 |   0.0433 |   0.0454 |   0.9545 |
timeseries_large_lookup_value                |   0.0287 |   0.0300 |   0.9550 |
read_table_multiple_date                     | 279.6036 | 291.7473 |   0.9584 |
dti_reset_index_tz                           |  17.8350 |  18.6047 |   0.9586 |
read_table_multiple_date_baseline            | 130.9160 | 136.5434 |   0.9588 |
frame_drop_dup_inplace                       |   3.5776 |   3.7280 |   0.9597 |
merge_2intkey_sort                           |  51.1840 |  53.2420 |   0.9613 |
frame_xs_col                                 |   0.0317 |   0.0330 |   0.9614 |
timeseries_asof_single                       |   0.0420 |   0.0436 |   0.9636 |
query_store_table_wide                       |  15.8637 |  16.4570 |   0.9639 |
reshape_stack_simple                         |   1.5557 |   1.6137 |   0.9640 |
frame_to_csv_mixed                           | 282.5810 | 292.7710 |   0.9652 |
reindex_fillna_pad_float32                   |   0.0827 |   0.0857 |   0.9657 |
groupby_frame_median                         |   8.6180 |   8.9057 |   0.9677 |
frame_iloc_dups                              |   0.2497 |   0.2580 |   0.9680 |
concat_series_axis1                          |  75.8933 |  78.4010 |   0.9680 |
append_frame_single_mixed                    |   1.0621 |   1.0967 |   0.9684 |
concat_small_frames                          |  19.0393 |  19.6483 |   0.9690 |
write_store_table_mixed                      | 229.4963 | 236.8270 |   0.9690 |
groupby_multi_python                         | 125.1554 | 129.0573 |   0.9698 |
groupby_apply_dict_return                    |  59.1354 |  60.9446 |   0.9703 |
frame_get_numeric_data                       |   0.1006 |   0.1036 |   0.9709 |
write_store_table_panel                      | 141.1357 | 145.1124 |   0.9726 |
frame_ctor_nested_dict_int64                 | 121.8780 | 125.2630 |   0.9730 |
indexing_panel_subset                        |   0.6837 |   0.7026 |   0.9731 |
melt_dataframe                               |   2.0859 |   2.1434 |   0.9732 |
unstack_sparse_keyspace                      |   1.8423 |   1.8926 |   0.9734 |
frame_iteritems_cached                       |   0.0884 |   0.0907 |   0.9746 |
reindex_fillna_backfill_float32              |   0.0893 |   0.0916 |   0.9748 |
dti_reset_index                              |   0.2216 |   0.2271 |   0.9762 |
timeseries_sort_index                        |  24.4863 |  25.0766 |   0.9765 |
join_dataframe_integer_key                   |   1.9927 |   2.0394 |   0.9771 |
frame_insert_500_columns                     | 131.8157 | 134.8596 |   0.9774 |
groupby_sum_booleans                         |   1.1047 |   1.1300 |   0.9776 |
frame_reindex_both_axes_ix                   |   0.4540 |   0.4640 |   0.9786 |
write_store_table                            |  96.8526 |  98.9670 |   0.9786 |
panel_from_dict_same_index                   |  33.4070 |  34.1267 |   0.9789 |
reindex_fillna_pad                           |   0.0897 |   0.0916 |   0.9792 |
timeseries_timestamp_tzinfo_cons             |   0.0190 |   0.0194 |   0.9795 |
groupby_multi_different_functions            |  18.9217 |  19.3143 |   0.9797 |
series_align_irregular_string                |  91.0884 |  92.9247 |   0.9802 |
join_dataframe_index_single_key_small        |   8.5344 |   8.7027 |   0.9807 |
groupby_series_simple_cython                 |   7.4213 |   7.5617 |   0.9814 |
sparse_series_to_frame                       | 159.4783 | 162.4194 |   0.9819 |
groupby_simple_compress_timing               |  39.7201 |  40.4267 |   0.9825 |
reindex_fillna_backfill                      |   0.0954 |   0.0970 |   0.9828 |
frame_insert_100_columns_begin               |  26.9523 |  27.4216 |   0.9829 |
join_dataframe_integer_2key                  |   6.7627 |   6.8777 |   0.9833 |
panel_from_dict_equiv_indexes                |  33.6480 |  34.2067 |   0.9837 |
timeseries_1min_5min_ohlc                    |   0.8053 |   0.8186 |   0.9837 |
write_store                                  |   7.7853 |   7.9123 |   0.9839 |
match_strings                                |   0.4560 |   0.4633 |   0.9842 |
series_ctor_from_dict                        |   3.4590 |   3.5137 |   0.9844 |
period_setitem                               | 211.7360 | 215.0184 |   0.9847 |
reindex_daterange_pad                        |   0.2046 |   0.2077 |   0.9855 |
append_frame_single_homogenous               |   0.2707 |   0.2747 |   0.9855 |
reindex_daterange_backfill                   |   0.2060 |   0.2090 |   0.9856 |
datetimeindex_normalize                      |   3.8527 |   3.9083 |   0.9858 |
stat_ops_level_frame_sum                     |   3.4913 |   3.5400 |   0.9863 |
write_store_mixed                            |  20.1573 |  20.4153 |   0.9874 |
mask_bools                                   |  18.2454 |  18.4613 |   0.9883 |
frame_repr_tall                              |   2.4837 |   2.5130 |   0.9883 |
panel_from_dict_two_different_indexes        |  57.2840 |  57.9487 |   0.9885 |
indexing_dataframe_boolean_rows              |   0.2880 |   0.2913 |   0.9885 |
frame_multi_and                              |  37.4480 |  37.8576 |   0.9892 |
timeseries_1min_5min_mean                    |   0.7963 |   0.8050 |   0.9892 |
frame_reindex_both_axes                      |   0.3664 |   0.3703 |   0.9893 |
read_store_table_panel                       |  39.2367 |  39.6434 |   0.9897 |
frame_sort_index_by_columns                  |  51.4247 |  51.9369 |   0.9901 |
frame_to_string_floats                       |  58.5093 |  59.0810 |   0.9903 |
frame_ctor_list_of_dict                      |  87.5090 |  88.3610 |   0.9904 |
indexing_dataframe_boolean_rows_object       |   0.5854 |   0.5906 |   0.9911 |
read_store                                   |   2.4203 |   2.4416 |   0.9913 |
frame_fillna_inplace                         |  12.3087 |  12.4167 |   0.9913 |
series_string_vector_slice                   | 265.6504 | 267.9363 |   0.9915 |
frame_repr_wide                              |   1.1756 |   1.1857 |   0.9916 |
frame_get_dtype_counts                       |   0.1160 |   0.1170 |   0.9918 |
stats_rank2d_axis0_average                   |  31.7893 |  32.0413 |   0.9921 |
replace_fillna                               |   2.1260 |   2.1423 |   0.9924 |
read_parse_dates_iso8601                     |   1.7130 |   1.7260 |   0.9925 |
lib_fast_zip                                 |  12.9871 |  13.0843 |   0.9926 |
stats_rank_average_int                       |  26.5594 |  26.7567 |   0.9926 |
groupby_transform                            | 173.6223 | 174.8737 |   0.9928 |
frame_ctor_nested_dict                       |  90.2743 |  90.9127 |   0.9930 |
index_datetime_intersection                  |  14.6950 |  14.7897 |   0.9936 |
groupby_multi_different_numpy_functions      |  17.4886 |  17.5977 |   0.9938 |
datetimeindex_add_offset                     |   0.2177 |   0.2190 |   0.9938 |
stats_rank2d_axis1_average                   |  21.4036 |  21.5360 |   0.9939 |
series_value_counts_int64                    |   2.7160 |   2.7327 |   0.9939 |
stats_corr_spearman                          | 104.6833 | 105.2810 |   0.9943 |
write_store_table_wide                       | 162.8940 | 163.8193 |   0.9944 |
index_datetime_union                         |  14.8407 |  14.9223 |   0.9945 |
indexing_dataframe_boolean                   |   7.7270 |   7.7680 |   0.9947 |
frame_loc_dups                               |   1.1547 |   1.1603 |   0.9951 |
read_csv_comment2                            |  30.6133 |  30.7567 |   0.9953 |
read_csv_thou_vb                             |  40.9724 |  41.1607 |   0.9954 |
read_csv_vb                                  |  28.4580 |  28.5730 |   0.9960 |
stat_ops_level_series_sum_multiple           |   9.4200 |   9.4560 |   0.9962 |
mask_floats                                  |   5.5910 |   5.6113 |   0.9964 |
frame_mult                                   |   7.3610 |   7.3800 |   0.9974 |
frame_drop_duplicates_na                     |  22.7741 |  22.8264 |   0.9977 |
indexing_dataframe_boolean_st                |  10.9363 |  10.9614 |   0.9977 |
stat_ops_level_frame_sum_multiple            |  10.9456 |  10.9647 |   0.9983 |
stats_rolling_mean                           |   1.3836 |   1.3857 |   0.9985 |
reindex_multiindex                           |   1.5523 |   1.5543 |   0.9987 |
frame_reindex_axis1                          |   4.3393 |   4.3446 |   0.9988 |
merge_2intkey_nosort                         |  22.0267 |  22.0497 |   0.9990 |
timeseries_period_downsample_mean            |  13.1647 |  13.1753 |   0.9992 |
frame_fillna_many_columns_pad                |  15.4930 |  15.5030 |   0.9994 |
lib_fast_zip_fillna                          |  18.4803 |  18.4876 |   0.9996 |
frame_boolean_row_select                     |   0.2367 |   0.2367 |   1.0000 |
series_drop_duplicates_int                   |   0.8540 |   0.8536 |   1.0005 |
index_int64_intersection                     |  27.7560 |  27.7333 |   1.0008 |
timeseries_timestamp_downsample_mean         |   8.9353 |   8.9267 |   1.0010 |
reindex_frame_level_align                    |   1.4460 |   1.4443 |   1.0012 |
frame_mult_no_ne                             |   8.2893 |   8.2797 |   1.0012 |
series_align_left_monotonic                  |  20.2010 |  20.1760 |   1.0012 |
read_store_mixed                             |   5.6214 |   5.6120 |   1.0017 |
replace_replacena                            |   2.2024 |   2.1986 |   1.0017 |
write_csv_standard                           |  60.8846 |  60.7440 |   1.0023 |
sort_level_one                               |   6.9184 |   6.9010 |   1.0025 |
timeseries_infer_freq                        |   9.5960 |   9.5716 |   1.0025 |
join_dataframe_index_single_key_bigger_sort  |  19.8517 |  19.7976 |   1.0027 |
series_align_int64_index                     |  43.9860 |  43.8654 |   1.0028 |
reshape_unstack_simple                       |   3.9103 |   3.8990 |   1.0029 |
stat_ops_level_series_sum                    |   2.6906 |   2.6813 |   1.0035 |
series_value_counts_strings                  |   6.4830 |   6.4564 |   1.0041 |
frame_multi_and_st                           |  51.8977 |  51.6740 |   1.0043 |
series_drop_duplicates_string                |   0.6460 |   0.6430 |   1.0047 |
timeseries_asof_nan                          |  12.1260 |  12.0676 |   1.0048 |
join_dataframe_index_single_key_bigger       |   9.0393 |   8.9947 |   1.0050 |
timeseries_to_datetime_iso8601               |   6.8889 |   6.8447 |   1.0065 |
frame_to_csv                                 | 193.5613 | 192.2576 |   1.0068 |
groupby_multi_size                           |  36.8676 |  36.6083 |   1.0071 |
groupby_multi_cython                         |  23.3246 |  23.1566 |   1.0073 |
write_store_table_dc                         | 222.9393 | 221.2910 |   1.0074 |
frame_to_csv2                                | 161.1224 | 159.8310 |   1.0081 |
frame_add_no_ne                              |   7.9737 |   7.9063 |   1.0085 |
stat_ops_series_std                          |   0.3087 |   0.3060 |   1.0086 |
timeseries_asof                              |  12.2716 |  12.1510 |   1.0099 |
frame_drop_duplicates                        |  24.6320 |  24.3850 |   1.0101 |
join_dataframe_index_multi                   |  26.2380 |  25.9730 |   1.0102 |
read_store_table_mixed                       |   6.4773 |   6.4117 |   1.0102 |
panel_from_dict_all_different_indexes        |  80.5343 |  79.6703 |   1.0108 |
frame_add                                    |   7.2591 |   7.1804 |   1.0110 |
groupby_last                                 |   5.1003 |   5.0430 |   1.0114 |
datetimeindex_unique                         |   0.1477 |   0.1460 |   1.0114 |
read_csv_standard                            |  15.3503 |  15.1657 |   1.0122 |
groupby_first                                |   4.4200 |   4.3623 |   1.0132 |
groupby_pivot_table                          |  27.1097 |  26.7387 |   1.0139 |
groupby_last_float32                         |   4.9826 |   4.9140 |   1.0140 |
reindex_frame_level_reindex                  |   1.3996 |   1.3803 |   1.0140 |
reshape_pivot_time_series                    | 225.6830 | 222.5023 |   1.0143 |
read_store_table                             |   3.9197 |   3.8604 |   1.0154 |
frame_add_st                                 |   8.5664 |   8.4220 |   1.0171 |
groupby_first_float32                        |   4.4951 |   4.4113 |   1.0190 |
groupby_indices                              |  10.1364 |   9.9407 |   1.0197 |
sort_level_zero                              |   6.9944 |   6.8517 |   1.0208 |
frame_reindex_columns                        |   0.4106 |   0.4020 |   1.0216 |
frame_mult_st                                |   8.6043 |   8.3633 |   1.0288 |
groupby_multi_series_op                      |  21.5487 |  20.9064 |   1.0307 |
index_int64_union                            |  87.2873 |  84.6050 |   1.0317 |
stats_rank_average                           |  35.2510 |  34.1436 |   1.0324 |
frame_reindex_axis0                          |   1.1787 |   1.1410 |   1.0331 |
frame_reindex_upcast                         |  19.1567 |  18.4720 |   1.0371 |
dataframe_reindex                            |   0.4817 |   0.4636 |   1.0389 |
frame_drop_dup_na_inplace                    |   3.5870 |   3.4497 |   1.0398 |
frame_fancy_lookup_all                       |  25.4450 |  24.3870 |   1.0434 |
frame_fancy_lookup                           |   2.3506 |   2.2329 |   1.0527 |
read_store_table_wide                        |  20.6024 |  19.5236 |   1.0553 |
frame_multi_and_no_ne                        |  89.0367 |  83.9037 |   1.0612 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [cdbd7d4] : CLN: Move _constructor checks to PandasObject base
Base   [cdb3b2c] : Merge pull request #4091 from cpcloud/fix-share-params-grouped-hist


test suite above @wesm - is that what reasonably close?

Contributor

jreback commented Jul 2, 2013

looks good

@jtratner merging UNODIR (unless otherwise directed)

Contributor

jtratner commented Jul 2, 2013

@jreback go ahead - all good on my end

@jreback jreback added a commit that referenced this pull request Jul 2, 2013

@jreback jreback Merge pull request #4092 from jtratner/refactor_string_special_methods
CLN: Refactor string special methods
030f613

@jreback jreback merged commit 030f613 into pandas-dev:master Jul 2, 2013

jtratner deleted the jtratner:refactor_string_special_methods branch Sep 21, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment