Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Fixes GH9311 groupby on datetime64 #9345

Merged
merged 1 commit into from
Feb 14, 2015

Conversation

chrisbyboston
Copy link

datetime64 columns were changing at the nano-second scale when
applying a groupby aggregator.

closes #9311
closes #6620

@@ -1504,7 +1504,11 @@ def aggregate(self, values, how, axis=0):
raise NotImplementedError
out_shape = (self.ngroups,) + values.shape[1:]

if is_numeric_dtype(values.dtype):
if is_datetime:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do to operations like groupby('foo').mean() for datetime columns? Currently these raise DataError: No numeric types to aggregate but perhaps they shouldn't...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mean of datetime64 is a can of worms. It can be done, but completely separate from this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed... wanted to make sure to understand if that was changing that behavior for groupby aggregations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, I think that should be cause before here though (iow there is a test for it now that asserts that it fails)

@shoyer
Copy link
Member

shoyer commented Jan 25, 2015

This looks like some nice fixes but also looks likely under-tested -- you've changed a lot of functions and only added tests for one of them.

@@ -1487,7 +1487,7 @@ def wrapper(*args, **kwargs):
(how, dtype_str))
return func, dtype_str

def aggregate(self, values, how, axis=0):
def aggregate(self, values, how, axis=0, is_datetime=False):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can't put a flag like this here, its very odd. There are already introspection determinations in the grouper objects.

@jreback jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions Groupby labels Jan 25, 2015
@jreback jreback added this to the 0.16.0 milestone Jan 25, 2015
@shoyer
Copy link
Member

shoyer commented Jan 25, 2015

I would probably scale this back to only change operations without any possibility of precision/overflow issues -- I think that's first/last/min/max/nth. The arithmetic mean/sum/var/std should probably wait until we define it even on datetime64 without grouping, as @jreback mentions above (though efforts there would certainly be appreciated).

Also, it is almost impossible to write too many tests :).

@chrisbyboston
Copy link
Author

Awesome. Thanks for the feedback. I'll have updates, and more tests in here shortly.

@chrisbyboston
Copy link
Author

@shoyer or @jreback - I'm very close to having the above fixes ready on this PR. I do need a bit of guidance on one thing though. Now that the Cython functions are using iNaT for nan values on the integer functions, they are returning the integer value (-9223372036854775808) back in the ndarray, as I would expect.

In order to get it working so that na functions will work (like fillna), I need to replace those values back with something that will be evaluated as na. Is there a helper function somewhere to do that?

@shoyer
Copy link
Member

shoyer commented Jan 30, 2015

@iwschris iNaT -> NaT happens automatically when you cast back to datetime64. You can use values.view('datetime64[ns]') for that, assuming values is an array with int64 dtype. This won't work for integers -- you'll need to upcast those to float.

@jreback
Copy link
Contributor

jreback commented Jan 30, 2015

@iwschris just a do the view as @shoyer suggest
you know you have an int64 (or int32) so this will work
(only take a view if it's actually a datetime64/tiemselta64) in the first place
this should be done in python for consistency (we just normally return basic types to/from cython)

@chrisbyboston
Copy link
Author

Awesome. Thanks!

@chrisbyboston
Copy link
Author

@shoyer -- Pretty sure this is ready for review again. I was able to keep the safe casting turned on in internals.py and I validated that vbench is still looking pretty good.

Here's the output from vbench:


Invoked with :
--ncalls: 3
--repeats: 3


-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
index_from_series_ctor                       |   0.0147 |   0.0337 |   0.4363 |
series_constructor_ndarray                   |   0.0137 |   0.0277 |   0.4943 |
concat_empty_frames1                         |   0.7093 |   1.3227 |   0.5363 |
frame_get_numeric_data                       |   0.0956 |   0.1766 |   0.5414 |
join_dataframe_index_single_key_small        |   7.9147 |  13.9790 |   0.5662 |
groupby_ngroups_100_first                    |   0.2720 |   0.4776 |   0.5694 |
read_csv_infer_datetime_format_custom        |   7.8737 |  11.8280 |   0.6657 |
join_dataframe_index_single_key_bigger       |   8.9413 |  12.7583 |   0.7008 |
groupby_ngroups_100_last                     |   0.2457 |   0.3486 |   0.7046 |
groupby_ngroups_10000_median                 |   2.1223 |   2.9956 |   0.7085 |
frame_fancy_lookup_all                       |  10.6823 |  14.9640 |   0.7139 |
groupby_ngroups_100_max                      |   0.2611 |   0.3640 |   0.7172 |
groupby_ngroups_100_min                      |   0.2487 |   0.3421 |   0.7270 |
groupby_ngroups_100_count                    |   0.2720 |   0.3714 |   0.7325 |
frame_from_series                            |   0.1073 |   0.1454 |   0.7381 |
ctor_index_array_string                      |   0.0143 |   0.0193 |   0.7407 |
dataframe_resample_mean_numpy                |   2.4650 |   3.2423 |   0.7603 |
frame_iteritems_cached                       |   0.4021 |   0.5227 |   0.7692 |
index_float64_boolean_indexer                |   2.8100 |   3.6487 |   0.7701 |
frame_constructor_ndarray                    |   0.0670 |   0.0866 |   0.7734 |
reindex_frame_level_align                    |   0.5527 |   0.6983 |   0.7914 |
frame_count_level_axis0_multi                |  53.3690 |  66.9137 |   0.7976 |
timeseries_custom_bmonthbegin_incr_n         |   0.1923 |   0.2387 |   0.8059 |
frame_ctor_dtindex_BYearEndx1                |   1.1930 |   1.4800 |   0.8061 |
frame_reindex_axis0                          | 123.4897 | 152.9363 |   0.8075 |
frame_assign_timeseries_index                |   0.5610 |   0.6904 |   0.8126 |
read_csv_infer_datetime_format_iso8601       |   1.4773 |   1.8067 |   0.8177 |
groupby_frame_cython_many_columns            |   1.9461 |   2.3790 |   0.8180 |
frame_repr_wide                              |  10.0773 |  12.3091 |   0.8187 |
frame_ctor_dtindex_BDayx2                    |   0.9710 |   1.1800 |   0.8229 |
frame_nonunique_equal                        |   6.3940 |   7.7110 |   0.8292 |
frame_ctor_dtindex_QuarterBeginx1            |   0.9793 |   1.1793 |   0.8304 |
reindex_fillna_pad                           |   0.2143 |   0.2580 |   0.8306 |
frame_apply_ref_by_name                      |  10.7396 |  12.9287 |   0.8307 |
groupby_ngroups_10000_max                    |   1.5074 |   1.8050 |   0.8351 |
groupby_ngroups_10000_first                  |   1.5857 |   1.8973 |   0.8358 |
groupby_ngroups_100_cummax                   |  11.6933 |  13.9883 |   0.8359 |
index_datetime_intersection                  |   8.0390 |   9.6114 |   0.8364 |
frame_mask_floats                            |   4.5877 |   5.4599 |   0.8403 |
frame_drop_duplicates                        |  11.3926 |  13.4079 |   0.8497 |
groupby_ngroups_100_std                      |   0.3264 |   0.3827 |   0.8530 |
frame_reindex_columns                        |   0.2379 |   0.2789 |   0.8530 |
groupby_multi_size                           |  17.3423 |  20.3076 |   0.8540 |
series_ix_slice                              |   0.0416 |   0.0486 |   0.8562 |
datetimeindex_unique                         |   0.0723 |   0.0843 |   0.8577 |
groupby_first_float64                        |   2.2407 |   2.6090 |   0.8588 |
frame_ctor_dtindex_BQuarterEndx2             |   1.1080 |   1.2867 |   0.8611 |
reindex_frame_level_reindex                  |   0.5770 |   0.6690 |   0.8624 |
timeseries_custom_bmonthbegin_decr_n         |   0.2031 |   0.2350 |   0.8641 |
frame_ctor_dtindex_MonthEndx1                |   1.0313 |   1.1810 |   0.8732 |
groupby_series_nth_none                      |   1.0096 |   1.1547 |   0.8744 |
groupby_ngroups_100_rank                     |  11.8863 |  13.5717 |   0.8758 |
stat_ops_frame_sum_float_axis_1              |   3.5137 |   3.9994 |   0.8785 |
series_ctor_from_dict                        |   2.0247 |   2.3017 |   0.8796 |
frame_ctor_dtindex_MonthEndx2                |   1.0126 |   1.1484 |   0.8818 |
groupby_frame_singlekey_integer              |   1.3860 |   1.5713 |   0.8821 |
frame_ctor_dtindex_BYearBeginx1              |   1.1864 |   1.3447 |   0.8823 |
frame_drop_dup_na_inplace                    |   1.7404 |   1.9697 |   0.8836 |
groupby_frame_nth_any                        |   4.4223 |   4.9994 |   0.8846 |
frame_drop_dup_inplace                       |   1.8404 |   2.0734 |   0.8876 |
reindex_daterange_pad                        |   0.5534 |   0.6224 |   0.8892 |
groupby_ngroups_10000_var                    |   1.6543 |   1.8600 |   0.8894 |
timeseries_infer_freq                        |   6.6311 |   7.4290 |   0.8926 |
frame_interpolate_some_good                  |   0.9607 |   1.0746 |   0.8940 |
groupby_ngroups_100_cummin                   |  10.7533 |  12.0024 |   0.8959 |
read_csv_standard                            |   8.7403 |   9.7334 |   0.8980 |
timeseries_to_datetime_iso8601               |   3.1000 |   3.4483 |   0.8990 |
timeseries_to_datetime_YYYYMMDD              |   6.3020 |   6.9950 |   0.9009 |
sql_string_write_sqlalchemy                  |  54.9703 |  60.7810 |   0.9044 |
groupby_ngroups_10000_count                  |   1.5697 |   1.7330 |   0.9058 |
groupby_ngroups_10000_diff                   | 905.3814 | 999.3493 |   0.9060 |
frame_html_repr_trunc_mi                     |  29.4023 |  32.4394 |   0.9064 |
groupby_ngroups_10000_std                    |   1.8214 |   2.0063 |   0.9078 |
groupby_ngroups_10000_sem                    |   2.1150 |   2.3276 |   0.9087 |
frame_from_records_generator_nrows           |   0.6553 |   0.7210 |   0.9090 |
index_int64_union                            |  52.9044 |  58.1080 |   0.9104 |
multiindex_from_product                      |   8.3253 |   9.1114 |   0.9137 |
index_datetime_union                         |   7.9623 |   8.7016 |   0.9150 |
frame_ctor_dtindex_Weekx2                    |   0.7860 |   0.8587 |   0.9153 |
groupby_nth_object_any                       | 872.7303 | 951.7753 |   0.9169 |
groupby_ngroups_100_mean                     |   0.2930 |   0.3186 |   0.9197 |
packers_read_pack                            |  62.9237 |  68.2081 |   0.9225 |
groupby_frame_apply_overhead                 |   5.9593 |   6.4523 |   0.9236 |
timeseries_large_lookup_value                |   0.0126 |   0.0137 |   0.9244 |
packers_write_json_mixed_float_int_str       |  93.9660 | 101.5189 |   0.9256 |
frame_ctor_dtindex_BYearBeginx2              |   1.1083 |   1.1961 |   0.9266 |
eval_frame_add_python_one_thread             |  15.0924 |  16.2520 |   0.9286 |
frame_iloc_dups                              |   0.1764 |   0.1896 |   0.9300 |
groupby_ngroups_10000_sum                    |   1.8277 |   1.9616 |   0.9317 |
groupby_ngroups_10000_last                   |   1.7217 |   1.8463 |   0.9325 |
packers_read_stata                           |  36.7910 |  39.4524 |   0.9325 |
read_csv_precise_converter                   |   1.1957 |   1.2817 |   0.9328 |
timeseries_asof                              |   5.7844 |   6.1994 |   0.9331 |
series_value_counts_int64                    |   1.5830 |   1.6963 |   0.9332 |
frame_dtypes                                 |   0.0799 |   0.0857 |   0.9332 |
panel_shift_minor                            |   0.0700 |   0.0750 |   0.9333 |
groupby_dt_size                              |  18.6746 |  19.9850 |   0.9344 |
timeseries_custom_bday_cal_incr_neg_n        |   0.0196 |   0.0210 |   0.9356 |
index_str_boolean_indexer                    |   6.8731 |   7.3450 |   0.9357 |
groupby_first_float32                        |   2.4620 |   2.6309 |   0.9358 |
stat_ops_frame_sum_float_axis_0              |   3.4143 |   3.6473 |   0.9361 |
groupby_multi_different_functions            |   8.2520 |   8.8123 |   0.9364 |
timeseries_iter_datetimeindex_preexit        |   9.0630 |   9.6117 |   0.9429 |
frame_repr_tall                              |  17.1143 |  18.1334 |   0.9438 |
packers_read_json_date_index                 | 139.4784 | 147.7621 |   0.9439 |
dtype_infer_datetime64                       |   5.7657 |   6.0943 |   0.9461 |
read_csv_roundtrip_converter                 |   1.8900 |   1.9963 |   0.9468 |
groupby_ngroups_100_prod                     |   0.3593 |   0.3793 |   0.9472 |
lib_fast_zip_fillna                          |   8.9866 |   9.4833 |   0.9476 |
frame_reindex_both_axes_ix                   |  27.4524 |  28.9690 |   0.9476 |
sort_level_zero                              |   8.5460 |   9.0003 |   0.9495 |
groupby_ngroups_10000_cumprod                | 987.3126 | 1039.5793 |   0.9497 |
groupby_indices                              |   4.2933 |   4.5194 |   0.9500 |
frame_ctor_dtindex_MonthBeginx2              |   1.0253 |   1.0786 |   0.9506 |
series_getitem_slice                         |   0.0339 |   0.0357 |   0.9510 |
read_csv_skiprows                            |  11.6111 |  12.2014 |   0.9516 |
sql_float_write_fallback                     |  23.6900 |  24.8843 |   0.9520 |
sql_write_sqlalchemy                         | 121.5783 | 127.5000 |   0.9536 |
packers_read_json                            | 142.3674 | 149.2996 |   0.9536 |
sql_float_read_table_sqlalchemy              |  11.9420 |  12.5156 |   0.9542 |
frame_ctor_dtindex_Weekx1                    |   0.8196 |   0.8589 |   0.9542 |
groupby_ngroups_100_head                     |   0.5653 |   0.5923 |   0.9544 |
join_non_unique_equal                        |   0.3424 |   0.3587 |   0.9546 |
groupby_multi_different_numpy_functions      |   8.0523 |   8.4341 |   0.9547 |
merge_2intkey_sort                           |  24.8954 |  26.0690 |   0.9550 |
frame_ctor_dtindex_BMonthBeginx1             |   1.1203 |   1.1717 |   0.9562 |
groupby_transform_multi_key2                 |  33.9526 |  35.4800 |   0.9570 |
series_timestamp_compare                     |   7.5290 |   7.8657 |   0.9572 |
packers_write_stata_with_validation          |  31.8860 |  33.2757 |   0.9582 |
frame_ctor_dtindex_DateOffsetx2              |   0.8277 |   0.8633 |   0.9588 |
strings_count                                |   4.7774 |   4.9780 |   0.9597 |
sql_float_read_query_fallback                |   6.5816 |   6.8550 |   0.9601 |
frame_apply_pass_thru                        |   3.2713 |   3.4067 |   0.9603 |
index_float64_div                            |   2.0644 |   2.1497 |   0.9603 |
frame_multi_and_no_ne                        |  20.8013 |  21.6600 |   0.9604 |
index_int64_intersection                     |  10.9847 |  11.4144 |   0.9624 |
frame_ctor_dtindex_DateOffsetx1              |   0.8663 |   0.8997 |   0.9629 |
timeseries_iter_periodindex_preexit          |   9.4426 |   9.8027 |   0.9633 |
strings_len                                  |   1.3343 |   1.3837 |   0.9643 |
frame_to_csv                                 | 110.6810 | 114.6793 |   0.9651 |
strings_cat                                  |   0.5057 |   0.5236 |   0.9657 |
frame_count_level_axis0_mixed_dtypes_multi   | 107.9280 | 111.7450 |   0.9658 |
join_dataframe_index_multi                   |  12.2537 |  12.6839 |   0.9661 |
groupby_ngroups_10000_mad                    | 3435.5440 | 3555.5203 |   0.9663 |
sql_float_write_sqlalchemy                   |  55.9303 |  57.8830 |   0.9663 |
frame_ctor_dtindex_Easterx1                  |   0.9393 |   0.9716 |   0.9667 |
groupby_ngroups_100_any                      |   6.9167 |   7.1410 |   0.9686 |
stats_rank2d_axis1_average                   |   8.8743 |   9.1600 |   0.9688 |
groupby_ngroups_10000_describe               | 12356.2800 | 12750.7817 |   0.9691 |
groupby_multi_series_op                      |  10.1536 |  10.4600 |   0.9707 |
packers_write_json                           |  73.1703 |  75.3726 |   0.9708 |
series_getitem_array                         |   0.3147 |   0.3237 |   0.9723 |
frame_dropna_axis1_any_mixed_dtypes          | 164.5314 | 169.1933 |   0.9724 |
frame_dropna_axis0_any_mixed_dtypes          | 159.9306 | 164.4520 |   0.9725 |
concat_small_frames                          |  40.3993 |  41.5320 |   0.9727 |
timeseries_timestamp_downsample_mean         |   3.4340 |   3.5297 |   0.9729 |
frame_ctor_dtindex_BusinessDayx2             |   0.9197 |   0.9443 |   0.9739 |
groupby_transform_series2                    | 108.0357 | 110.8853 |   0.9743 |
groupby_ngroups_100_sum                      |   0.3707 |   0.3804 |   0.9745 |
frame_ctor_dtindex_YearEndx1                 |   0.8994 |   0.9213 |   0.9762 |
series_xs_mi_ix                              |   2.9663 |   3.0383 |   0.9763 |
packers_write_csv                            | 943.0904 | 965.2163 |   0.9771 |
timeseries_asof_single                       |   0.0173 |   0.0177 |   0.9776 |
frame_dropna_axis0_all_mixed_dtypes          | 192.6063 | 196.9613 |   0.9779 |
groupby_nth_datetimes_any                    | 889.3877 | 909.3717 |   0.9780 |
frame_ctor_dtindex_CDayx2                    |   0.8940 |   0.9136 |   0.9785 |
frame_interpolate                            |  64.3644 |  65.7176 |   0.9794 |
timeseries_iter_datetimeindex                | 539.4704 | 550.7627 |   0.9795 |
frame_to_string_floats                       |  22.1376 |  22.5883 |   0.9800 |
eval_frame_mult_python                       |  16.0577 |  16.3810 |   0.9803 |
frame_ctor_dtindex_CBMonthBeginx2            |   2.0204 |   2.0594 |   0.9811 |
panel_shift                                  |   0.0716 |   0.0730 |   0.9815 |
frame_ctor_dtindex_BQuarterEndx1             |   1.0660 |   1.0850 |   0.9826 |
reindex_daterange_backfill                   |   0.7203 |   0.7331 |   0.9827 |
packers_write_json_date_index                |  83.0213 |  84.4580 |   0.9830 |
timeseries_custom_bmonthend_decr_n           |   0.2303 |   0.2340 |   0.9840 |
frame_mask_bools                             |   6.4433 |   6.5400 |   0.9852 |
timeseries_add_irregular                     |   9.7550 |   9.8964 |   0.9857 |
frame_apply_lambda_mean                      |   4.3870 |   4.4500 |   0.9858 |
series_drop_duplicates_string                |   0.4007 |   0.4063 |   0.9861 |
frame_ctor_nested_dict                       |  56.1170 |  56.9040 |   0.9862 |
frame_ctor_dtindex_YearBeginx1               |   0.8856 |   0.8980 |   0.9863 |
groupby_pivot_table                          |  12.7386 |  12.9116 |   0.9866 |
frame_interpolate_some_good_infer            |   2.0733 |   2.1013 |   0.9866 |
frame_ctor_dtindex_YearEndx2                 |   0.9444 |   0.9566 |   0.9872 |
groupby_ngroups_100_all                      |   7.2843 |   7.3783 |   0.9873 |
groupby_ngroups_10000_cumsum                 | 987.8446 | 1000.3951 |   0.9875 |
groupby_ngroups_100_sem                      |   0.6073 |   0.6146 |   0.9881 |
groupby_transform_multi_key1                 |  49.5160 |  50.1057 |   0.9882 |
melt_dataframe                               |   1.4334 |   1.4487 |   0.9894 |
stat_ops_frame_sum_int_axis_1                |   3.4297 |   3.4643 |   0.9900 |
strings_contains_many                        |   4.6546 |   4.6970 |   0.9910 |
frame_mult                                   |   4.3540 |   4.3923 |   0.9913 |
groupby_frame_apply                          |  29.8660 |  30.1157 |   0.9917 |
panel_pct_change_major                       | 5035.1934 | 5074.3450 |   0.9923 |
read_csv_comment2                            |  20.2267 |  20.3773 |   0.9926 |
datetimeindex_add_offset                     |   0.1920 |   0.1934 |   0.9930 |
strings_endswith                             |   3.2713 |   3.2940 |   0.9931 |
frame_apply_np_mean                          |   4.2990 |   4.3247 |   0.9941 |
datetimeindex_normalize                      |   2.6603 |   2.6757 |   0.9942 |
frame_dropna_axis1_any                       |  19.4923 |  19.6040 |   0.9943 |
packers_write_json_mixed_delta_int_tstamp    | 102.4027 | 102.9747 |   0.9944 |
read_csv_vb                                  |  16.3047 |  16.3777 |   0.9955 |
groupby_ngroups_10000_cumcount               |  57.6493 |  57.8650 |   0.9963 |
frame_object_equal                           |   6.4127 |   6.4360 |   0.9964 |
frame_ctor_dtindex_Easterx2                  |   0.9423 |   0.9456 |   0.9965 |
series_ix_scalar                             |   0.0236 |   0.0237 |   0.9966 |
write_csv_standard                           |  38.6047 |  38.7180 |   0.9971 |
groupby_transform_multi_key4                 | 100.7117 | 100.9640 |   0.9975 |
groupby_ngroups_10000_size                   |   3.2973 |   3.3036 |   0.9981 |
groupby_nth_float32_none                     |  66.8260 |  66.9477 |   0.9982 |
timeseries_with_format_no_exact              | 601.0760 | 601.8333 |   0.9987 |
frame_fillna_many_columns_pad                |   3.2874 |   3.2890 |   0.9995 |
datetimeindex_infer_dst                      |   2.6283 |   2.6280 |   1.0001 |
panel_from_dict_same_index                   |  30.4280 |  30.4210 |   1.0002 |
groupby_ngroups_10000_value_counts           | 3861.8680 | 3859.5047 |   1.0006 |
timeseries_asof_nan                          |   5.5869 |   5.5803 |   1.0012 |
frame_multi_and                              |  22.1247 |  22.0914 |   1.0015 |
frame_add_st                                 |   4.3914 |   4.3824 |   1.0020 |
groupby_frame_median                         |   4.9524 |   4.9326 |   1.0040 |
frame_fancy_lookup                           |   2.5950 |   2.5843 |   1.0042 |
frame_ctor_dtindex_QuarterEndx1              |   1.0297 |   1.0243 |   1.0052 |
groupby_last_object                          |  12.3967 |  12.3320 |   1.0052 |
timeseries_is_month_start                    |   2.3900 |   2.3774 |   1.0053 |
series_align_left_monotonic                  |  12.2104 |  12.1450 |   1.0054 |
lib_fast_zip                                 |   6.3469 |   6.3126 |   1.0054 |
frame_ctor_dtindex_CBMonthEndx2              |   2.8620 |   2.8457 |   1.0057 |
groupby_ngroups_10000_pct_change             | 2949.9527 | 2930.2514 |   1.0067 |
frame_fillna_inplace                         |   7.3990 |   7.3454 |   1.0073 |
eval_frame_add_python                        |  14.7847 |  14.6750 |   1.0075 |
groupby_nth_object_none                      | 479.3561 | 475.7130 |   1.0077 |
timeseries_slice_minutely                    |   0.0400 |   0.0397 |   1.0080 |
frame_mult_st                                |   4.3757 |   4.3403 |   1.0081 |
groupby_multi_python                         |  64.3490 |  63.8003 |   1.0086 |
frame_reindex_both_axes                      |  27.2237 |  26.9873 |   1.0088 |
dataframe_reindex                            |   0.2473 |   0.2450 |   1.0094 |
groupby_nth_datetimes_none                   | 460.5076 | 456.1773 |   1.0095 |
frame_ctor_dtindex_QuarterEndx2              |   1.0264 |   1.0166 |   1.0096 |
frame_iloc_big                               |   0.1190 |   0.1176 |   1.0115 |
sql_datetime_write_sqlalchemy                | 105.9604 | 104.7047 |   1.0120 |
timestamp_ops_diff1                          |  12.7310 |  12.5770 |   1.0122 |
series_value_counts_strings                  |   3.2010 |   3.1614 |   1.0125 |
frame_ctor_dtindex_BMonthEndx2               |   0.9724 |   0.9593 |   1.0136 |
frame_shift_axis_1                           |  13.0233 |  12.8467 |   1.0137 |
frame_ctor_dtindex_BYearEndx2                |   1.1053 |   1.0897 |   1.0144 |
frame_dropna_axis0_any                       |  20.1437 |  19.8566 |   1.0145 |
frame_ctor_dtindex_CBMonthBeginx1            |   2.2947 |   2.2613 |   1.0148 |
reshape_stack_simple                         |   1.7076 |   1.6827 |   1.0148 |
reindex_fillna_pad_float32                   |   0.4029 |   0.3970 |   1.0150 |
append_frame_single_homogenous               |   0.8920 |   0.8783 |   1.0156 |
frame_add_no_ne                              |   4.4303 |   4.3620 |   1.0157 |
indexing_dataframe_boolean_rows_object       |   0.3884 |   0.3823 |   1.0158 |
groupby_multi_cython                         |  11.0527 |  10.8797 |   1.0159 |
frame_ctor_dtindex_CDayx1                    |   0.9046 |   0.8896 |   1.0169 |
groupby_ngroups_10000_rank                   | 1007.1193 | 989.4890 |   1.0178 |
groupby_nth_float64_none                     |  65.9187 |  64.7260 |   1.0184 |
frame_count_level_axis1_mixed_dtypes_multi   |  81.4800 |  79.9817 |   1.0187 |
frame_sort_index_by_columns                  |  31.9147 |  31.3116 |   1.0193 |
groupby_ngroups_10000_head                   |  65.1494 |  63.9053 |   1.0195 |
frame_to_csv_mixed                           | 503.7344 | 494.0030 |   1.0197 |
unstack_sparse_keyspace                      |   0.9720 |   0.9530 |   1.0199 |
read_csv_thou_vb                             |  14.9190 |  14.6270 |   1.0200 |
timeseries_with_format_replace               | 825.8780 | 809.6297 |   1.0201 |
frame_dropna_axis1_all_mixed_dtypes          | 201.0813 | 197.0267 |   1.0206 |
groupby_transform_multi_key3                 | 563.4913 | 551.8860 |   1.0210 |
timedelta_convert_int                        |   0.1133 |   0.1109 |   1.0215 |
match_strings                                |   0.2960 |   0.2897 |   1.0217 |
timeseries_iter_periodindex                  | 937.8570 | 917.8220 |   1.0218 |
join_dataframe_integer_key                   |   1.1926 |   1.1670 |   1.0220 |
stat_ops_series_std                          |   0.3860 |   0.3777 |   1.0221 |
frame_multi_and_st                           |  22.6813 |  22.1707 |   1.0230 |
frame_apply_axis_1                           |  58.1466 |  56.8313 |   1.0231 |
groupby_ngroups_100_diff                     |  11.0393 |  10.7853 |   1.0236 |
frame_ctor_dtindex_BQuarterBeginx1           |   1.1190 |   1.0923 |   1.0244 |
frame_ctor_dtindex_BQuarterBeginx2           |   1.1107 |   1.0840 |   1.0246 |
frame_float_equal                            |   1.3397 |   1.3063 |   1.0256 |
frame_from_records_generator                 |  54.8224 |  53.4543 |   1.0256 |
strings_contains_many_noregex                |   2.0823 |   2.0300 |   1.0258 |
strings_get                                  |   2.3750 |   2.3136 |   1.0266 |
replace_replacena                            |   0.4417 |   0.4303 |   1.0266 |
dtype_infer_timedelta64_2                    |   8.5163 |   8.2900 |   1.0273 |
frame_html_repr_trunc_si                     |  21.0226 |  20.4570 |   1.0276 |
dataframe_resample_mean_string               |   2.4560 |   2.3890 |   1.0280 |
groupby_ngroups_10000_unique                 | 511.7454 | 497.7160 |   1.0282 |
frame_shift_axis0                            |   9.3033 |   9.0457 |   1.0285 |
frame_dropna_axis0_all                       |  45.4573 |  44.1926 |   1.0286 |
groupby_ngroups_100_nunique                  |   7.7484 |   7.5307 |   1.0289 |
sql_string_write_fallback                    |  23.1417 |  22.4884 |   1.0291 |
dataframe_resample_max_numpy                 |   1.2786 |   1.2423 |   1.0292 |
panel_pct_change_items                       | 5977.8221 | 5807.0730 |   1.0294 |
frame_loc_dups                               |   0.6920 |   0.6714 |   1.0307 |
frame_ctor_dtindex_BMonthBeginx2             |   1.1263 |   1.0924 |   1.0311 |
read_csv_default_converter                   |   1.3087 |   1.2683 |   1.0318 |
read_table_multiple_date_baseline            |  62.8977 |  60.9293 |   1.0323 |
groupby_ngroups_100_value_counts             |  37.9097 |  36.7200 |   1.0324 |
strings_upper                                |   4.4840 |   4.3400 |   1.0332 |
eval_frame_and_python_one_thread             |  21.4583 |  20.7507 |   1.0341 |
frame_insert_100_columns_begin               |  25.8463 |  24.9857 |   1.0344 |
groupby_agg_builtins1                        |   7.4780 |   7.2234 |   1.0353 |
eval_frame_and_python                        |  20.7856 |  20.0644 |   1.0359 |
series_getitem_list_like                     |   0.1480 |   0.1427 |   1.0367 |
strings_strip                                |   3.4354 |   3.3133 |   1.0368 |
frame_ctor_dtindex_BMonthEndx1               |   0.9767 |   0.9417 |   1.0372 |
frame_mult_no_ne                             |   4.4643 |   4.3034 |   1.0374 |
groupby_ngroups_100_cumcount                 |   0.5494 |   0.5294 |   1.0378 |
frame_add                                    |   4.5600 |   4.3924 |   1.0382 |
series_string_vector_slice                   | 168.7703 | 162.5583 |   1.0382 |
groupby_ngroups_10000_cummin                 | 1077.7996 | 1038.0610 |   1.0383 |
groupby_ngroups_10000_all                    | 720.6297 | 694.0330 |   1.0383 |
strings_extract                              |  34.1103 |  32.8497 |   1.0384 |
concat_empty_frames2                         |   0.7330 |   0.7057 |   1.0386 |
frame_ctor_dtindex_CBMonthEndx1              |   2.9496 |   2.8390 |   1.0390 |
packers_write_json_T                         |  87.2324 |  83.9403 |   1.0392 |
groupby_simple_compress_timing               |  22.7056 |  21.8263 |   1.0403 |
frame_ctor_dtindex_YearBeginx2               |   0.8867 |   0.8500 |   1.0432 |
dtype_infer_uint32                           |   0.3046 |   0.2920 |   1.0433 |
groupby_frame_nth_none                       |   1.5750 |   1.5093 |   1.0435 |
strings_startswith                           |   3.4940 |   3.3460 |   1.0442 |
strings_encode_decode                        |   0.2256 |   0.2160 |   1.0445 |
sql_datetime_read_as_native_sqlalchemy       |  19.7659 |  18.9184 |   1.0448 |
groupby_ngroups_10000_mean                   |   1.6270 |   1.5570 |   1.0450 |
frame_dropna_axis1_all                       |  54.1590 |  51.7350 |   1.0469 |
groupby_dt_timegrouper_size                  |  14.8040 |  14.1377 |   1.0471 |
groupby_agg_builtins2                        |  33.8887 |  32.3486 |   1.0476 |
packers_write_sql                            | 1956.6897 | 1866.3857 |   1.0484 |
panel_pct_change_minor                       | 5288.4736 | 5042.6934 |   1.0487 |
frame_reindex_axis1                          | 143.6096 | 136.9224 |   1.0488 |
sort_level_one                               |   8.7850 |   8.3740 |   1.0491 |
timeseries_1min_5min_ohlc                    |   0.6716 |   0.6401 |   1.0493 |
groupby_ngroups_10000_nunique                | 704.0600 | 670.6223 |   1.0499 |
stats_rank2d_axis0_average                   |  18.2641 |  17.3707 |   1.0514 |
groupby_apply_dict_return                    |  25.1443 |  23.9117 |   1.0515 |
strings_lower                                |   4.9067 |   4.6660 |   1.0516 |
groupby_transform_series                     |  15.4867 |  14.7260 |   1.0517 |
series_ix_array                              |   0.5894 |   0.5604 |   1.0518 |
multiindex_with_datetime_level_sliced        |   0.1463 |   0.1390 |   1.0526 |
groupby_ngroups_10000_skew                   | 981.4417 | 930.7914 |   1.0544 |
groupby_ngroups_100_skew                     |   9.8927 |   9.3813 |   1.0545 |
frame_ctor_nested_dict_int64                 |  60.3494 |  57.1480 |   1.0560 |
frame_to_html_mixed                          | 165.5410 | 156.7531 |   1.0561 |
frame_ctor_list_of_dict                      |  59.4960 |  56.2257 |   1.0582 |
groupby_ngroups_100_mad                      |  35.4846 |  33.4873 |   1.0596 |
groupby_ngroups_10000_tail                   |  60.6120 |  57.1950 |   1.0597 |
groupby_transform                            | 119.0536 | 111.8713 |   1.0642 |
stats_rank_average                           |  25.0426 |  23.5287 |   1.0643 |
timestamp_ops_diff2                          |  16.7730 |  15.7553 |   1.0646 |
frame_ctor_dtindex_BusinessDayx1             |   0.9303 |   0.8733 |   1.0652 |
sql_write_fallback                           |  48.0767 |  45.0877 |   1.0663 |
timestamp_series_compare                     |   7.7377 |   7.2530 |   1.0668 |
indexing_dataframe_boolean_rows              |   0.2377 |   0.2227 |   1.0675 |
reshape_unstack_simple                       |   2.2411 |   2.0994 |   1.0675 |
strings_rstrip                               |   3.0786 |   2.8836 |   1.0676 |
dti_reset_index_tz                           |   4.7610 |   4.4587 |   1.0678 |
multiindex_with_datetime_level_full          |   9.3650 |   8.7677 |   1.0681 |
datetime_index_intersection                  |   0.2949 |   0.2760 |   1.0685 |
concat_series_axis1                          |  65.8151 |  61.4983 |   1.0702 |
series_getitem_pos_slice                     |   0.0397 |   0.0370 |   1.0708 |
dtype_infer_float32                          |   0.2507 |   0.2340 |   1.0713 |
stat_ops_frame_mean_int_axis_1               |   4.8420 |   4.5130 |   1.0729 |
stat_ops_frame_mean_float_axis_0             |   3.8706 |   3.6034 |   1.0742 |
timeseries_custom_bday_decr                  |   0.0240 |   0.0223 |   1.0747 |
series_iloc_list_like                        |   0.2007 |   0.1867 |   1.0749 |
frame_reindex_upcast                         |   5.9930 |   5.5740 |   1.0752 |
indexing_panel_subset                        |   0.6670 |   0.6204 |   1.0752 |
packers_read_pickle                          | 108.4936 | 100.8810 |   1.0755 |
groupby_ngroups_10000_any                    | 707.5600 | 657.6886 |   1.0758 |
dtype_infer_timedelta64_1                    |  46.3394 |  42.9693 |   1.0784 |
groupby_ngroups_10000_cummax                 | 1078.3180 | 998.9180 |   1.0795 |
groupby_ngroups_10000_prod                   |   1.8747 |   1.7360 |   1.0799 |
reindex_fillna_backfill                      |   0.2646 |   0.2450 |   1.0801 |
groupby_ngroups_100_tail                     |   0.6087 |   0.5634 |   1.0804 |
strings_contains_few                         |   4.5360 |   4.1974 |   1.0807 |
sql_read_query_fallback                      |  26.6887 |  24.6690 |   1.0819 |
groupby_series_nth_any                       |   3.1127 |   2.8744 |   1.0829 |
frame_ctor_dtindex_CustomBusinessDayx1       |   0.9580 |   0.8833 |   1.0845 |
strings_center                               |   3.7570 |   3.4637 |   1.0847 |
stat_ops_level_frame_sum                     |   2.1319 |   1.9654 |   1.0848 |
strings_lstrip                               |   3.3210 |   3.0596 |   1.0854 |
frame_get_dtype_counts                       |   0.0757 |   0.0697 |   1.0855 |
frame_ctor_dtindex_BDayx1                    |   0.9207 |   0.8463 |   1.0879 |
merge_2intkey_nosort                         |  13.6313 |  12.5194 |   1.0888 |
stats_rolling_mean                           |   0.8846 |   0.8110 |   1.0907 |
dataframe_resample_min_string                |   1.4046 |   1.2860 |   1.0922 |
groupby_ngroups_100_var                      |   0.3340 |   0.3057 |   1.0928 |
timeseries_sort_index                        |   7.1480 |   6.5390 |   1.0931 |
groupby_ngroups_100_pct_change               |  30.0250 |  27.4397 |   1.0942 |
packers_read_sql                             | 456.4653 | 417.0200 |   1.0946 |
groupby_ngroups_100_unique                   |   5.5910 |   5.1040 |   1.0954 |
replace_large_dict                           | 9793.7127 | 8937.8887 |   1.0958 |
timedelta_convert_string_seconds             | 103.9890 |  94.8300 |   1.0966 |
sql_read_table_sqlalchemy                    |  32.6030 |  29.7074 |   1.0975 |
read_table_multiple_date                     | 144.7484 | 131.8140 |   1.0981 |
frame_ctor_dtindex_QuarterBeginx2            |   1.0433 |   0.9487 |   1.0998 |
frame_ctor_dtindex_CustomBusinessDayx2       |   0.9689 |   0.8806 |   1.1003 |
stats_corr_spearman                          |  68.1484 |  61.8983 |   1.1010 |
packers_read_csv                             | 157.6234 | 143.1127 |   1.1014 |
panel_from_dict_two_different_indexes        |  62.8486 |  56.9550 |   1.1035 |
frame_getitem_single_column2                 |  19.1580 |  17.3460 |   1.1045 |
eval_frame_chained_cmp_python_one_thread     |  91.1884 |  82.3380 |   1.1075 |
frame_count_level_axis1_multi                |  86.7036 |  78.2163 |   1.1085 |
frame_apply_user_func                        |  70.7947 |  63.8500 |   1.1088 |
groupby_ngroups_100_size                     |   0.4110 |   0.3699 |   1.1111 |
timeseries_custom_bmonthend_incr_n           |   0.2093 |   0.1884 |   1.1114 |
stats_rank_pct_average                       |  29.2093 |  26.2606 |   1.1123 |
groupby_ngroups_10000_min                    |   1.9766 |   1.7763 |   1.1128 |
frame_ctor_dtindex_MonthBeginx1              |   1.0486 |   0.9410 |   1.1143 |
groupby_multi_count                          |   6.1553 |   5.5184 |   1.1154 |
groupby_ngroups_100_cumprod                  |  11.5967 |  10.3873 |   1.1164 |
groupby_series_simple_cython                 | 167.5251 | 149.9200 |   1.1174 |
sql_read_query_sqlalchemy                    |  32.5234 |  29.1040 |   1.1175 |
index_float64_mul                            |   1.8547 |   1.6540 |   1.1213 |
i8merge                                      | 880.4169 | 784.4897 |   1.1223 |
stat_ops_frame_sum_int_axis_0                |   3.5937 |   3.2016 |   1.1224 |
frame_drop_duplicates_na                     |  12.6983 |  11.3060 |   1.1232 |
timedelta_convert_string                     |  87.5190 |  77.7940 |   1.1250 |
series_loc_scalar                            |   0.0267 |   0.0237 |   1.1275 |
groupby_last_float64                         |   3.5220 |   3.1199 |   1.1289 |
series_drop_duplicates_int                   |   0.6540 |   0.5790 |   1.1294 |
series_ix_list_like                          |   0.1517 |   0.1343 |   1.1296 |
packers_write_stata                          |  20.9417 |  18.5220 |   1.1306 |
timeseries_custom_bday_cal_decr              |   0.0234 |   0.0207 |   1.1308 |
strings_replace                              |  11.4644 |  10.1280 |   1.1319 |
period_setitem                               |  13.2340 |  11.6623 |   1.1348 |
timeseries_custom_bday_incr                  |   0.0140 |   0.0123 |   1.1355 |
frame_to_csv2                                | 110.0953 |  96.9253 |   1.1359 |
groupby_ngroups_100_describe                 | 137.4177 | 120.8344 |   1.1372 |
packers_write_json_mixed_float_int           | 131.6490 | 115.6390 |   1.1384 |
strings_get_dummies                          |  66.3884 |  58.2893 |   1.1389 |
timeseries_custom_bmonthend_incr             |   0.1527 |   0.1340 |   1.1394 |
packers_read_stata_with_validation           |  55.2636 |  48.4761 |   1.1400 |
replace_fillna                               |   0.9890 |   0.8654 |   1.1429 |
dtype_infer_int32                            |   0.4090 |   0.3577 |   1.1433 |
frame_isnull                                 |   0.4657 |   0.4070 |   1.1443 |
join_dataframe_integer_2key                  |   3.3087 |   2.8860 |   1.1465 |
series_align_irregular_string                |  48.6543 |  42.3963 |   1.1476 |
frame_xs_row                                 |   0.0333 |   0.0290 |   1.1479 |
dti_reset_index                              |   0.3053 |   0.2657 |   1.1493 |
append_frame_single_mixed                    |   1.4187 |   1.2324 |   1.1512 |
strings_findall                              |   8.1600 |   7.0873 |   1.1514 |
timeseries_1min_5min_mean                    |   0.6646 |   0.5770 |   1.1519 |
multiindex_duplicated                        |  96.3167 |  83.5230 |   1.1532 |
packers_write_pickle                         | 130.7033 | 113.0910 |   1.1557 |
timeseries_custom_bday_cal_incr              |   0.0216 |   0.0187 |   1.1574 |
panel_from_dict_equiv_indexes                |  39.0920 |  33.7410 |   1.1586 |
frame_insert_500_columns_end                 |  84.1594 |  72.6186 |   1.1589 |
frame_to_csv_date_formatting                 |  11.6611 |  10.0466 |   1.1607 |
frame_iteritems                              |  25.1820 |  21.6257 |   1.1644 |
series_iloc_list_like                        |   0.7620 |   0.6526 |   1.1676 |
strings_repeat                               |   3.8571 |   3.3034 |   1.1676 |
reshape_pivot_time_series                    | 157.2340 | 134.5960 |   1.1682 |
strings_title                                |   7.7453 |   6.6237 |   1.1693 |
timeseries_year_incr                         |   0.0164 |   0.0140 |   1.1705 |
stat_ops_frame_mean_float_axis_1             |   4.6740 |   3.9916 |   1.1709 |
stat_ops_level_series_sum                    |   1.4474 |   1.2354 |   1.1716 |
packers_write_json_mixed_float_int_T         | 107.0007 |  91.1396 |   1.1740 |
stats_rank_pct_average_old                   |  28.4254 |  24.0947 |   1.1797 |
indexing_dataframe_boolean_no_ne             |  85.9000 |  72.7403 |   1.1809 |
strings_join_split                           |  36.1074 |  30.5471 |   1.1820 |
read_parse_dates_iso8601                     |   1.1773 |   0.9950 |   1.1832 |
stats_rank_average_int                       |  19.7994 |  16.6473 |   1.1893 |
series_loc_array                             |   0.9330 |   0.7813 |   1.1942 |
left_outer_join_index                        | 2329.4190 | 1949.8839 |   1.1946 |
groupby_int_count                            |   3.7150 |   3.0710 |   1.2097 |
strings_slice                                |   3.2950 |   2.7227 |   1.2102 |
sparse_frame_constructor                     |   5.0067 |   4.1353 |   1.2107 |
groupby_ngroups_100_cumsum                   |  12.2750 |  10.1330 |   1.2114 |
sparse_series_to_frame                       | 121.3010 |  99.8526 |   1.2148 |
indexing_dataframe_boolean_st                |  90.5453 |  74.4573 |   1.2161 |
groupby_sum_booleans                         |   1.0333 |   0.8493 |   1.2166 |
groupby_first_object                         |  15.8263 |  12.9640 |   1.2208 |
frame_ctor_dtindex_Nanox2                    |   0.9380 |   0.7677 |   1.2218 |
groupby_transform_ufunc                      | 101.8643 |  83.3070 |   1.2228 |
sql_float_read_query_sqlalchemy              |  12.8566 |  10.4984 |   1.2246 |
frame_ctor_dtindex_Microx1                   |   0.9426 |   0.7663 |   1.2301 |
reindex_fillna_backfill_float32              |   0.2473 |   0.2007 |   1.2325 |
frame_ctor_dtindex_Nanox1                    |   0.9500 |   0.7660 |   1.2403 |
sql_datetime_read_and_parse_sqlalchemy       |  18.3130 |  14.7080 |   1.2451 |
timeseries_period_downsample_mean            |  11.1174 |   8.8491 |   1.2563 |
dtype_infer_int64                            |   0.6446 |   0.5107 |   1.2622 |
groupby_last_float32                         |   3.0770 |   2.4350 |   1.2637 |
stat_ops_frame_mean_int_axis_0               |   4.5147 |   3.5590 |   1.2685 |
panel_from_dict_all_different_indexes        | 100.3427 |  78.9137 |   1.2715 |
reindex_multiindex                           |   1.3320 |   1.0437 |   1.2762 |
timeseries_year_apply                        |   0.0153 |   0.0120 |   1.2781 |
series_align_int64_index                     |  34.4250 |  26.7910 |   1.2849 |
strings_pad                                  |   4.4510 |   3.4587 |   1.2869 |
stat_ops_level_series_sum_multiple           |   4.9117 |   3.8057 |   1.2906 |
series_iloc_array                            |   5.0873 |   3.8664 |   1.3158 |
read_csv_infer_datetime_format_ymd           |   2.0707 |   1.5700 |   1.3189 |
dtype_infer_float64                          |   0.6750 |   0.5116 |   1.3192 |
frame_ctor_dtindex_Hourx1                    |   0.9096 |   0.6883 |   1.3216 |
frame_getitem_single_column                  |  19.6416 |  14.7943 |   1.3276 |
frame_ctor_dtindex_Microx2                   |   0.9057 |   0.6807 |   1.3305 |
series_loc_slice                             |   0.0476 |   0.0357 |   1.3341 |
stat_ops_level_frame_sum_multiple            |   6.6206 |   4.9623 |   1.3342 |
frame_ctor_dtindex_Minutex1                  |   0.9073 |   0.6800 |   1.3344 |
index_float64_boolean_series_indexer         |   3.6701 |   2.7480 |   1.3355 |
strings_match                                |   6.0484 |   4.5284 |   1.3357 |
dataframe_resample_min_numpy                 |   1.6863 |   1.2569 |   1.3416 |
eval_frame_chained_cmp_python                |  90.7466 |  66.8840 |   1.3568 |
index_str_boolean_series_indexer             |  10.8276 |   7.8300 |   1.3828 |
frame_ctor_dtindex_Minutex2                  |   0.8980 |   0.6429 |   1.3968 |
packers_write_pack                           |  27.5940 |  19.7157 |   1.3996 |
frame_ctor_dtindex_Dayx1                     |   0.9130 |   0.6493 |   1.4061 |
join_dataframe_index_single_key_bigger_sort  |  12.1287 |   8.5853 |   1.4127 |
frame_ctor_dtindex_Millix1                   |   0.8970 |   0.6333 |   1.4164 |
frame_ctor_dtindex_Millix2                   |   0.9647 |   0.6777 |   1.4236 |
frame_ctor_dtindex_Dayx2                     |   0.9143 |   0.6410 |   1.4264 |
frame_ctor_dtindex_Hourx2                    |   0.9403 |   0.6537 |   1.4385 |
groupby_ngroups_100_median                   |   0.4290 |   0.2973 |   1.4429 |
dataframe_resample_max_string                |   1.7533 |   1.2083 |   1.4511 |
frame_ctor_dtindex_Secondx1                  |   0.9033 |   0.6204 |   1.4561 |
frame_ctor_dtindex_Secondx2                  |   0.9097 |   0.6200 |   1.4674 |
timeseries_custom_bday_cal_incr_n            |   0.0263 |   0.0173 |   1.5183 |
series_getitem_label_slice                   |   0.0650 |   0.0427 |   1.5233 |
series_iloc_slice                            |   0.0410 |   0.0264 |   1.5542 |
indexing_dataframe_boolean                   | 110.2073 |  69.7950 |   1.5790 |
strings_contains_few_noregex                 |   3.1060 |   1.9280 |   1.6110 |
frame_boolean_row_select                     |   0.3196 |   0.1919 |   1.6654 |
frame_xs_mi_ix                               |   4.7997 |   2.8503 |   1.6839 |
datetime_index_union                         |   0.0747 |   0.0406 |   1.8395 |
eval_frame_mult_python_one_thread            |  27.5110 |  14.1670 |   1.9419 |
timeseries_custom_bday_apply_dt64            |   0.0373 |   0.0130 |   2.8598 |
groupby_first_datetimes                      |  24.7090 |   7.6466 |   3.2314 |
groupby_last_datetimes                       |  29.9554 |   9.0907 |   3.2952 |
timeseries_custom_bday_apply                 |   0.0403 |   0.0114 |   3.5455 |
timeseries_day_incr                          |   0.0237 |   0.0060 |   3.9733 |
timeseries_day_apply                         |   0.0267 |   0.0053 |   5.0149 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [f2b54fb] : BUG: Fixes GH9311 groupby on datetime64

datetime64 columns were changing at the nano-second scale when
applying a groupby aggregator.
Base   [5fd1fbd] : Merge pull request #9318 from jorisvandenbossche/doc-api-dt

DOC: delete removed Timedelta properties (see GH9257) from API overview

@@ -4,6 +4,11 @@
# or we get a bootstrapping problem
from StringIO import StringIO

MAX_INT8 = 127
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shoyer
Copy link
Member

shoyer commented Feb 3, 2015

OK, this looks much more reasonable to me.

I'm slightly troubled by the groupby_first_datetimes and groupby_last_datetimes performance tests -- aren't those the exact operations you tried hard to keep fast here? Can you run those tests again (see the -r option to vbench) to verify that they slowed down, and if so, figure out why? Maybe it's casting='safe'?

@@ -85,6 +87,8 @@
_dataframe_apply_whitelist = \
_common_apply_whitelist | frozenset(['dtypes', 'corrwith'])

_non_arithmetic_agg = ('first', 'last', 'min', 'max', 'nth', 'count')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a frozenset

@chrisbyboston
Copy link
Author

@shoyer Thanks for looking at it so quickly. I'll make the changes suggested, and I'll use a profiler to figure out why groupby_first_datetimes and groupby_last_datetimes are slower.

@chrisbyboston
Copy link
Author

@shoyer

Alright, I found it. Here's the relevant function:

    def _try_coerce_result(self, result):
        """ reverse of try_coerce_args """
        if isinstance(result, np.ndarray):
            if result.dtype == 'i8':
                result = tslib.array_to_datetime(
                    result.astype(object).ravel()).reshape(result.shape)
            elif result.dtype.kind in ['i', 'f', 'O']:
                result = result.astype('M8[ns]', casting='safe')
        elif isinstance(result, (np.integer, np.datetime64)):
            result = lib.Timestamp(result)
        return result

We aren't even hitting the casting='safe' section any more because the new Cython functions keep the datetime64 as an i8 when it comes into this function. The slow down is from using the line that gets hit in the if result.dtype == 'i8 condition. Additionally, casting='safe' is useless in this elif block, because in numpy, none of the dtypes we're looking for can be safely cast to 'M8[ns]'. Here's the proof:

In [5]: np.can_cast(np.int8, 'M8[ns]')
Out[5]: False

In [6]: np.can_cast(np.int16, 'M8[ns]')
Out[6]: False

In [7]: np.can_cast(np.int32, 'M8[ns]')
Out[7]: False

In [8]: np.can_cast(np.int64, 'M8[ns]')
Out[8]: False

In [9]: np.can_cast(np.float32, 'M8[ns]')
Out[9]: False

In [10]: np.can_cast(np.float64, 'M8[ns]')
Out[10]: False

In [11]: np.can_cast('O', 'M8[ns]')
Out[11]: False

What's more, this block...

            if result.dtype == 'i8':
                result = tslib.array_to_datetime(
                    result.astype(object).ravel()).reshape(result.shape)

...I don't believe is necessary any more, as I think it was covering for this error in numpy 1.6.

I'm going to clean this function up and make sure all the tests are passing and that vbench looks better, and I'll push up my changes.

@shoyer
Copy link
Member

shoyer commented Feb 4, 2015

@iwschris Interesting -- sounds like a good plan to me!

@jreback
Copy link
Contributor

jreback commented Feb 4, 2015

@iwschris The point of that coercion was to handle the cases where a returned input was non-i8. E.g. an op was applied to a datetime64 (say mean in a groubpy) that returned a float / object. So it prob wasn't hit very much (maybe not tested at all).

@chrisbyboston
Copy link
Author

@shoyer I think I've made all the changes that you requested, and the groupby performance is fixed. Let me know if you see anything else that needs to be changed.

@chrisbyboston
Copy link
Author

Cool. I think that last push takes care of everything mentioned up to this point. Anything else?

@jreback
Copy link
Contributor

jreback commented Feb 13, 2015

just FYI

In [1]: x = pd.date_range('20130101',periods=3)

In [2]: x.values            
Out[2]: 
array(['2012-12-31T19:00:00.000000000-0500',
       '2013-01-01T19:00:00.000000000-0500',
       '2013-01-02T19:00:00.000000000-0500'], dtype='datetime64[ns]')

In [3]: x.values.view('i8').base
Out[3]: array([1356998400000000000, 1357084800000000000, 1357171200000000000])

In [4]: x.values.astype('i8',copy=False).base

In [5]: np.asarray(x.values,'i8').base

The reason we always want to take a view on a M8/m8 object is that it does NOT copy.
whereas the other 2 methods always copy (even with the copy=False flag).

so these DO need to be treated separately (from a regular int64)

@chrisbyboston
Copy link
Author

I see. Good info.

Newest stuff now pushed.

@shoyer
Copy link
Member

shoyer commented Feb 13, 2015

@jreback interesting -- didn't realize that about view vs astype for datetime64

@jreback
Copy link
Contributor

jreback commented Feb 13, 2015

@iwschris can you just run another vbench vs master (you can limit it with -r groupby|timeseries if you want, just to check thanks

@chrisbyboston
Copy link
Author

You bet. Should have it in a few minutes.

@chrisbyboston
Copy link
Author

@jreback looks like objects weren't accounted for in that group of if elif statements. Making that change now...

@jreback
Copy link
Contributor

jreback commented Feb 13, 2015

@iwschris right, yeh that should be the else.

datetime64 columns were changing at the nano-second scale when
applying a groupby aggregator.
@chrisbyboston
Copy link
Author

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
groupby_sum_multiindex                       |   0.7521 |   3.2990 |   0.2280 |
groupby_first_float32                        |   2.3560 |   4.8750 |   0.4833 |
groupby_ngroups_100_max                      |   0.2607 |   0.4440 |   0.5871 |
groupby_multi_index                          | 586.1234 | 994.6311 |   0.5893 |
groupby_ngroups_100_count                    |   0.2876 |   0.4670 |   0.6159 |
timeseries_day_apply                         |   0.0319 |   0.0513 |   0.6223 |
groupby_ngroups_100_last                     |   0.2426 |   0.3810 |   0.6368 |
groupby_ngroups_10000_size                   |   3.7566 |   5.5623 |   0.6754 |
timeseries_custom_bday_cal_decr              |   0.0196 |   0.0290 |   0.6767 |
timeseries_custom_bmonthbegin_decr_n         |   0.1824 |   0.2547 |   0.7161 |
groupby_agg_builtins1                        |   6.5397 |   9.0700 |   0.7210 |
groupby_ngroups_100_first                    |   0.2743 |   0.3740 |   0.7335 |
timeseries_day_incr                          |   0.0233 |   0.0313 |   0.7437 |
groupby_ngroups_10000_last                   |   1.9563 |   2.5903 |   0.7552 |
groupby_int_count                            |   2.6743 |   3.5147 |   0.7609 |
groupby_agg_builtins2                        |  32.5734 |  42.7121 |   0.7626 |
groupby_series_nth_none                      |   0.9303 |   1.2077 |   0.7703 |
timeseries_custom_bday_cal_incr_n            |   0.0164 |   0.0211 |   0.7774 |
groupby_transform_series                     |  15.4217 |  19.6993 |   0.7829 |
groupby_last_float32                         |   2.4937 |   3.1406 |   0.7940 |
timeseries_custom_bday_incr                  |   0.0120 |   0.0150 |   0.7989 |
groupby_ngroups_100_skew                     |   9.2690 |  11.5190 |   0.8047 |
groupby_ngroups_10000_first                  |   1.5870 |   1.9684 |   0.8062 |
groupby_ngroups_10000_unique                 | 528.3200 | 650.3677 |   0.8123 |
groupby_ngroups_100_sem                      |   0.5520 |   0.6727 |   0.8207 |
groupby_ngroups_100_min                      |   0.3023 |   0.3684 |   0.8207 |
groupby_frame_apply_overhead                 |   6.6214 |   8.0397 |   0.8236 |
groupby_ngroups_10000_var                    |   1.7893 |   2.1513 |   0.8317 |
groupby_ngroups_100_sum                      |   0.3630 |   0.4364 |   0.8319 |
groupby_transform_ufunc                      |  90.9894 | 108.7949 |   0.8363 |
groupby_transform_multi_key4                 |  96.6847 | 115.4637 |   0.8374 |
groupby_dt_size                              |  19.4210 |  23.1177 |   0.8401 |
groupby_ngroups_100_head                     |   0.5563 |   0.6620 |   0.8403 |
timeseries_custom_bday_decr                  |   0.0197 |   0.0234 |   0.8435 |
groupby_transform_series2                    | 110.8003 | 130.9373 |   0.8462 |
groupby_ngroups_10000_max                    |   1.5440 |   1.8240 |   0.8465 |
groupby_transform_multi_key3                 | 492.3480 | 579.4737 |   0.8496 |
groupby_simple_compress_timing               |  22.3200 |  26.1633 |   0.8531 |
timeseries_to_datetime_iso8601               |   3.0967 |   3.6070 |   0.8585 |
groupby_int64_overflow                       | 251.3187 | 291.4154 |   0.8624 |
groupby_ngroups_10000_rank                   | 972.6830 | 1118.1310 |   0.8699 |
groupby_transform_multi_key2                 |  29.4696 |  33.8450 |   0.8707 |
groupby_ngroups_10000_all                    | 712.3833 | 810.1400 |   0.8793 |
groupby_transform_multi_key1                 |  43.5643 |  49.4163 |   0.8816 |
timeseries_1min_5min_ohlc                    |   0.7386 |   0.8353 |   0.8842 |
groupby_frame_singlekey_integer              |   1.3994 |   1.5797 |   0.8858 |
groupby_multi_different_numpy_functions      |   7.7173 |   8.6620 |   0.8909 |
timeseries_asof_nan                          |   5.7263 |   6.4077 |   0.8937 |
timeseries_custom_bmonthend_decr_n           |   0.2110 |   0.2357 |   0.8951 |
groupby_ngroups_10000_diff                   | 891.7777 | 987.4310 |   0.9031 |
groupby_multi_count                          |   5.0724 |   5.6121 |   0.9038 |
timeseries_custom_bmonthbegin_incr_n         |   0.1677 |   0.1847 |   0.9079 |
groupby_nth_float64_none                     |  69.3804 |  76.1610 |   0.9110 |
groupby_multi_different_functions            |   8.1933 |   8.9907 |   0.9113 |
groupby_indices                              |   4.6080 |   5.0390 |   0.9145 |
timeseries_asof                              |   5.8460 |   6.3770 |   0.9167 |
groupby_ngroups_10000_nunique                | 673.3130 | 732.7573 |   0.9189 |
timeseries_custom_bday_cal_incr              |   0.0207 |   0.0223 |   0.9253 |
groupby_ngroups_100_all                      |   7.4526 |   8.0520 |   0.9256 |
groupby_ngroups_100_pct_change               |  28.9074 |  30.9927 |   0.9327 |
groupby_apply_dict_return                    |  26.3223 |  28.0710 |   0.9377 |
groupby_nth_object_any                       | 907.5707 | 960.7120 |   0.9447 |
groupby_ngroups_100_size                     |   0.4393 |   0.4650 |   0.9448 |
groupby_series_nth_any                       |   2.8097 |   2.9716 |   0.9455 |
groupby_multi_size                           |  18.4967 |  19.5157 |   0.9478 |
groupby_ngroups_10000_sem                    |   2.3330 |   2.4590 |   0.9488 |
timeseries_year_incr                         |   0.0133 |   0.0140 |   0.9489 |
groupby_ngroups_10000_mad                    | 3652.3110 | 3848.0051 |   0.9491 |
groupby_first_datetimes                      |   7.6036 |   8.0037 |   0.9500 |
groupby_ngroups_100_mean                     |   0.2724 |   0.2867 |   0.9501 |
groupby_ngroups_100_any                      |   7.2740 |   7.6460 |   0.9513 |
groupby_nth_datetimes_any                    | 870.5579 | 914.8546 |   0.9516 |
timeseries_sort_index                        |   7.9620 |   8.3427 |   0.9544 |
timeseries_custom_bday_apply                 |   0.0134 |   0.0140 |   0.9545 |
groupby_multi_series_op                      |   9.9360 |  10.3896 |   0.9563 |
groupby_ngroups_100_rank                     |  11.3663 |  11.8790 |   0.9568 |
groupby_nth_object_none                      | 514.6056 | 536.9283 |   0.9584 |
groupby_last_datetimes                       |   9.1613 |   9.5367 |   0.9606 |
groupby_pivot_table                          |  11.9127 |  12.3960 |   0.9610 |
timeseries_period_downsample_mean            |   8.3439 |   8.6637 |   0.9631 |
groupby_ngroups_100_cummin                   |  11.8447 |  12.2477 |   0.9671 |
groupby_last_object                          |  13.5950 |  14.0096 |   0.9704 |
groupby_ngroups_100_unique                   |   5.3563 |   5.5177 |   0.9707 |
groupby_ngroups_10000_mean                   |   1.6189 |   1.6673 |   0.9710 |
groupby_first_object                         |  13.5777 |  13.9623 |   0.9725 |
groupby_frame_median                         |   4.6940 |   4.8234 |   0.9732 |
timeseries_with_format_replace               | 849.3273 | 870.5983 |   0.9756 |
timeseries_with_format_no_exact              | 642.9497 | 657.4820 |   0.9779 |
timeseries_custom_bmonthend_incr             |   0.1337 |   0.1367 |   0.9779 |
groupby_ngroups_10000_any                    | 680.4217 | 694.6084 |   0.9796 |
groupby_ngroups_100_median                   |   0.2913 |   0.2970 |   0.9810 |
groupby_ngroups_100_mad                      |  37.1737 |  37.8356 |   0.9825 |
timeseries_to_datetime_YYYYMMDD              |  10.6564 |  10.8293 |   0.9840 |
timeseries_iter_periodindex_preexit          |   9.1586 |   9.2920 |   0.9856 |
groupby_multi_cython                         |  11.9693 |  12.1080 |   0.9885 |
groupby_ngroups_100_cummax                   |  11.3430 |  11.4384 |   0.9917 |
groupby_sum_booleans                         |   0.9013 |   0.9084 |   0.9922 |
groupby_ngroups_10000_describe               | 13609.3757 | 13623.5873 |   0.9990 |
timeseries_large_lookup_value                |   0.0137 |   0.0137 |   1.0000 |
groupby_ngroups_10000_cumprod                | 1134.5750 | 1133.7350 |   1.0007 |
groupby_ngroups_10000_cummax                 | 1093.6377 | 1090.4837 |   1.0029 |
groupby_nth_float32_none                     |  71.3633 |  71.0537 |   1.0044 |
groupby_ngroups_100_describe                 | 134.6430 | 134.0171 |   1.0047 |
groupby_ngroups_10000_sum                    |   1.7863 |   1.7767 |   1.0054 |
groupby_ngroups_10000_pct_change             | 3164.2790 | 3113.4090 |   1.0163 |
groupby_ngroups_10000_cummin                 | 1061.4703 | 1044.0420 |   1.0167 |
groupby_ngroups_10000_count                  |   1.7674 |   1.7337 |   1.0194 |
groupby_ngroups_10000_value_counts           | 4022.8783 | 3946.0400 |   1.0195 |
groupby_ngroups_10000_median                 |   2.3727 |   2.3223 |   1.0217 |
groupby_ngroups_100_prod                     |   0.4040 |   0.3926 |   1.0291 |
groupby_ngroups_10000_cumsum                 | 1125.8500 | 1093.0517 |   1.0300 |
groupby_nth_datetimes_none                   | 482.6593 | 468.3119 |   1.0306 |
timeseries_timestamp_downsample_mean         |   3.8143 |   3.6987 |   1.0313 |
timeseries_iter_datetimeindex                | 562.7003 | 545.4220 |   1.0317 |
timeseries_infer_freq                        |   7.6024 |   7.3264 |   1.0377 |
groupby_ngroups_10000_cumcount               |  66.6266 |  64.1980 |   1.0378 |
groupby_transform                            | 124.0466 | 119.5150 |   1.0379 |
timeseries_1min_5min_mean                    |   0.6224 |   0.5994 |   1.0383 |
groupby_frame_apply                          |  31.1930 |  29.9776 |   1.0405 |
groupby_series_simple_cython                 | 173.9823 | 166.2357 |   1.0466 |
groupby_first_float64                        |   2.7541 |   2.6210 |   1.0508 |
timeseries_custom_bday_apply_dt64            |   0.0147 |   0.0140 |   1.0511 |
groupby_ngroups_10000_tail                   |  67.1217 |  63.7630 |   1.0527 |
timeseries_iter_periodindex                  | 1014.3323 | 961.3767 |   1.0551 |
groupby_ngroups_100_value_counts             |  42.9380 |  40.6311 |   1.0568 |
groupby_frame_nth_any                        |   4.6680 |   4.3880 |   1.0638 |
groupby_ngroups_100_var                      |   0.2960 |   0.2766 |   1.0701 |
groupby_ngroups_100_cumcount                 |   0.6820 |   0.6333 |   1.0768 |
groupby_ngroups_100_cumsum                   |  11.8704 |  11.0226 |   1.0769 |
groupby_ngroups_100_nunique                  |   8.1023 |   7.5150 |   1.0782 |
groupby_ngroups_100_diff                     |  11.5736 |  10.7230 |   1.0793 |
groupby_ngroups_10000_head                   |  67.4507 |  62.1850 |   1.0847 |
groupby_multi_python                         |  78.9371 |  72.3193 |   1.0915 |
groupby_ngroups_100_std                      |   0.4003 |   0.3661 |   1.0936 |
timeseries_iter_datetimeindex_preexit        |  11.6250 |  10.5534 |   1.1015 |
timeseries_custom_bday_cal_incr_neg_n        |   0.0213 |   0.0193 |   1.1029 |
timeseries_add_irregular                     |  11.4637 |  10.3450 |   1.1081 |
groupby_ngroups_10000_skew                   | 1084.5517 | 978.5136 |   1.1084 |
groupby_ngroups_100_tail                     |   0.6386 |   0.5750 |   1.1107 |
groupby_ngroups_100_cumprod                  |  11.9090 |  10.6903 |   1.1140 |
groupby_frame_nth_none                       |   1.7730 |   1.5844 |   1.1191 |
groupby_dt_timegrouper_size                  |  17.4917 |  15.5451 |   1.1252 |
groupby_ngroups_10000_prod                   |   2.0370 |   1.7896 |   1.1382 |
groupby_ngroups_10000_std                    |   2.1477 |   1.8667 |   1.1505 |
frame_assign_timeseries_index                |   0.6467 |   0.5604 |   1.1540 |
groupby_frame_cython_many_columns            |   2.5174 |   2.1220 |   1.1863 |
timeseries_is_month_start                    |   3.0104 |   2.5110 |   1.1989 |
timeseries_custom_bmonthend_incr_n           |   0.2177 |   0.1777 |   1.2250 |
timeseries_asof_single                       |   0.0223 |   0.0180 |   1.2434 |
groupby_ngroups_10000_min                    |   2.2504 |   1.8086 |   1.2442 |
groupby_last_float64                         |   3.3420 |   2.6394 |   1.2662 |
timeseries_slice_minutely                    |   0.0567 |   0.0400 |   1.4175 |
timeseries_year_apply                        |   0.0286 |   0.0160 |   1.7910 |
timeseries_timestamp_tzinfo_cons             |   0.0143 |   0.0077 |   1.8557 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

@chrisbyboston
Copy link
Author

@jreback - vbench output posted, and the modification for object is done.

Let me know if there is anything else.

@shoyer
Copy link
Member

shoyer commented Feb 13, 2015

vbench looks pretty good to me -- some nice speedups for grouped aggregations!

@jreback
Copy link
Contributor

jreback commented Feb 13, 2015

the vbenchs are due to other PR's actually.

Here is what I get

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
groupby_ngroups_100_count                    |   0.3027 |   0.4513 |   0.6707 |
groupby_ngroups_100_min                      |   0.2793 |   0.4063 |   0.6875 |
groupby_ngroups_100_max                      |   0.2987 |   0.4203 |   0.7107 |
groupby_ngroups_100_last                     |   0.2950 |   0.4013 |   0.7350 |
groupby_ngroups_100_first                    |   0.2836 |   0.3827 |   0.7412 |
groupby_int_count                            |   2.7003 |   3.2799 |   0.8233 |
groupby_ngroups_10000_min                    |   1.8367 |   2.2127 |   0.8301 |
groupby_ngroups_10000_max                    |   1.9150 |   2.1427 |   0.8937 |
timeseries_infer_freq                        |   7.5614 |   8.4263 |   0.8973 |
groupby_dt_timegrouper_size                  |  15.5920 |  17.3516 |   0.8986 |
groupby_simple_compress_timing               |  24.6580 |  27.4227 |   0.8992 |
groupby_ngroups_100_cumsum                   |  11.6700 |  12.9297 |   0.9026 |
timeseries_to_datetime_YYYYMMDD              |   7.5997 |   8.3993 |   0.9048 |
timeseries_sort_index                        |   7.4020 |   8.1673 |   0.9063 |
groupby_ngroups_100_tail                     |   0.6397 |   0.7053 |   0.9069 |
groupby_nth_float32_none                     |  70.5680 |  77.5150 |   0.9104 |
timeseries_custom_bmonthend_incr_n           |   0.1960 |   0.2136 |   0.9174 |
groupby_sum_booleans                         |   0.9297 |   1.0130 |   0.9178 |
groupby_ngroups_100_diff                     |  10.6417 |  11.5767 |   0.9192 |
groupby_ngroups_100_value_counts             |  41.5277 |  45.1584 |   0.9196 |
groupby_ngroups_10000_std                    |   2.0053 |   2.1660 |   0.9258 |
timeseries_period_downsample_mean            |   9.2463 |   9.9670 |   0.9277 |
groupby_ngroups_100_size                     |   0.4164 |   0.4483 |   0.9287 |
timeseries_iter_datetimeindex_preexit        |  10.2930 |  11.0653 |   0.9302 |
timeseries_iter_datetimeindex                | 558.3330 | 600.1287 |   0.9304 |
groupby_transform_multi_key4                 | 102.0021 | 109.6350 |   0.9304 |
groupby_ngroups_10000_value_counts           | 4049.3347 | 4349.6927 |   0.9309 |
timeseries_custom_bday_apply_dt64            |   0.0140 |   0.0150 |   0.9312 |
groupby_ngroups_10000_sum                    |   1.9740 |   2.1197 |   0.9313 |
groupby_multi_index                          | 527.2981 | 565.1563 |   0.9330 |
groupby_ngroups_10000_cummax                 | 1053.8023 | 1128.4770 |   0.9338 |
groupby_agg_builtins2                        |  29.9434 |  32.0370 |   0.9346 |
groupby_ngroups_10000_prod                   |   1.9640 |   2.0967 |   0.9367 |
timeseries_custom_bday_cal_incr              |   0.0190 |   0.0203 |   0.9373 |
plot_timeseries_period                       | 100.2293 | 106.8187 |   0.9383 |
groupby_ngroups_10000_last                   |   2.0286 |   2.1590 |   0.9396 |
groupby_ngroups_10000_cumprod                | 1070.9914 | 1139.1160 |   0.9402 |
timeseries_1min_5min_mean                    |   0.6700 |   0.7126 |   0.9402 |
timeseries_to_datetime_iso8601               |   3.4893 |   3.7096 |   0.9406 |
groupby_ngroups_10000_skew                   | 975.7330 | 1034.3137 |   0.9434 |
groupby_ngroups_10000_describe               | 11928.2053 | 12641.3703 |   0.9436 |
timeseries_custom_bday_cal_incr_neg_n        |   0.0220 |   0.0233 |   0.9454 |
timeseries_custom_bmonthend_decr_n           |   0.2356 |   0.2490 |   0.9464 |
timeseries_1min_5min_ohlc                    |   0.7553 |   0.7976 |   0.9470 |
timeseries_iter_periodindex_preexit          |   9.7260 |  10.2700 |   0.9470 |
groupby_ngroups_100_mad                      |  32.6534 |  34.4483 |   0.9479 |
timeseries_custom_bday_cal_incr_n            |   0.0190 |   0.0200 |   0.9484 |
timeseries_day_apply                         |   0.0257 |   0.0270 |   0.9500 |
timeseries_is_month_start                    |   2.8730 |   3.0140 |   0.9532 |
groupby_transform                            | 119.9337 | 125.6570 |   0.9545 |
timeseries_custom_bday_incr                  |   0.0134 |   0.0140 |   0.9545 |
timeseries_custom_bmonthend_incr             |   0.1537 |   0.1610 |   0.9546 |
timeseries_custom_bmonthbegin_decr_n         |   0.2077 |   0.2174 |   0.9554 |
groupby_transform_ufunc                      |  93.5323 |  97.8587 |   0.9558 |
timeseries_custom_bmonthbegin_incr_n         |   0.1920 |   0.2007 |   0.9568 |
groupby_ngroups_10000_first                  |   1.9853 |   2.0707 |   0.9588 |
timeseries_iter_periodindex                  | 969.0204 | 1010.4167 |   0.9590 |
groupby_frame_apply                          |  30.6340 |  31.9337 |   0.9593 |
timeseries_year_incr                         |   0.0150 |   0.0157 |   0.9594 |
timeseries_with_format_no_exact              | 629.6630 | 655.2920 |   0.9609 |
groupby_frame_cython_many_columns            |   2.4337 |   2.5264 |   0.9633 |
groupby_ngroups_10000_cumsum                 | 1083.9800 | 1123.7434 |   0.9646 |
groupby_last_datetimes                       |  10.0000 |  10.3630 |   0.9650 |
groupby_multi_count                          |   6.0413 |   6.2490 |   0.9668 |
timeseries_custom_bday_decr                  |   0.0210 |   0.0217 |   0.9670 |
timeseries_custom_bday_apply                 |   0.0126 |   0.0130 |   0.9695 |
groupby_transform_multi_key3                 | 581.4103 | 598.2854 |   0.9718 |
groupby_nth_object_any                       | 897.0937 | 922.5266 |   0.9724 |
groupby_nth_object_none                      | 490.6300 | 504.4353 |   0.9726 |
timeseries_year_apply                        |   0.0146 |   0.0150 |   0.9735 |
groupby_multi_series_op                      |  10.7660 |  11.0053 |   0.9783 |
groupby_ngroups_100_cumcount                 |   0.6344 |   0.6467 |   0.9810 |
groupby_ngroups_10000_all                    | 721.7060 | 735.6390 |   0.9811 |
groupby_ngroups_100_std                      |   0.3716 |   0.3787 |   0.9813 |
timeseries_with_format_replace               | 851.9327 | 867.9880 |   0.9815 |
groupby_ngroups_10000_mad                    | 3273.8944 | 3332.5966 |   0.9824 |
groupby_ngroups_10000_cumcount               |  64.3473 |  65.4813 |   0.9827 |
groupby_ngroups_100_describe                 | 118.8650 | 120.9197 |   0.9830 |
groupby_multi_python                         |  75.6930 |  76.8147 |   0.9854 |
groupby_transform_multi_key1                 |  51.7964 |  52.5324 |   0.9860 |
timeseries_slice_minutely                    |   0.0454 |   0.0460 |   0.9862 |
groupby_ngroups_100_cumprod                  |  11.7360 |  11.8947 |   0.9867 |
frame_assign_timeseries_index                |   0.6424 |   0.6500 |   0.9883 |
groupby_sum_multiindex                       |   0.9010 |   0.9106 |   0.9894 |
groupby_ngroups_10000_unique                 | 526.0944 | 531.6893 |   0.9895 |
groupby_ngroups_10000_median                 |   2.4913 |   2.5120 |   0.9918 |
groupby_nth_float64_none                     |  73.2677 |  73.7294 |   0.9937 |
timeseries_large_lookup_value                |   0.0149 |   0.0150 |   0.9947 |
groupby_multi_different_functions            |   9.1166 |   9.1650 |   0.9947 |
groupby_ngroups_100_prod                     |   0.4179 |   0.4200 |   0.9951 |
groupby_ngroups_100_all                      |   8.0884 |   8.1267 |   0.9953 |
groupby_ngroups_10000_pct_change             | 3417.7426 | 3432.1783 |   0.9958 |
groupby_ngroups_100_nunique                  |   8.5557 |   8.5843 |   0.9967 |
groupby_ngroups_10000_size                   |   3.6917 |   3.7020 |   0.9972 |
groupby_ngroups_10000_tail                   |  65.5094 |  65.5634 |   0.9992 |
groupby_multi_cython                         |  11.8380 |  11.8477 |   0.9992 |
groupby_ngroups_100_sum                      |   0.4176 |   0.4177 |   0.9998 |
timeseries_asof_single                       |   0.0207 |   0.0207 |   1.0000 |
timeseries_custom_bday_cal_decr              |   0.0223 |   0.0223 |   1.0000 |
groupby_int64_overflow                       | 286.5373 | 286.4160 |   1.0004 |
groupby_ngroups_100_cummax                   |  12.5953 |  12.5740 |   1.0017 |
timeseries_day_incr                          |   0.0270 |   0.0269 |   1.0029 |
groupby_ngroups_10000_any                    | 714.5850 | 711.8286 |   1.0039 |
groupby_series_nth_any                       |   3.5603 |   3.5450 |   1.0043 |
groupby_frame_apply_overhead                 |   6.8093 |   6.7707 |   1.0057 |
groupby_nth_datetimes_any                    | 918.3880 | 912.4471 |   1.0065 |
groupby_series_nth_none                      |   1.1896 |   1.1800 |   1.0081 |
groupby_ngroups_100_cummin                   |  12.1826 |  12.0743 |   1.0090 |
groupby_frame_median                         |   5.9024 |   5.8440 |   1.0100 |
groupby_frame_nth_any                        |   5.2697 |   5.2167 |   1.0102 |
groupby_ngroups_100_head                     |   0.6737 |   0.6667 |   1.0105 |
groupby_last_object                          |  14.8334 |  14.6753 |   1.0108 |
groupby_ngroups_100_var                      |   0.3307 |   0.3270 |   1.0114 |
groupby_ngroups_10000_var                    |   2.1030 |   2.0790 |   1.0115 |
groupby_ngroups_10000_nunique                | 771.5917 | 761.1760 |   1.0137 |
groupby_ngroups_10000_diff                   | 1028.4077 | 1010.5730 |   1.0176 |
groupby_frame_nth_none                       |   1.9077 |   1.8697 |   1.0203 |
groupby_pivot_table                          |  14.1603 |  13.8530 |   1.0222 |
groupby_last_float32                         |   3.0050 |   2.9330 |   1.0245 |
groupby_transform_series2                    | 112.3273 | 109.4310 |   1.0265 |
groupby_ngroups_100_rank                     |  13.1743 |  12.8287 |   1.0269 |
groupby_apply_dict_return                    |  29.3473 |  28.5690 |   1.0272 |
groupby_ngroups_10000_cummin                 | 1117.0100 | 1080.1910 |   1.0341 |
groupby_first_object                         |  14.8550 |  14.3493 |   1.0352 |
groupby_first_datetimes                      |   9.3730 |   9.0377 |   1.0371 |
groupby_ngroups_100_pct_change               |  37.1253 |  35.6517 |   1.0413 |
groupby_ngroups_10000_head                   |  67.1093 |  64.4406 |   1.0414 |
groupby_ngroups_10000_rank                   | 1145.4223 | 1098.6357 |   1.0426 |
groupby_ngroups_100_skew                     |  11.2367 |  10.7393 |   1.0463 |
groupby_transform_multi_key2                 |  36.3350 |  34.6060 |   1.0500 |
groupby_series_simple_cython                 | 183.0710 | 174.3507 |   1.0500 |
groupby_nth_datetimes_none                   | 443.9003 | 422.1927 |   1.0514 |
groupby_multi_size                           |  19.4400 |  18.4393 |   1.0543 |
timeseries_asof_nan                          |   2.4397 |   2.3093 |   1.0564 |
groupby_transform_series                     |  17.8304 |  16.7394 |   1.0652 |
groupby_multi_different_numpy_functions      |   9.3890 |   8.8094 |   1.0658 |
groupby_ngroups_100_median                   |   0.3647 |   0.3417 |   1.0675 |
groupby_ngroups_100_any                      |   8.1367 |   7.6074 |   1.0696 |
groupby_ngroups_100_unique                   |   6.4450 |   5.9753 |   1.0786 |
groupby_first_float64                        |   2.7567 |   2.5383 |   1.0860 |
groupby_dt_size                              |  22.9206 |  21.0621 |   1.0882 |
groupby_ngroups_10000_mean                   |   2.0073 |   1.8340 |   1.0945 |
groupby_ngroups_100_mean                     |   0.3433 |   0.3096 |   1.1088 |
groupby_last_float64                         |   3.6613 |   2.9603 |   1.2368 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [5f6cbf8] : BUG: Fixes GH9311 groupby on datetime64
datetime64 columns were changing at the nano-second scale when
applying a groupby aggregator.
Base   [f2882b8] : Merge pull request #9479 from jreback/align
BUG: bug in partial setting of with a DatetimeIndex (GH9478)

@iwschris I think you were benching against an older version.

in any event it looks fine.

ping when green

@chrisbyboston
Copy link
Author

Good to know for future PR's. It just takes a fetch and a rebase, right?

@jreback
Copy link
Contributor

jreback commented Feb 13, 2015

yep, then I run with

./test_perf.sh -b master -t HEAD -r 'groupby|timeseries'

if you originally checked out with git checkout -b your_branch --track master
then a fetch/rebase will work

(or you can set the tracking branch, e.g. git branch yourbranch --set-upstream-to origin/master (or local master), however you prefer

@chrisbyboston
Copy link
Author

Yep, our vbench's match now.

@chrisbyboston
Copy link
Author

@jreback it's green.

jreback added a commit that referenced this pull request Feb 14, 2015
BUG: Fixes GH9311 groupby on datetime64
@jreback jreback merged commit 3f24b87 into pandas-dev:master Feb 14, 2015
@jreback
Copy link
Contributor

jreback commented Feb 14, 2015

thanks @iwschris !

working with cython and generated code is a bit non-trivial. thanks for all of the patience and effort!

feel free to look at other issues! (hint hint....#4095), might be a interesting

@chrisbyboston
Copy link
Author

Thanks for working with me on it! I'll take a peek at #4095.

@shoyer
Copy link
Member

shoyer commented Feb 14, 2015

Indeed, really nicely done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Groupby
Projects
None yet
3 participants