Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Add equals method to NDFrames. #5283

Merged
merged 1 commit into from Jan 24, 2014

Conversation

@unutbu
Copy link
Contributor

commented Oct 20, 2013

Also adds array_equivalent, which
is similar to np.array_equal except that it handles object arrays and
treats NaNs in corresponding locations as equal.

closes #5183

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2013

pls run a perf check on this (test_perf.sh)

these comparisons are used everywhere

do u need the shape check?
the null check might kill perf on this
why are u not doing == and != ?

@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2013

@jreback - seems like it doesn't work for this example, but we could be missing something

left = pd.Float64Index([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, nan], dtype='object')
right = pd.Float64Index([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, nan], dtype='object')

# OR 
left = np.array([1.0, 2.0, nan], dtype=object)
right= np.array([1.0, 2.0, nan], dtype=object)

(fully enumerated here - https://gist.github.com/unutbu/7070565)

@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2013

to be explicit:

left = np.array([1.0, 2.0, nan], dtype=object)
    ...: right= np.array([1.0, 2.0, nan], dtype=object)
    ...: 

left != right
Out[16]: array([False, False, False], dtype=bool)

left != left
Out[17]: array([False, False, False], dtype=bool)

right != right
Out[18]: array([False, False, False], dtype=bool)

nan != nan
Out[19]: True
@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2013

Though I guess they compare true with == so not a real issue - we were going back and forth on another PR b/c sometimes nan handling can be confusing :P

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2013

you have to astype to float! before you can do the comparison (not sure exactly why) only works if they r all float values (so you need to do it in a try except)

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 20, 2013

@jreback: I'm working on installing vbench and figuring out how to run test_perf.sh...

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 20, 2013

@jreback: When I run

time ./test_perf.sh -b array-equivalent -t array-equivalent^ 

I get

sqlalchemy.exc.IntegrityError: (IntegrityError) column checksum is not unique u'INSERT INTO benchmarks (checksum, name, description) VALUES (?, ?, ?)' ('ea1993ef61c3cc4e871d2cce3c5d983c', 'eval_frame_chained_cmp_python', None)

I see I can limit test_perf.sh to one test, such as

time ./test_perf.sh -b array-equivalent -t array-equivalent^ -r reindex

which yielded

    Invoked with :
    --ncalls: 3
    --repeats: 3


    -------------------------------------------------------------------------------
    Test name                                    | head[ms] | base[ms] |  ratio   |
    -------------------------------------------------------------------------------
    reindex_frame_level_align                    |   2.6046 |  10.1856 |   0.2557 |
    dataframe_reindex                            |   0.4900 |   0.6377 |   0.7684 |
    frame_reindex_axis0                          | 110.6919 | 126.7160 |   0.8735 |
    frame_reindex_columns                        |   0.4164 |   0.4683 |   0.8890 |
    frame_reindex_both_axes_ix                   |  43.5000 |  46.9437 |   0.9266 |
    reindex_frame_level_reindex                  |   2.3306 |   2.3570 |   0.9888 |
    frame_reindex_upcast                         |  16.1486 |  16.2884 |   0.9914 |
    reindex_fillna_pad_float32                   |   0.5860 |   0.5894 |   0.9942 |
    reindex_fillna_backfill_float32              |   0.5997 |   0.6014 |   0.9972 |
    frame_reindex_both_axes                      |  46.7057 |  46.7397 |   0.9993 |
    reindex_daterange_pad                        |   2.9510 |   2.9523 |   0.9995 |
    reindex_fillna_backfill                      |   1.0234 |   1.0213 |   1.0020 |
    reindex_fillna_pad                           |   0.8663 |   0.8514 |   1.0175 |
    reindex_multiindex                           |   1.5457 |   1.5034 |   1.0281 |
    frame_reindex_axis1                          | 558.3910 | 510.9200 |   1.0929 |
    reindex_daterange_backfill                   |   3.4040 |   2.9933 |   1.1372 |
    -------------------------------------------------------------------------------
    Test name                                    | head[ms] | base[ms] |  ratio   |
    -------------------------------------------------------------------------------

    Ratio < 1.0 means the target commit is faster then the baseline.
    Seed used: 1234

    Target [5c6116c] : Merge pull request #5281 from cancan101/index_meta_data_doc

    DOC: Added versionadded for "Setting index metadata"
    Base   [8c8ef7d] : ENH: Add array_equivalent, to address the handling of NaNs when comparing arrays for equality.

    Added NDFrame.equals

    Index, Float64Index, and MultiIndex's equal method now uses array_equivalent
    instead of np.array_equal.

Clearly I don't know what I'm doing. What is the right test_perf.sh command?
I see there are other choices for -r in pandas/vb_suite. But which is the right/relevant one(s)?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2013

b should be the commit before 1st of yours and t should be the last commit of yours

generally I rebase to master before this

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 20, 2013

With array-equivalent rebased to master,

time ./test_perf.sh -b master -t array-equivalent 

yields vb_suite.log

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 20, 2013

concat_series_axis1                          | 204.8774 |  83.7650 |   2.4459 |
reindex_frame_level_align                    |   8.9770 |   1.2484 |   7.1910 |

so look at these in master and in your PR using %prun...and see if you can figure out what's up...

@jreback

View changes

pandas/core/common.py Outdated
null_right = np.isnan(right)
except TypeError:
return np.array_equal(left, right)
else:

This comment has been minimized.

Copy link
@jreback

jreback Oct 21, 2013

Contributor

I think you can just coerce to float (if it fails that your fallback is fine, though that itself takes some time, might be better just to check the index type first) you don't need the isnull/isnan checking at all, just do (left != left) & (right != right)

This comment has been minimized.

Copy link
@unutbu

unutbu Oct 21, 2013

Author Contributor

@jreback I tried

def array_equivalent(left, right):
    left, right = np.asarray(left), np.asarray(right)
    try:
        left = left.astype(float)
        right = right.astype(float)        
    except (ValueError, TypeError):
        return np.array_equal(left, right)
    else:
        return (left.shape == right.shape
                and ((left == right) | (left != left) & (right != right)).all())

time ./test_perf.sh -b master -t coerce-to-float yields (using Python2.7, Numpy 1.7)

series_align_irregular_string                |  97.3604 |  68.7210 |   1.4167 |
series_align_left_monotonic                  |  32.0517 |  22.5259 |   1.4229 |
concat_series_axis1                          | 430.0770 |  82.5344 |   5.2109 |
reindex_frame_level_align                    |  23.5590 |   1.2616 |  18.6734 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Also, coercing to float drops the imaginary part of complex arrays:

>>> np.array([nan, 1+1j], dtype='complex').astype(float)
array([ nan,   1.])

So np.isnan will (I think) handle more dtypes than (x != x), and has comparable, maybe even favorable speed, when applied to float arrays:

In [6]: x = np.array([1, 2, nan])

In [7]: %timeit x != x
1000000 loops, best of 3: 1.23 µs per loop

In [5]: %timeit np.isnan(x)
1000000 loops, best of 3: 1.1 µs per loop

This comment has been minimized.

Copy link
@jreback

jreback Oct 21, 2013

Contributor

hah....so my suggestion made it worse!

I think you need to detect if you need to do this in the first place (maybe by only checking on Index/Float64Index) types (as Int64Index cannot hold nan)....so you avoid the try: except: overhead

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 21, 2013

With the current commit, test_perf.sh yields

groupby_simple_compress_timing               |  54.9030 |  47.4270 |   1.1576 |
frame_iloc_dups                              |   0.3117 |   0.2663 |   1.1704 |
index_int64_intersection                     |  41.4550 |  33.6334 |   1.2326 |
groupby_series_simple_cython                 |   7.6556 |   5.9413 |   1.2885 |
series_align_left_monotonic                  |  30.3703 |  22.4893 |   1.3504 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

I'm going to try adding a check for Int64Index arrays next...

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

also try doing test_perf again....these could be 'random'.....(e.g. if they are not similar on subsequent runs) then its just an artifact of the data....you can also try with a bigger n (numcalls)

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

@unutbu have a look at #5219 I believe your replacement will simply be called by that, yes?

@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

question here - why do you need to cast it to float first? I thought it worked with just ==? I'm sure I'm missing something but just wanted to make sure we had an example that fails using ==. (or maybe it's just float dtype that fails)

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

I think object dtype that has floats in it (iow float64index) fails ; not sure why though

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 21, 2013

@jreback Regarding #5219, yes, I am striving to make array_equivalent a drop-in replacement for np.array_equal. It should behave exactly like np.array_equal except that NaNs in corresponding locations should be treated as equal.

The tests in tests/test_common.py in the test_array_equivalent show the behavior I'm currently testing for.

@hayd

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

Perhaps related: weird bug in numpy assert_array_equal I came across in a test a while ago (that I can't repo outside of the test suite) #4458.

(@unutbu: your doing pandas' pull requests now, awesome!)

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 21, 2013

@jtratner: I did try coercing to float (#5283 (diff)), but found there were problems. (See the link for more details.)

(Fixed incorrect link.)

Currently, array_equivalent uses np.isnan instead of pd.isnull because it is faster, but since it raises TypeError or NotImplementedError (Python2.6 or 3.2) on object arrays (unlike pd.isnull), I'm using np.array_equal as a fallback.

@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

Again, can we take a quick step back here: what's an example where it
doesn't work to compare ndarrays with == (let's assume that
array_equivalent always gets ndarrays for now). So you don't have to deal
with Index subclasses - will always get actual ndarrays.

if you pass array of floats with dtype object and some are nan, it compares
incorrectly with ==, right?

@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

So if you're thinking of Float64Index - just do '.view(ndarray)' so you're
not dealing with anything on pandas level.

Once we get it to work for ndarray, then can consider what to do for
NDFrame and friends. (trivial to view Index as ndarray for now)
On Oct 21, 2013 5:51 PM, "Jeffrey Tratner" jeffrey.tratner@gmail.com
wrote:

Again, can we take a quick step back here: what's an example where it
doesn't work to compare ndarrays with == (let's assume that
array_equivalent always gets ndarrays for now). So you don't have to deal
with Index subclasses - will always get actual ndarrays.

if you pass array of floats with dtype object and some are nan, it
compares incorrectly with ==, right?

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 21, 2013

@jtratner: I don't quite understand. What is the "it" in the phrase "where it doesn't work..."?

Currently the test

assert array_equivalent(np.array([nan, None], dtype='object'),
                        np.array([nan, None], dtype='object')) 

pass.s

@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

Finally have a computer - just need to look at something for myself. I
mean, where a == b | ((a != a) & ( b != b)) doesn't work, since that's
what I'd expect to work everywhere with a check for matching dtypes.

@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

I just used this:

def array_equiv(n1, n2):
    return n1.shape == n2.shape and n1.dtype == n2.dtype and ((n1 == n2) | ((n1 != n1) & (n2 != n2))).all()

And it worked for all of these - am I missing why this is complicated? Is there a numpy version issue?

import numpy as np
nan = np.nan
for func in [
             lambda : np.array([0.1, 0.2, np.nan, 0.3], dtype=object),
             lambda : np.array([0.1, 0.2, np.nan, 0.3, np.nan], dtype=float),
             lambda : np.array([None, None, np.nan, None], dtype=object),
             lambda : np.array([], dtype=object)]:
    assert array_equiv(func(), func())

Then callers should be responsible for checking anything at pandas-level.

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Oct 21, 2013

How about:

import numpy as np
import pandas as pd
import pandas.core.common as com

def array_equiv(n1, n2):
    return n1.shape == n2.shape and n1.dtype == n2.dtype and ((n1 == n2) | ((n1 != n1) & (n2 != n2))).all()

index = np.random.random(10)
df1 = pd.DataFrame(np.random.random(10,), index=index, columns=['floats'])
df1['dates'] = pd.date_range('2000-1-1', periods=10, freq='T')
df1.ix[::2] = np.nan

print(array_equiv(df1.values, df1.values))
# False

However, my array_equivalent does not handle object arrays correctly either. To work around the above problem, I had to add code to NDFrame.equals to test each column separately.

@jtratner

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

okay, thanks - just wanted to make sure we had something that explicitly didn't work for the simpler version.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Oct 21, 2013

actually...why don't we do both...

use the simpler version...if its True (then we are done as we don't have false positives), however a False can fall back to the slower version

@jtratner

View changes

pandas/core/common.py Outdated
>>> array_equivalent([1, nan, 2], [1, 2, nan])
False
"""
if isinstance(left, pd.Int64Index):

This comment has been minimized.

Copy link
@jtratner

jtratner Oct 21, 2013

Contributor

you can change this to something like if not issubclass(left.dtype.type, (np.object_, np.floating)): return np.array_equal(left, right), right? Given that only object and floating can hold nan?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 21, 2014

can you put in some tests for datetime/timedeltas? (incluing with NaT) ? and bools too

you might need to change the comparisons to something like this:

def equals(self, other):
         if self.dtype != other.dtype or self.shape != other.shape: return False
         return np.array_equal(self._try_operate(self.values), self._try_operate(other.values))

_try_operate essentially does .view('i8') as needed (this way you wont' have to change anything else

(it might work w/o this ...not sure exactly what np.array_equal does)

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Jan 21, 2014

While writing a test for timedeltas and bools, I've come upon an interesting problem:

Suppose df1 and df2 are defined this way:

import numpy as np
import pandas as pd
index = np.random.random(10)
df1 = pd.DataFrame(np.random.random(10,), index=index, columns=['floats'])
df1['text'] = 'the sky is so blue. we could use more chocolate.'.split()
df1['start'] = pd.date_range('2000-1-1', periods=10, freq='T')
df1['end'] = pd.date_range('2000-1-1', periods=10, freq='D')
df1['diff'] = df1['end'] - df1['start']
df1['bool'] = (np.arange(10) % 3 == 0)
df1.ix[::2] = np.nan
df2 = df1.copy()

Then the underlying blocks look like this:

In [2]: df1._data.blocks
Out[2]: 
[DatetimeBlock: [start, end], 2 x 10, dtype: datetime64[ns],
 FloatBlock: [floats], 1 x 10, dtype: float64,
 ObjectBlock: [text], 1 x 10, dtype: object,
 TimeDeltaBlock: [diff], 1 x 10, dtype: timedelta64[ns],
 FloatBlock: [bool], 1 x 10, dtype: float64]

In [3]: df2._data.blocks
Out[3]: 
[DatetimeBlock: [start, end], 2 x 10, dtype: datetime64[ns],
 FloatBlock: [floats, bool], 2 x 10, dtype: float64,
 ObjectBlock: [text], 1 x 10, dtype: object,
 TimeDeltaBlock: [diff], 1 x 10, dtype: timedelta64[ns]]

df1 has two FloatBlocks while df2 has one FloatBlock.

Is there a way to massage the BlockManager into a canonical form? (or put more generally, how would you go about comparing these two BlockManagers for equality?)

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 21, 2014

before comparing, bm.consolidate_inplace() (will combine the blocks); this is a normal operation and is somewhat 'lazy', e.g. only done when needed. You will see this called a lot; do it inside the BlockManager.equals first thing (or after you compare shapes, but before iterating over the blocks)

blocks are created in various operations (e.g. insertion, changing a block dtype, etc)...the consolidate merges them (if it can)

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 21, 2014

another slight complication, block order is not-guaranteed, int that you could have [IntBlock, FloatBlock] in one and [FloatBlock, IntBlock] in another and they could be equal

so you should prob sort in some kind of order before you iterate (actually many ways to handle this).

@jreback jreback closed this Jan 21, 2014

@jreback jreback reopened this Jan 21, 2014

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Jan 22, 2014

In internals.py,

def _consolidate(blocks, items):
    # sort by _can_consolidate, dtype
    gkey = lambda x: x._consolidate_key
    grouper = itertools.groupby(sorted(blocks, key=gkey), gkey)

causes the blocks to be sorted by _consolidate_key (which includes the dtype). I think this will enforce the same dtype order of the blocks when comparing blockmanagers that should be equal.

However, it is also possible that the blockmanagers might have multiple blocks of the
same dtype but in different orders: [IntBlock1, IntBlock2] versus [IntBlock2,
IntBlock1].

Do you know if the call to _consolidate_inplace() will cause the merged blocks
to always appear in the same order?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2014

Yes the blocks CAN be in different orders; but since their are only a small number of block types, you could either order by the block types in a specific way (prob easiest), or iterate over one and find in the other

separately you might be able to guarantee that consolidate_inplace puts them in the same order (e.g. it would insert into a specific order rather than always appending at the end); I think this would be pretty straightfoward to do

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Jan 22, 2014

I think I need some help. I've been trying to create a test where the current code fails, but haven't been able to find one.

I'm pushing my test_internals.py to help clarify the case I'm worried about.
But it still passes the test because the index is unique and so in _merge_blocks

        # unique, can reindex
        if items.is_unique:
            return new_block.reindex_items_from(items)

makes the returned value the same for both blockmanagers because items is the same.

I wonder if there might be a problem if items is not unique, but I haven't been able to create such an example.

Can you help me find and example which breaks the current code?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2014

here's a non-unique example; essentially the placement is a set index to locations (as opposed to
the unique case where .ref_locs computes the indexer), here is is 'set' (by the calling function). You need this for the non-uniques to map the items in a block to the ref_items as they both could be non-unique (even across blocks).

This may not answer your question about the unique case, which I am thinking because of the reindex actually DOES guarantee orderings.(certainly on the items), but maybe on the blocks (as I said I cannot prove that it does not work)

from pandas.core.internals import make_block, BlockManager
import numpy as np
from pandas import Index

index = Index(list('aaabbb'))
block1 = make_block(np.arange(12).reshape(3,4), list('aaa'), index, placement=[0,1,2])
block2 = make_block(np.arange(12).reshape(3,4)*10, list('bbb'), index, placement=[3,4,5])
block1.ref_items = block2.ref_items = index
bm1 = BlockManager([block1, block2], [index, np.arange(block1.shape[1])])
bm2 = BlockManager([block2, block1], [index, np.arange(block1.shape[1])])

print "before consolidation"
print bm1
print bm1.blocks[0]._ref_locs
print bm2.blocks[0]._ref_locs
print bm2
print bm1.blocks[0]._ref_locs
print bm2.blocks[0]._ref_locs

bm1._consolidate_inplace()
bm2._consolidate_inplace()

print "\nafter consolidation"
print bm1
print bm1.blocks[0]._ref_locs
print bm2
print bm2.blocks[0]._ref_locs

output

before consolidation
BlockManager
Items: Index([u'a', u'a', u'a', u'b', u'b', u'b'], dtype='object')
Axis 1: Int64Index([0, 1, 2, 3], dtype='int64')
IntBlock: [a, a, a], 3 x 4, dtype: int64
IntBlock: [b, b, b], 3 x 4, dtype: int64
[0 1 2]
[3 4 5]
BlockManager
Items: Index([u'a', u'a', u'a', u'b', u'b', u'b'], dtype='object')
Axis 1: Int64Index([0, 1, 2, 3], dtype='int64')
IntBlock: [b, b, b], 3 x 4, dtype: int64
IntBlock: [a, a, a], 3 x 4, dtype: int64
[0 1 2]
[3 4 5]

after consolidation
BlockManager
Items: Index([u'a', u'a', u'a', u'b', u'b', u'b'], dtype='object')
Axis 1: Int64Index([0, 1, 2, 3], dtype='int64')
IntBlock: [a, a, a, b, b, b], 6 x 4, dtype: int64
[0 1 2 3 4 5]
BlockManager
Items: Index([u'a', u'a', u'a', u'b', u'b', u'b'], dtype='object')
Axis 1: Int64Index([0, 1, 2, 3], dtype='int64')
IntBlock: [b, b, b, a, a, a], 6 x 4, dtype: int64
[3 4 5 0 1 2]
@@ -4004,6 +4024,9 @@ def _merge_blocks(blocks, items, dtype=None, _can_consolidate=True):
raise AssertionError("_merge_blocks are invalid!")
dtype = blocks[0].dtype

if not items.is_unique:

This comment has been minimized.

Copy link
@unutbu

unutbu Jan 23, 2014

Author Contributor

The example you gave did indeed break the code. I've added your example to test_internals.py and am handling this case by sorting the blocks according to their ref_locs.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 23, 2014

ok looks good. can you do a quick perf test (just ccomment if its notok). II would add a small mention in the main docs (and put in a link from v0.13.1.txt), maybe in a sub-section near any/all/bool (IIRC in basics.rst). Also pls add a one-liner in release notes.

@y-p @jtratner ??

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Jan 23, 2014

Problem:

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_apply_np_mean                          |   3.4624 |   1.9017 |   1.8207 |
frame_apply_lambda_mean                      |   3.3963 |   1.2426 |   2.7331 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

I re-ran these Benchmarks and found the ratio is consistently large.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 23, 2014

are you rebased on master? I just added these

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Jan 23, 2014

Oops, thanks for the reminder. Now, much better:

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
frame_apply_np_mean                          |   3.1480 |   3.2330 |   0.9737 |
frame_apply_lambda_mean                      |   3.1850 |   3.1913 |   0.9980 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 23, 2014

yep...that looks fine

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 24, 2014

this looks ok to me.....@y-p ?
@jorisvandenbossche

@unutbu rebase maybe just to fix the release notes if you have a chance

@ghost

This comment has been minimized.

Copy link

commented Jan 24, 2014

Can't review, up to you.

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Jan 24, 2014

I think the Travis test failed for a reason unrelated to my commits. Is there a way to restart Travis on the same build, or should a push an innocuous change to try it again?

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 24, 2014

there is a little button on the rhs of the screen where you can restart an individual job

or can always

git commit -C HEAD --amend then force push (resets the commit hash and forces a rebuild)

API: Add equals method to NDFrames. (Implemented with `array_equivale…
…nt`, which is similar to `np.array_equal` except that it handles object arrays and treats NaNs in corresponding locations as equal.

TST: Add tests for NDFrame.equals and BlockManager.equals

DOC: Mention the equals method in basics, release and v.0.13.1
jreback added a commit that referenced this pull request Jan 24, 2014
Merge pull request #5283 from unutbu/array-equivalent
API: Add equals method to NDFrames.

@jreback jreback merged commit 929fd1c into pandas-dev:master Jan 24, 2014

1 check passed

default The Travis CI build passed
Details
@@ -215,6 +215,14 @@ These operations produce a pandas object the same type as the left-hand-side inp
that if of dtype ``bool``. These ``boolean`` objects can be used in indexing operations,
see :ref:`here<indexing.boolean>`

As of v0.13.1, Series, DataFrames and Panels have an equals method to compare if

This comment has been minimized.

Copy link
@jreback

jreback Jan 24, 2014

Contributor

I merged this thanks! maybe as a small followup....can you explain in the docs why one would need to do this, maybe a small example is in order?

@unutbu

This comment has been minimized.

Copy link
Contributor Author

commented Jan 24, 2014

@jreback: Sure; I tried pushing to here, but since that did not work, I've opened PR #6072.

@jreback

This comment has been minimized.

Copy link
Contributor

commented Jan 24, 2014

yep already merged

one thing on the doc update

can u put a link from v0.13.1 back to your new section

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.