PERF: block-wise arithmetic for frame-with-frame #32779

jbrockmendel · 2020-03-17T17:01:08Z

…rf-arith-frame

jbrockmendel · 2020-03-17T18:45:18Z

Did numpy recently change something that would make np.errstate(all="ignore") not suppress a DeprecationWarning?

…rf-arith-frame

jreback

lookimg pretty good. do you have benchmarks?

if you can create some helper functions as indicated would be great.

jreback · 2020-03-19T00:49:15Z

pandas/core/ops/__init__.py

+            nbs = blk._split_op_result(res_values)
+            res_blks.extend(nbs)
+            continue
+


can you add some comments here.(general + line comments as needed)

jreback · 2020-03-19T00:49:51Z

pandas/core/ops/__init__.py

+
+        if not isinstance(blk_vals, np.ndarray):
+            # 1D EA
+            assert len(locs) == 1, locs


might be more clear to make for example this section (368 to 374) as a function

so the structure of what you are doing is more clear.

jreback · 2020-03-19T00:51:19Z

also this has a linked issue IIRC? (you just commented today on it, I think)

jbrockmendel · 2020-03-20T20:36:26Z

will add comments as requested

before implementing helper functions, I'm going to wait for the where/putmask PRs (#32769,#32791, #32846) to go through, as I think that will allow us to simplify this in a more elegant way

pandas/core/arrays/datetimelike.py

…rf-arith-frame

pep8speaks · 2020-03-25T17:04:34Z

Hello @jbrockmendel! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2020-05-17 21:35:14 UTC

jbrockmendel · 2020-03-25T17:12:46Z

Refactored the new function here so as to have one main execution path. This makes the debugging assertions clunkier, but whats left is much smoother than the previous implementation.

TomAugspurger

Seems nice overall. Can you ensure we have ASVs for this (I think we do) and post the results?

TomAugspurger · 2020-03-31T18:16:05Z

pandas/core/ops/array_ops.py

-        result = masked_arith_op(left, right, op)
+    with warnings.catch_warnings():
+        # suppress warnings from numpy about element-wise comparison
+        warnings.simplefilter("ignore", DeprecationWarning)


What warning are we suppressing here, and is this future-proof? If this is the usual

In [2]: np.array([1, 2]) == 'a' /Users/taugspurger/.virtualenvs/dask-dev/bin/ipython:1: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison #!/Users/taugspurger/Envs/dask-dev/bin/python Out[2]: False

then in the future the result will be array([False, False]). Is that what we want?

this warning-catching was necessary to make the npdev build, pass, im going to see if i can revert it

looks like catching this is needed in some form, but i can move the catching so as to cast a less-wide net.

pandas/core/ops/__init__.py

…rf-arith-frame

jbrockmendel · 2020-04-01T04:09:02Z

Just pushed with a some new asvs, about 200x speedup in the best-case scenario:

arr = np.random.randn(10 ** 6).reshape(500, 2000).astype(np.float64)
df = pd.DataFrame(arr)
df[1000] = df[1000].astype(np.float32)
df.iloc[:, 1000:] = df.iloc[:, 1000:].astype(np.float32)

df2 = pd.DataFrame(arr)
df2[1000] = df2[1000].astype(np.int64)
df2.iloc[:, 500:1500] = df2.iloc[:, 500:1500].astype(np.int64)

df._consolidate_inplace()
df2._consolidate_inplace()

In [35]: %timeit df + df2                                                                                                                                                                         
572 ms ± 55 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  # <-- master
113 ms ± 10.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)  # <-- PR

In [38]: %timeit df + df                                                                                                                                                                                
527 ms ± 69.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)  # <-- master
2.51 ms ± 24.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  # <-- PR

…rf-arith-frame

jbrockmendel · 2020-05-08T17:09:51Z

gentle ping

jorisvandenbossche · 2020-05-08T20:52:43Z

Sorry for the slow follow-up, will take a look at your latest changes tomorrow!

jorisvandenbossche · 2020-05-11T12:40:04Z

just updated with a simpler variant of #33597 that is effectively operating column-wise in pretty-precisely those situations where doing so will avoid copies being made.

That indeed nicely avoids copies in some cases!
However, the example that I gave above (third paragraph at #32779 (comment)) is still using "take" instead of slice: so suppose a left df with "int, int, float, int" columns and a right df with "int, int, int, int" columns (eg because of a NaN being present). Such a case is still copying a part of the right block.
Now, I suppose further special cases can be added to _slice_take_blocks_ax0 to also avoid a copy here.

Personally, I still think that with something a little bit simpler (eg the "only do block-wise in case of identical block layout"), we can reduce the complexity quite a bit while at the same time preserving the performance speedup for the majority of use cases of wide frames.

jorisvandenbossche · 2020-05-11T11:50:39Z

asv_bench/benchmarks/arithmetic.py

+            operator.gt,
+            operator.ge,
+            operator.lt,
+            operator.le,


Is it needed to test it with all those different ops?
I think what we are interested in catching with the benchmark, is the general code handling wide dataframes (how the dispatching to the actual op is done, dispathing to block/column instead of series, etc), not the actual op right? So for those aspects, they all use the same code, and testing all ops seems overkill? (it just adds to the number of benchmarks needlessly, making it harder to run the benchmark suite / interpret the results)

Is it needed to test it with all those different ops?

No strong opinion here.

making it harder to run the benchmark suite

Yah, this is a hassle. Best guess is eventually we're going to land on a "throw hardware at the problem" solution, but that doesn't help the local-running troubles.

No strong opinion here.

Then let's keep only 2 of them or so, I would say

how about four: one comparison, one logical, floordiv (for which we have special handling), and one other arithmetic

asv_bench/benchmarks/arithmetic.py

pandas/core/ops/blockwise.py

pandas/core/internals/managers.py

jorisvandenbossche · 2020-05-11T12:46:36Z

pandas/tests/arithmetic/test_datetime64.py

-        warn = PerformanceWarning if box_with_array is not pd.DataFrame else None
+        warn = None
+        if box_with_array is not pd.DataFrame or tz_naive_fixture is None:
+            warn = PerformanceWarning


Do you know why this changed? We do / do not raise a warning on the array-level?

We don't do a warning when operating op(arr, length_1_object_array), which turns out to be the cases described here (in reverse)

jorisvandenbossche · 2020-05-11T12:55:08Z

Expanding on the last paragraph of my comment above:

Personally, I still think that with something a little bit simpler (eg the "only do block-wise in case of identical block layout"), we can reduce the complexity quite a bit while at the same time preserving the block-wise performance speedup for the majority of use cases of wide frames (in the assumption that wide dataframes typically have uniform dtypes).

Now, if both Jeff and Tom are fine with the current PR instead of my reduced proposal, I am not going to further block it. I made my stance clear, but if the majority prefers otherwise, I go with that.

…rf-arith-frame

jbrockmendel · 2020-05-11T16:45:00Z

@jorisvandenbossche thanks for new round of comments. Looking into the non-slicing case you pointed out now.

jbrockmendel · 2020-05-11T22:53:21Z

Updated to avoid copy in the case Joris identified.

gentle ping @jreback @TomAugspurger

…rf-arith-frame

TomAugspurger · 2020-05-13T13:12:39Z

I haven't been able to stay up to date with this. I wouldn't wait around for me.

jbrockmendel · 2020-05-13T16:54:31Z

Travis is green, just hasnt updated the icon here

…rf-arith-frame

jbrockmendel · 2020-05-14T20:22:27Z

rebased+green

asv_bench/benchmarks/arithmetic.py

…rf-arith-frame

jreback · 2020-05-17T21:13:21Z

pandas/core/internals/managers.py

+                    blocks = []
+                    for i, ml in enumerate(slobj):
+                        nb = blk.getitem_block([ml], new_mgr_locs=i)
+                        print(nb.shape, np.values.shape)


extra print here, canyou use a list-comprehension here

updated+green

jreback · 2020-05-17T21:13:47Z

pandas/core/internals/managers.py

+                    elif only_slice:
+                        # GH#33597 slice instead of take, so we get
+                        #  views instead of copies
+                        for i, ml in zip(taker, mgr_locs):


can you use a list comprehesnion

above on 1317 i can, here it is less clean

jreback · 2020-05-17T21:14:21Z

pandas/core/ops/blockwise.py

+            # else:
+            #    assert res_values.shape == lvals.shape, (res_values.shape, lvals.shape)
+
+            for nb in nbs:


can you do this here

…rf-arith-frame

jreback · 2020-05-18T15:09:15Z

pandas/core/ops/blockwise.py

+    return new_mgr
+
+
+def _reset_block_mgr_locs(nbs: List["Block"], locs):


returns None

though I would actually return the List['Block'] here as conceptually simpler to grok

though I would actually return the List['Block'] here as conceptually simpler to grok

that can give the misleading impression that args are not being altered in place.

jreback · 2020-05-19T13:16:25Z

ok thanks @jbrockmendel

let's see how this goes

jbrockmendel added 3 commits March 17, 2020 09:59

PERF: block-wise arithmetic for frame-with-frame

1697252

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

a7764d6

…rf-arith-frame

lint fixup

30a836d

jbrockmendel added 2 commits March 17, 2020 19:08

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

3559698

…rf-arith-frame

troubleshoot npdev build

4334353

jbrockmendel mentioned this pull request Mar 18, 2020

PERF: implement scalar ops blockwise #29853

Merged

jreback requested changes Mar 19, 2020

View reviewed changes

jreback added Performance Memory or execution speed performance Internals Related to non-user accessible pandas implementation labels Mar 19, 2020

jreback reviewed Mar 22, 2020

View reviewed changes

pandas/core/arrays/datetimelike.py Outdated Show resolved Hide resolved

jbrockmendel added 3 commits March 24, 2020 16:16

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

cb40b0c

…rf-arith-frame

comment

713a776

checkpoint passing

95ef3ad

jbrockmendel mentioned this pull request Mar 25, 2020

WIP/PERF: block-wise ops for frame-with-series axis=1 #32997

Closed

jbrockmendel added 2 commits March 25, 2020 09:29

checkpoint passing

61e5cd6

refactor

89c3d7b

blackify

e348e46

TomAugspurger reviewed Mar 31, 2020

View reviewed changes

jbrockmendel added 4 commits March 31, 2020 13:31

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

519c757

…rf-arith-frame

disable assertions for perf

2b1ba18

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

53e93fc

…rf-arith-frame

asv

91c86a3

jbrockmendel added 2 commits April 1, 2020 08:03

whatsnew

2034084

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

8aedf35

…rf-arith-frame

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

f3dc465

…rf-arith-frame

jorisvandenbossche reviewed May 11, 2020

View reviewed changes

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

b57f52c

…rf-arith-frame

update per comments

0c46531

jbrockmendel added 2 commits May 12, 2020 09:44

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

463a145

…rf-arith-frame

update per comments

32e70d8

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

7989251

…rf-arith-frame

jorisvandenbossche requested changes May 14, 2020

View reviewed changes

asv_bench/benchmarks/arithmetic.py Show resolved Hide resolved

jbrockmendel added 2 commits May 16, 2020 09:44

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

455e45e

…rf-arith-frame

update asv

41e8e78

jreback requested changes May 17, 2020

View reviewed changes

jbrockmendel added 2 commits May 17, 2020 14:29

Merge branch 'master' of https://github.com/pandas-dev/pandas into pe…

ac8eea8

…rf-arith-frame

requested edits

8c4f951

jreback reviewed May 18, 2020

View reviewed changes

jreback approved these changes May 19, 2020

View reviewed changes

jreback merged commit b9ad20a into pandas-dev:master May 19, 2020

jbrockmendel deleted the perf-arith-frame branch May 19, 2020 15:48

jorisvandenbossche mentioned this pull request May 25, 2020

REGR: frame/frame op with unaligned blocks + non-slice-like placement failing with assertion error #34367

Closed

jorisvandenbossche mentioned this pull request Feb 16, 2021

ASV: add frame ops benchmarks for varying n_rows/n_columns ratios #39848

Merged

jorisvandenbossche mentioned this pull request Mar 19, 2021

CLN/PERF: remove catching of numpy deprecation warning in comparison_op #40515

Merged

		return new_mgr


		def _reset_block_mgr_locs(nbs: List["Block"], locs):

PERF: block-wise arithmetic for frame-with-frame #32779

PERF: block-wise arithmetic for frame-with-frame #32779

Conversation

jbrockmendel commented Mar 17, 2020 • edited Loading

jbrockmendel commented Mar 17, 2020

jreback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Mar 19, 2020

jbrockmendel commented Mar 20, 2020

pep8speaks commented Mar 25, 2020 • edited Loading

Comment last updated at 2020-05-17 21:35:14 UTC

jbrockmendel commented Mar 25, 2020

TomAugspurger left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jbrockmendel commented Apr 1, 2020 • edited Loading

jbrockmendel commented May 8, 2020

jorisvandenbossche commented May 8, 2020

jorisvandenbossche commented May 11, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented May 11, 2020

jbrockmendel commented May 11, 2020

jbrockmendel commented May 11, 2020

TomAugspurger commented May 13, 2020

jbrockmendel commented May 13, 2020

jbrockmendel commented May 14, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented May 19, 2020

jbrockmendel commented Mar 17, 2020 •

edited

Loading

pep8speaks commented Mar 25, 2020 •

edited

Loading

jbrockmendel commented Apr 1, 2020 •

edited

Loading