Avoid substituting keys that occur multiple times #10646

fjetter · 2023-11-23T14:11:44Z

I haven't run any widespread tests, yet. However, for the cases that are affected, this triggers the computation of a task twice and apparently also prohibits some kind of numpy optimization as reported in #10645

The graph topology here doesn't strike me as rare or very special

so the add in this graph is basically fused into a fused into a matmul-transpose blob / subgraphcallable that accepts two inputs that are identical. I would argue that it is a bug of blockwise_fusion that generates such a blob in the first place since by rewriting that subgraph callable it would be possible to accept just a single dependency, restoring this fusion again.

I'm not very familiar with the blockwise fusion code and was able to find this first.

With this PR the above graph optimizes to

This change effectively stops fusing because it detects this situation. If the blockwise fusion bug is fixed, it would fuse into one task (again) but without recomputing the intermediate result twice

cc @rjzamora

fjetter · 2023-11-23T14:14:23Z

I'm pretty sure this is a bug that would've been caught if this all was type annotated.

fjetter · 2023-11-23T14:29:20Z

I'm running our benchmarks to see if anything pops up. We don't have a X + 1; X.T @ X test but I mostly want to make sure no other workloads are negatively affected

fjetter · 2023-11-23T16:21:50Z

our benchmarks look good

Illviljan · 2023-11-25T11:04:31Z

dask/tests/test_optimization.py

@@ -227,6 +227,48 @@ def test_fuse_keys():
    )


+def test_donot_substitute_same_key_multiple_times():


Suggested change

def test_donot_substitute_same_key_multiple_times():

def test_donot_substitute_same_key_multiple_times() -> None:

If you add -> None to new tests mypy will be able to check them as well. I think it's good practice to just do it if you are worrying about the type hints.
It helps catch the edge cases that aren't always obvious when just type hinting the functions.

Avoid substituting keys that occur multiple times

474c202

Illviljan reviewed Nov 25, 2023

View reviewed changes

fjetter merged commit 80520de into dask:main Dec 12, 2023
26 of 28 checks passed

fjetter deleted the fix_fuse branch December 12, 2023 15:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid substituting keys that occur multiple times #10646

Avoid substituting keys that occur multiple times #10646

fjetter commented Nov 23, 2023

fjetter commented Nov 23, 2023

fjetter commented Nov 23, 2023

fjetter commented Nov 23, 2023

Illviljan Nov 25, 2023

		@@ -227,6 +227,48 @@ def test_fuse_keys():
		)


		def test_donot_substitute_same_key_multiple_times():

Avoid substituting keys that occur multiple times #10646

Avoid substituting keys that occur multiple times #10646

Conversation

fjetter commented Nov 23, 2023

fjetter commented Nov 23, 2023

fjetter commented Nov 23, 2023

fjetter commented Nov 23, 2023

Illviljan Nov 25, 2023

Choose a reason for hiding this comment