feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures #294

lgray · 2023-06-21T17:24:59Z

This will allow more expressive signatures as inputs to map_partitions.

Resulting code is pleasantly clean.

for more information, see https://pre-commit.ci

- we can always use index zero since we're unpacking from list/dict at the top level

lgray · 2023-06-21T17:59:58Z

OK there we go - got all the original tests passing.

…ction

codecov-commenter · 2023-06-21T23:52:49Z

Codecov Report

Merging #294 (98b7d2f) into main (b35dbe7) will decrease coverage by 0.14%.
The diff coverage is 97.82%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main     #294      +/-   ##
==========================================
- Coverage   91.69%   91.56%   -0.14%     
==========================================
  Files          21       21              
  Lines        2433     2466      +33     
==========================================
+ Hits         2231     2258      +27     
- Misses        202      208       +6

Impacted Files	Coverage Δ
src/dask_awkward/lib/core.py	`89.63% <97.82%> (+0.30%)`	⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

lgray · 2023-06-22T13:01:25Z

@douglasdavis @martindurant @agoose77 I think this is complete, would appreciate a review!

It turns out we actually need this over in coffea for correctly dealing with the triton inference server. Packing arrays into structured arts/kwargs according to triton's interface makes it much more clear to the user what to do to get a meaningful answer, since we can just forward the underlying library's interface up to the user.

douglasdavis

Thanks @lgray! Overall this looks pretty good to me. I left a few inline comments/questions. I think it would also be a good idea to have a couple more tests mixing args and kwargs with map_partitions

src/dask_awkward/lib/core.py

Co-authored-by: Doug Davis <ddavis@ddavis.io>

lgray · 2023-06-22T16:28:50Z

A pure kwargs example I think is sufficient. We test pure args elsewhere in the code, and one args + kwargs example is enough to show it functions correctly.

I guess I can add one with a scalar arg that's broadcasted or something.

lgray · 2023-06-22T16:39:48Z

@douglasdavis all comments addressed, tests extended :-)

douglasdavis · 2023-06-22T16:40:04Z

Yea the structured_function test is a good one!

I wrote this one for some local tinkering- may be worth adding

def my_func(aaa, bbb, *, ccc=None, ddd=None):
    return (aaa + bbb) ** ccc - ddd

a = ak.Array([[1,2,3,],[4,5],[6,7,8]])
b = ak.Array([[-10, -10, -10], [-10, -10], [-10, -10, -10]])
c = ak.Array([0, 1, 2])
d = 1

aa = dak.from_awkward(a, npartitions=2)
bb = dak.from_awkward(b, npartitions=2)
cc = dak.from_awkward(c, npartitions=2)
dd = d

res = my_func(a, b, ccc=c, ddd=d)

print(res)

resl = dak.map_partitions(my_func, aa, bb, ccc=cc, ddd=dd)

lgray · 2023-06-22T16:40:44Z

please go ahead and add it!

douglasdavis · 2023-06-22T16:48:54Z

looks good!

lgray and others added 10 commits June 21, 2023 12:23

use unpack_collections to find deps instead of top-level iteration

27f4497

[pre-commit.ci] auto fixes from pre-commit.com hooks

bf529b3

for more information, see https://pre-commit.ci

missed a comma

aac9084

not sure why it reverted the numpy changes...

43b447a

adjust map_meta as well for more flexibility

e0e52ec

traverse is a keyword to map_meta now

8e3543d

[pre-commit.ci] auto fixes from pre-commit.com hooks

8f9144e

for more information, see https://pre-commit.ci

get the lza fallback too

a1c027b

[pre-commit.ci] auto fixes from pre-commit.com hooks

bae1a79

for more information, see https://pre-commit.ci

un-tuple them

82ab6e1

- we can always use index zero since we're unpacking from list/dict at the top level

lgray and others added 5 commits June 21, 2023 13:01

forward map_partitions traverse to map_meta

f6829aa

failing test; to_meta improvement

3f715e8

function with kwargs test works as the cost of most other things

39d48fe

all tests pass, but kwargs cannot be a blockwise index

c37bf8a

general solution: unroll args to per-arg deps, repack in wrapping fun…

7569bb5

…ction

lgray mentioned this pull request Jun 22, 2023

fix: adjustments to callable wrap to deal with typetracers in nested python structures CoffeaTeam/coffea#843

Merged

remove spurious comments

ca0107e

lgray changed the title ~~refactor: use unpack_collections to find deps instead of top-level iteration~~ feat: use unpack_collections to find deps instead of top-level iteration Jun 22, 2023

douglasdavis reviewed Jun 22, 2023

View reviewed changes

src/dask_awkward/lib/core.py Outdated Show resolved Hide resolved

src/dask_awkward/lib/core.py Outdated Show resolved Hide resolved

src/dask_awkward/lib/core.py Show resolved Hide resolved

src/dask_awkward/lib/core.py Outdated Show resolved Hide resolved

lgray and others added 3 commits June 22, 2023 10:28

uniformly apply hyphenize

98b7d2f

Co-authored-by: Doug Davis <ddavis@ddavis.io>

address comments

ebeecba

add a test that is topologically similar to a real-world use case

32d58fa

add more topologies of arg packing to tests

04949c5

one more rest case

a5e4d6b

douglasdavis changed the title ~~feat: use unpack_collections to find deps instead of top-level iteration~~ feat: in map_partitions allow for dask collections to be passed as kwargs or as part of higher level structures Jun 22, 2023

douglasdavis changed the title ~~feat: in map_partitions allow for dask collections to be passed as kwargs or as part of higher level structures~~ feat: map_partitions: allow for dask collections to be passed as kwargs or as part of higher level structures Jun 22, 2023

lgray changed the title ~~feat: map_partitions: allow for dask collections to be passed as kwargs or as part of higher level structures~~ feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures Jun 22, 2023

douglasdavis enabled auto-merge (squash) June 22, 2023 16:48

douglasdavis merged commit 6d74980 into dask-contrib:main Jun 22, 2023
23 checks passed

lgray deleted the patch-9 branch June 22, 2023 19:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures #294

feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures #294

lgray commented Jun 21, 2023 •

edited

lgray commented Jun 21, 2023

codecov-commenter commented Jun 21, 2023 •

edited

lgray commented Jun 22, 2023 •

edited

douglasdavis left a comment

lgray commented Jun 22, 2023

lgray commented Jun 22, 2023

douglasdavis commented Jun 22, 2023 •

edited

lgray commented Jun 22, 2023

douglasdavis commented Jun 22, 2023

feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures #294

feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures #294

Conversation

lgray commented Jun 21, 2023 • edited

lgray commented Jun 21, 2023

codecov-commenter commented Jun 21, 2023 • edited

Codecov Report

lgray commented Jun 22, 2023 • edited

douglasdavis left a comment

Choose a reason for hiding this comment

lgray commented Jun 22, 2023

lgray commented Jun 22, 2023

douglasdavis commented Jun 22, 2023 • edited

lgray commented Jun 22, 2023

douglasdavis commented Jun 22, 2023

lgray commented Jun 21, 2023 •

edited

codecov-commenter commented Jun 21, 2023 •

edited

lgray commented Jun 22, 2023 •

edited

douglasdavis commented Jun 22, 2023 •

edited