Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures #294

Merged
merged 21 commits into from
Jun 22, 2023

Conversation

lgray
Copy link
Collaborator

@lgray lgray commented Jun 21, 2023

This will allow more expressive signatures as inputs to map_partitions.

Resulting code is pleasantly clean.

@lgray
Copy link
Collaborator Author

lgray commented Jun 21, 2023

OK there we go - got all the original tests passing.

@codecov-commenter
Copy link

codecov-commenter commented Jun 21, 2023

Codecov Report

Merging #294 (98b7d2f) into main (b35dbe7) will decrease coverage by 0.14%.
The diff coverage is 97.82%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

@@            Coverage Diff             @@
##             main     #294      +/-   ##
==========================================
- Coverage   91.69%   91.56%   -0.14%     
==========================================
  Files          21       21              
  Lines        2433     2466      +33     
==========================================
+ Hits         2231     2258      +27     
- Misses        202      208       +6     
Impacted Files Coverage Δ
src/dask_awkward/lib/core.py 89.63% <97.82%> (+0.30%) ⬆️

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@lgray lgray changed the title refactor: use unpack_collections to find deps instead of top-level iteration feat: use unpack_collections to find deps instead of top-level iteration Jun 22, 2023
@lgray
Copy link
Collaborator Author

lgray commented Jun 22, 2023

@douglasdavis @martindurant @agoose77 I think this is complete, would appreciate a review!

It turns out we actually need this over in coffea for correctly dealing with the triton inference server. Packing arrays into structured arts/kwargs according to triton's interface makes it much more clear to the user what to do to get a meaningful answer, since we can just forward the underlying library's interface up to the user.

Copy link
Collaborator

@douglasdavis douglasdavis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lgray! Overall this looks pretty good to me. I left a few inline comments/questions. I think it would also be a good idea to have a couple more tests mixing args and kwargs with map_partitions

src/dask_awkward/lib/core.py Outdated Show resolved Hide resolved
src/dask_awkward/lib/core.py Outdated Show resolved Hide resolved
src/dask_awkward/lib/core.py Show resolved Hide resolved
src/dask_awkward/lib/core.py Outdated Show resolved Hide resolved
@lgray
Copy link
Collaborator Author

lgray commented Jun 22, 2023

A pure kwargs example I think is sufficient. We test pure args elsewhere in the code, and one args + kwargs example is enough to show it functions correctly.

I guess I can add one with a scalar arg that's broadcasted or something.

@lgray
Copy link
Collaborator Author

lgray commented Jun 22, 2023

@douglasdavis all comments addressed, tests extended :-)

@douglasdavis
Copy link
Collaborator

douglasdavis commented Jun 22, 2023

Yea the structured_function test is a good one!

I wrote this one for some local tinkering- may be worth adding

def my_func(aaa, bbb, *, ccc=None, ddd=None):
    return (aaa + bbb) ** ccc - ddd

a = ak.Array([[1,2,3,],[4,5],[6,7,8]])
b = ak.Array([[-10, -10, -10], [-10, -10], [-10, -10, -10]])
c = ak.Array([0, 1, 2])
d = 1

aa = dak.from_awkward(a, npartitions=2)
bb = dak.from_awkward(b, npartitions=2)
cc = dak.from_awkward(c, npartitions=2)
dd = d

res = my_func(a, b, ccc=c, ddd=d)

print(res)

resl = dak.map_partitions(my_func, aa, bb, ccc=cc, ddd=dd)

@lgray
Copy link
Collaborator Author

lgray commented Jun 22, 2023

please go ahead and add it!

@douglasdavis douglasdavis changed the title feat: use unpack_collections to find deps instead of top-level iteration feat: in map_partitions allow for dask collections to be passed as kwargs or as part of higher level structures Jun 22, 2023
@douglasdavis douglasdavis changed the title feat: in map_partitions allow for dask collections to be passed as kwargs or as part of higher level structures feat: map_partitions: allow for dask collections to be passed as kwargs or as part of higher level structures Jun 22, 2023
@lgray lgray changed the title feat: map_partitions: allow for dask collections to be passed as kwargs or as part of higher level structures feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures Jun 22, 2023
@douglasdavis douglasdavis enabled auto-merge (squash) June 22, 2023 16:48
@douglasdavis
Copy link
Collaborator

looks good!

@douglasdavis douglasdavis merged commit 6d74980 into dask-contrib:main Jun 22, 2023
23 checks passed
@lgray lgray deleted the patch-9 branch June 22, 2023 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants