-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(map_partitions): allow for dask collections to be passed as kwargs or as part of higher level structures #294
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
- we can always use index zero since we're unpacking from list/dict at the top level
OK there we go - got all the original tests passing. |
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## main #294 +/- ##
==========================================
- Coverage 91.69% 91.56% -0.14%
==========================================
Files 21 21
Lines 2433 2466 +33
==========================================
+ Hits 2231 2258 +27
- Misses 202 208 +6
... and 1 file with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@douglasdavis @martindurant @agoose77 I think this is complete, would appreciate a review! It turns out we actually need this over in coffea for correctly dealing with the triton inference server. Packing arrays into structured arts/kwargs according to triton's interface makes it much more clear to the user what to do to get a meaningful answer, since we can just forward the underlying library's interface up to the user. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @lgray! Overall this looks pretty good to me. I left a few inline comments/questions. I think it would also be a good idea to have a couple more tests mixing args and kwargs with map_partitions
Co-authored-by: Doug Davis <ddavis@ddavis.io>
A pure kwargs example I think is sufficient. We test pure args elsewhere in the code, and one args + kwargs example is enough to show it functions correctly. I guess I can add one with a scalar arg that's broadcasted or something. |
@douglasdavis all comments addressed, tests extended :-) |
Yea the I wrote this one for some local tinkering- may be worth adding def my_func(aaa, bbb, *, ccc=None, ddd=None):
return (aaa + bbb) ** ccc - ddd
a = ak.Array([[1,2,3,],[4,5],[6,7,8]])
b = ak.Array([[-10, -10, -10], [-10, -10], [-10, -10, -10]])
c = ak.Array([0, 1, 2])
d = 1
aa = dak.from_awkward(a, npartitions=2)
bb = dak.from_awkward(b, npartitions=2)
cc = dak.from_awkward(c, npartitions=2)
dd = d
res = my_func(a, b, ccc=c, ddd=d)
print(res)
resl = dak.map_partitions(my_func, aa, bb, ccc=cc, ddd=dd) |
please go ahead and add it! |
map_partitions
: allow for dask collections to be passed as kwargs or as part of higher level structures
map_partitions
: allow for dask collections to be passed as kwargs or as part of higher level structures
looks good! |
This will allow more expressive signatures as inputs to map_partitions.
Resulting code is pleasantly clean.