Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for custom picklers #2682

Merged
merged 4 commits into from
Sep 1, 2023
Merged

Conversation

agoose77
Copy link
Collaborator

@agoose77 agoose77 commented Aug 30, 2023

This PR adds support for customisation of the pickling __reduce__ implementation by third-party libraries. The intention is to provide a mechanism for dask-awkward to intercept serialisation in a form-preserving manner c.f. dask-contrib/dask-awkward#344.

The problem in dask-contrib/dask-awkward#344 is that whilst regular users of Awkward Array are likely not that concerned about the form of an array, Dask requires guarantees that the form does not change during transfer between workers. As such, we need an environment-sensitive serialisation mechanism.

The simplest/decoupled way of doing this is to let dask-awkward handle the serialisation, rather than internally making Awkward aware of dask-awkward directly.

With this PR it is possible to define an awkward.pickle.reduce entrypoint that exposes a __reduce_ex__ implementation, e.g. with hatchling:

[project.entry-points."awkward.pickle.reduce"]
dask-awkward-pickler = "dask_awkward.pickle:plugin"

where

def plugin(obj, protocol: int):
    if isinstance(obj, (ak.Array, ak.Record)):
        ...
    else:
        # Allow future extensions of this mechanism
        return NotImplemented

@codecov
Copy link

codecov bot commented Aug 30, 2023

Codecov Report

Merging #2682 (035c891) into main (59d4235) will increase coverage by 0.00%.
The diff coverage is 85.41%.

Additional details and impacted files
Files Changed Coverage Δ
src/awkward/_pickle.py 85.36% <85.36%> (ø)
src/awkward/highlevel.py 76.38% <85.71%> (+0.09%) ⬆️

@agoose77 agoose77 temporarily deployed to docs-preview August 30, 2023 17:46 — with GitHub Actions Inactive
@agoose77 agoose77 marked this pull request as ready for review August 31, 2023 07:08
@agoose77 agoose77 temporarily deployed to docs-preview August 31, 2023 07:24 — with GitHub Actions Inactive
Copy link
Member

@jpivarski jpivarski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, letting dask-awkward have the responsibility of implementing the serialization. Awkward can't know about all of its dependent projects (though dask-awkward is certainly one that we do know about!), but dependent projects by definition all know about Awkward.

So yes, I think this is the right way to go.

If we eventually do have a Form-preserving packing algorithm, we can provide that centrally in Awkward so that dependent projects that do override serialization can choose that as an option. It would be a performance improvement, but not a correctness one, so not the highest priority.

@jpivarski
Copy link
Member

Oh, and this is ready to merge!

@agoose77 agoose77 merged commit 519bba6 into main Sep 1, 2023
34 checks passed
@agoose77 agoose77 deleted the agoose77/feat-custom-pickle branch September 1, 2023 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants