Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REF-2089] Use dill instead of cloudpickle for serialization #2922

Merged
merged 6 commits into from
Mar 27, 2024

Conversation

masenf
Copy link
Collaborator

@masenf masenf commented Mar 25, 2024

  • smaller size pickles
  • support dynamically defined states [REF-2265]
  • avoid issues with unpickleable globals [REF-1192]

Dynamically convert EventHandler to functools.partial

  • smaller size pickles!!!
  • serialization of the state instance no longer needs to worry about serializing partials
  • initialization of state does not need to unbox EventHandler functions up front

This gets us a step closer to EventHandler as descriptor without breaking any tests or existing functionality.

* smaller size pickles
* support dynamically defined states
* avoid issues with unpickleable globals
Instead of converting the functions up front and assigning them to the
instance, unbox the function from the EventHandler when it is requested via
__getattribute__. This reduces the size of the per-instance pickle, because
event handler bodies do not need to be included.
Copy link

linear bot commented Mar 25, 2024

Because pydantic can be installed without cython, only use the workaround in
the case where the BaseModel.validate function is NOT a FunctionType,
indicating it's a cython function.
@masenf masenf requested a review from martinxu9 March 26, 2024 17:25
Copy link
Contributor

@martinxu9 martinxu9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -2159,6 +2187,25 @@ async def modify_state(self, token: str) -> AsyncIterator[BaseState]:
await self.set_state(token, state)


# Workaround https://github.com/cloudpipe/cloudpickle/issues/408 for dynamic pydantic classes
if not isinstance(State.validate.__func__, FunctionType):
cython_function_or_method = type(State.validate.__func__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what type is a cython function?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a cython function is compiled to C via the cython library. there's no importable name for it though, because it masquerades as a builtin, but it's not accessible as a builtin name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the builtin function or method types, but those types are exposed via types.FunctionType and types.MethodType respectively

@masenf masenf merged commit b788890 into main Mar 27, 2024
66 checks passed
@masenf masenf deleted the masenf/dill-pickle branch March 27, 2024 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants