Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use inspect to deterministically tokenize local/lambda functions #767

Closed
wants to merge 1 commit into from

Conversation

rjzamora
Copy link
Member

Possible/partial solution to #765

This is for demonstration purposes - Needs further testing/discussion.

@fjetter
Copy link
Member

fjetter commented Jan 18, 2024

I was using / trying to use this before over in dask/dask#10808 but rejected the idea again. Sorry, I don't even remember why but something was missing.

One thing I remember is that inspect just fails at times

@fjetter
Copy link
Member

fjetter commented Jan 18, 2024

So far, the most promising approach was to tokenize the code object. This is also what cloudpickle is doing. This isn't possible for locally defined classes/types so it's also just a partial solution

@mrocklin
Copy link
Member

Lambdas can also contain other important scope, like closed-over variables.

@fjetter
Copy link
Member

fjetter commented Jan 18, 2024

so, effectively using something like https://github.com/cloudpipe/cloudpickle/blob/d003266b18336e1e603536bdbe6518bc2dcc00d3/cloudpickle/cloudpickle.py#L812-L920 works great for lambdas but the problem also pops up if types are defined in a local scope and they are a little harder to deal with

@fjetter
Copy link
Member

fjetter commented Jan 18, 2024

the most recent commit on main of cloudpickle (not released yet) likely solves the lambda problem but there are still cases that are not handled well.

@rjzamora
Copy link
Member Author

rjzamora commented Jan 18, 2024

Lambdas can also contain other important scope, like closed-over variables.

Totally agree with this. However, I'm still curious how much this matters for the tokenization purpose needed in dask-expr. (I'm sure we can come up with cases that demonstrate a problem, but I couldn't immediately imagine an obvious problem)

EDIT: Eh - I do agree that it is easy to come up with a case where the code string doesn't capture the closure well enough...

@rjzamora rjzamora closed this Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants