Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-39434: Avoid defaultdict with lambda in pickled dataclasses #844

Merged
merged 2 commits into from May 31, 2023

Conversation

andy-slac
Copy link
Contributor

@andy-slac andy-slac commented May 28, 2023

Pickle does not like lambdas or local/nested methods, using a dataclass.field
with defaultdict and default_factory as a lambda breaks pickle. Instead of defining
a module-level factory method I decided to use a plain dict and its setdefault method.
Also had to add explicit pickle method to StoredFileInfo, being a frozen
dataclass it does not work by default with pickle.

Checklist

  • ran Jenkins
  • added a release note for user-visible changes to doc/changes

@codecov
Copy link

codecov bot commented May 28, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.02 🎉

Comparison is base (ec51e16) 87.91% compared to head (c3c26c9) 87.93%.

❗ Current head c3c26c9 differs from pull request most recent head 719fcf9. Consider uploading reports for the commit 719fcf9 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #844      +/-   ##
==========================================
+ Coverage   87.91%   87.93%   +0.02%     
==========================================
  Files         268      268              
  Lines       35240    35256      +16     
  Branches     7396     7392       -4     
==========================================
+ Hits        30980    31004      +24     
+ Misses       3119     3112       -7     
+ Partials     1141     1140       -1     
Impacted Files Coverage Δ
python/lsst/daf/butler/core/datastoreRecordData.py 96.00% <100.00%> (+11.06%) ⬆️
python/lsst/daf/butler/core/storedFileInfo.py 100.00% <100.00%> (ø)
python/lsst/daf/butler/datastores/fileDatastore.py 82.33% <100.00%> (+0.01%) ⬆️
tests/test_datastore.py 99.54% <100.00%> (+<0.01%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Member

@timj timj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forget why we can't use a single pydantic model for this and instead have to have a pydantic model and a dataclass. Is it for performance reasons? Presumably the pydantic model would pickle just fine.

Can you add a bit more detail about having to drop defaultdict in the commit message?

Pickle does not like lambdas or local/nested methods, using a `dataclass.field`
with defaultdict and default_factory as a lambda breaks pickle. Instead of defining
a module-level factory method I decided to use a plain dict and its `setdefault` method.
Also had to add explicit pickle method to StoredFileInfo, being a frozen
dataclass it does not work by default with pickle.
@andy-slac
Copy link
Contributor Author

I forget why we can't use a single pydantic model for this and instead have to have a pydantic model and a dataclass. Is it for performance reasons? Presumably the pydantic model would pickle just fine.

I'm not sure what the exact reason was, but I see that pydantic and dataclass have different structure, maybe pydantic is more YAML/JSON-friedly than the in-memory dataclass?

Can you add a bit more detail about having to drop defaultdict in the commit message?

I updated the commit message.

@andy-slac andy-slac merged commit 15ae3e5 into main May 31, 2023
11 checks passed
@andy-slac andy-slac deleted the tickets/DM-39434 branch May 31, 2023 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants