New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-30266 Modify serialization of some objects #573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've got some pretty big concerns here, and I'm not actually sure it's worth trying to resolve them on this branch - originally this ticket was just going to be about making quantum node IDs into UUIDs to make other Quantum serialization changes easier, but:
- this now overlaps enough with DM-30332 that I'd really like to at least try to reconcile the branches before merging, to see if we've solved some problems in fundamentally different ways;
- the
hash()
-keyedDimensionRecord
normalization here is broken, but that is the problem that DM-30332 aims to solve much more rigorously; - if we can possibly avoid setting the precedent of adding these
direct
methods to all of our serialization structures, I think I'll pay off in the long run.
I'd also like @timj to sign off on the serialization changes first, if we do plan to merge this soon; this system is more in his domain than mine (note that he may also object to some of the things I'm doing on DM-30332).
So, where does this leave us? I was hoping not to get back to DM-30332 until after some big query-system changes, because QG serialiization and state stuff in general is a lower priority from a high-level perspective than QG generation stuff right now. But this ticket is mostly done, and there's a lot of content on the DM-30332 branches that I don't want to get too stale, either. And if I can get DM-30332 (or at least the parts that are basically done) out for review soon, we could rebase this branch on that and delegate to it for the DimensionRecord
normalization logic instead of fixing it independently.
There's also the big catch that I didn't try to avoid pydantic validation on DM-30332, and the fact that @natelust has done so here makes me worry that DM-30332 could push us off a performance cliff. Maybe I can learn enough from his work here to fix my branch in that respect, but I'm also worried that just turning off validation sort of defeats the purpose of using pydantic, and it might be a sign of bigger problems to come. I'm been wondering for a while now whether QG is going to be the thing that finally pushes us into adding some compiled-language code to the middleware; doing graph algorithms with hundreds of thousands of nodes in Python just strikes me as bonkers, period.
Thoughts on how to proceed welcome.
This method should only be called when the inputs are trusted. | ||
""" | ||
node = SerializedDatasetRef.__new__(cls) | ||
setter = object.__setattr__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This degree of poking at class internals worries me a lot; we seem to be assuming a lot about pydantic implementation details. If we can't use construct
instead of adding these methods, could we at least use it inside the direct
implementations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@natelust can you respond to @TallJimbo 's question here please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
He and I discussed this out of band. I'm not thrilled, but I don't see a great alternative and I'm mostly disappointed in pydantic. Hopefully we can figure out a way to clean it up later.
fbf806d
to
953ce5e
Compare
Codecov Report
@@ Coverage Diff @@
## main #573 +/- ##
==========================================
- Coverage 83.85% 83.26% -0.60%
==========================================
Files 234 242 +8
Lines 29781 30896 +1115
Branches 4929 4636 -293
==========================================
+ Hits 24974 25725 +751
- Misses 3674 3968 +294
- Partials 1133 1203 +70
Continue to review full report at Codecov.
|
There don't seem to be any tests that use the serialization |
953ce5e
to
c7cc0d0
Compare
c604def
to
85eaba5
Compare
Add/Modify serialization of some objects to support a new serialized format of QuantumGraphs. In particular this introduces a new method on some objects to support direct construction if the inputs are already trusted, skipping validation steps.
85eaba5
to
08a9926
Compare
Add/Modify serialization of some objects to support a new serialized
format of QuantumGraphs. In particular this introduces a new method
on some objects to support direct construction if the inputs are
already trusted, skipping validation steps.
Checklist
doc/changes