New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix nested discriminated union schema gen, pt 2 #8932
Conversation
Deploying with Cloudflare Pages
|
CodSpeed Performance ReportMerging #8932 will not alter performanceComparing Summary
|
Please review |
|
||
s = recurse(s, inner) | ||
if s['type'] == 'tagged-union': | ||
return s | ||
|
||
metadata = s.get('metadata', {}) | ||
discriminator = metadata.get(CORE_SCHEMA_METADATA_DISCRIMINATOR_PLACEHOLDER_KEY, None) | ||
discriminator = metadata.pop(CORE_SCHEMA_METADATA_DISCRIMINATOR_PLACEHOLDER_KEY, None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually get rid of this once we use it
elif choice['type'] == 'definition-ref': | ||
if choice['schema_ref'] not in self.definitions: | ||
raise MissingDefinitionForUnionRef(choice['schema_ref']) | ||
self._handle_choice(self.definitions[choice['schema_ref']]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important!!! If a choice is of type definition-ref
, we want to just reuse that ref for the given choice. Before, we were going through this whole thing of fetching the value from definitions, then using that, but that ends up not working for nested / recursive schemas.
Our schema walking logic walks through both the schema and the definitions, so we can rest easy knowing that unions will be converted to tagged unions in the definitions list as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work with JSON schema generation if e.g. the ref'ed schema is itself a discriminated union? Maybe that can't happen, and either way this seems like an improvement if no tests fail, but still
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. I believe that the walk core schema logic handles definitions schemas such that the function being applied during the walk is also applied to all of the definitions in a definitions schema, so that's why I felt comfortable removing this step.
This can be seen via the example test I added - discriminated union transformation logic is applied to the 2 schemas in the definitions list that require said changes!
I'll note, we do trace all of the way down each schema now to check for discriminated unions, which isn't the most performant, although we certainly get some nice benefits here from using refs instead of constantly pulling from definitions and then having to simplify nasty schemas. I think it'd be beneficial for us to have some codecov tests for schema building, etc for PRs like this one. |
@@ -1868,8 +1868,6 @@ class LeafState(BaseModel): | |||
state_type: Literal['leaf'] | |||
|
|||
AnyState = Annotated[Union[NestedState, LoopState, LeafState], Field(..., discriminator='state_type')] | |||
NestedState.model_rebuild() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you expand this section, you can see that this test showcases the example I've shown in the PR description 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! The edited test shows that a model rebuild is no longer necessary, and hence this does fix things.
I've added a few tests showing the now-correct behavior for the example given in #7487, along with some extra variants for Though these tests could be consolidated a bit with parametrization, I think that each of these test cases is simple enough that it's fine to leave them as singular tests that can easily be run independently. |
Regarding my comment about other high-priority fixes for discriminated unions, this was one I had in mind: #8628. I think we could include that fix + this improvement in 2.6.4. |
Trying to keep this change as simple as possible -- more
_generate_schema.py
refactoring to come in future PRs.This removes the usage of metadata keys to manage tagged union application in the latter part of schema simplification. I've made a comment on the most important line that fixed the bug we were experiencing with recursive discriminated unions.
See the code snippet below as an example of:
a) vastly increased performance compared to the faulty 2.6 logic
b) correct schema generation
The above schema is now actually correct, and isn't cluttered with metadata tags relating to the application of discriminated unions :)