-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AVRO-1646: Print/parse consistency for nested null namespace #724
AVRO-1646: Print/parse consistency for nested null namespace #724
Conversation
Fixes print-parse consistency for named types with null namespace, enclosed in named fields with null namespace, which are themselves enclosed in named fields with non-null namespace. Currently, if a type specifies the null namespace, this overrides the enclosing non-null namespace for the purpose of naming that type, but not for the purposes of determining the closest enclosing namespace for more deeply nested types. This fixes that by removing special-casing of null when setting the default namespace for recursively parsed schemas. Adapted from the patch that was submitted with the AVRO-1646 ticket.
f3cbae8
to
2b45bd0
Compare
Thanks for the clear case in the description! I'm looking into this, but there's definitely something odd going on -- not in the description but the unit test. Just reading the test and the Avro spec, I would expect the full name of the records to be Is this an ambiguity in the spec with regards to namespace inheritance? When you consider "canonical schemas", there's no way to distinguish the null |
@RyanSkraba My understanding was originally the same as yours, but after digging into it I came to the following understanding:
Thus, when That's a good point about canonical schemas - based on my reading of the spec, the canonical form of the name |
Excellent and clear explanation, thanks! To paraphrase: in Java, the fact that That sounds like a surprising rule in the current implementation, but it would avoid some problems with the same schema instance being reused in different name spaces. It doesn't seem quite right (for the semantics reason you give above), but it'd be hard to argue that it's wrong. In any case, the current behaviour without this PR is actually addressing is definitely wrong and improved by this fix! It took me a while to unravel the way |
Fixes print-parse consistency for named types with null namespace,
enclosed in named fields with null namespace, which are themselves
enclosed in named fields with non-null namespace.
Currently, if a type specifies the null namespace, this overrides the
enclosing non-null namespace for the purpose of naming that type, but
not for the purposes of determining the closest enclosing namespace for
more deeply nested types. This fixes that by removing special-casing of
null when setting the default namespace for recursively parsed schemas.
For example:
when parsed and re-printed becomes
Adapted from the patch that was submitted with the AVRO-1646 ticket.
Jira
Tests
TestSchema.testDeeplyNestedNullNamespace
Commits
Documentation