-
Notifications
You must be signed in to change notification settings - Fork 117
Reintroduce type renaming with more resilient handling for unexpected protobuf names #3770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Reintroduce type renaming with more resilient handling for unexpected protobuf names #3770
Conversation
… names (FoundationDB#3736)" (FoundationDB#3767) This reverts commit 7f1f4c9.
This adds support for retaining the protobuf names more directly for types and fields. This can happen if the user has created a meta-data proto and used a strategy for naming that differs from the one that would have been generated by our own DML. The basic strategy is to: 1. Continue to always apply the `toProtoUtils` method to produce plausible user-generated names but 1. Retain the original protobuf name in the `Type` information and then use that to get the name used to access data in the field
… that code in the RecordMetadataDeserializer with logic in the Type system
- make sure to convert FieldKeyExpression#fieldName (which is internal) to user-facing name when constructing match candidates. - also, add tests for deeply nested (and repeated) structures with non-pb-compliant field names, and an index.
… can match the indexes in the same circumstances as cases with non-escaped identifiers
…ializing RecordLayerSchemaTemplate. (#3)
hatyo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, LGTM, but I would leave the final judgement to @normen662.
ad4d344 to
5d212bc
Compare
📊 Metrics Diff Analysis ReportSummary
ℹ️ About this analysisThis automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:
The last category in particular may indicate planner regressions that should be investigated. New QueriesCount of new queries by file:
|
This reintroduces protobuf type and field renaming. The first version was added in #3696 and #3706 and then taken out by #3726. A second version was introduced with #3736 and then removed by #3767. This adds back a variation based on the data from #3736 but updated to be more resilient to unexpected names in existing meta-data objects.
The issue with #3736 before is that if a type existed in the meta-data that was not correctly escaped (e.g.,
Type__Blahinstead ofType__0Blah), then it would fail to match the type during querying. This was a problem even if the type wasn't actually involved in a query because of how matching worked on theFullUnorderedScanExpression, meaning that any query would fail to plan if any type in the meta-data was so written.This makes things more resilient. We now do a bit more work to associate a type with its original name from the protobuf file if one is provided, recording both the user-visible name and the original storage name. The only places that we now generate new protobuf compliant names is when we construct a
Typeobject. In all other cases, we only go from the storage name to user-visible names.We still do rely on the fact that we can correctly predict the expected user-visible name by running the de-escaping logic. At some point, we may need to have a more complicated mapping, especially if we want to support more arbitrary names. That is left as future work. I could also see us wanting to do a bit more refactoring to better encapsulate this transformation.
The new test modifications made to
valid-identifiers.yamsqlcover those cases by adding new types with names that would not have been generated by any DDL statement, and then validating that (1) those do not disrupt correctly constructed queries and (2) that the problematic types can themselves be queried.In addition, this addresses some shortcomings with the match candidates where
FieldKeyExpressions (which use the internal names) would sometimes be used to generate match candidates which referenced the internal name directly. This fixes that by plugging those gaps. There are additional queries invalid-identifiers.yamsqlthat are designed to cover those matches.