-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix schema inference in case of empty messages in Protobuf/CapnProto formats #39357
Fix schema inference in case of empty messages in Protobuf/CapnProto formats #39357
Conversation
Our current workaround for this bug is to use schema inference to get a preliminary schema on our (extremely) complex Protobuf messages and then edit the inferred schema to remove illegal empty tuples. This patch would make it impossible for us to use schema inference at all, which is unfortunate. Would you consider allowing some way to DESCRIBE a protobuf-based schema via the File engine but prohibit creating any actual tables? I admit this seems like a weird solution because CH will still create a schema that is illegal. If you integrate this patch, we will probably pre-process our .proto files to remove empty messages. |
What if I add special setting like Like, if we see a field that is an empty message and this setting is enabled, we will just skip this field while schema inference |
That would be perfect, thanks. |
…oto schema inference
…otobuf-capnp-empty-message
New test is flaky with Ordinary database: https://s3.amazonaws.com/clickhouse-test-reports/0/8e98b7d47dbbcfe88c8d420b35794854e93a6361/stateless_tests__release__databaseordinary_.html |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix bug in schema inference in case of empty messages in Protobuf/CapnProto formats that allowed to create column with empty
Tuple
type. Closes #39051Add 2 new settings
input_format_{protobuf/capnproto}_skip_fields_with_unsupported_types_in_schema_inference
that allow to skip fields with unsupported types while schema inference for Protobuf and CapnProto formats