feat(catalog): expose InformationSchemataBuilder as public API#22499
Open
zfarrell wants to merge 1 commit into
Open
feat(catalog): expose InformationSchemataBuilder as public API#22499zfarrell wants to merge 1 commit into
zfarrell wants to merge 1 commit into
Conversation
Make InformationSchemataBuilder constructible and usable from outside the crate so downstream catalog implementations (federated or async) can produce SQL-standard information_schema.schemata rows without depending on the private InformationSchemaConfig/CatalogProviderList walk, which is built around synchronous schema_names() enumeration. Adds InformationSchemataBuilder::new(), schema(), and bumps add_schemata and finish to pub. The struct's existing internal use via InformationSchemata::builder() is unchanged.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
No linked issue. Happy to file one if reviewers prefer.
Rationale for this change
Downstream catalog implementations that resolve schemas asynchronously
cannot reuse
InformationSchemaProvider— it enumerates schemas viaCatalogProvider::schema_names(), which is synchronous, so anasync-only catalog has to provide its own
information_schema.schemataview. Today that requires either duplicating the column layout and
the row-building logic, or reaching into private items.
Exposing
InformationSchemataBuilderand aschemata_schema()factorylets external crates emit byte-for-byte-compatible
schematabatcheswithout copy-pasting the contract.
What changes are included in this PR?
pub fn schemata_schema() -> SchemaRef— extracts the column-layoutfactory.
InformationSchemata::newnow calls it instead of inliningthe schema, so there is a single source of truth.
InformationSchemataBuilderbecomespub(was private) with aDefaultimpl and a publicnew().add_schemataandfinisharebumped to
pub. The function bodies and parameter types(
&str/Option<&str>) are unchanged.finishnow returnsResult<RecordBatch>instead of panicking viaan internal
.unwrap(). The one internal caller(
PartitionStream::executeforInformationSchemata) was previouslywrapping
Ok(builder.finish())and is updated to justbuilder.finish()since the inner expression now produces theResultdirectly.Are these changes tested?
Yes. A new unit test
schemata_builder_emits_canonical_schema_and_rowsexercises the public API end-to-end via
Default::default(), assertsthe produced batch's schema matches
schemata_schema(), and verifiesthe null pattern for
schema_owner, the threedefault_character_set_*columns, and
sql_path. The pre-existing internal users(
InformationSchemata::new,PartitionStream::execute) continue toexercise the same code path through the unchanged
InformationSchemata::builder()constructor.Are there any user-facing changes?
Yes — three new public items in
datafusion-catalog:schemata_schema,InformationSchemataBuilder(with itsnew/add_schemata/finishmethods +
Defaultimpl). No existing public API is broken. TheResult<RecordBatch>return onfinishis a first-time-public surface,not a regression.