Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
Closes #25669
Improves the built-in schema inferrer to produce schemas that work better with the Airbyte platform by default:
instead, do this:
as the platform is ignoring anyOf but can extract the nested object in the second case.
How
The number/integer case can be handled by adding another extra strategy.
However, the null handling is baked in too deeply into genson to be able to get the desired behavior by extra strategies (e.g. the anyOf can't be changed by provided strategies and the library always expects an output for a schema node, so an "ignore null" strategy can't be implemented). To solve this, a post-processing function is introduced that traverses the built schema and changes the output as desired.
🚨 User Impact 🚨
Schema detection in the connector builder will change for the better (existing connectors won't be affected)