[HUDI-4942] Fix RowSource schema provider#6817
Conversation
|
@nsivabalan Can you please review this? I am yet to add a unit test but I have tested with my local confluent schema registry setup. The main issue is that if a schema provider is overridden then RowSource does not take it into consideration. It simply fetched the schema based on |
nsivabalan
left a comment
There was a problem hiding this comment.
is ti possible to write tests?
| if (overriddenSchemaProvider != null) { | ||
| return new InputBatch<>(res.getKey(), res.getValue(), overriddenSchemaProvider); | ||
| } |
There was a problem hiding this comment.
org.apache.hudi.utilities.sources.Source#fetchNext actually checks and uses overriddenSchemaProvider.
And fetchNewData() is only used in fetchNext() . i think some misconfig caused the issue.
There was a problem hiding this comment.
That's actually a valid point. I wrote a unit test with evolving schema. But, it passes even without this change. I think we can hold off landing this PR. Let me investigate more.
|
@codope : whats the status of this PR. do we need this anymore. if not, do we still need to rootcause the original issue then ? |
|
Closing the PR. We need to root cause the issue. Something more is happening here. |
Change Logs
Default value being provided by schema provider is being lost since RowSource sets a RowBasedSchemaProvider for the InputBatch. This PR fixes it by passing the user-specified schema provider.
Impact
Describe any public API or user-facing feature change or any performance impact.
Risk level: none | low | medium | high
Choose one. If medium or high, explain what verification was done to mitigate the risks.
Documentation Update
Describe any necessary documentation update if there is any new feature, config, or user-facing change
ticket number here and follow the instruction to make
changes to the website.
Contributor's checklist