[HUDI-735] Fixing error messages on record key field not found in schema#4342
[HUDI-735] Fixing error messages on record key field not found in schema#4342harsh1231 wants to merge 1 commit intoapache:masterfrom
Conversation
|
|
||
| val columnSet = df.columns.toSet | ||
| keyGenerator.getRecordKeyFieldNames.foreach(fieldName => if(!columnSet.contains(fieldName)) { | ||
| throw new Exception(s"record key '$fieldName' does not exist in existing table schema : ${schema.toString(true)}") |
There was a problem hiding this comment.
Is code style all good? guess L235 should be intended. Did you apply the code style format as per the guidelines?
nsivabalan
left a comment
There was a problem hiding this comment.
1 minor comment on code style. LGTM.
Done |
|
|
||
| val columnSet = df.columns.toSet | ||
| keyGenerator.getRecordKeyFieldNames.foreach(fieldName => if(!columnSet.contains(fieldName)) { | ||
| throw new Exception(s"record key '$fieldName' does not exist in existing table schema " + |
There was a problem hiding this comment.
minor. "does not exist in incoming dataframe schema"
|
@harsh1231 : there are some test failures. you may want to check them out. |
| log.info(s"Registered avro schema : ${schema.toString(true)}") | ||
|
|
||
| val columnSet = df.columns.toSet | ||
| keyGenerator.getRecordKeyFieldNames.foreach(fieldName => if(!columnSet.contains(fieldName)) { |
There was a problem hiding this comment.
do all keygenerator return some valid key field names? including custom ones implemented by users outside the project? If not, then May need to be more defensive in the check.
There was a problem hiding this comment.
yeah. we can't control user defined ones. if you feel this is too tight of a constraint, then probably we have to drop this patch.
One option is, to add validateKeyGenProps api to our keyGenerator interface and have it empty for base impl.
only for internal implementations, we can add this validation.
|
Moving this back to discussion since we need to have some discussion on how to go about this. |
|
I will take this up and put up a new PR. |
Tips
What is the purpose of the pull request
If record key does not exists in table schema , then current error messaging was not clear for user to understand.
Added exception if record key does not exist
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.