Skip to content

Comments

[HUDI-735] Fixing error messages on record key field not found in schema#4342

Closed
harsh1231 wants to merge 1 commit intoapache:masterfrom
harsh1231:HUDI-735
Closed

[HUDI-735] Fixing error messages on record key field not found in schema#4342
harsh1231 wants to merge 1 commit intoapache:masterfrom
harsh1231:HUDI-735

Conversation

@harsh1231
Copy link
Contributor

@harsh1231 harsh1231 commented Dec 16, 2021

Tips

What is the purpose of the pull request

If record key does not exists in table schema , then current error messaging was not clear for user to understand.
Added exception if record key does not exist

Brief change log

(for example:)

  • Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added HoodieClientWriteTest to verify the change.
  • Manually verified the change by running a job locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.


val columnSet = df.columns.toSet
keyGenerator.getRecordKeyFieldNames.foreach(fieldName => if(!columnSet.contains(fieldName)) {
throw new Exception(s"record key '$fieldName' does not exist in existing table schema : ${schema.toString(true)}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is code style all good? guess L235 should be intended. Did you apply the code style format as per the guidelines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@nsivabalan nsivabalan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 minor comment on code style. LGTM.

@vinothchandar vinothchandar changed the title [Hudi 735] Fixing error messages on record key not found [HUDI-735] Fixing error messages on record key not found Dec 16, 2021
@harsh1231
Copy link
Contributor Author

1 minor comment on code style. LGTM.

Done


val columnSet = df.columns.toSet
keyGenerator.getRecordKeyFieldNames.foreach(fieldName => if(!columnSet.contains(fieldName)) {
throw new Exception(s"record key '$fieldName' does not exist in existing table schema " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor. "does not exist in incoming dataframe schema"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@nsivabalan
Copy link
Contributor

@harsh1231 : there are some test failures. you may want to check them out.

log.info(s"Registered avro schema : ${schema.toString(true)}")

val columnSet = df.columns.toSet
keyGenerator.getRecordKeyFieldNames.foreach(fieldName => if(!columnSet.contains(fieldName)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do all keygenerator return some valid key field names? including custom ones implemented by users outside the project? If not, then May need to be more defensive in the check.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. we can't control user defined ones. if you feel this is too tight of a constraint, then probably we have to drop this patch.
One option is, to add validateKeyGenProps api to our keyGenerator interface and have it empty for base impl.
only for internal implementations, we can add this validation.

@nsivabalan nsivabalan changed the title [HUDI-735] Fixing error messages on record key not found [HUDI-735] Fixing error messages on record key field not found in schema Dec 28, 2021
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@nsivabalan
Copy link
Contributor

Moving this back to discussion since we need to have some discussion on how to go about this.

@nsivabalan nsivabalan added the status:in-progress Work in progress label Jan 17, 2022
@nsivabalan
Copy link
Contributor

I will take this up and put up a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

status:in-progress Work in progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants