New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-2590] Adding tests to validate different key generators #4473
Conversation
35841cd
to
b91d9c4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Let me know what you think of the comment below.
var inputDF2 = spark.read.json(spark.sparkContext.parallelize(records2, 2)) | ||
|
||
if (classOf[TimestampBasedKeyGenerator].getName.equals(keyGenClass)) { | ||
// incase of Timestamp based key gen, current_ts should not be updated. but dataGen.generateUpdates() would have updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this was the issue with the test datagen. Would special handling of timestamp keygen in the datagen itself be better than doing it here in a test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did take a stab, looks like we have to touch lot of methods in HoodieTestDataGenerator. Will take it as a follow up https://issues.apache.org/jira/browse/HUDI-3152
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Will land this.
What is the purpose of the pull request
Added tests to validate COW table for different queries for different key generators.
this is a redo of #3877
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.