-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HUDI-7571] Add api to get exception details in HoodieMetadataTableValidator with ignoreFailed mode #10960
Conversation
@the-other-tim-brown @codope Can you please take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor nitpicks but otherwise lgtm
@@ -197,6 +200,27 @@ public HoodieMetadataTableValidator(JavaSparkContext jsc, Config cfg) { | |||
this.taskLabels = generateValidationTaskLabels(); | |||
} | |||
|
|||
/** | |||
* Returns list of Throwable which were encountered during validation. This method is useful | |||
* when ignoreFailed parameter is set to true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may also want to note you can use this option to find all issues instead of just the first one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry! Did not get this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i'm not wrong, I think it's about making javadoc more clear that the list of throwables is actually the list of all validation failures (MDT, RLI, etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was the idea. It's used to get back all the things that may be wrong with the MDT
* Returns true if validation has failed. This method is useful when ignoreFailed | ||
* parameter is set to true. | ||
*/ | ||
public boolean isValidationFailed() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: method name could be hasValidationFailure()
to distinguish between generic failures and validation failures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed
@@ -180,6 +181,8 @@ public class HoodieMetadataTableValidator implements Serializable { | |||
|
|||
private final String taskLabels; | |||
|
|||
private List<Throwable> throwables = new LinkedList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: Any reason to use linked list over array list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed. No particular preference in this case. Generally I tend to use linked list if the size of list is not known and I don't need random access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this will be useful for validation. Can you please rebase and add a test for the two public methods added in this patch? Check out TestHoodieMetadataTableValidator
after rebase.
hudi-utilities/src/main/java/org/apache/hudi/utilities/HoodieMetadataTableValidator.java
Show resolved
Hide resolved
…lidator with ignoreFailed mode
@@ -197,6 +200,27 @@ public HoodieMetadataTableValidator(JavaSparkContext jsc, Config cfg) { | |||
this.taskLabels = generateValidationTaskLabels(); | |||
} | |||
|
|||
/** | |||
* Returns list of Throwable which were encountered during validation. This method is useful | |||
* when ignoreFailed parameter is set to true. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i'm not wrong, I think it's about making javadoc more clear that the list of throwables is actually the list of all validation failures (MDT, RLI, etc).
@@ -438,6 +461,7 @@ private boolean doHoodieMetadataTableValidationOnce() { | |||
if (!cfg.ignoreFailed) { | |||
throw e; | |||
} | |||
throwables.add(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are you clearing up the throwables list?
if we are calling doHoodieMetadataTableValidationOnce repeatedly, don't we need to clear the list before we execute next time ?
Change Logs
When ignoreFailed is enabled, HoodieMetadataTableValidator ignores failure and continues the validation. This jira aims to add api to get list of exceptions and an api to check if validation exception was thrown.
Impact
NA
Risk level (write none, low medium or high below)
low
Documentation Update
NA
Contributor's checklist