Skip to content

Comments

[HUDI-3635]Fix HoodieMetadataTableValidator around comparison of partition listing#5100

Merged
nsivabalan merged 2 commits intoapache:masterfrom
zhangyue19921010:HUDI-3635
Mar 30, 2022
Merged

[HUDI-3635]Fix HoodieMetadataTableValidator around comparison of partition listing#5100
nsivabalan merged 2 commits intoapache:masterfrom
zhangyue19921010:HUDI-3635

Conversation

@zhangyue19921010
Copy link
Contributor

@zhangyue19921010 zhangyue19921010 commented Mar 23, 2022

What is the purpose of the pull request

Please refer https://issues.apache.org/jira/browse/HUDI-3635 for more details.

Brief change log

(for example:)

  • Modify AnnotationLocation checkstyle rule in checkstyle.xml

Verify this pull request

(Please pick either of the following options)

This pull request is a trivial rework / code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end.
  • Added HoodieClientWriteTest to verify the change.
  • Manually verified the change by running a job locally.

Committer checklist

  • Has a corresponding JIRA in PR title & commit

  • Commit message is descriptive of the change

  • CI is green

  • Necessary doc changes done or have another open PR

  • For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.

@yihua yihua self-assigned this Mar 24, 2022
@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@zhangyue19921010
Copy link
Contributor Author

zhangyue19921010 commented Mar 25, 2022

This patch is tested on local. Also CI passed. I believe it's ready for review :) cc @yihua
With out this patch, HoodieMetadataTableValidator may fail because of new partition is creating but not committed.

3801 [main] INFO  org.apache.hudi.common.table.log.HoodieMergedLogRecordScanner  - Size of file spilled to disk => 0
3802 [main] INFO  org.apache.hudi.metadata.HoodieBackedTableMetadata  - Opened 3 metadata log files (dataset instant=20220323111740608, metadata instant=20220323111740608) in 469 ms
3819 [main] INFO  org.apache.hudi.metadata.BaseTableMetadata  - Listed partitions from metadata: #partitions=1
3819 [main] ERROR org.apache.hudi.utilities.HoodieMetadataTableValidator  - Compare Partitions Failed! AllPartitionPathsFromFS : [20210623/0/20210623, 20210623/0/20210825] and allPartitionPathsMeta : [20210623/0/20210825]
3820 [main] ERROR org.apache.hudi.utilities.HoodieMetadataTableValidator  - Metadata table validation failed to HoodieValidationException
org.apache.hudi.exception.HoodieValidationException: Compare Partitions Failed! AllPartitionPathsFromFS : [20210623/0/20210623, 20210623/0/20210825] and allPartitionPathsMeta : [20210623/0/20210825]
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.validatePartitions(HoodieMetadataTableValidator.java:435)
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.doMetadataTableValidation(HoodieMetadataTableValidator.java:388)
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.doHoodieMetadataTableValidationOnce(HoodieMetadataTableValidator.java:336)
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.run(HoodieMetadataTableValidator.java:322)
     at MetaDataTable.HoodieMetadataTableValidatorTest.run(HoodieMetadataTableValidatorTest.java:41)
     at MetaDataTable.HoodieMetadataTableValidatorTest.main(HoodieMetadataTableValidatorTest.java:11)
Exception in thread "main" org.apache.hudi.exception.HoodieException: Unable to do hoodie metadata table validation in file:///Users/yuezhang/tmp/hudiAfTable/forecast_agg
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.run(HoodieMetadataTableValidator.java:325)
     at MetaDataTable.HoodieMetadataTableValidatorTest.run(HoodieMetadataTableValidatorTest.java:41)
     at MetaDataTable.HoodieMetadataTableValidatorTest.main(HoodieMetadataTableValidatorTest.java:11)
Caused by: org.apache.hudi.exception.HoodieValidationException: Compare Partitions Failed! AllPartitionPathsFromFS : [20210623/0/20210623, 20210623/0/20210825] and allPartitionPathsMeta : [20210623/0/20210825]
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.validatePartitions(HoodieMetadataTableValidator.java:435)
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.doMetadataTableValidation(HoodieMetadataTableValidator.java:388)
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.doHoodieMetadataTableValidationOnce(HoodieMetadataTableValidator.java:336)
     at org.apache.hudi.utilities.HoodieMetadataTableValidator.run(HoodieMetadataTableValidator.java:322)
     ... 2 more
3824 [Thread-1] INFO  org.apache.spark.SparkContext  - Invoking stop() from shutdown hook
3832 [Thread-1] INFO  org.spark_project.jetty.server.AbstractConnector  - Stopped Spark@7e70bd39{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
3834 [Thread-1] INFO  org.apache.spark.ui.SparkUI  - Stopped Spark web UI at http://localhost:4040
3844 [dispatcher-event-loop-11] INFO  org.apache.spark.MapOutputTrackerMasterEndpoint  - MapOutputTrackerMasterEndpoint stopped!
3856 [Thread-1] INFO  org.apache.spark.storage.memory.MemoryStore  - MemoryStore cleared
3856 [Thread-1] INFO  org.apache.spark.storage.BlockManager  - BlockManager stopped
3858 [Thread-1] INFO  org.apache.spark.storage.BlockManagerMaster  - BlockManagerMaster stopped
3860 [dispatcher-event-loop-3] INFO  org.apache.spark.scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint  - OutputCommitCoordinator stopped!
3869 [Thread-1] INFO  org.apache.spark.SparkContext  - Successfully stopped SparkContext
3869 [Thread-1] INFO  org.apache.spark.util.ShutdownHookManager  - Shutdown hook called
3870 [Thread-1] INFO  org.apache.spark.util.ShutdownHookManager  - Deleting directory /private/var/folders/61/77xdhf3x0x9g3t_vdd1c9_nwr4wznp/T/spark-2b6036a2-c3d4-461b-8ec0-334aa6ea68e8

@nsivabalan nsivabalan merged commit 2b60641 into apache:master Mar 30, 2022
vingov pushed a commit to vingov/hudi that referenced this pull request Apr 3, 2022
…tition path listing (apache#5100)

Co-authored-by: yuezhang <yuezhang@freewheel.tv>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants