New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Manifest is missing" ValidationException when there have Concurrent applications to rewrite manifests #3466
Comments
Same issue with iceberg |
We were just discussing this today, the message is a bit confusing but is the correct response. I think we should probably change it to something like "Cannot apply RewriteManifests result since manifests being replaced have already been removed in the current snapshot" because what is happening actually the same as an other validation error we do during the commit phase. For example, say we run two optimize metadata operations simultaneously
If 1 finishes first the state of our table is
If we try to apply Optimize 2 we see that [A and B] are already deleted and not present, if we added [C''] our table would look like
And end up with duplicate records! So the correct response is to cancel the 2nd optimize oepration |
Adding the label Beginner if anyone wants to take a crack at improving the error message |
I think put the current-snapshot-id into the error message would make it easier for the user to figure out what happened. |
… improve as suggestion from apache#4606 (comment)
@RussellSpitzer We have a flink stream writing data to an iceberg table (the table is set to 'commit.manifest-merge.enabled' = 'false'). When I execute a rewrite manifest task in spark, this error comes up quite often, causing the task to fail. Does it mean that 'commit.manifest-merge.enabled' is not taking effect? |
I have the same problem, which happens occasionally,big guys, help me analyze it Thank you: |
As I wrote above the issue os that the rewrite command becomes out of date while running so it fails. At least this is my hypothesis above |
This problem happened by accident. I think other people also encountered the same problem. How should we fix or solve this problem? Thank you |
Would using the DynamoDB lock table help at all with this issue? We have a Spark Streaming job that commits every 3 minutes and a compaction job that runs hourly (rewrite manifests, expire snapshots, rewrite data files) due to an extremely high volume of data ingest. Past a certain point, compaction began failing every time due to the streaming job updating the latest manifest before compaction could complete. I think I'm going to try the edit: ^ disabling merge made matters a bit worse, would not recommend. We're going to try calling rewriteManifests less often and wrap it in some application level retries for now. |
I got the same issue.. |
I got the same issue , my iceberg maintain order is expireSnapshots -> rewriteDataFiles -> rewriteManifests -> deleteOrphanFiles(older than 20 minutes) . The image is rewriteManifests error,but after rewriteDataFiles and before rewriteManifests I have get the currentSnapshot and print the manifest files. |
is this issue still relevant ?? - i saw a pr that is opened but im not sure that somebody is working on it and if not i would like to give it a try ?? @RussellSpitzer |
This error still occasionally occurs,Thanks |
Thanks @372242283 - i know this is still an issue but i saw a pr and was wondering if the pr was abandoned and if so i would like to work on fixing this issue |
we concurrently run a lots of sql-shell to overwrite different day's data in iceberg table, and every application end with "CALL spark_catalog.system.rewrite_manifests(table => 'dwm.tableA', use_caching => false)". The applications will rewrite the tableA's manifests concurrently , and throw ValidationException("Manifest is missing: xxx") in validateDeletedManifests method.
diagnostics: User class threw exception: org.apache.spark.SparkException: org.apache.iceberg.exceptions.ValidationException: Manifest is missing: oss://xgimi-data/apps/spark/warehouse/ods.db/screen_event_log_hi/metadata/52c1e98f-02a5-4ce3-ae05-d7382a6a68c2-m2.avro at org.apache.iceberg.BaseRewriteManifests.lambda$validateDeletedManifests$7(BaseRewriteManifests.java:261) at java.util.Optional.ifPresent(Optional.java:159) at org.apache.iceberg.BaseRewriteManifests.validateDeletedManifests(BaseRewriteManifests.java:260) at org.apache.iceberg.BaseRewriteManifests.apply(BaseRewriteManifests.java:169) at org.apache.iceberg.SnapshotProducer.apply(SnapshotProducer.java:163) at org.apache.iceberg.BaseRewriteManifests.apply(BaseRewriteManifests.java:53) at org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:276) at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:404) at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:213) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:197) at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:189) at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:275) at org.apache.iceberg.BaseRewriteManifests.commit(BaseRewriteManifests.java:53) at org.apache.iceberg.actions.BaseSnapshotUpdateAction.commit(BaseSnapshotUpdateAction.java:40) at org.apache.iceberg.actions.RewriteManifestsAction.replaceManifests(RewriteManifestsAction.java:309) at org.apache.iceberg.actions.RewriteManifestsAction.execute(RewriteManifestsAction.java:196)
The text was updated successfully, but these errors were encountered: