Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18242. ABFS Rename Failure when tracking metadata is in an incomplete state #4517

Merged
merged 2 commits into from
Jul 1, 2022

Conversation

mehakmeet
Copy link
Contributor

@mehakmeet mehakmeet commented Jun 30, 2022

ABFS rename fails intermittently when the Storage-blob tracking
metadata is in an incomplete state. This surfaces as the error code
404 and an error message of "RenameDestinationParentPathNotFound"

To mitigate this issue, when a request fails with this response.
the ABFS client issues a HEAD call on the source file
and then retries the rename operation again

ABFS filesystem statistics track when this occurs with new counters
rename_recovery
metadata_incomplete_rename_failures
rename_path_attempts

This is very rare occurrence and appears to be triggered under certain
heavy load conditions, just as with HADOOP-18163.

Contributed by Mehakmeet Singh.

Description of PR

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

…omplete state (apache#4331)

ABFS rename fails intermittently when the Storage-blob tracking
metadata is in an incomplete state. This surfaces as the error code
404 and an error message of "RenameDestinationParentPathNotFound"

To mitigate this issue, when a request fails with this response.
the ABFS client issues a HEAD call on the source file
and then retries the rename operation again

ABFS filesystem statistics track when this occurs with new counters
  rename_recovery
  metadata_incomplete_rename_failures
  rename_path_attempts

This is very rare occurrence and appears to be triggered under certain
heavy load conditions, just as with HADOOP-18163.

Contributed by Mehakmeet Singh.
@mehakmeet
Copy link
Contributor Author

Testing:
Region: US-West-2
mvn -Dparallel-tests=abfs -DtestsThreadCount=8 -Dscale clean verify
All tests ran fine.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 6m 56s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ branch-3.3 Compile Tests _
-1 ❌ mvninstall 1m 2s /branch-mvninstall-root.txt root in branch-3.3 failed.
+1 💚 compile 5m 33s branch-3.3 passed
+1 💚 checkstyle 0m 33s branch-3.3 passed
+1 💚 mvnsite 1m 13s branch-3.3 passed
+1 💚 javadoc 0m 44s branch-3.3 passed
+1 💚 spotbugs 1m 14s branch-3.3 passed
-1 ❌ shadedclient 1m 55s branch has errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 49s the patch passed
+1 💚 compile 0m 36s the patch passed
+1 💚 javac 0m 36s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 26s the patch passed
+1 💚 mvnsite 0m 40s the patch passed
+1 💚 javadoc 0m 30s the patch passed
+1 💚 spotbugs 1m 12s the patch passed
-1 ❌ shadedclient 1m 54s patch has errors when building and testing our client artifacts.
_ Other Tests _
-1 ❌ unit 0m 21s /patch-unit-hadoop-tools_hadoop-azure.txt hadoop-azure in the patch failed.
+0 🆗 asflicense 0m 23s ASF License check generated no output?
26m 51s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4517/1/artifact/out/Dockerfile
GITHUB PR #4517
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux 03ea3ff8121e 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3 / 7587a37
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4517/1/testReport/
Max. process+thread count 88 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4517/1/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@mehakmeet
Copy link
Contributor Author

Seems like some version mismatch in the pom.xml of hadoop-benchmark after vectored-IO backport, not entirely sure.

[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]   
[ERROR]   The project org.apache.hadoop:hadoop-benchmark:3.4.0-SNAPSHOT (/home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-4517/src/hadoop-tools/hadoop-benchmark/pom.xml) has 1 error
[ERROR]     Non-resolvable parent POM for org.apache.hadoop:hadoop-benchmark:3.4.0-SNAPSHOT: Could not find artifact org.apache.hadoop:hadoop-project:pom:3.4.0-SNAPSHOT and 'parent.relativePath' points at wrong local POM @ line 22, column 11 -> [Help 2]

CC: @mukund-thakur @steveloughran

@mukund-thakur
Copy link
Contributor

Yeah even I saw this later while trying to backport internally. But how come local branch-3.3 build is successful.

@mukund-thakur
Copy link
Contributor

Can you try now. Just add a dummy commit with one line addition somewhere. It should trigger Yetus.
I think it should be fixed by now https://issues.apache.org/jira/browse/HADOOP-18322

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 7m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ branch-3.3 Compile Tests _
+1 💚 mvninstall 38m 7s branch-3.3 passed
+1 💚 compile 0m 51s branch-3.3 passed
+1 💚 checkstyle 0m 46s branch-3.3 passed
+1 💚 mvnsite 0m 55s branch-3.3 passed
+1 💚 javadoc 0m 56s branch-3.3 passed
+1 💚 spotbugs 1m 33s branch-3.3 passed
+1 💚 shadedclient 27m 19s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 44s the patch passed
+1 💚 compile 0m 36s the patch passed
+1 💚 javac 0m 36s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 23s the patch passed
+1 💚 mvnsite 0m 44s the patch passed
+1 💚 javadoc 0m 28s the patch passed
+1 💚 spotbugs 1m 14s the patch passed
+1 💚 shadedclient 27m 37s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 31s hadoop-azure in the patch passed.
+1 💚 asflicense 0m 49s The patch does not generate ASF License warnings.
113m 42s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4517/2/artifact/out/Dockerfile
GITHUB PR #4517
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets
uname Linux ac47305a59d1 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3 / 1ecb949
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4517/2/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4517/2/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@mukund-thakur mukund-thakur merged commit 90b1e73 into apache:branch-3.3 Jul 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants