Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-16729 out of band deletes #952

Closed

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Jun 12, 2019

This is Gabor's #802 PR rebased to trunk and with some extra changes on top

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 0 Docker mode activated.
-1 patch 12 #952 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help.
Subsystem Report/Notes
GITHUB PR #952
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-952/1/console
versions git=2.7.4
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

Gabor Bota and others added 9 commits June 12, 2019 17:11
* Tests for being in sync /not being in sync now compare etag and version IDs. Highlights we aren't always getting back version IDs.
* Parameterized OOB test now includes auth flag in mehod name, for ease of debugging
* minor: formatting

Change-Id: I7ea062d6996c9ca00d036347c310e5d2e0fa60fe
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 31 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
0 mvndep 45 Maven dependency ordering for branch
+1 mvninstall 1094 trunk passed
+1 compile 1134 trunk passed
+1 checkstyle 142 trunk passed
+1 mvnsite 120 trunk passed
+1 shadedclient 975 branch has no errors when building and testing our client artifacts.
+1 javadoc 87 trunk passed
0 spotbugs 64 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 187 trunk passed
_ Patch Compile Tests _
0 mvndep 24 Maven dependency ordering for patch
+1 mvninstall 78 the patch passed
+1 compile 1064 the patch passed
+1 javac 1064 the patch passed
-0 checkstyle 141 root: The patch generated 5 new + 48 unchanged - 1 fixed = 53 total (was 49)
+1 mvnsite 109 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 1 The patch has no ill-formed XML file.
+1 shadedclient 646 patch has no errors when building and testing our client artifacts.
+1 javadoc 92 the patch passed
+1 findbugs 209 the patch passed
_ Other Tests _
+1 unit 566 hadoop-common in the patch passed.
+1 unit 296 hadoop-aws in the patch passed.
+1 asflicense 40 The patch does not generate ASF License warnings.
7080
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-952/2/artifact/out/Dockerfile
GITHUB PR #952
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 96e36e1f0bd3 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 3b31694
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-952/2/artifact/out/diff-checkstyle-root.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-952/2/testReport/
Max. process+thread count 1598 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-952/2/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

Incorporates HADOOP-16368 "S3A list operation doesn't pick up etags from results" so that the tests can use that to validate consistency rather than just timestamps;

Similarly, some changes in ITestS3GuardOutOfBandOperations added while trying to debug the problem.

Change-Id: Id809886841442a8cc42bff8f7046ade69b94e013
Change-Id: Ic0a0710d2092c2e60eabbb7bc140fac3a1545297
@steveloughran
Copy link
Contributor Author

Tested: S3 Ireland (versioned) with s3guard

One failure; the usual intermittent ITestS3AContractGetFileStatusV1List.

ITestS3AContractGetFileStatusV1List>AbstractContractGetFileStatusTest.testListLocatedStatusEmptyDirectory:132->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 listLocatedStatus(test dir): directory count in 4 directories and 0 files expected:<1> but was:<4>
[INFO] 

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 82 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
0 mvndep 77 Maven dependency ordering for branch
+1 mvninstall 1147 trunk passed
+1 compile 1020 trunk passed
+1 checkstyle 144 trunk passed
+1 mvnsite 126 trunk passed
+1 shadedclient 1024 branch has no errors when building and testing our client artifacts.
+1 javadoc 93 trunk passed
0 spotbugs 68 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 193 trunk passed
_ Patch Compile Tests _
0 mvndep 22 Maven dependency ordering for patch
+1 mvninstall 81 the patch passed
+1 compile 950 the patch passed
+1 javac 950 the patch passed
-0 checkstyle 141 root: The patch generated 5 new + 50 unchanged - 1 fixed = 55 total (was 51)
+1 mvnsite 118 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 1 The patch has no ill-formed XML file.
+1 shadedclient 731 patch has no errors when building and testing our client artifacts.
+1 javadoc 92 the patch passed
+1 findbugs 194 the patch passed
_ Other Tests _
+1 unit 530 hadoop-common in the patch passed.
+1 unit 288 hadoop-aws in the patch passed.
+1 asflicense 53 The patch does not generate ASF License warnings.
7101
Subsystem Report/Notes
Docker Client=18.09.5 Server=18.09.5 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-952/3/artifact/out/Dockerfile
GITHUB PR #952
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 6aacccfc0fb4 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / cf84881
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-952/3/artifact/out/diff-checkstyle-root.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-952/3/testReport/
Max. process+thread count 1375 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-952/3/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 31 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
0 mvndep 24 Maven dependency ordering for branch
+1 mvninstall 1031 trunk passed
+1 compile 1027 trunk passed
+1 checkstyle 138 trunk passed
+1 mvnsite 130 trunk passed
+1 shadedclient 1007 branch has no errors when building and testing our client artifacts.
+1 javadoc 90 trunk passed
0 spotbugs 61 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 176 trunk passed
_ Patch Compile Tests _
0 mvndep 24 Maven dependency ordering for patch
+1 mvninstall 76 the patch passed
+1 compile 978 the patch passed
+1 javac 978 the patch passed
-0 checkstyle 141 root: The patch generated 2 new + 50 unchanged - 1 fixed = 52 total (was 51)
+1 mvnsite 125 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 2 The patch has no ill-formed XML file.
+1 shadedclient 678 patch has no errors when building and testing our client artifacts.
+1 javadoc 105 the patch passed
+1 findbugs 205 the patch passed
_ Other Tests _
-1 unit 501 hadoop-common in the patch failed.
+1 unit 282 hadoop-aws in the patch passed.
+1 asflicense 41 The patch does not generate ASF License warnings.
6813
Reason Tests
Failed junit tests hadoop.util.TestDiskCheckerWithDiskIo
hadoop.util.TestReadWriteDiskValidator
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-952/4/artifact/out/Dockerfile
GITHUB PR #952
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 3ed003f1e647 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / cf84881
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-952/4/artifact/out/diff-checkstyle-root.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-952/4/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-952/4/testReport/
Max. process+thread count 1345 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-952/4/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

I have just pushed up a PR with changes. If I didn't need this in so that I could base my own PR atop it, I'd be seriously considering say "Use java 8 time over millis, as it guarantees that there won't be any bits of the code which assumes it is seconds"

latter is awful about assessing the value of all enumerated moves & countermoves.

In this instance,

S3Guard.addAncestors()

that walk up the tree uses an isDeleted() checl. Should that include TTL probes

Note: I'm not going to do that now, because I've pushed that work further into DDB itself (needed to deal with scale issues); if changes are needed then they should be based off that patch. Please review that code and suggest the next action

Imports

keep that ordering of imports what we expect

java.*
---
javax.*
---
non-org.apache
---
org.apache
---

static. * 

`
and in each one, in order. This is critical to help merge conflict. If a class already has inconsistent ordering, don't worry -but don't make it worse. Always check the imports in reviews for this reason.

Tests

I got a failure in a test run in teardown of ITestDynamoDBMetadataStore; happens if the FS didn't get created. Root cause shows up on some of the other test cases: java.io.FileNotFoundException: DynamoDB table 's3guard-stevel-testing' is being deleted in region eu-west-1 at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.initTable(DynamoDBMetadataStore.java:1293)

	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)
[ERROR] testDeleteSubtreeHostPath(org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStore)  Time elapsed: 0.206 s  <<< ERROR! java.lang.NullPointerException
	at org.apache.hadoop.fs.s3a.s3guard.MetadataStoreTestBase.strToPath(MetadataStoreTestBase.java:1035)
	at org.apache.hadoop.fs.s3a.s3guard.ITestDynamoDBMetadataStore.tearDown(ITestDynamoDBMetadataStore.java:219)
	at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)

It's not caused by this PR and addressed in my rename PR, which doesn't do that cleanup unless fs != null.

Also (I believe) unrelated, the Magic Committer ITest is playing up, and as the logs of the AM don't seem to be saved, can't quite debug.

[ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitMRJob)  Time elapsed: 458.528 s  <<< FAILURE! java.lang.AssertionError: No cleanup: unexpectedly found s3a://hwdev-steve-ireland-new/fork-0004/test/testMRJob/__magic as  S3AFileStatus{path=s3a://hwdev-steve-ireland-new/fork-0004/test/testMRJob/__magic; isDirectory=true; modification_time=0; access_time=0; owner=stevel; group=stevel; permission=rwxrwxrwx; isSymlink=false; hasAcl=false; isEncrypted=true; isErasureCoded=false} isEmptyDirectory=UNKNOWN eTag=null versionId=null
	at org.junit.Assert.fail(Assert.java:88)
	at org.apache.hadoop.fs.contract.ContractTestUtils.assertPathDoesNotExist(ContractTestUtils.java:977)
	at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.assertPathDoesNotExist(AbstractFSContractTestBase.java:305)
	at org.apache.hadoop.fs.s3a.commit.magic.ITestMagicCommitMRJob.customPostExecutionValidation(ITestMagicCommitMRJob.java:96)
	at org.apache.hadoop.fs.s3a.commit.AbstractITCommitMRJob.testMRJob(AbstractITCommitMRJob.java:162)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)

Could this been related? Well, iff the path was still in S3 and the tombstone hadn't expired, yes. But I doubt that. Something I will worry about myself. I'll add some more assertions....

+that v1 listing again.

Next Steps

Gabor - pull this test down and do the scale test runs with all the options (auth, nonauth, local) and see how well it goes. If we dont' see problems or we believe they are transient unrelated issues, then I'll vote on this later on today

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 35 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
0 mvndep 66 Maven dependency ordering for branch
+1 mvninstall 1134 trunk passed
+1 compile 1134 trunk passed
+1 checkstyle 135 trunk passed
+1 mvnsite 123 trunk passed
+1 shadedclient 956 branch has no errors when building and testing our client artifacts.
+1 javadoc 89 trunk passed
0 spotbugs 62 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 183 trunk passed
_ Patch Compile Tests _
0 mvndep 23 Maven dependency ordering for patch
+1 mvninstall 83 the patch passed
+1 compile 1080 the patch passed
+1 javac 1080 the patch passed
-0 checkstyle 152 root: The patch generated 4 new + 50 unchanged - 1 fixed = 54 total (was 51)
+1 mvnsite 119 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 2 The patch has no ill-formed XML file.
+1 shadedclient 634 patch has no errors when building and testing our client artifacts.
-1 javadoc 29 hadoop-tools_hadoop-aws generated 3 new + 1 unchanged - 0 fixed = 4 total (was 1)
-1 findbugs 122 hadoop-common in the patch failed.
-1 findbugs 15 hadoop-aws in the patch failed.
_ Other Tests _
-1 unit 30 hadoop-common in the patch failed.
-1 unit 30 hadoop-aws in the patch failed.
+1 asflicense 45 The patch does not generate ASF License warnings.
6263
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/artifact/out/Dockerfile
GITHUB PR #952
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 7b5b3201c7bd 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 940bcf0
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/artifact/out/diff-checkstyle-root.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/artifact/out/diff-javadoc-javadoc-hadoop-tools_hadoop-aws.txt
findbugs https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/artifact/out/patch-findbugs-hadoop-common-project_hadoop-common.txt
findbugs https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/artifact/out/patch-findbugs-hadoop-tools_hadoop-aws.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
unit https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/testReport/
Max. process+thread count 412 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-952/5/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@bgaborg
Copy link

bgaborg commented Jun 13, 2019

My test results:

local: error seems unrelated

    [ERROR] Tests run: 9, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 23.849 s <<< FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractRootDir
    [ERROR] testRmEmptyRootDirNonRecursive(org.apache.hadoop.fs.contract.s3a.ITestS3AContractRootDir)  Time elapsed: 4.18 s  <<< ERROR!
    org.apache.hadoop.fs.PathIOException: `gabota-versioned-bucket-ireland': Cannot delete root path: s3a://gabota-versioned-bucket-ireland/
    at org.apache.hadoop.fs.s3a.S3AFileSystem.rejectRootDirectoryDelete(S3AFileSystem.java:2184)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.innerDelete(S3AFileSystem.java:2109)
    at org.apache.hadoop.fs.s3a.S3AFileSystem.delete(S3AFileSystem.java:2058)
    at org.apache.hadoop.fs.contract.AbstractContractRootDirectoryTest.testRmEmptyRootDirNonRecursive(AbstractContractRootDirectoryTest.java:116)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
    at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
    at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.lang.Thread.run(Thread.java:748)

dynamo: testMRJob failure is known, testDynamoTableTagging failed for me the first time. unrelated.

  [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 76.635 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitMRJob
    [ERROR] testMRJob(org.apache.hadoop.fs.s3a.commit.staging.integration.ITestDirectoryCommitMRJob)  Time elapsed: 45.052 s  <<< ERROR!
    java.io.FileNotFoundException: Path s3a://gabota-versioned-bucket-ireland/fork-0003/test/DELAY_LISTING_ME/testMRJob is recorded as deleted by S3Guard
    	at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:2479)
    	at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:2450)
    	at org.apache.hadoop.fs.contract.ContractTestUtils.assertIsDirectory(ContractTestUtils.java:559)
    	at org.apache.hadoop.fs.contract.AbstractFSContractTestBase.assertIsDirectory(AbstractFSContractTestBase.java:327)
    	at org.apache.hadoop.fs.s3a.commit.AbstractITCommitMRJob.testMRJob(AbstractITCommitMRJob.java:133)
    	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
    	at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
    	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
    	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
    	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    	at java.lang.Thread.run(Thread.java:748)

      [ERROR] Tests run: 16, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 396.307 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB
      [ERROR] testDynamoTableTagging(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB)  Time elapsed: 144.84 s  <<< ERROR!
      java.lang.IllegalArgumentException: Table s3guard.test.testDynamoTableTagging-f0baca8f-6e1a-4c44-a56e-97fd86534b2f is not deleted.
      	at com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:505)
      	at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.destroy(DynamoDBMetadataStore.java:1028)
      	at org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB.testDynamoTableTagging(ITestS3GuardToolDynamoDB.java:149)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:498)
      	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
      	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
      	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
      	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
      	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
      	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
      	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
      	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
      	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
      	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
      	at java.lang.Thread.run(Thread.java:748)
      Caused by: com.amazonaws.waiters.WaiterTimedOutException: Reached maximum attempts without transitioning to the desired state
      	at com.amazonaws.waiters.WaiterExecution.pollResource(WaiterExecution.java:86)
      	at com.amazonaws.waiters.WaiterImpl.run(WaiterImpl.java:88)
      	at com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:502)
      	... 17 more

@steveloughran
Copy link
Contributor Author

testRmEmptyRootDirNonRecursive
failure there means it didn't think the root dir was empty. We do use eventually() There to spin for the listing being empty before doing the rm, so if it failed, it's because delete() felt there were still things underneath.

@steveloughran
Copy link
Contributor Author

  [ERROR] testDynamoTableTagging(org.apache.hadoop.fs.s3a.s3guard.ITestS3GuardToolDynamoDB)  Time elapsed: 144.84 s  <<< ERROR!
      java.lang.IllegalArgumentException: Table s3guard.test.testDynamoTableTagging-f0baca8f-6e1a-4c44-a56e-97fd86534b2f is not deleted.
      	at com.amazonaws.services.dynamodbv2.document.Table.waitForDelete(Table.java:505)
      	at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStore.destroy(DynamoDBMetadataStore.java:1028)

teardown timeout, "it happens". The converting this from illegal arg to a new PathIOException subclass is part of HADOOP-15183. Happens sporadically as dDB deletion is eventually consistent

@steveloughran
Copy link
Contributor Author

OK, did a final test run, one failure HADOOP-16375

@steveloughran
Copy link
Contributor Author

My Test failure is clearly unrelated. Gabor's may be, but it'd have to be if the delete tombstones expired between the listing and the delete. I'll leave him to improve the test debugging on a failure. (Catch the IOE, do a ContractTestUtils.lsR, print it, etc)

+1 for this as is.

@bgaborg
Copy link

bgaborg commented Jun 14, 2019

Further test results (running with -Dscale takes awhile). The ITestS3AContractGetFileStatusV1List seems flaky to me with -Dscale. First time it was ok, but second time I got the error:

[ERROR] Tests run: 18, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 117.356 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.ITestS3AContractGetFileStatusV1List
[ERROR] testListStatusEmptyDirectory(org.apache.hadoop.fs.s3a.ITestS3AContractGetFileStatusV1List)  Time elapsed: 4.311 s  <<< FAILURE!
java.lang.AssertionError: listStatus(/fork-0003/test): directory count in 2 directories and 0 files expected:<1> but was:<2>
	at org.junit.Assert.fail(Assert.java:88)
	at org.junit.Assert.failNotEquals(Assert.java:834)
	at org.junit.Assert.assertEquals(Assert.java:645)
	at org.apache.hadoop.fs.contract.ContractTestUtils$TreeScanResults.assertSizeEquals(ContractTestUtils.java:1649)
	at org.apache.hadoop.fs.contract.AbstractContractGetFileStatusTest.testListStatusEmptyDirectory(AbstractContractGetFileStatusTest.java:91)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)

sequential-integration-tests

[ERROR] Tests run: 9, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 575.153 s <<< FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractRootDir
[ERROR] testRecursiveRootListing(org.apache.hadoop.fs.contract.s3a.ITestS3AContractRootDir)  Time elapsed: 180.01 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 180000 milliseconds
------
[ERROR] testRmEmptyRootDirNonRecursive(org.apache.hadoop.fs.contract.s3a.ITestS3AContractRootDir)  Time elapsed: 180.002 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 180000 milliseconds

Maybe bump up the timout to more?

So this actually means that there's only a timeout and the flaky ITestS3AContractGetFileStatusV1List.

@steveloughran
Copy link
Contributor Author

Maybe bump up the timeout to more?

that's a sign that somehow the delete didn't go through. I wouldn't say "keep extending it" -more likely that there's something there which wasn't deleted. Which could be: some other test putting it in, or a failure of that first list call to find and delete everything.

I'd suggest thinking of 'what diagnostics could you collect' rather than just hoping the problem will go away with timeouts. e.g: deep listing of directory trees before the rm is started, and again on every listing != empty outcome.

@steveloughran
Copy link
Contributor Author

I should add: both these test failures are related to listing directories in one form or another, during the test and after previous tests created (and should have cleaned up) files from immediately previous test cases. Which means that they are potentially signs of a problem. Interesting that it is only ever the v1 test which fails.

Note: We can't just say "oh, this is eventual consistency at work', not if the tests are running with s3guard on. With S3Guard enabled, it is an implicit requirement that no tests fail from eventual consistency errors on listings. HEAD/GET calls may still have some surprises

* fix up javadocs

Change-Id: I8652eceeb8010c82a9892378c434ee5db2ad51f9
Change-Id: Iac23037e039b9a3eea2060d65fbf097198064ec8
@steveloughran
Copy link
Contributor Author

merged with trunk

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
0 reexec 44 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
0 mvndep 64 Maven dependency ordering for branch
+1 mvninstall 1133 trunk passed
+1 compile 1028 trunk passed
+1 checkstyle 151 trunk passed
+1 mvnsite 125 trunk passed
+1 shadedclient 1065 branch has no errors when building and testing our client artifacts.
+1 javadoc 96 trunk passed
0 spotbugs 65 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 184 trunk passed
_ Patch Compile Tests _
0 mvndep 21 Maven dependency ordering for patch
+1 mvninstall 78 the patch passed
+1 compile 975 the patch passed
+1 javac 975 the patch passed
-0 checkstyle 149 root: The patch generated 4 new + 50 unchanged - 1 fixed = 54 total (was 51)
+1 mvnsite 129 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 2 The patch has no ill-formed XML file.
+1 shadedclient 713 patch has no errors when building and testing our client artifacts.
-1 javadoc 33 hadoop-tools_hadoop-aws generated 3 new + 1 unchanged - 0 fixed = 4 total (was 1)
+1 findbugs 201 the patch passed
_ Other Tests _
+1 unit 522 hadoop-common in the patch passed.
+1 unit 288 hadoop-aws in the patch passed.
+1 asflicense 52 The patch does not generate ASF License warnings.
7112
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-952/6/artifact/out/Dockerfile
GITHUB PR #952
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux b221b8a3d465 4.4.0-141-generic #167~14.04.1-Ubuntu SMP Mon Dec 10 13:20:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / e70aeb4
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-952/6/artifact/out/diff-checkstyle-root.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-952/6/artifact/out/diff-javadoc-javadoc-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-952/6/testReport/
Max. process+thread count 1542 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-952/6/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 33 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 10 new or modified test files.
_ trunk Compile Tests _
0 mvndep 67 Maven dependency ordering for branch
+1 mvninstall 1068 trunk passed
+1 compile 1011 trunk passed
+1 checkstyle 141 trunk passed
+1 mvnsite 129 trunk passed
+1 shadedclient 1004 branch has no errors when building and testing our client artifacts.
+1 javadoc 106 trunk passed
0 spotbugs 61 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 177 trunk passed
_ Patch Compile Tests _
0 mvndep 20 Maven dependency ordering for patch
+1 mvninstall 81 the patch passed
+1 compile 956 the patch passed
+1 javac 956 the patch passed
-0 checkstyle 140 root: The patch generated 4 new + 50 unchanged - 1 fixed = 54 total (was 51)
+1 mvnsite 126 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 xml 1 The patch has no ill-formed XML file.
+1 shadedclient 684 patch has no errors when building and testing our client artifacts.
+1 javadoc 103 the patch passed
+1 findbugs 204 the patch passed
_ Other Tests _
+1 unit 532 hadoop-common in the patch passed.
+1 unit 293 hadoop-aws in the patch passed.
+1 asflicense 52 The patch does not generate ASF License warnings.
6947
Subsystem Report/Notes
Docker Client=17.05.0-ce Server=17.05.0-ce base: https://builds.apache.org/job/hadoop-multibranch/job/PR-952/7/artifact/out/Dockerfile
GITHUB PR #952
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml findbugs checkstyle
uname Linux 7e3275e9f448 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / e70aeb4
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-952/7/artifact/out/diff-checkstyle-root.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-952/7/testReport/
Max. process+thread count 1598 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-952/7/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran deleted the s3/HADOOP-16279-oob-delete branch July 22, 2019 14:50
shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
…job redeploys

Author: Ray Matharu <rmatharu@linkedin.com>

Reviewers: Jagadish<jagadish@apache.org>

Closes apache#952 from rmatharu/test-standbyimprovements
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants