Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-16906. Abortable #2684

Conversation

steveloughran
Copy link
Contributor

This is #2667 with an extra commit; my changes

  • markdown spec (really needs outputstream.md in, but...)
  • stats collected on invocations of abort and multpart uploads
    -counters and durations
  • stats of streams propagated to filesystem in abort()
  • tests use stats to verify what happens
  • verify that aborted stream doesn't delete/overwrite an existing file
  • oh, and FSDataOutputStream uses instanceof over ClassCast. and catch. Consistent with the other uses.

Testing in progress.

@steveloughran
Copy link
Contributor Author

test run against s3 london; known failures... buffer underfill and a recent s3 change breaking ITestAssumeRole. Both unrelated

[ERROR]   ITestS3AContractUnbuffer>AbstractContractUnbufferTest.testUnbufferBeforeRead:63->AbstractContractUnbufferTest.validateFullFileContents:132->AbstractContractUnbufferTest.validateFileContents:139->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 failed to read expected number of bytes from stream. This may be transient expected:<1024> but was:<3>
[ERROR]   ITestAssumeRole.testAssumeRoleBadInnerAuth:256->expectFileSystemCreateFailure:136  Expected to find 'not a valid key=value pair (missing equal-sign) in Authorization header' but got unexpected exception: org.apache.hadoop.fs.s3a.AWSBadRequestException: Instantiate org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider: com.amazonaws.services.securitytoken.model.AWSSecurityTokenServiceException: null (Service: AWSSecurityTokenService; Status Code: 400; Error Code: IncompleteSignature; Request ID: 9050b674-e236-4b4c-8831-21149db759bc; Proxy: null):IncompleteSignature: null (Service: AWSSecurityTokenService; Status Code: 400; Error Code: IncompleteSignature; Request ID: 9050b674-e236-4b4c-8831-21149db759bc; Proxy: null)

@hadoop-yetus

This comment has been minimized.

@hadoop-yetus

This comment has been minimized.

@hadoop-yetus

This comment has been minimized.

@HeartSaVioR
Copy link
Contributor

The additional commit looks pretty great. Probably need to fix checkstyle/whitespace complains if they're not false alarms.

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks really good now. Thanks @steveloughran
Ready to go in once Yetus gives +1.

Strictly then:

> if `Abortable.abort()` does not raise `UnsupportedOperationException`
> then returns, then it guarantees that the write SHALL NOT become visible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then -> and ?

@steveloughran
Copy link
Contributor Author

Thanks @mukund-thakur

Had some thoughts over the w/e

  1. The call should return something, AbortOutcome which can include a list of non-critical exceptions caught during the call, just for diags. And to give us the option of adding IOStatistics to an impl
  2. change the spec to differentiate "ops which MUST succeed for abort" (which for s3a is a no-op) and ops which MAY succeed and are which needed for cleanup. The MAY succeed operations SHOULD be best effort, but MUST NOT block for retries etc on failures.

The s3a stream abort() does do retries right now, and I think that's probably a mistake for something which is only cleanup and is usually invoked when things are starting to go wrong (e.g. data post/complete). Saying "no retries here", means that we can have a fast abort() rather than a 60-120 second spin before the failure is caught and swallowed. I'll change the relevant bits of s3a abort to once() in this situation

Makes sense?

@hadoop-yetus

This comment has been minimized.

@hadoop-yetus

This comment has been minimized.

Copy link
Contributor

@mehakmeet mehakmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, bar small nit and whitespace/checkstyles fix.


/**
* Abort any active uploads, enter closed state.
* @return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing Javadoc after @return

@steveloughran steveloughran force-pushed the incoming/limj/HADOOP-16906-abort-trunk branch from 8cb92e3 to af303a1 Compare February 10, 2021 13:00
HeartSaVioR and others added 6 commits February 10, 2021 14:13
…ble output stream to be terminated

No meaningful tests have been added, as I have no idea where I can add it, and how s3a has been tested
with integration test manner. (Tests in TestS3ABlockOutputStream only check simple things with mocking
everything, so can't do some write/upload test with it.)
* markdown spec (really needs outputstream.md in, but...)
* stats collected on invocations of abort and multpart uploads
  -counters and durations
* stats of streams propagated to filesystem in abort()
* tests use stats to verify what happens
* test to verify that aborted stream doesn't delete/overwrite an existing file
* tests verify hasCapability("abortable")

Change-Id: I104a8e6ca6d16aa9706c411ffae871409ea9ec22

HADOOP-16906. Abortable

* Added a result which is used in testing.
* added hasPathSupport() in S3A FS, changed capability name to suit
* clarified what happens on a closed stream.
* declared that cleanup (as opposed to the actual cancel)
  MUST BE non-retrying, so abort() doesn't spin for a long time in
  the presence of network failures.
* modified S3A BlockOutputStream to not retry. there or in the cleanup in close()

Change-Id: I82b8991c99faef8f5ff7d11c3648ac73d6540f85
Change-Id: Iced57b7402649d77cb6dfc243e6fcc80f6448a23
@steveloughran steveloughran force-pushed the incoming/limj/HADOOP-16906-abort-trunk branch from af303a1 to b5c8163 Compare February 10, 2021 14:22
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 0m 0s test4tests The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 5s Maven dependency ordering for branch
+1 💚 mvninstall 20m 5s trunk passed
+1 💚 compile 20m 46s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 17m 58s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 checkstyle 3m 47s trunk passed
+1 💚 mvnsite 2m 28s trunk passed
+1 💚 shadedclient 20m 5s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 42s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 23s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+0 🆗 spotbugs 1m 16s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 32s trunk passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 26s Maven dependency ordering for patch
+1 💚 mvninstall 1m 26s the patch passed
+1 💚 compile 19m 56s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 19m 56s the patch passed
+1 💚 compile 17m 55s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 javac 17m 55s the patch passed
-0 ⚠️ checkstyle 3m 42s /diff-checkstyle-root.txt root: The patch generated 2 new + 35 unchanged - 0 fixed = 37 total (was 35)
+1 💚 mvnsite 2m 25s the patch passed
-1 ❌ whitespace 0m 0s /whitespace-eol.txt The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚 shadedclient 13m 7s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 38s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 23s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 findbugs 3m 47s the patch passed
_ Other Tests _
+1 💚 unit 17m 17s hadoop-common in the patch passed.
+1 💚 unit 2m 3s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 55s The patch does not generate ASF License warnings.
193m 40s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/7/artifact/out/Dockerfile
GITHUB PR #2684
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint
uname Linux cddc862a35c1 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cacc870
Default Java Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/7/testReport/
Max. process+thread count 1378 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/7/console
versions git=2.25.1 maven=3.6.3 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

final bits of style....

Change-Id: I258e06b0b9ee4fce15162b57eb7ce4ea7012e485
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 36s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 0m 0s test4tests The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 2s Maven dependency ordering for branch
+1 💚 mvninstall 20m 5s trunk passed
+1 💚 compile 20m 39s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 17m 55s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 checkstyle 3m 46s trunk passed
+1 💚 mvnsite 2m 25s trunk passed
+1 💚 shadedclient 19m 48s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 39s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 19s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+0 🆗 spotbugs 1m 15s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 34s trunk passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 1m 26s the patch passed
+1 💚 compile 19m 53s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 19m 53s the patch passed
+1 💚 compile 20m 44s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 javac 20m 44s the patch passed
+1 💚 checkstyle 4m 29s the patch passed
+1 💚 mvnsite 2m 39s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 18m 20s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 55s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 3m 10s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 findbugs 4m 37s the patch passed
_ Other Tests _
-1 ❌ unit 33m 15s /patch-unit-hadoop-common-project_hadoop-common.txt hadoop-common in the patch passed.
-1 ❌ unit 61m 36s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch passed.
+1 💚 asflicense 0m 55s The patch does not generate ASF License warnings.
279m 44s
Reason Tests
Failed junit tests hadoop.metrics2.source.TestJvmMetrics
hadoop.log.TestLogLevel
hadoop.fs.s3a.commit.staging.TestStagingDirectoryOutputCommitter
hadoop.fs.s3a.commit.staging.TestStagingCommitter
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/8/artifact/out/Dockerfile
GITHUB PR #2684
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint
uname Linux a3305defea5e 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 98ca6af
Default Java Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/8/testReport/
Max. process+thread count 1377 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/8/console
versions git=2.25.1 maven=3.6.3 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

test run failures are all container OOM. As the last run was only to verify style &c -checks which pass, ignoring that and going with success of previous runs

Change-Id: I78b62ab746ace501f260b9e933004d454b4cbbf3
@steveloughran steveloughran merged commit 78905d7 into apache:trunk Feb 11, 2021
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 31s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 0m 0s test4tests The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 14m 7s Maven dependency ordering for branch
+1 💚 mvninstall 20m 6s trunk passed
+1 💚 compile 21m 47s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 compile 21m 59s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 checkstyle 4m 40s trunk passed
+1 💚 mvnsite 2m 39s trunk passed
+1 💚 shadedclient 23m 44s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 43s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 42s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+0 🆗 spotbugs 1m 13s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 3m 52s trunk passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 23s Maven dependency ordering for patch
+1 💚 mvninstall 1m 29s the patch passed
+1 💚 compile 20m 56s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javac 20m 56s the patch passed
+1 💚 compile 18m 18s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 javac 18m 18s the patch passed
+1 💚 checkstyle 3m 33s the patch passed
+1 💚 mvnsite 2m 19s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 12m 58s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 1m 31s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 2m 12s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
+1 💚 findbugs 3m 47s the patch passed
_ Other Tests _
+1 💚 unit 17m 0s hadoop-common in the patch passed.
+1 💚 unit 1m 57s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 51s The patch does not generate ASF License warnings.
202m 54s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/9/artifact/out/Dockerfile
GITHUB PR #2684
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint
uname Linux c69519b6bf60 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 98ca6af
Default Java Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~20.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/9/testReport/
Max. process+thread count 3060 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2684/9/console
versions git=2.25.1 maven=3.6.3 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@HeartSaVioR
Copy link
Contributor

Finally! Thanks again everyone!
Just to double check, if I understand correctly, we are targetting this to 3.3.1, right? This is merged to trunk but the JIRA issue is not resolved so not sure which target versions are.

asfgit pushed a commit that referenced this pull request Feb 17, 2021
Adds an Abortable.abort() interface for streams to enable output streams to be terminated; this
is implemented by the S3A connector's output stream. It allows for commit protocols
to be implemented which commit/abort work by writing to the final destination and
using the abort() call to cancel any write which is not intended to be committed.
Consult the specification document for information about the interface and its use.

Contributed by Jungtaek Lim and Steve Loughran.

Change-Id: I7fcc25e9dd8c10ce6c29f383529f3a2642a201ae
@steveloughran steveloughran deleted the incoming/limj/HADOOP-16906-abort-trunk branch October 15, 2021 19:49
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
Adds an Abortable.abort() interface for streams to enable output streams to be terminated; this
is implemented by the S3A connector's output stream. It allows for commit protocols
to be implemented which commit/abort work by writing to the final destination and
using the abort() call to cancel any write which is not intended to be committed.
Consult the specification document for information about the interface and its use.

Contributed by Jungtaek Lim and Steve Loughran.

Change-Id: Ifcc5a13b07376ebebcc21c7b21f4aed4a84214ac
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants