Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18410. S3AInputStream.unbuffer() does not release http connections (#4766) #4839

Conversation

steveloughran
Copy link
Contributor

#4766 cherrypicked to branch-3.3


HADOOP-16202 "Enhance openFile()" added asynchronous draining of the
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
() -> drain(uri, streamStatistics,
false, reason, remaining,
object, wrappedStream)); /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.

Change-Id: Ia43339302dbe837ceee4bcfc83fd9624b3c4992c

Description of PR

How was this patch tested?

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

…ions (apache#4766)

HADOOP-16202 "Enhance openFile()" added asynchronous draining of the
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.

Change-Id: Ia43339302dbe837ceee4bcfc83fd9624b3c4992c
@steveloughran
Copy link
Contributor Author

full itests in progress; will merge unless yetus vetoes it

@steveloughran
Copy link
Contributor Author

tests s3 london at scale; buffer underflow at the usual transient location

[INFO] 
[ERROR] Failures: 
[ERROR]   ITestS3AContractUnbuffer>AbstractContractUnbufferTest.testUnbufferAfterRead:53->AbstractContractUnbufferTest.validateFullFileContents:132->AbstractContractUnbufferTest.validateFileContents:139->Assert.assertEquals:647->Assert.failNotEquals:835->Assert.fail:89 failed to read expected number of bytes from stream. This may be transient expected:<1024> but was:<392>
[INFO] 
[ERROR] Tests run: 1142, Failures: 1, Errors: 0, Skipped: 52

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 10m 20s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ branch-3.3 Compile Tests _
+1 💚 mvninstall 43m 3s branch-3.3 passed
+1 💚 compile 0m 59s branch-3.3 passed
+1 💚 checkstyle 0m 41s branch-3.3 passed
+1 💚 mvnsite 0m 51s branch-3.3 passed
+1 💚 javadoc 0m 47s branch-3.3 passed
+1 💚 spotbugs 1m 35s branch-3.3 passed
+1 💚 shadedclient 28m 21s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 1m 1s the patch passed
+1 💚 compile 0m 56s the patch passed
+1 💚 javac 0m 56s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 42s the patch passed
+1 💚 mvnsite 1m 1s the patch passed
+1 💚 javadoc 0m 47s the patch passed
+1 💚 spotbugs 2m 12s the patch passed
+1 💚 shadedclient 29m 40s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 15s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
126m 7s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4839/1/artifact/out/Dockerfile
GITHUB PR #4839
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 5d83b70466dd 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3 / 0df4b9a
Default Java Private Build-1.8.0_342-8u342-b07-0ubuntu1~18.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4839/1/testReport/
Max. process+thread count 523 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4839/1/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran merged commit f6c557d into apache:branch-3.3 Aug 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants