HADOOP-18410. S3AInputStream.unbuffer() not releasing http connections #4766

steveloughran · 2022-08-19T18:01:30Z

Lots more logging at debug, more comments, no idea why async drain
doesn't work

also, adaptive changed the stream to go from sequential to random on unbuffer(),
as it is clear the caller is doing clever things. Still doesn't make things
work.

update second patch fixes it. race condition in when the values of fields were evaluated.

How was this patch tested?

staring at test results all afternoon commenting lines on and off.

no regression testing of existing tests.

For code changes:

Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

steveloughran · 2022-08-19T18:14:10Z

The async stream draining only executes (in the other thread) if the original invoker waits for the result.

this is not caused by some synchronized conflict -I made the method being invoked static to ensure this.

And the logging of the start/finish of the call are present. just not the bit in the middle.

test run with the {{join()}} to make it blocking.

the [JUnit-testUnbufferDraining] thread is the one doing the unbuffer()/read() calls; the [s3a-transfer-stevel-london-unbounded-pool] threads are the unbounded pool into which work
is queued. it is the unbounded pool; so that is not the limit.

That would appear to leave some aspect of CompletableFuture, possibly related to how things are being wrapped in duration tracking, audit spans etc. But I can't see this, and they all seem to work everywhere else.


{code}
2022-08-19 18:52:50,102 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(663)) - Closing stream unbuffer(): soft
2022-08-19 18:52:50,102 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(676)) - initiating asynchronous drain of 998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1609)) - Starting submitted operation in ec003fc0-e885-4793-b9b9-34296cc34f3a-00000010
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(761)) - drain or abort reason unbuffer() remaining=998 abort=false
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(771)) - draining 998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(789)) - Drained stream of 998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(801)) - Closing stream
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(826)) - Stream s3a://stevel-london/test/testUnbufferDraining closed: unbuffer(); remaining=0
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1613)) - Completed submitted operation in ec003fc0-e885-4793-b9b9-34296cc34f3a-00000010
2022-08-19 18:52:50,104 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInstrumentation (S3AInstrumentation.java:merge(1172)) - Merging statistics into FS statistics in unbuffer(): ...

2022-08-19 18:52:50,104 [JUnit-testUnbufferDraining] INFO s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:describe(219)) -

testUnbufferDraining: Starting read/unbuffer #2

{code}

no waiting

But without the join, no joy. even though the log above shows the drain is being executed in the new thread.

2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker (DurationInfo.java:close(101)) - read: duration 0:00.000s
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(663)) - Closing stream unbuffer(): soft
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(676)) - initiating asynchronous drain of 998 bytes
2022-08-19 18:36:56,109 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1609)) - Starting submitted operation in 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010
2022-08-19 18:36:56,109 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1613)) - Completed submitted operation in 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010

2022-08-19 18:36:56,110 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInstrumentation (S3AInstrumentation.java:merge(1172)) - Merging statistics into FS statistics in unbuffer():  ...
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] INFO s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:describe(219)) -

testUnbufferDraining: Starting read/unbuffer #2

2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker (DurationInfo.java:<init>(80)) - Starting: lazySeek
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(263)) - reopen(s3a://stevel-london/test/testUnbufferDraining) for read from new offset range[49001-50000], length=1, streamPosition=49002, nextReadPosition=49001, policy=random
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG impl.ChangeDetectionPolicy (ChangeDetectionPolicy.java:applyRevisionConstraint(430)) - Restricting get request to version RFW1GsJZAc_mL5eKKWRnB7tX2K9B3I98
2022-08-19 18:36:56,112 [JUnit-testUnbufferDraining] DEBUG impl.LoggingAuditor (LoggingAuditor.java:beforeExecution(327)) - [11] 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010 Executing op_open with {action_http_get_request 'test/testUnbufferDraining' size=998, mutating=false}; https://audit.example.org/hadoop/1/op_open/54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010/?op=op_open&p1=test/testUnbufferDraining&pr=stevel&ps=6cd73b0d-cfa1-41ec-ba5a-305b3abff0b0&id=54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010&t0=30&fs=54f63eab-e0d8-48b5-a0c1-9a877576a450&t1=11&ts=1660930615787
2022-08-19 18:36:58,223 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker (DurationInfo.java:close(101)) - lazySeek: duration 0:02.112s

then the timeout surfaces on the next read()


org.apache.hadoop.net.ConnectTimeoutException: re-open s3a://stevel-london/test/testUnbufferDraining at 49001 on s3a://stevel-london/test/testUnbufferDraining: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

	at org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:385)
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:198)
	at org.apache.hadoop.fs.s3a.Invoker.onceTrackingDuration(Invoker.java:149)
	at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:276)
	at org.apache.hadoop.fs.s3a.S3AInputStream.lambda$lazySeek$1(S3AInputStream.java:429)
	at org.apache.hadoop.fs.s3a.Invoker.lambda$maybeRetry$3(Invoker.java:284)
	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
	at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:410)
	at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:282)
	at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:326)
	at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:421)
	at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:455)
	at java.io.FilterInputStream.read(FilterInputStream.java:83)
	at org.apache.hadoop.fs.s3a.performance.ITestUnbufferDraining.testUnbufferDraining(ITestUnbufferDraining.java:141)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:750)

Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
at org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1600)
at org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:278)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at org.apache.hadoop.fs.s3a.Invoker.onceTrackingDuration(Invoker.java:147)
... 26 more
Caused by: com.amazonaws.thirdparty.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:316)
at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
at com.amazonaws.http.conn.$Proxy22.get(Unknown Source)
at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
... 40 more

…p connections Lots more logging at debug, more comments, *no idea why async drain doesn't work* also, adaptive changed the stream to go from sequential to random on unbuffer(), as it is clear the caller is doing clever things. Change-Id: I4ea2d902e5ae1c0db630091eed55e5916db97534

the key cause is that even though the fields passed in to drain() were converted to references through the methods, in the lambda expression passed in to submit, they are direct references ``` operation = client.submit( () -> drain(uri, streamStatistics, false, reason, remaining, object, wrappedStream)); /* here */ ``` the fields are only read during the async execution, not during the submit phase. the next step the code does is reset those fields; the async work is failing with an NPE which isn't being noted. Adding the join() after submit works *because this was inserted before the fields were set to null...so no NPE being silently raised. Change-Id: I417108db0daa78f711eb8dc1ed96ef619437a2bf

steveloughran · 2022-08-19T19:16:51Z

HADOOP-18410. copy fields to variables before use in a lambda expression

the key cause is that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they are direct references

operation = client.submit(
 () -> drain(uri, streamStatistics, 
       false, reason, remaining, 
       object, wrappedStream));  /* here */

the fields are only read during the async execution, not during the submit phase.

the next step the code does is reset those fields; the async work is
failing with an NPE which isn't being noted.

Adding the join() after submit works *because this was inserted before
the fields were set to null...so no NPE being silently raised.

steveloughran · 2022-08-19T19:17:03Z

tested: s3 london

steveloughran

reviewed myself; needs an iteration

steveloughran · 2022-08-19T20:25:40Z

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java

@@ -604,7 +604,7 @@ public synchronized void close() throws IOException {
      try {
        stopVectoredIOOperations.set(true);
        // close or abort the stream; blocking
-        awaitFuture(closeStream("close() operation", false, true));
+        closeStream("close() operation", false, true);


because this is blocking here's no need for that await future, but i think i will reinstate it for safety

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java

steveloughran · 2022-08-20T14:54:40Z

as this draining code is used in prefetch too, i'm going to

create a StreamDrainer class which the prefetch stream will also switch to; this code is fussy and I don't want duplicates
have it implement CallableWithIOE so can be passed in to submit(); no need for an extra l-expression to invoke.

the isolation lets me add unit tests for its failure cases

read returns -1 while data still remaining
read() throws ioe
remaining finishes while data remaining
abort failure

Pulls out draining code into its own class SDKStreamDrainer. 1. This is a CallableRaisingIOE so can be passed into submit without any lambda expression wrapping. 2. Used in normal and prefetching streams. 3. Has unit tests of failure modes. Change-Id: Id476f9029613c24b1070c3645ce84643b7705ed7

steveloughran · 2022-08-22T15:27:29Z

the last patch factors out stream draining; adds test for corner cases, especially escalation from read to abort. used in classic and prefetching streams, with unit tests.

one long-standing aspect of this design is that read(buffer) will swallow any IOE raised on any read() to fill the buffer, other than the first read(). so if a socket exception is raised partway through the read, it won't get noticed on that call, but only the subsequent one. need to modify the test to mimic this

…ead(buf) better. Any IOE in read() is only thrown if it is when reading the first byte of the buffer. Change-Id: I53fba27bfacd105f4c3ea4128202a3215456c001

hadoop-yetus · 2022-08-22T15:39:22Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 0s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+0 🆗	xmllint	0m 1s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	42m 1s		trunk passed
+1 💚	compile	0m 55s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	compile	0m 45s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	0m 43s		trunk passed
+1 💚	mvnsite	0m 53s		trunk passed
+1 💚	javadoc	0m 39s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 41s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 27s		trunk passed
+1 💚	shadedclient	24m 1s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 36s		the patch passed
+1 💚	compile	0m 41s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javac	0m 41s		the patch passed
+1 💚	compile	0m 34s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	0m 34s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 23s		the patch passed
+1 💚	mvnsite	0m 38s		the patch passed
+1 💚	javadoc	0m 19s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 28s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 14s		the patch passed
+1 💚	shadedclient	23m 32s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 58s		hadoop-aws in the patch passed.
+1 💚	asflicense	0m 44s		The patch does not generate ASF License warnings.
		106m 29s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/4/artifact/out/Dockerfile
GITHUB PR	#4766
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 8d8d8310dc65 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `9e6cdd5`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/4/testReport/
Max. process+thread count	580 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/4/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2022-08-22T17:21:34Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 7s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	xmllint	0m 0s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	41m 24s		trunk passed
+1 💚	compile	0m 53s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	compile	0m 45s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	0m 43s		trunk passed
+1 💚	mvnsite	0m 53s		trunk passed
+1 💚	javadoc	0m 39s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 41s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 26s		trunk passed
+1 💚	shadedclient	23m 56s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 36s		the patch passed
+1 💚	compile	0m 45s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javac	0m 44s		the patch passed
+1 💚	compile	0m 33s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	0m 33s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 25s		the patch passed
+1 💚	mvnsite	0m 39s		the patch passed
+1 💚	javadoc	0m 21s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 32s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 15s		the patch passed
+1 💚	shadedclient	25m 35s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 47s		hadoop-aws in the patch passed.
+1 💚	asflicense	0m 47s		The patch does not generate ASF License warnings.
		107m 48s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/5/artifact/out/Dockerfile
GITHUB PR	#4766
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 4bf4682f2efe 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `fb702a4`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/5/testReport/
Max. process+thread count	572 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/5/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

virajjasani · 2022-08-23T01:41:28Z

I was wondering, outside the scope of this PR, if we can also provide S3AInputStream API to retrieve the lastAccessedTimestamp for S3ObjectInputStream? Perhaps it might help in future to identify if the given S3AInputStream is idle?

steveloughran · 2022-08-23T09:07:48Z

if there already is a field there, yes. i've been wondering if there was a way to do some retirement of long-lived input streams. it doesn't matter so much with vectored io or prefetch as both will be doing shorter lived IO

virajjasani · 2022-08-23T22:46:48Z

a way to do some retirement of long-lived input streams.

Hmm might need some sort of "self expiring cache" kind of thing

steveloughran · 2022-08-24T09:48:29Z

either the fs tracks all open streams (weak ref map) and scans them, or each stream schedules a worker to run every few minutes which will release the stream if idle. fs is probably simpler, at least in terms of scheduling, thread shutdown etc

steveloughran · 2022-08-25T10:42:55Z

need some reviews here. this is a critical bug

mukund-thakur

LGTM +1, really nice test coverage.

mukund-thakur · 2022-08-25T21:07:14Z

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/SDKStreamDrainer.java

+      throw new IllegalStateException(
+          "duplicate invocation of drain operation");
+    }
+    boolean executeAbort = shouldAbort;


Why creating a new temp variable?

shouldAbort is a final arg; executeAbort will be set to true if draining doesn't work

mukund-thakur · 2022-08-25T21:29:41Z

hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/TestSDKStreamDrainer.java

+   * @return the drainer.
+   */
+  private SDKStreamDrainer assertAborted(SDKStreamDrainer drainer) {
+    Assertions.assertThat(drainer)


Assertions.assertThat(drainer.isAborted()) .isTrue();

Why not use this?

because on an assertion failure, we get drainer.toString() in the generated message.

still trying to find the best way to use assertj for simple true/false settings. this is a bit overcomplex, but it should be the most informative on a failure

mukund-thakur · 2022-08-25T21:29:51Z

hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/TestSDKStreamDrainer.java

+  private SDKStreamDrainer assertAborted(SDKStreamDrainer drainer) {
+    Assertions.assertThat(drainer)
+        .matches(SDKStreamDrainer::isAborted, "isAborted");
+    return drainer;


return value is never used

virajjasani

Overall looks good, left few minor comments

virajjasani · 2022-08-25T22:14:28Z

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/SDKStreamDrainer.java

+        return false;
+      } catch (Exception e) {
+        // exception escalates to an abort
+        LOG.debug("When closing {} stream for {}, will abort the stream",


Since we are changing the behaviour by aborting the stream, shall we have this log at WARN level at least?

don't want logs full of warnings as they create issues of their own. seen it too many times.

virajjasani · 2022-08-25T22:18:51Z

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/SDKStreamDrainer.java

+
+    streamStatistics.streamClose(true, remaining);
+    LOG.debug("Stream {} {}: {}; remaining={}",
+        uri, (shouldAbort ? "aborted" : "closed"), reason,


(shouldAbort ? "aborted" : "closed") can be replaced with just aborted? If the stream is closed, we would not reach here.

virajjasani · 2022-08-25T23:18:17Z

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java

+        LOG.debug("Switching to Random IO seek policy after unbuffer() invoked");
+        setInputPolicy(S3AInputPolicy.Random);


nit: how about?

final S3AInputPolicy newPolicy = S3AInputPolicy.Random; LOG.debug("Switching to {} policy after unbuffer() invoked", newPolicy); setInputPolicy(newPolicy);

virajjasani · 2022-08-25T23:23:01Z

...ols/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/performance/ITestUnbufferDraining.java

+  public FileSystem getBrittleFS() {
+    return brittleFS;
+  }


nit: given that brittleFS usages are private, perhaps we don't need this getter method?

it's there. not going to revert it now.

mehakmeet

LG, some minor comments and suggestions. Really liked the tests.

mehakmeet · 2022-08-26T08:24:31Z

hadoop-tools/hadoop-aws/src/test/resources/core-site.xml

@@ -184,12 +184,17 @@
    <value>true</value>
  </property>

+


nit: extra blank line

...ols/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/performance/ITestUnbufferDraining.java

mehakmeet · 2022-08-26T09:05:00Z

...ols/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/performance/ITestUnbufferDraining.java

+        describe("Starting read/unbuffer #%d", i);
+        in.read();
+        in.unbuffer();
+      }


We can assert the number of aborts collected in IOStats after the for loop StreamStatisticNames.STREAM_READ_ABORTED to be 10 in this test and 0 in the above test.

done, plus asserts on the fs to verify propagation (and find bugs where they don't)

hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/impl/TestSDKStreamDrainer.java

hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/impl/SDKStreamDrainer.java

...ols/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/performance/ITestUnbufferDraining.java

* some of the feedback * verify seek policy changes/doesn't change on unbuffer as appropriate. * tests assert on iostats of stream and fs * which identified that the FS wasn't counting unbuffer events. fixed Change-Id: Ia4b9182cf0d61078085ea0551b3293a5ed86bbc5

Change-Id: I3745e86eccf4c51d6f35fdce5dbca75cc40e9036

steveloughran · 2022-08-26T12:34:09Z

tested, s3 london

Change-Id: If2499c6d47a2c5502f2f963ad57f638334336b04

hadoop-yetus · 2022-08-26T14:18:14Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 59s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+0 🆗	xmllint	0m 1s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	42m 33s		trunk passed
+1 💚	compile	1m 0s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	compile	0m 43s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	0m 38s		trunk passed
+1 💚	mvnsite	0m 50s		trunk passed
+1 💚	javadoc	0m 36s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 39s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 39s		trunk passed
+1 💚	shadedclient	24m 36s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 37s		the patch passed
+1 💚	compile	0m 44s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javac	0m 44s		the patch passed
+1 💚	compile	0m 33s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	0m 33s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 23s		the patch passed
+1 💚	mvnsite	0m 38s		the patch passed
+1 💚	javadoc	0m 18s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 29s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 28s		the patch passed
+1 💚	shadedclient	24m 13s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 48s		hadoop-aws in the patch passed.
+1 💚	asflicense	0m 40s		The patch does not generate ASF License warnings.
		107m 30s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/6/artifact/out/Dockerfile
GITHUB PR	#4766
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 97061934a676 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `0921aa3`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/6/testReport/
Max. process+thread count	573 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/6/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2022-08-26T14:22:37Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	1m 4s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	xmllint	0m 0s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	42m 33s		trunk passed
+1 💚	compile	1m 1s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	compile	0m 49s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	0m 41s		trunk passed
+1 💚	mvnsite	0m 54s		trunk passed
+1 💚	javadoc	0m 36s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 39s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 30s		trunk passed
+1 💚	shadedclient	24m 24s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 40s		the patch passed
+1 💚	compile	0m 50s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javac	0m 50s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	0m 37s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 25s		the patch passed
+1 💚	mvnsite	0m 40s		the patch passed
+1 💚	javadoc	0m 17s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 26s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 17s		the patch passed
+1 💚	shadedclient	23m 49s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 40s		hadoop-aws in the patch passed.
+1 💚	asflicense	0m 39s		The patch does not generate ASF License warnings.
		107m 17s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/7/artifact/out/Dockerfile
GITHUB PR	#4766
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 49ee26e02e16 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `c7ddcf7`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/7/testReport/
Max. process+thread count	533 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/7/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

hadoop-yetus · 2022-08-26T14:27:45Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 53s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 1s		codespell was not available.
+0 🆗	detsecrets	0m 1s		detect-secrets was not available.
+0 🆗	xmllint	0m 1s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	38m 8s		trunk passed
+1 💚	compile	0m 55s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	compile	0m 53s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	0m 49s		trunk passed
+1 💚	mvnsite	0m 53s		trunk passed
+1 💚	javadoc	0m 42s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 48s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 33s		trunk passed
+1 💚	shadedclient	21m 5s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 35s		the patch passed
+1 💚	compile	0m 39s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javac	0m 39s		the patch passed
+1 💚	compile	0m 32s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	0m 32s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 23s		the patch passed
+1 💚	mvnsite	0m 40s		the patch passed
+1 💚	javadoc	0m 20s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 27s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 9s		the patch passed
+1 💚	shadedclient	20m 26s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 43s		hadoop-aws in the patch passed.
+1 💚	asflicense	0m 50s		The patch does not generate ASF License warnings.
		96m 13s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/8/artifact/out/Dockerfile
GITHUB PR	#4766
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 200322a1c96d 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `684db35`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/8/testReport/
Max. process+thread count	559 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/8/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

...of import ordering and changes to S3ARemoteObject Change-Id: Ieb2c1a11e6893fa57099e320eab5ef352c6fdcf1

hadoop-yetus · 2022-08-26T16:23:25Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 59s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+0 🆗	xmllint	0m 0s		xmllint was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s		The patch appears to include 3 new or modified test files.
			_ trunk Compile Tests _
+1 💚	mvninstall	38m 8s		trunk passed
+1 💚	compile	0m 59s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	compile	0m 54s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	checkstyle	0m 42s		trunk passed
+1 💚	mvnsite	0m 53s		trunk passed
+1 💚	javadoc	0m 44s		trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 41s		trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 25s		trunk passed
+1 💚	shadedclient	20m 38s		branch has no errors when building and testing our client artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	0m 40s		the patch passed
+1 💚	compile	0m 43s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javac	0m 43s		the patch passed
+1 💚	compile	0m 37s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	javac	0m 37s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 24s		the patch passed
+1 💚	mvnsite	0m 38s		the patch passed
+1 💚	javadoc	0m 20s		the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚	javadoc	0m 27s		the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚	spotbugs	1m 10s		the patch passed
+1 💚	shadedclient	20m 30s		patch has no errors when building and testing our client artifacts.
			_ Other Tests _
+1 💚	unit	2m 45s		hadoop-aws in the patch passed.
+1 💚	asflicense	0m 46s		The patch does not generate ASF License warnings.
		96m 7s

Subsystem	Report/Notes
Docker	ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/9/artifact/out/Dockerfile
GITHUB PR	#4766
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname	Linux 7f4c63c1368f 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `fd382fb`
Default Java	Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/9/testReport/
Max. process+thread count	714 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/9/console
versions	git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by	Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

steveloughran · 2022-08-31T10:00:51Z

ok, i'm merging this; mukund's vote, mehakmeets comments and yetus are all happy

…ions (apache#4766) HADOOP-16202 "Enhance openFile()" added asynchronous draining of the remaining bytes of an S3 HTTP input stream for those operations (unbuffer, seek) where it could avoid blocking the active thread. This patch fixes the asynchronous stream draining to work and so return the stream back to the http pool. Without this, whenever unbuffer() or seek() was called on a stream and an asynchronous drain triggered, the connection was not returned; eventually the pool would be empty and subsequent S3 requests would fail with the message "Timeout waiting for connection from pool" The root cause was that even though the fields passed in to drain() were converted to references through the methods, in the lambda expression passed in to submit, they were direct references operation = client.submit( () -> drain(uri, streamStatistics, false, reason, remaining, object, wrappedStream)); /* here */ Those fields were only read during the async execution, at which point they would have been set to null (or even a subsequent read). A new SDKStreamDrainer class peforms the draining; this is a Callable and can be submitted directly to the executor pool. The class is used in both the classic and prefetching s3a input streams. Also, calling unbuffer() switches the S3AInputStream from adaptive to random IO mode; that is, it is considered a cue that future IO will not be sequential, whole-file reads. Contributed by Steve Loughran. Change-Id: Ia43339302dbe837ceee4bcfc83fd9624b3c4992c

…ions (#4766) HADOOP-16202 "Enhance openFile()" added asynchronous draining of the remaining bytes of an S3 HTTP input stream for those operations (unbuffer, seek) where it could avoid blocking the active thread. This patch fixes the asynchronous stream draining to work and so return the stream back to the http pool. Without this, whenever unbuffer() or seek() was called on a stream and an asynchronous drain triggered, the connection was not returned; eventually the pool would be empty and subsequent S3 requests would fail with the message "Timeout waiting for connection from pool" The root cause was that even though the fields passed in to drain() were converted to references through the methods, in the lambda expression passed in to submit, they were direct references operation = client.submit( () -> drain(uri, streamStatistics, false, reason, remaining, object, wrappedStream)); /* here */ Those fields were only read during the async execution, at which point they would have been set to null (or even a subsequent read). A new SDKStreamDrainer class peforms the draining; this is a Callable and can be submitted directly to the executor pool. The class is used in both the classic and prefetching s3a input streams. Also, calling unbuffer() switches the S3AInputStream from adaptive to random IO mode; that is, it is considered a cue that future IO will not be sequential, whole-file reads. Contributed by Steve Loughran.

…ions (apache#4766) HADOOP-16202 "Enhance openFile()" added asynchronous draining of the remaining bytes of an S3 HTTP input stream for those operations (unbuffer, seek) where it could avoid blocking the active thread. This patch fixes the asynchronous stream draining to work and so return the stream back to the http pool. Without this, whenever unbuffer() or seek() was called on a stream and an asynchronous drain triggered, the connection was not returned; eventually the pool would be empty and subsequent S3 requests would fail with the message "Timeout waiting for connection from pool" The root cause was that even though the fields passed in to drain() were converted to references through the methods, in the lambda expression passed in to submit, they were direct references operation = client.submit( () -> drain(uri, streamStatistics, false, reason, remaining, object, wrappedStream)); /* here */ Those fields were only read during the async execution, at which point they would have been set to null (or even a subsequent read). A new SDKStreamDrainer class peforms the draining; this is a Callable and can be submitted directly to the executor pool. The class is used in both the classic and prefetching s3a input streams. Also, calling unbuffer() switches the S3AInputStream from adaptive to random IO mode; that is, it is considered a cue that future IO will not be sequential, whole-file reads. Contributed by Steve Loughran.

…ions -prefetch changes(#4766) Changes in HADOOP-18410 which are needed for the S3A prefetching stream; needed as part of the HADOOP-18703 backport Change-Id: Ib403ca793e29a4416e5d892f9081de5832da3b68

steveloughran force-pushed the s3/HADOOP-18410-s3a-draining branch from 5f8fcf6 to f199851 Compare August 19, 2022 18:17

steveloughran commented Aug 19, 2022

View reviewed changes

apache deleted a comment from hadoop-yetus Aug 20, 2022

apache deleted a comment from hadoop-yetus Aug 22, 2022

HADOOP-18410. TestSDKStreamDrainer fake stream emulates InputStream.r…

fb702a4

…ead(buf) better. Any IOE in read() is only thrown if it is when reading the first byte of the buffer. Change-Id: I53fba27bfacd105f4c3ea4128202a3215456c001

mukund-thakur approved these changes Aug 25, 2022

View reviewed changes

virajjasani reviewed Aug 25, 2022

View reviewed changes

mehakmeet reviewed Aug 26, 2022

View reviewed changes

steveloughran added 2 commits August 26, 2022 13:29

HADOOP-18410. final set of mehakmeet's comments

c7ddcf7

Change-Id: I3745e86eccf4c51d6f35fdce5dbca75cc40e9036

HADOOP-18410. renaming fields and getters in SDKStreamDrainer

684db35

Change-Id: If2499c6d47a2c5502f2f963ad57f638334336b04

HADOOP-18410. final review/tuning

fd382fb

...of import ordering and changes to S3ARemoteObject Change-Id: Ieb2c1a11e6893fa57099e320eab5ef352c6fdcf1

steveloughran merged commit c69e16b into apache:trunk Aug 31, 2022

steveloughran mentioned this pull request Aug 31, 2022

HADOOP-18410. S3AInputStream.unbuffer() does not release http connections (#4766) #4839

Merged

4 tasks

		LOG.debug("Switching to Random IO seek policy after unbuffer() invoked");
		setInputPolicy(S3AInputPolicy.Random);

HADOOP-18410. S3AInputStream.unbuffer() not releasing http connections #4766

HADOOP-18410. S3AInputStream.unbuffer() not releasing http connections #4766

Conversation

steveloughran commented Aug 19, 2022 • edited

How was this patch tested?

For code changes:

steveloughran commented Aug 19, 2022

test run with the {{join()}} to make it blocking.

no waiting

steveloughran commented Aug 19, 2022

steveloughran commented Aug 19, 2022

steveloughran left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steveloughran commented Aug 20, 2022

steveloughran commented Aug 22, 2022

hadoop-yetus commented Aug 22, 2022

hadoop-yetus commented Aug 22, 2022

virajjasani commented Aug 23, 2022

steveloughran commented Aug 23, 2022

virajjasani commented Aug 23, 2022

steveloughran commented Aug 24, 2022

steveloughran commented Aug 25, 2022

mukund-thakur left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

virajjasani left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mehakmeet left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

steveloughran commented Aug 26, 2022

hadoop-yetus commented Aug 26, 2022

hadoop-yetus commented Aug 26, 2022

hadoop-yetus commented Aug 26, 2022

hadoop-yetus commented Aug 26, 2022

steveloughran commented Aug 31, 2022

steveloughran commented Aug 19, 2022 •

edited