Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-18410. S3AInputStream.unbuffer() not releasing http connections #4766

Merged

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Aug 19, 2022

Lots more logging at debug, more comments, no idea why async drain
doesn't work

also, adaptive changed the stream to go from sequential to random on unbuffer(),
as it is clear the caller is doing clever things. Still doesn't make things
work.

update second patch fixes it. race condition in when the values of fields were evaluated.

How was this patch tested?

staring at test results all afternoon commenting lines on and off.

no regression testing of existing tests.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@steveloughran
Copy link
Contributor Author

The async stream draining only executes (in the other thread) if the original invoker waits for the result.

this is not caused by some synchronized conflict -I made the method being invoked static to ensure this.

And the logging of the start/finish of the call are present. just not the bit in the middle.

test run with the {{join()}} to make it blocking.

the [JUnit-testUnbufferDraining] thread is the one doing the unbuffer()/read() calls; the [s3a-transfer-stevel-london-unbounded-pool] threads are the unbounded pool into which work
is queued. it is the unbounded pool; so that is not the limit.

That would appear to leave some aspect of CompletableFuture, possibly related to how things are being wrapped in duration tracking, audit spans etc. But I can't see this, and they all seem to work everywhere else.


{code}
2022-08-19 18:52:50,102 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(663)) - Closing stream unbuffer(): soft
2022-08-19 18:52:50,102 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(676)) - initiating asynchronous drain of 998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1609)) - Starting submitted operation in ec003fc0-e885-4793-b9b9-34296cc34f3a-00000010
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(761)) - drain or abort reason unbuffer() remaining=998 abort=false
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(771)) - draining 998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(789)) - Drained stream of 998 bytes
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(801)) - Closing stream
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AInputStream (S3AInputStream.java:drainOrAbortHttpStream(826)) - Stream s3a://stevel-london/test/testUnbufferDraining closed: unbuffer(); remaining=0
2022-08-19 18:52:50,102 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1613)) - Completed submitted operation in ec003fc0-e885-4793-b9b9-34296cc34f3a-00000010
2022-08-19 18:52:50,104 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInstrumentation (S3AInstrumentation.java:merge(1172)) - Merging statistics into FS statistics in unbuffer(): ...

2022-08-19 18:52:50,104 [JUnit-testUnbufferDraining] INFO s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:describe(219)) -

testUnbufferDraining: Starting read/unbuffer #2

{code}

no waiting

But without the join, no joy. even though the log above shows the drain is being executed in the new thread.

2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker (DurationInfo.java:close(101)) - read: duration 0:00.000s
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(663)) - Closing stream unbuffer(): soft
2022-08-19 18:36:56,109 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:closeStream(676)) - initiating asynchronous drain of 998 bytes
2022-08-19 18:36:56,109 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1609)) - Starting submitted operation in 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010
2022-08-19 18:36:56,109 [s3a-transfer-stevel-london-unbounded-pool5-t3] DEBUG s3a.S3AFileSystem (S3AFileSystem.java:lambda$null$0(1613)) - Completed submitted operation in 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010

2022-08-19 18:36:56,110 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInstrumentation (S3AInstrumentation.java:merge(1172)) - Merging statistics into FS statistics in unbuffer():  ...
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] INFO s3a.AbstractS3ATestBase (AbstractS3ATestBase.java:describe(219)) -

testUnbufferDraining: Starting read/unbuffer #2

2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker (DurationInfo.java:<init>(80)) - Starting: lazySeek
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG s3a.S3AInputStream (S3AInputStream.java:reopen(263)) - reopen(s3a://stevel-london/test/testUnbufferDraining) for read from new offset range[49001-50000], length=1, streamPosition=49002, nextReadPosition=49001, policy=random
2022-08-19 18:36:56,111 [JUnit-testUnbufferDraining] DEBUG impl.ChangeDetectionPolicy (ChangeDetectionPolicy.java:applyRevisionConstraint(430)) - Restricting get request to version RFW1GsJZAc_mL5eKKWRnB7tX2K9B3I98
2022-08-19 18:36:56,112 [JUnit-testUnbufferDraining] DEBUG impl.LoggingAuditor (LoggingAuditor.java:beforeExecution(327)) - [11] 54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010 Executing op_open with {action_http_get_request 'test/testUnbufferDraining' size=998, mutating=false}; https://audit.example.org/hadoop/1/op_open/54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010/?op=op_open&p1=test/testUnbufferDraining&pr=stevel&ps=6cd73b0d-cfa1-41ec-ba5a-305b3abff0b0&id=54f63eab-e0d8-48b5-a0c1-9a877576a450-00000010&t0=30&fs=54f63eab-e0d8-48b5-a0c1-9a877576a450&t1=11&ts=1660930615787
2022-08-19 18:36:58,223 [JUnit-testUnbufferDraining] DEBUG s3a.Invoker (DurationInfo.java:close(101)) - lazySeek: duration 0:02.112s

then the timeout surfaces on the next read()


org.apache.hadoop.net.ConnectTimeoutException: re-open s3a://stevel-london/test/testUnbufferDraining at 49001 on s3a://stevel-london/test/testUnbufferDraining: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool

	at org.apache.hadoop.fs.s3a.S3AUtils.translateInterruptedException(S3AUtils.java:385)
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:198)
	at org.apache.hadoop.fs.s3a.Invoker.onceTrackingDuration(Invoker.java:149)
	at org.apache.hadoop.fs.s3a.S3AInputStream.reopen(S3AInputStream.java:276)
	at org.apache.hadoop.fs.s3a.S3AInputStream.lambda$lazySeek$1(S3AInputStream.java:429)
	at org.apache.hadoop.fs.s3a.Invoker.lambda$maybeRetry$3(Invoker.java:284)
	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:122)
	at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:410)
	at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:282)
	at org.apache.hadoop.fs.s3a.Invoker.maybeRetry(Invoker.java:326)
	at org.apache.hadoop.fs.s3a.S3AInputStream.lazySeek(S3AInputStream.java:421)
	at org.apache.hadoop.fs.s3a.S3AInputStream.read(S3AInputStream.java:455)
	at java.io.FilterInputStream.read(FilterInputStream.java:83)
	at org.apache.hadoop.fs.s3a.performance.ITestUnbufferDraining.testUnbufferDraining(ITestUnbufferDraining.java:141)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:750)

Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Timeout waiting for connection from pool
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)
at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1524)
at org.apache.hadoop.fs.s3a.S3AFileSystem$InputStreamCallbacksImpl.getObject(S3AFileSystem.java:1600)
at org.apache.hadoop.fs.s3a.S3AInputStream.lambda$reopen$0(S3AInputStream.java:278)
at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547)
at org.apache.hadoop.fs.s3a.Invoker.onceTrackingDuration(Invoker.java:147)
... 26 more
Caused by: com.amazonaws.thirdparty.apache.http.conn.ConnectionPoolTimeoutException: Timeout waiting for connection from pool
at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.leaseConnection(PoolingHttpClientConnectionManager.java:316)
at com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager$1.get(PoolingHttpClientConnectionManager.java:282)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.amazonaws.http.conn.ClientConnectionRequestFactory$Handler.invoke(ClientConnectionRequestFactory.java:70)
at com.amazonaws.http.conn.$Proxy22.get(Unknown Source)
at com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:190)
at com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346)
at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)
... 40 more

…p connections

Lots more logging at debug, more comments, *no idea why async drain
doesn't work*

also, adaptive changed the stream to go from sequential to random on unbuffer(),
as it is clear the caller is doing clever things.

Change-Id: I4ea2d902e5ae1c0db630091eed55e5916db97534
the key cause is that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they are direct references
```
operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */
```
the fields are only read during the async execution, not during the submit phase.

the next step the code does is reset those fields; the async work is
failing with an NPE which isn't being noted.

Adding the join() after submit works *because this was inserted before
the fields were set to null...so no NPE being silently raised.

Change-Id: I417108db0daa78f711eb8dc1ed96ef619437a2bf
@steveloughran
Copy link
Contributor Author

HADOOP-18410. copy fields to variables before use in a lambda expression

the key cause is that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they are direct references

operation = client.submit(
 () -> drain(uri, streamStatistics, 
       false, reason, remaining, 
       object, wrappedStream));  /* here */

the fields are only read during the async execution, not during the submit phase.

the next step the code does is reset those fields; the async work is
failing with an NPE which isn't being noted.

Adding the join() after submit works *because this was inserted before
the fields were set to null...so no NPE being silently raised.

@steveloughran
Copy link
Contributor Author

tested: s3 london

Copy link
Contributor Author

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewed myself; needs an iteration

@@ -604,7 +604,7 @@ public synchronized void close() throws IOException {
try {
stopVectoredIOOperations.set(true);
// close or abort the stream; blocking
awaitFuture(closeStream("close() operation", false, true));
closeStream("close() operation", false, true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because this is blocking here's no need for that await future, but i think i will reinstate it for safety

@apache apache deleted a comment from hadoop-yetus Aug 20, 2022
@apache apache deleted a comment from hadoop-yetus Aug 20, 2022
@steveloughran
Copy link
Contributor Author

as this draining code is used in prefetch too, i'm going to

  1. create a StreamDrainer class which the prefetch stream will also switch to; this code is fussy and I don't want duplicates
  2. have it implement CallableWithIOE so can be passed in to submit(); no need for an extra l-expression to invoke.

the isolation lets me add unit tests for its failure cases

  • read returns -1 while data still remaining
  • read() throws ioe
  • remaining finishes while data remaining
  • abort failure

Pulls out draining code into its own class SDKStreamDrainer.

1. This is a CallableRaisingIOE so can be passed into submit without
any lambda expression wrapping.
2. Used in normal and prefetching streams.
3. Has unit tests of failure modes.

Change-Id: Id476f9029613c24b1070c3645ce84643b7705ed7
@apache apache deleted a comment from hadoop-yetus Aug 22, 2022
@steveloughran
Copy link
Contributor Author

the last patch factors out stream draining; adds test for corner cases, especially escalation from read to abort. used in classic and prefetching streams, with unit tests.

one long-standing aspect of this design is that read(buffer) will swallow any IOE raised on any read() to fill the buffer, other than the first read(). so if a socket exception is raised partway through the read, it won't get noticed on that call, but only the subsequent one. need to modify the test to mimic this

…ead(buf) better.

Any IOE in read() is only thrown if it is when reading the first byte of
the buffer.

Change-Id: I53fba27bfacd105f4c3ea4128202a3215456c001
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 0s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 1s trunk passed
+1 💚 compile 0m 55s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 0m 45s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 43s trunk passed
+1 💚 mvnsite 0m 53s trunk passed
+1 💚 javadoc 0m 39s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 41s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 27s trunk passed
+1 💚 shadedclient 24m 1s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 36s the patch passed
+1 💚 compile 0m 41s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 0m 41s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 34s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 23s the patch passed
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 javadoc 0m 19s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 28s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 14s the patch passed
+1 💚 shadedclient 23m 32s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 58s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 44s The patch does not generate ASF License warnings.
106m 29s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/4/artifact/out/Dockerfile
GITHUB PR #4766
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 8d8d8310dc65 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 9e6cdd5
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/4/testReport/
Max. process+thread count 580 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/4/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 7s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 41m 24s trunk passed
+1 💚 compile 0m 53s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 0m 45s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 43s trunk passed
+1 💚 mvnsite 0m 53s trunk passed
+1 💚 javadoc 0m 39s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 41s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 26s trunk passed
+1 💚 shadedclient 23m 56s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 36s the patch passed
+1 💚 compile 0m 45s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 0m 44s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 33s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 25s the patch passed
+1 💚 mvnsite 0m 39s the patch passed
+1 💚 javadoc 0m 21s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 32s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 15s the patch passed
+1 💚 shadedclient 25m 35s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 47s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 47s The patch does not generate ASF License warnings.
107m 48s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/5/artifact/out/Dockerfile
GITHUB PR #4766
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 4bf4682f2efe 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / fb702a4
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/5/testReport/
Max. process+thread count 572 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor

I was wondering, outside the scope of this PR, if we can also provide S3AInputStream API to retrieve the lastAccessedTimestamp for S3ObjectInputStream? Perhaps it might help in future to identify if the given S3AInputStream is idle?

@steveloughran
Copy link
Contributor Author

if there already is a field there, yes. i've been wondering if there was a way to do some retirement of long-lived input streams. it doesn't matter so much with vectored io or prefetch as both will be doing shorter lived IO

@virajjasani
Copy link
Contributor

a way to do some retirement of long-lived input streams.

Hmm might need some sort of "self expiring cache" kind of thing

@steveloughran
Copy link
Contributor Author

either the fs tracks all open streams (weak ref map) and scans them, or each stream schedules a worker to run every few minutes which will release the stream if idle. fs is probably simpler, at least in terms of scheduling, thread shutdown etc

@steveloughran
Copy link
Contributor Author

need some reviews here. this is a critical bug

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1, really nice test coverage.

throw new IllegalStateException(
"duplicate invocation of drain operation");
}
boolean executeAbort = shouldAbort;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why creating a new temp variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldAbort is a final arg; executeAbort will be set to true if draining doesn't work

* @return the drainer.
*/
private SDKStreamDrainer assertAborted(SDKStreamDrainer drainer) {
Assertions.assertThat(drainer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assertions.assertThat(drainer.isAborted())
        .isTrue();

Why not use this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because on an assertion failure, we get drainer.toString() in the generated message.

still trying to find the best way to use assertj for simple true/false settings. this is a bit overcomplex, but it should be the most informative on a failure

private SDKStreamDrainer assertAborted(SDKStreamDrainer drainer) {
Assertions.assertThat(drainer)
.matches(SDKStreamDrainer::isAborted, "isAborted");
return drainer;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return value is never used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not yet...

Copy link
Contributor

@virajjasani virajjasani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, left few minor comments

return false;
} catch (Exception e) {
// exception escalates to an abort
LOG.debug("When closing {} stream for {}, will abort the stream",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are changing the behaviour by aborting the stream, shall we have this log at WARN level at least?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't want logs full of warnings as they create issues of their own. seen it too many times.


streamStatistics.streamClose(true, remaining);
LOG.debug("Stream {} {}: {}; remaining={}",
uri, (shouldAbort ? "aborted" : "closed"), reason,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(shouldAbort ? "aborted" : "closed") can be replaced with just aborted? If the stream is closed, we would not reach here.

Comment on lines 1242 to 1243
LOG.debug("Switching to Random IO seek policy after unbuffer() invoked");
setInputPolicy(S3AInputPolicy.Random);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: how about?

        final S3AInputPolicy newPolicy = S3AInputPolicy.Random;
        LOG.debug("Switching to {} policy after unbuffer() invoked", newPolicy);
        setInputPolicy(newPolicy);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +118 to +120
public FileSystem getBrittleFS() {
return brittleFS;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: given that brittleFS usages are private, perhaps we don't need this getter method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's there. not going to revert it now.

Copy link
Contributor

@mehakmeet mehakmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, some minor comments and suggestions. Really liked the tests.

@@ -184,12 +184,17 @@
<value>true</value>
</property>


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra blank line

describe("Starting read/unbuffer #%d", i);
in.read();
in.unbuffer();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can assert the number of aborts collected in IOStats after the for loop StreamStatisticNames.STREAM_READ_ABORTED to be 10 in this test and 0 in the above test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, plus asserts on the fs to verify propagation (and find bugs where they don't)

* some of the feedback
* verify seek policy changes/doesn't change on unbuffer as
  appropriate.
* tests assert on iostats of stream and fs
* which identified that the FS wasn't counting
  unbuffer events. fixed

Change-Id: Ia4b9182cf0d61078085ea0551b3293a5ed86bbc5
Change-Id: I3745e86eccf4c51d6f35fdce5dbca75cc40e9036
@steveloughran
Copy link
Contributor Author

tested, s3 london

Change-Id: If2499c6d47a2c5502f2f963ad57f638334336b04
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 59s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 33s trunk passed
+1 💚 compile 1m 0s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 0m 43s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 38s trunk passed
+1 💚 mvnsite 0m 50s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 39s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 39s trunk passed
+1 💚 shadedclient 24m 36s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 37s the patch passed
+1 💚 compile 0m 44s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 0m 44s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 33s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 23s the patch passed
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 javadoc 0m 18s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 29s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 28s the patch passed
+1 💚 shadedclient 24m 13s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 48s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
107m 30s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/6/artifact/out/Dockerfile
GITHUB PR #4766
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 97061934a676 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0921aa3
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/6/testReport/
Max. process+thread count 573 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/6/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 1m 4s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 42m 33s trunk passed
+1 💚 compile 1m 1s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 0m 49s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 41s trunk passed
+1 💚 mvnsite 0m 54s trunk passed
+1 💚 javadoc 0m 36s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 39s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 30s trunk passed
+1 💚 shadedclient 24m 24s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 40s the patch passed
+1 💚 compile 0m 50s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 0m 50s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 37s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 25s the patch passed
+1 💚 mvnsite 0m 40s the patch passed
+1 💚 javadoc 0m 17s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 26s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 17s the patch passed
+1 💚 shadedclient 23m 49s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 40s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 39s The patch does not generate ASF License warnings.
107m 17s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/7/artifact/out/Dockerfile
GITHUB PR #4766
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 49ee26e02e16 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / c7ddcf7
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/7/testReport/
Max. process+thread count 533 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/7/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 53s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 38m 8s trunk passed
+1 💚 compile 0m 55s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 0m 53s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 49s trunk passed
+1 💚 mvnsite 0m 53s trunk passed
+1 💚 javadoc 0m 42s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 48s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 33s trunk passed
+1 💚 shadedclient 21m 5s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 35s the patch passed
+1 💚 compile 0m 39s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 0m 39s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 32s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 23s the patch passed
+1 💚 mvnsite 0m 40s the patch passed
+1 💚 javadoc 0m 20s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 27s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 9s the patch passed
+1 💚 shadedclient 20m 26s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 43s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 50s The patch does not generate ASF License warnings.
96m 13s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/8/artifact/out/Dockerfile
GITHUB PR #4766
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 200322a1c96d 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 684db35
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/8/testReport/
Max. process+thread count 559 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

...of import ordering and changes to S3ARemoteObject

Change-Id: Ieb2c1a11e6893fa57099e320eab5ef352c6fdcf1
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 59s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 38m 8s trunk passed
+1 💚 compile 0m 59s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 compile 0m 54s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 checkstyle 0m 42s trunk passed
+1 💚 mvnsite 0m 53s trunk passed
+1 💚 javadoc 0m 44s trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 41s trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 25s trunk passed
+1 💚 shadedclient 20m 38s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 40s the patch passed
+1 💚 compile 0m 43s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javac 0m 43s the patch passed
+1 💚 compile 0m 37s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 javac 0m 37s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 24s the patch passed
+1 💚 mvnsite 0m 38s the patch passed
+1 💚 javadoc 0m 20s the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1
+1 💚 javadoc 0m 27s the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
+1 💚 spotbugs 1m 10s the patch passed
+1 💚 shadedclient 20m 30s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 45s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 46s The patch does not generate ASF License warnings.
96m 7s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/9/artifact/out/Dockerfile
GITHUB PR #4766
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets xmllint
uname Linux 7f4c63c1368f 4.15.0-191-generic #202-Ubuntu SMP Thu Aug 4 01:49:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / fd382fb
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/9/testReport/
Max. process+thread count 714 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4766/9/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor Author

ok, i'm merging this; mukund's vote, mehakmeets comments and yetus are all happy

@steveloughran steveloughran merged commit c69e16b into apache:trunk Aug 31, 2022
steveloughran added a commit to steveloughran/hadoop that referenced this pull request Aug 31, 2022
…ions (apache#4766)

HADOOP-16202 "Enhance openFile()" added asynchronous draining of the
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.

Change-Id: Ia43339302dbe837ceee4bcfc83fd9624b3c4992c
steveloughran added a commit that referenced this pull request Aug 31, 2022
…ions (#4766)


HADOOP-16202 "Enhance openFile()" added asynchronous draining of the
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.
HarshitGupta11 pushed a commit to HarshitGupta11/hadoop that referenced this pull request Nov 28, 2022
…ions (apache#4766)


HADOOP-16202 "Enhance openFile()" added asynchronous draining of the 
remaining bytes of an S3 HTTP input stream for those operations
(unbuffer, seek) where it could avoid blocking the active
thread.

This patch fixes the asynchronous stream draining to work and so
return the stream back to the http pool. Without this, whenever
unbuffer() or seek() was called on a stream and an asynchronous
drain triggered, the connection was not returned; eventually
the pool would be empty and subsequent S3 requests would
fail with the message "Timeout waiting for connection from pool"

The root cause was that even though the fields passed in to drain() were
converted to references through the methods, in the lambda expression
passed in to submit, they were direct references

operation = client.submit(
 () -> drain(uri, streamStatistics,
       false, reason, remaining,
       object, wrappedStream));  /* here */

Those fields were only read during the async execution, at which
point they would have been set to null (or even a subsequent read).

A new SDKStreamDrainer class peforms the draining; this is a Callable
and can be submitted directly to the executor pool.

The class is used in both the classic and prefetching s3a input streams.

Also, calling unbuffer() switches the S3AInputStream from adaptive
to random IO mode; that is, it is considered a cue that future
IO will not be sequential, whole-file reads.

Contributed by Steve Loughran.
asfgit pushed a commit that referenced this pull request Apr 26, 2023
…ions -prefetch changes(#4766)

Changes in HADOOP-18410 which are needed for the S3A prefetching stream; needed
as part of the HADOOP-18703 backport

Change-Id: Ib403ca793e29a4416e5d892f9081de5832da3b68
asfgit pushed a commit that referenced this pull request Apr 28, 2023
…ions -prefetch changes(#4766)

Changes in HADOOP-18410 which are needed for the S3A prefetching stream; needed
as part of the HADOOP-18703 backport

Change-Id: Ib403ca793e29a4416e5d892f9081de5832da3b68
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants