New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16854. ABFS: Fix for the OutOfMemoryException in AbfsOutputStream #2014
Conversation
💔 -1 overall
This message was automatically generated. |
a134eaa
to
cb5fad0
Compare
cb5fad0
to
f71c48b
Compare
💔 -1 overall
This message was automatically generated. |
Driver test results using accounts in Central India Account with HNS Support Account without HNS support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial review = not looked at tests
@@ -172,6 +172,12 @@ | |||
<groupId>com.google.guava</groupId> | |||
<artifactId>guava</artifactId> | |||
</dependency> | |||
<dependency> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- can you add to hadoop project pom and then refer here. its how we guarantee consistent versions.
- do we really need to add a new JAR into production just for annotations? if that is all it is for, maybe we could somehow avoid doing that
which annotations is it actually for? as VisibleForTesting is in guava
import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MIN_VALUE_MAX_AZURE_WRITE_MEM_USAGE_PERCENTAGE; | ||
|
||
/** | ||
* Pool for byte[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. at ehd for javadoc in java 8
} | ||
|
||
private synchronized boolean isPossibleToIssueNewBuffer() { | ||
Runtime rt = Runtime.getRuntime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could this be done outside a sync block? it probably does a JNI call.
private synchronized boolean isPossibleToIssueNewBuffer() { | ||
Runtime rt = Runtime.getRuntime(); | ||
int bufferCountByMaxFreeBuffers = | ||
maxBuffersToPool + rt.availableProcessors(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
store availableProcessors in constructor.
|
||
initWriteBufferPool(abfsOutputStreamContext); | ||
|
||
ThreadFactory daemonThreadFactory = new ThreadFactory() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's some hadoop thread factory you should be able to lift. Also: name the threads
return null; | ||
})); | ||
} | ||
for (Future<Void> futureTask : futureTasks) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see if you can use org.apache.hadoop.fs.impl.FutureIOSupport here. And somewhere there's a method to block waiting for futures to complete without doing it sequentally; I believe it is faster
Currently in environments where memory is restricted, It is observed at times a large number of buffers needed for the execution and the same and are kept within the bufferpool for ever this lead to out Of Memory exceptions.
This change addresses certain improvemnts to AbfsOutputStream which help fix the issue.
Driver test results using accounts in Central India
mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
Account with HNS Support
[INFO] Tests run: 67, Failures: 0, Errors: 0, Skipped: 0
[WARNING] Tests run: 426, Failures: 0, Errors: 0, Skipped: 66
[WARNING] Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
Account without HNS support
[INFO] Tests run: 67, Failures: 0, Errors: 0, Skipped: 0
[WARNING] Tests run: 426, Failures: 0, Errors: 0, Skipped: 240
[WARNING] Tests run: 206, Failures: 0, Errors: 0, Skipped: 24