New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-16696: ABFS Always read ahead config, to use read ahead even for non sequential reads. #1708
base: trunk
Are you sure you want to change the base?
Conversation
Updating with apache trunk
…n sequential reads
a808b71
to
14444a8
Compare
💔 -1 overall
This message was automatically generated. |
No new feature added, additional configuration option was added, to allow clients more control over the readAhead feature. Therefore, no additional tests were added. Tests were manually run results are as follows: [INFO] Tests run: 42, Failures: 0, Errors: 0, Skipped: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- needs some docs
- needs at least one test
Have you any benchmark numbers? And how would it compare to a Parquet client using readFully over seek + read? As I believe there are opportunities there -and as https://issues.apache.org/jira/browse/IMPALA-8525 shows, significant benefits if the reader and the FS are best aligned
AbfsHttpConstants.FORWARD_SLASH + getRelativePath(path), contentLength, | ||
abfsConfiguration.getReadBufferSize(), abfsConfiguration.getReadAheadQueueDepth(), | ||
abfsConfiguration.getTolerateOobAppends(), eTag); | ||
AbfsHttpConstants.FORWARD_SLASH + getRelativePath(path), contentLength, eTag, abfsConfiguration); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that passing the full config down is/is not the right approach compared to passing the explicit options in. Can you justify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of options we are passing to the Stream has increased to 4, causing Hadoop check style related issues (total arguments > 8). Also, increasing the number of arguments doesn't feel like a scalable approach, in my opinion. I agree, passing the whole abfsConfig object isn't very elegant either. How would you feel about a new abfsInputStreamConfig structure (class), to be used for passing stream related config options?
Thanks,
Saurabh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer some structure "ReadContext" to pass in
🎊 +1 overall
This message was automatically generated. |
Hi @steveloughran , |
...tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java
Outdated
Show resolved
Hide resolved
...e/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAbfsReadWriteAndSeekReadAheadEnabled.java
Show resolved
Hide resolved
...ls/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/constants/ConfigurationKeys.java
Show resolved
Hide resolved
🎊 +1 overall
This message was automatically generated. |
Hi @steveloughran , @DadanielZ , Thank you for your comments, I have tried to address all of them with my latest commits. Please kindly review again and let me know if you have any suggestions/ concerns. Thanks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks for the tests!
+1
@sapant-msft please run |
🎊 +1 overall
This message was automatically generated. |
@DadanielZ Thanks for the suggestion, I made a minor fix to the test, please kindly help review. The test results post the fix are (ran using: mvn -T 1C -Dparallel-tests=abfs -Dscale clean verify) Thanks, |
The 1 test (ITestAzureBlobFileSystemCLI) which was failing earlier was fixed in PR, after sync with trunk seems to be passing now. Thanks, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just accidentally hit the review rather than comment button. sorry
setting the readaheadI have seen the Parquet readFooter code and agree, readahead matters there. For the S3A readahead (HADOOP-13208? I forget) we worked off ORC traces and optimised for the PositionedReadable reads, which offen surface as a pair of back-to-back reads; Setting a minimum read ahead range handles this codepath. If you're going near read ahead, it would be good if you implement Presumably your benchmarks were executed in Azure? That is the likely real-world deployment. I'd be curious
Regarding the patch: when would we ever really want to turn this off? Is a serious question. For ORC, our past studies with S3A say "readahead". For Parquet, the implication is the same as needed. What other uses it do we care about? an HBase trace would be good here. A key thing I'm trying to emphasise is that a small readahead is not expensive compared to the delays caused by having to issue two near-adjacent GET Requests. The main penalty is on a seek outside the range we have to drain the stream, which is ~(bytes-remaining/bandwidth). I would prefer that rather than a boolean value, the stream always does the readahead except when the range == 0. That way setReadahead(0) and the config options are the way to disable it. And as noted: I don't think it should be disabled on random IO. S3A reads always show significant speed up with readahead > 0; the only issue is what makes a good value?TestingLooking at the test, if you add some package-private method to query the readahead state, e.g (a) the range value and (b) bytes remaining in the current buffer. Then you can verify that
Yes I am adding extra homework. Sorry. But as others will note: I like my tests, and I like tests in the hadoop- test suites as they catch regressions without needing any external tests. |
Adding a config alwaysReadAhead, set to False by default, to be able to use abfs' read ahead capability, even for non-sequential reads. At the moment, only sequential reads support read ahead. A read ahead is queued only after a sequential read is made, so we miss out on gains where we have a non-sequential followed by a sequential read. For example, seek(n), read 1 byte, read 10 bytes , seek (m), could benefit from read ahead, but not currently supported.