Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-13560 S3A to support huge file writes and operations -with tests #125

Closed

Conversation

steveloughran
Copy link
Contributor

Adds

Scale tests for S3A huge file support;

  • always running at the MB size (maybe best to make optional)
    -configurable to bigger sizes in the auth-keys XML or in the build -Dfs.s3a.scale.test.huge.filesize=1000
  • limited to upload, seek, read, rename, delete. The JUnit test cases are explicltly set up to run in order here.

New scalable output stream for writing, S3ABlockOutputStream

-always saves in incremental blocks as writes proceed, block size == partition size.
-supports Fast output stream memory buffer code (for regression testing)
-supports a back end which buffers blocks in files, using RR disk allocation. As such, write/read bandwidth is limited to aggregate HDD bandwidth.
-adding extra failure resilience as testing throws up failure conditions (network timeouts, no-response from server on multipart commit, etc).
-adding instrumentation, including using callbacks from AWS SDK to update gauges and counters (in progress)

What we have here is essentially something that can replace the classic "save to file, upload at the end" stream and the fast "store it all in RAM and hope there's space" stream. It should offer incremental upload for faster output of larger files compared the classic file stream, with the scaleability the fast one lacks. And the instrumentation to show what's happening.

@@ -183,6 +199,8 @@
<include>**/ITestS3AFileSystemContract.java</include>
<include>**/ITestS3AMiniYarnCluster.java</include>
<include>**/ITest*Root*.java</include>
<include>**/ITestS3AFileContextStatistics.java</include>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved this line down as it was failing sometimes

…ing on inside S3A, including a gauge of active request counts. +more troubleshooting docs. The fast output stream will retry on errors
block streaming is in, testing at moderate scale <100 MB.

you can choose for buffer-by-ram (current fast uploader) or buffer by HDD; in a test using SSD & remote S3, I got ~1.38MB/s bandwidth, got something similar 1.44 on RAM. But: we shouldn't run out off heap on the HDD option. RAM buffering uses existing ByteArrays, to ease source code migration off FastUpload (which is still there, for now).

* I do plan to add pooled ByteBuffers
* Add metrics of total and ongoing upload, including tracking what quantity of the outstanding block data has actually been uploaded.
-supercede the Fast output stream,
-run tests, tune outcomes (especially race conditions in multipart operations)
* more debug statements
* fixed name of fs.s3a.block.output option in core-default and docs. Thanks Rajesh!
* more attempts at managing close() operation rigorously. No evidence this is the cause of the problem rajesh saw though.
* rearranged layout of code in S3ADatablocks so associated classes are adjacent;
* retry on multipart commit adding sleep statements between retries
* new Progress log for logging progress @ debug level in s3a. Why? Because logging events every 8KB gets too chatty when debugging many-MB uploads.
* gauges of active block uploads wired up.
@steveloughran steveloughran deleted the s3/HADOOP-13560-5GB-blobs branch October 7, 2016 17:45
shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
…rocessorId

Refactoring LocalApplicationRunner s.t. each processor has its own listener instance, instead of a single listener keeping track of all processors.

Author: Navina Ramesh <navina@apache.org>

Reviewers: Prateek Maheshwari <pmaheshw@linkedin.com>, Xinyu Liu <xiliu@linkedin.com>

Closes apache#125 from navina/SAMZA-1213
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant