Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-17981. resilient commit through etag validation #3611

Conversation

steveloughran
Copy link
Contributor

@steveloughran steveloughran commented Nov 2, 2021

Compared to #3597 this moves the recovery handling into
the application code.

draft commit message


Uses a new EtagSource API in hadoop-common (and implemented
supported in ABFS) to implement recovery from failures
during job commit of the form
"java.io.IOException: Failed to rename VersionedFileStatus"

After a rename() call fails, the etag of the file at the
destination path is compared with that of the source file
as determined when listing the parent directory.
If the two match and the source file is no longer present,
the rename is considered successful.

This algorithm relies on

  • the etag of files being different on different uploads
    (at a minimum: when different data is uploaded to the same
    destination)
  • etag values persisted across renames.
    (true for ADLS Gen2, not for S3)
  • directory list API calls returning the etags of all files.

To enable users MUST set
mapreduce.fileoutputcommitter.algorithm.version.v1.experimental.parallel.rename.recovery
to "true"

users SHOULD set
fs.azure.rename.raises.exceptions to true

That is not mandatory, but it gives better error reporting on all forms of rename
failures, and when handling recoverable errors in job commits, reduces the
number of IO requests by one, so reducing load in an IO-intensive phase
of the job.


For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@steveloughran
Copy link
Contributor Author

somehow test fs with renaming raising exceptions is being picked up by other test suites...caching,

[ERROR] Failures:
[ERROR]   ITestAbfsFileSystemContractRename>AbstractContractRenameTest.testRenameFileOverExistingFile:138->Assert.assertFalse:65->Assert.assertTrue:42->Assert.fail:89 expected rename(abfs://stevel-testing@stevelukwest.dfs.core.windows.net/fork-7/test/source-256.txt, abfs://stevel-testing@stevelukwest.dfs.core.windows.net/fork-7/test/dest-512.txt) to be rejected with false, but got exception
[ERROR]   ITestAbfsFileSystemContractRename>AbstractContractRenameTest.testRenameNonexistentFile:77 Renaming a missing file unexpectedly threw an exception
[ERROR] Errors:
[ERROR]   ITestAzureBlobFileSystemBasics>FileSystemContractBaseTest.testRenameChildDirForbidden:728->FileSystemContractBaseTest.rename:593 » FileAlreadyExists
[ERROR]   ITestAzureBlobFileSystemBasics>FileSystemContractBaseTest.testRenameDirectoryAsExistingFile:537->FileSystemContractBaseTest.rename:593 » FileAlreadyExists
[ERROR]   ITestAzureBlobFileSystemBasics>FileSystemContractBaseTest.testRenameDirectoryMoveToNonExistentDirectory:504->FileSystemContractBaseTest.rename:593 » FileNotFound
[ERROR]   ITestAzureBlobFileSystemBasics>FileSystemContractBaseTest.testRenameFileAsExistingFile:481->FileSystemContractBaseTest.rename:593 » FileAlreadyExists
[ERROR]   ITestAzureBlobFileSystemBasics>FileSystemContractBaseTest.testRenameFileMoveToNonExistentDirectory:459->FileSystemContractBaseTest.rename:593 » FileNotFound
[ERROR]   ITestAzureBlobFileSystemBasics>FileSystemContractBaseTest.testRenameNonExistentPath:449->FileSystemContractBaseTest.rename:593 » FileNotFound
[INFO]

Copy link
Contributor

@snvijaya snvijaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with the rename recovery logic.
As discussed, for ABFS case will work towards a long term solution where the driver and server can handle the case and avoid the client having to do the src-destn eTag checks as in ResilientCommitByRenameHelper.

@steveloughran
Copy link
Contributor Author

tested against abfs cardiff.

two failures with threads=8

HADOOP-17797. transient failure of ITestAbfsListStatusRemoteIterator.

HADOOP-17989. ITestAzureBlobFileSystemDelete failing "Operations has null HTTP response"

@steveloughran
Copy link
Contributor Author

checkstyle on imports

@steveloughran
Copy link
Contributor Author

pushed up import fixup patch.
tests: azure cardiff. all good except for HADOOP-17989. testDeleteIdempotency(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemDelete)

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Will approve once the YETUS is fixed.
Nice tests.

* has the same etag. as a result, this is considered a success.
*/
@Test
public void testCommitMissingDestDir() throws Throwable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the description and javadoc is wrong here ?

@steveloughran steveloughran force-pushed the mr/HADOOP-17981-resilient-commit-etag-api branch from 9e33d29 to f7763f8 Compare November 8, 2021 10:20
@steveloughran
Copy link
Contributor Author

thanks, updated javadocs.

bear in mind this is not going to be merged in as is; once the review is in I plan to

  • isolate the etag changes into their own pr & add S3A support
  • use the resilient commit helper to commit in the manifest committer.

Rajesh Balamohan and others added 8 commits November 9, 2021 17:26
…commit

Contributed by Rajesh Balamohan

Change-Id: I8d2cedb0ba8b6aba2f1d0f814f52d77b156f1206
Compared to the previous pull request, this moves the recovery handling into
the application code.

However, the nominally simpler "fs api" strategy had actually got more complicated with
interfaces in RawLocalFS for testing, etc etc, and the hive team were
showing a dangerous amount of interest and using the same API,
judging by the comments.

This solution at one new interface in hadoop fs, which is straight forward it to declare
stable as s3 and abfs and others can all serve this field up immediately.

With the interface and part capabilities probes, anything is free to use this for:
Tracking changes across files, verifying that objects are unchanged
and this special case of resilience recovery.

What else do we need?
We would like, but don't need, an option for abfs to
throw meaningful exceptions on rename failures
and for it not to attempt to use modtimes to recover from
source file not found events.

Change-Id: I10e6657f37769df3917b3f34b3b4cd066827fba7
abfs client to
* stop trying to recover on modtime (which was broken anyway)
* optionally raise exceptions on rename failure
* not yet cut the unused obsolete methods.

Algorithm tuning during review.
* only tries on FNFE
* if rename returns false, escalation will raise FNFE
  on source not found, so automatically fall into
  recovery
* recovery attempts can be disabled

Change-Id: If979166775333560c26c93d1d20c171b168a0100
abfs integration test suites, with and without rename raising
exceptions.

commit helper counts how many operations were recovered from;
this is logged in the committer.

Change-Id: Id9a3bb49e71bccc5fa267ffb9077bfbed8008117
completely cut tests for rename idempotency in abfs and obsolete method in
AbfsClient

Change-Id: I21a42d9c3858e3cb846e0b2dc90003c47209e912
Change-Id: I2d2984e6225834d067ec550e50766e343fb11ede
Change-Id: I922802b0f8251c56878bdd5419d4524b39b39ccd
Change-Id: I2bc9aefa0755ed68115526d5f6f3610d9eec6704
@steveloughran steveloughran force-pushed the mr/HADOOP-17981-resilient-commit-etag-api branch from c67143e to b3cb4f8 Compare November 9, 2021 17:27
@apache apache deleted a comment from hadoop-yetus Nov 10, 2021
@steveloughran steveloughran marked this pull request as draft November 10, 2021 10:07
@steveloughran
Copy link
Contributor Author

converting this to a draft as i don't want it merged in; the etag pr and adoption of the resilient commit helper is how I intend to do it.

  • isolates the etag api and adds s3a coverage
  • avoids going near FileOutputCommitter

the manifest committer is faster than rajesh's patch on object stores where listings are slow (done during task commit), and because it doesn't rename in task commit, is atomic even on gcs. with rate limiting we can even keep it below the load threshold for azure storage.
and, because it doesn't touch FileOutputCommitter, safer to merge

@steveloughran
Copy link
Contributor Author

note, setting "fs.azure.rename.raises.exceptions" to "true" highlights that the default trash policy can't handle exceptions raised there

steveloughran added a commit to steveloughran/hadoop that referenced this pull request Nov 24, 2021
* Pull out from HADOOP-17981/apache#3611
* Add S3A support
* Add requirement that located status API calls MUST also support the API,
and do this for ABFS.

s3a code retains now deprecated getETag entries. lots of references in
the s3guard code which I am leaving alone.

Change-Id: I16c10c73873b3cbe60433adb9bbc7c1441221dbb
@apache apache deleted a comment from hadoop-yetus Dec 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants