New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-5129] make the BlobServer use a distributed file system #2891
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sorry for the hassle, found a regression and added a fix plus an appropriate test for it. Should be fine now. |
This was actually the same implementation as FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the two could have been removed but the implementation makes most sense at the concrete file system abstraction layer, i.e. in FileSystemBlobStore.
…h for blobs Also use JUnit's TemporaryFolder in BlobRecoveryITCase, too. This makes cleaning up simpler.
Previously, the BlobServer holds a local copy and in case high availability (HA) is set, it also copies jar files to a distributed file system. Upon restore, these files are copied to local store from which they are used. This commit abstracts the BlobServer's backing file system and makes it use the distributed file system directly in HA mode, i.e. without the local file system copy. Other than that the behaviour does not change.
… HA mode * re-factor the file system abstraction in FileSystemBlobStore so that it can be used by the task managers, too, which should not be able to delete files in a distributed file system shared among different nodes * only download blobs from the blob server if not in HA mode or the distributed file system is not accessible by the BlobCache, e.g. at the task managers
…erver and cache If not in high availability mode, local (and now also distributed) file systems again try to set up a unique directory structure so that other instances with the same configuration file or storage path do not interfere. This was lost in 8b9c7d9.
…obStore and cleanup unused methods
Instead, the return value indicates whether a delete operation was successful. This is a result of the FileSystem abstraction layer in FileSystemBlobStore and follows the idiom that a failing delete operation is not that grave and the program can still continue.
This was set in 249b2ea.
despite the tests completing successfully, I do still need to check a few things:
|
I need to adapt a few things and choose a different approach - I'll re-open later |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Previously, the BlobServer held a local copy and in case high availability (HA)
is set, it also copied jar files to a distributed file system. Upon restore,
these files were copied to local store from which they are used.
This PR abstracts the BlobServer's backing file system and makes it use the
distributed file system directly in HA mode, i.e. without the local file system
copy. Other than that the behaviour should not change.
Secondly, BlobCache instances at the task managers also make use of this
distributed file system and download files from there instead of bothering
the blob server. As before, however, distributed files may only be deleted
by the blob server. If the distributed file system is not accessible at the blob
caches, the old behaviour is used.
@uce can you have a look?