-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-6020] Introduce BlobServer#readWriteLock to synchronize file creation and deletion #3888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c6ff2ce
to
d308546
Compare
if (!blobFile.exists()) { | ||
// first we have to release the read lock in order to acquire the write lock | ||
readLock.unlock(); | ||
writeLock.lock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In between upgrading from read lock to write lock, multiple threads can reach this point and as far as I see, then a file can be written more often then required. I would assume the code still produces correct result, but could do duplicate work. An obvious fix would be to re-check blobFile.exists()
under the write lock, but now sure if the costs of another meta data query per write could offset occasional, but unlikely duplicate work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I'm also not sure which version would be more efficient. Given that the file transfer dominates the execution time here, I assume that the additional check won't hurt. Will change it accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except for my one inline comment, the changes look good.
One more thing that I noticed is that the PR also does some changes to the delete path, which are not covered by a (new) test.
Thanks for the quick review @StefanRRichter. I will address your comment and also add a test case for the delete path. |
d308546
to
b3f427a
Compare
…reation and deletion This commit introduces a BlobServer#readWriteLock in order to synchronize file creation and deletion operations in BlobServerConnection and BlobServer. This will prevent that multiple put and get operations interfere with each other and with get operations. The get operations are synchronized using the read lock in order to guarantee some kind of parallelism. Add Get and Delete operation tests This closes apache#3888.
…reation and deletion This commit introduces a BlobServer#readWriteLock in order to synchronize file creation and deletion operations in BlobServerConnection and BlobServer. This will prevent that multiple put and get operations interfere with each other and with get operations. The get operations are synchronized using the read lock in order to guarantee some kind of parallelism. Add Get and Delete operation tests This closes apache#3888.
…reation and deletion This commit introduces a BlobServer#readWriteLock in order to synchronize file creation and deletion operations in BlobServerConnection and BlobServer. This will prevent that multiple put and get operations interfere with each other and with get operations. The get operations are synchronized using the read lock in order to guarantee some kind of parallelism. Add Get and Delete operation tests This closes apache#3888.
…reation and deletion This commit introduces a BlobServer#readWriteLock in order to synchronize file creation and deletion operations in BlobServerConnection and BlobServer. This will prevent that multiple put and get operations interfere with each other and with get operations. The get operations are synchronized using the read lock in order to guarantee some kind of parallelism. Add Get and Delete operation tests This closes apache#3888.
The ConjunctFuture now returns the set of future values once it is completed. Introduce WaitingConjunctFuture; Fix thread safety issue with ResultConjunctFuture The WaitingConjunctFuture waits for the completion of its futures. The future values are discarded making it more efficient than the ResultConjunctFuture which returns the futures' values. The WaitingConjunctFuture is instantiated via FutureUtils.waitForAll(Collection<Future>). This closes apache#3873.
…teHandleStore In order to store completed checkpoints in an increasing order in ZooKeeper, the paths for the completed checkpoint is no generated by String.format("/%019d", checkpointId) instead of String.format("/%s", checkpointId). This makes sure that the converted long will always have the same length with leading 0s. Fix failing ZooKeeperCompletedCheckpointStoreITCase This closes apache#3884.
…abilityServices The HighAvailabilityService creates a single BlobStoreService instance which is shared by all BlobServer and BlobCache instances. The BlobStoreService's lifecycle is exclusively managed by the HighAvailabilityServices. This means that the BlobStore's content is only cleaned up if the HighAvailabilityService's HA data is cleaned up. Having this single point of control, makes it easier to decide when to discard HA data (e.g. in case of a successful job execution) and when to retain the data (e.g. for recovery). Close and cleanup all data of BlobStore in HighAvailabilityServices Use HighAvailabilityServices to create BlobStore Introduce BlobStoreService interface to hide close and closeAndCleanupAllData methods This closes apache#3864.
…reation and deletion This commit introduces a BlobServer#readWriteLock in order to synchronize file creation and deletion operations in BlobServerConnection and BlobServer. This will prevent that multiple put and get operations interfere with each other and with get operations. The get operations are synchronized using the read lock in order to guarantee some kind of parallelism. Add Get and Delete operation tests This closes apache#3888.
b3f427a
to
19e4fac
Compare
…reation and deletion This commit introduces a BlobServer#readWriteLock in order to synchronize file creation and deletion operations in BlobServerConnection and BlobServer. This will prevent that multiple put and get operations interfere with each other and with get operations. The get operations are synchronized using the read lock in order to guarantee some kind of parallelism. Add Get and Delete operation tests This closes #3888.
This PR is based on #3873 and #3864.
This commit introduces a BlobServer#readWriteLock in order to synchronize file creation
and deletion operations in BlobServerConnection and BlobServer. This will prevent
that multiple put, delete and get operations interfere with each other.
The get operations are synchronized using the read lock in order to guarantee some kind of
parallelism.
What this PR does not address is the handling of concurrent writes and reads to the
BlobStore
. This could be solved via SUCCESS files in order to indicate the completion of a file. However, the first read operation should now happen strictly after the write operation due to the locking.