[NC | NSFS] Add mm policy migration rule to kick in every 15 mins #7530
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Explain the changes
This PR adds a new rule to the spectrum scale policies which migrates resident data from disk to tape every 15 minutes along with the restore trigger that happens every 15 minutes.
Why?
Previously, if a user would upload an object to
GLACIER
storage class and then will issue a recall, they will have to wait at least till00:15 UTC
to be able to get their object back. That is acceptable from S3 protocol perspective IMO as restores can take upto days but in this case we are simply taking more time to restore because the resident files are migrated to the tape at00:00 UTC
and only once migrated we bring them back to the tape (every 15 mins) hence the user will have to wait till00:15 UTC
.What?
With this PR, we have 3 different flows, one for migrating restored data (aka premigrated data), one for data that was freshly uploaded hence not on tape (aka resident data) and one for restoring data to the disk from tape.
We trigger migration of premigrated data every day at
00:00 UTC
as the granularity for restore request is in days and there is no point in issuing it more often.We trigger the restoration every 15 mins.
We trigger migration of resident data (freshly uploaded data) every 15 mins.
With this PR if a user uploads an object and immediately requests a refresh then at max the restore will take 30 mins (15 mins to first move that data to tape and then 15 to move that data back to disk).
Can there be race between data moving to tape and back to disk in the same flow?
I don't think so. The flows identify the candidates based on their state on the filesystem. If the file is
resident
, it will be handed over to migration flow while if the file ismigrated
it will be handed over to the restore flow.Tested this in the tape simulation environment.