- Support "GLACIER" storage class in NooBaa which should behave similar to AWS "GLACIER" storage class.
- NooBaa should allow limited support of
RestoreObject
API.
The current approach to support GLACIER
storage class is to separate the implementation into two parts.
Main NooBaa process only manages metadata on the files/objects via extended attributes and maintains relevant
data in a log file. Another process (currently manage_nsfs
) manages the actual movements of the files across
disk and tape.
There are 3 primary flows of concern and this document will discuss all 3 of them:
- Upload object to
GLACIER
storage class (API:PutObject
). - Restore object that are uploaded to
GLACIER
storage class (API:RestoreObject
). - Copy objects where source is an object stored in
GLACIER
(API:PutObject
).
Important component of all the flows is the Persistent Log. NooBaa has a PersistentLogger
is extremely simple in some senses.
It does not deal with fsync issues, partial writes, holes, etc. rather just appends data seperated by a new line character.
PersistentLogger
features:
- Exposes an
append
method which adds data to the file. - Can perform auto rotation of the file which makes sure that a single log is never too huge for the log consumer to consume.
- Exposes a
process_inactive
method which allows "safe" iteration on the previous log files. - Tries to make sure that no data loss happens due to process level races.
n
processes open the log file while a "consumer" swoops and tries to process the file affectively losing the current writes (due to processing partially written file and ultimately invokingunlink
on the file) - This isn't possible asprocess_inactive
method makes sure that it doesn't iterate over the "current active file".k
processes out ofn
(such thatk < n
) open the peristent log while a "consumer" swoops and tries to process the file affectively losing the current writes (due to unliking the file others hold reference to) - Althoughprocess_inactive
method will not protect against this as technically "current active file" is a different file but this is still not possible as the "consumer" need to have an "EXCLUSIVE" lock on the files before it can process the file this makes sure that for as long as any process is writing on the file, the "consumer" cannot consume the file and will block.k
processes out ofn
(such thatk < n
) open the peristent log but before the NSFS process could get a "SHARED" lock on the file the "consumer" process swoops in and process the files and then issuesunlink
on the file. The unlink will not delete the file ask
processes have open FD to the file but as soon as those processes will be done writing to it and will close the FD, the file will be deleted which will result in lost writes - This isn't possible asPersistentLogger
does not allow writing to a file till it can get a lock on the file and ensure that there are> 0
links to the file. If there are no links then it tries to open file the again assuming that the consumer has issuedunlink
on the file it holds the FD to.- Multiple processes try to swap the same file causing issues - This isn't possible as the process needs to acquire
a "swap lock" before it performs the swap which essentially serializes the operations. Further the swapping is done only
once by ensuring that the process only swaps if the current
inode
matches with theinode
it got when it opened the file initially, if not it skips the swapping.
- Scripts should be placed in
config.NSFS_GLACIER_TAPECLOUD_BIN_DIR
dir. migrate
script should take a file name and perform migrations of the files mentioned in the given file. The output should comply witheeadm migrate
command.recall
script should take a file name and perform recall of the files mentioned in the given file. The output should comply witheeadm recall
command.task_show
script should take a task id as argument and output its status. The output should be similar toeeadm task show -r <id>
.process_expired
should scan the FS for all the expired files and should migrate them back to disk. NooBaa doesn't expects nor care about the outputs generated here.low_free_space
script should outputtrue
if the disk has low free space or else should returnfalse
.
As mentioned earlier, any operation that is related to GLACIER
are handled in 2 phases. One phase is immediate
which is managed my the NSFS process itself while another phase is something which needs to be invoked seperately
which manages the actual movements of the file.
- PutObject is requested with storage class set to
GLACIER
. - NooBaa rejects the request if NooBaa isn't configured to support the given storage class. This is not enabled
by default and needs to be enabled via
config-local.js
by settingconfig.NSFS_GLACIER_ENABLED = true
andconfig.NSFS_GLACIER_LOGS_ENABLED = true
. - NooBaa will set the storage class to
GLACIER
by settinguser.storage_class
extended attribute. - NooBaa creates a persistent log and appends the filename to the log file.
- Completes the upload.
Once the upload is complete, the file sits on the disk till the second process kicks in and actually does the movement
of the file but main NooBaa process does not concerns itself with the actual file state and rather just relies on the
extended attributes to judge the state of the file. The implications of this is that NooBaa will refuse a file read operation
even if the file is on disk unless the user explicitly issues a RestoreObject
(It should be noted that this is what AWS
does as well).
- A scheduler (eg. Cron, human, script, etc) issues
noobaa-cli glacier migrate --interval <val>
. - The command will first acquire an "EXCLUSIVE" lock so as to ensure that only one tape management command is running at once.
- Once the process has the lock it will start to iterate over the potentially currently inactive files.
- Before processing a log file, the proceess will get an "EXCLUSIVE" lock to the file ensuring that it is indeed the only process processing the file.
- It will read the log one line at a time and will ensure the following:
- The file still exists.
- The file is still has
GLACIER
storage class. (This is can happen if the user uploads another object withSTANDARD
storage class). - The file doesn't have any of the
RestoreObject
extended attributes. This is to ensure that if the file was marked for restoration as soon as it was uploaded then we don't perform the migration at all. This is to avoid unnecessary work and also make sure that we don't end up racing with ourselves.
- Once a file name passes through all the above criterions then we add its name to a temporary log and handover the file
name to
migrate
script which should be inconfig.NSFS_GLACIER_TAPECLOUD_BIN_DIR
directory. We expect that the script will take the file name as its first parameter and will perform the migration. If theconfig.NSFS_GLACIER_BACKEND
is set toTAPECLOUD
(default) then we expect the script to output data in compliance witheeadm migrate
command. - We delete the temporary log that we created.
- We delete the log created by NSFS process iff there were no failures in
migrate
. In case of failures we skip the log deletion as a way to retry during the next trigger of the script. It should be noted that NooBaa'smigrate
(TAPECLOUD
backend) invocation does not considerDUPLICATE TASK
an error.
As mentioned earlier, any operation that is related to GLACIER
are handled in 2 phases. One phase is immediate
which is managed my the NSFS process itself while another phase is something which needs to be invoked seperately
which manages the actual movements of the file.
- RestoreObject is requested with non-zero positive number of days.
- NooBaa rejects the request if NooBaa isn't configured to support the given storage class. This is not enabled
by default and needs to be enabled via
config-local.js
by settingconfig.NSFS_GLACIER_ENABLED = true
andconfig.NSFS_GLACIER_LOGS_ENABLED = true
. - NooBaa performs a number of checks to ensure that the operation is valid (for example there is no already ongoing restore request going on etc).
- NooBaa saves the filename to a persistent log.
- Returns the request with success indicating that the restore request has been accepted.
- A scheduler (eg. Cron, human, script, etc) issues
noobaa-cli glacier restore --interval <val>
. - The command will first acquire an "EXCLUSIVE" lock so as to ensure that only one tape management command is running at once.
- Once the process has the lock it will start to iterate over the potentially currently inactive files.
- Before processing a log file, the proceess will get an "EXCLUSIVE" lock to the file ensuring that it is indeed the only process processing the file.
- It will read the log one line at a time and will store the names of the files that we expect to fail during an eeadm restore
(this can happen for example because a
RestoreObject
was issued for a file but later on that file was deleted before we could actually process the file). - The log is handed over to
recall
script which should be present inconfig.NSFS_GLACIER_TAPECLOUD_BIN_DIR
directory. We expect that the script will take the file name as its first parameter and will perform the recall. If theconfig.NSFS_GLACIER_BACKEND
is set toTAPECLOUD
(default) then we expect the script to output data in compliance witheeadm recall
command. - If we get any unexpected failures then we mark it a failure and make sure we do not delete the log file (so as to retry later).
- We iterate over the log again to set the final extended attributes. This is to make sure that we can communicate the latest with the NSFS processes.
This is very similar to Flow 1 with some additional checks.
If the source file is not in GLACIER
storage class then normal procedure kicks in.
If the source file is in GLACIER
storage class then:
- NooBaa refuses the copy if the file is not already restored (similar to AWS behaviour).
- NooBaa accepts the copy if the file is already restored (similar to AWS behaviour).