GEODE-3799: Move backups towards a pluggable architecture#1109
GEODE-3799: Move backups towards a pluggable architecture#1109nreich merged 8 commits intoapache:developfrom
Conversation
* This effort decouples the definition of files to be backed up
from the destination of the backup. It will allow for the
development of different, pluggable, strategies for conducting
backups, such as compressed archive files of backups or backup
to cloud providers.
* A new BackupDestination interface is provided and the existing
logic for saving a backup to the filesystem has been moved into
an implementation of this interface, FileSystemBackupDestination.
During a backup, all files are copied to a temporary directory
and added to a definition of the backup. This definition is then
used to send the files to their ultimate destination. In the
current case of moving the files to a specified location on the
filesystem, this is implemented by moving the files from the
temporary location to the target backup directory.
|
|
||
| import org.apache.geode.cache.DiskStore; | ||
|
|
||
| class BackupDefinition { |
There was a problem hiding this comment.
Since BackupDefinition is passed to BackupDestination it seems like it would be better to pass an interface instead of a class. If someone implements their own BackupDestination I think they would only want the BackupDefinition interface to have gettor methods. They are not interested in creating a definition but in simply reading an existing definition passed to their implementation of the backupFiles method.
Why is RestoreScript part of the BackupDefinition? From an implementor of backupFiles it seems like it would either want to know the "Path" that is the restore script. But it is unclear how the restore script will work for other destinations. Isn't all the meta info the restore script needs encoded already in the BackupDefinition itself (ie. all the info on different types of files).
Why is the definition so complex with all the different types of files. For the purposes of BackupDestination.backupFiles isn't all it needs is a list of Path instances to backup?
It is also unclear to me why this class uses "Path" instead of "File" for the files that are being backed up. Given either one you can create the other. Path is an interface and File is a class so I would vote for Path. Just wondering if other reasons exist for using it.
There was a problem hiding this comment.
Good point on using an interface that only exposes the getters.
The RestoreScript is part of the BackupDefinition to remove its logic from the BackupManager. Ultimately, we want to no longer have the script and instead have a manifest file that details the files and checks that need to be done to restore the backup (and do so in gemfire, not in a command line script), but that is a future feature.
The different file types are because we store those files in different directories in the backup, to prevent name collisions among other things. It is true that we could probably just have a list of files and have the methods that handle each type of file modify the path in the correct way. I think that the logic of where in the backup the file should go does belong here instead of in BackupManager.
I use Paths, generally, because they are now the prefered method of handling file IO (see https://docs.oracle.com/javase/tutorial/essential/io/legacy.html#), but also the Files utility class takes Paths only (added at the same time as Path) and I generally prefer using it (though there are still a few things that are easier to do with Apache Commons)
There was a problem hiding this comment.
But what would an implementor of BackupDestination do with the RestoreScript? Are they responsible for copying it? It seems like it might fit better on FileSystemBackupDestination
There was a problem hiding this comment.
Since we do not yet have a mechanism for getting backups returned to a system or restored to the correct locations, we still need that restore script. Once we have a way to restore the files without the script, we should be able to remove it in its entirety.
| import java.io.IOException; | ||
|
|
||
| public interface BackupDestination { | ||
| String USER_FILES_DIRECTORY = "user"; |
There was a problem hiding this comment.
Why does the BackupDestination interface have all these String constants? They seem like an implementation details that someone implementing this interface would not care about.
There was a problem hiding this comment.
If we get the BackupDefinition to contain only a list of files, then these would not be needed here (and could be moved into that class). These would be used to set the correct relative paths. We should consider that.
| return new HashSet<>(); | ||
| } | ||
| HashSet<PersistentID> persistentIds = new HashSet<>(); | ||
| tempDirectory = Files.createTempDirectory("backup_" + System.currentTimeMillis()); |
There was a problem hiding this comment.
My understanding is that each member backing up its disk stores now copies all the file to this "tempDirectory". This seems like a big change that could cause customers to need much more disk space during a backup. Even if we use hard links to move the files to this directory, hard links are not always available and it will then need to copy the file twice (once to the temp dir and once to the destination).
Why is a tempDirectory needed? Can we do a backup without it?
There was a problem hiding this comment.
The temporary directory is required because we now do not backup the Oplog files to their final destination while holding locks that prevent Oplog deletion. It also results in the user, config files, and deployed jars being saved in the backup in the state they are in closest to the backup, instead of eventually being copied (for example, in a future backup to cloud storage, where there would be a delay between the backup being "done" and transported to the storage provider.
This method does not inherently use more space during the backup than we did before (recently) adding the hard-link support, as we copied the oplog files to where the backup was conducted. Now, if their backup location is remote from their local filesystem and the temporary location is on a different filesystem than the disk stores (preventing hard-links), it will use more space on the local file system for the duration of the backup.
There was a problem hiding this comment.
This temporary directory change seems like a big change to how backup is done. If we could make sure that if the oplogs are on multiple file systems that we would copy them to a temp directory on the same file system then I'd accept this change.
There was a problem hiding this comment.
One option is to make a directory temporarily within the directory for each diskstore to hold the files for that diskstore during the backup. That should remove the likelihood that oplogs would need to be moved between filesystems.
| import org.apache.geode.internal.i18n.LocalizedStrings; | ||
|
|
||
| public class FileSystemBackupDestination implements BackupDestination { | ||
| static final String INCOMPLETE_BACKUP_FILE = "INCOMPLETE_BACKUP_FILE"; |
There was a problem hiding this comment.
This constant is already defined on BackupManager. I think it should only be defined in one place.
It seems wrong for BackupManager to be looking for this special file. Doesn't it need to be able to ask the BackupDestination if a previous backup exists? I think this magic file is used to determine if we can do an incremental backup. But since it is a file created when doing a backup, with the new BackupDestination interface, BackupManager should no longer think it will be on the file system.
There was a problem hiding this comment.
That would be ideal, but I have not yet handled incremental backups in a generic way: they still need to be using a backup that is accessible from the filesystem.
|
|
||
| cache.close(); | ||
| cache = new CacheFactory(null).create(); | ||
| cache = new CacheFactory().create(); |
There was a problem hiding this comment.
Why is it okay to not set MCAST_PORT to "0" here?
I think it would be better if you just had a private method that will create the cache and you could call that method in both setup and here.
There was a problem hiding this comment.
Good point: I will make that change.
* This effort decouples the definition of files to be backed up
from the destination of the backup. It will allow for the
development of different, pluggable, strategies for conducting
backups, such as compressed archive files of backups or backup
to cloud providers.
* A new BackupDestination interface is provided and the existing
logic for saving a backup to the filesystem has been moved into
an implementation of this interface, FileSystemBackupDestination.
During a backup, all files are copied to a temporary directory
and added to a definition of the backup. This definition is then
used to send the files to their ultimate destination. In the
current case of moving the files to a specified location on the
filesystem, this is implemented by moving the files from the
temporary location to the target backup directory.
…ache#1109)" This reverts commit 6c49506.
This effort decouples the definition of files to be backed up
from the destination of the backup. It will allow for the
development of different, pluggable, strategies for conducting
backups, such as compressed archive files of backups or backup
to cloud providers.
A new BackupDestination interface is provided and the existing
logic for saving a backup to the filesystem has been moved into
an implementation of this interface, FileSystemBackupDestination.
During a backup, all files are copied to a temporary directory
and added to a definition of the backup. This definition is then
used to send the files to their ultimate destination. In the
current case of moving the files to a specified location on the
filesystem, this is implemented by moving the files from the
temporary location to the target backup directory.