Skip to content

GEODE-3799: Move backups towards a pluggable architecture#1109

Merged
nreich merged 8 commits intoapache:developfrom
nreich:feature/GEODE-3799
Dec 14, 2017
Merged

GEODE-3799: Move backups towards a pluggable architecture#1109
nreich merged 8 commits intoapache:developfrom
nreich:feature/GEODE-3799

Conversation

@nreich
Copy link
Copy Markdown

@nreich nreich commented Nov 30, 2017

  • This effort decouples the definition of files to be backed up
    from the destination of the backup. It will allow for the
    development of different, pluggable, strategies for conducting
    backups, such as compressed archive files of backups or backup
    to cloud providers.

  • A new BackupDestination interface is provided and the existing
    logic for saving a backup to the filesystem has been moved into
    an implementation of this interface, FileSystemBackupDestination.
    During a backup, all files are copied to a temporary directory
    and added to a definition of the backup. This definition is then
    used to send the files to their ultimate destination. In the
    current case of moving the files to a specified location on the
    filesystem, this is implemented by moving the files from the
    temporary location to the target backup directory.

  * This effort decouples the definition of files to be backed up
    from the destination of the backup. It will allow for the
    development of different, pluggable, strategies for conducting
    backups, such as compressed archive files of backups or backup
    to cloud providers.

  * A new BackupDestination interface is provided and the existing
    logic for saving a backup to the filesystem has been moved into
    an implementation of this interface, FileSystemBackupDestination.
    During a backup, all files are copied to a temporary directory
    and added to a definition of the backup. This definition is then
    used to send the files to their ultimate destination. In the
    current case of moving the files to a specified location on the
    filesystem, this is implemented by moving the files from the
    temporary location to the target backup directory.

import org.apache.geode.cache.DiskStore;

class BackupDefinition {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since BackupDefinition is passed to BackupDestination it seems like it would be better to pass an interface instead of a class. If someone implements their own BackupDestination I think they would only want the BackupDefinition interface to have gettor methods. They are not interested in creating a definition but in simply reading an existing definition passed to their implementation of the backupFiles method.

Why is RestoreScript part of the BackupDefinition? From an implementor of backupFiles it seems like it would either want to know the "Path" that is the restore script. But it is unclear how the restore script will work for other destinations. Isn't all the meta info the restore script needs encoded already in the BackupDefinition itself (ie. all the info on different types of files).

Why is the definition so complex with all the different types of files. For the purposes of BackupDestination.backupFiles isn't all it needs is a list of Path instances to backup?

It is also unclear to me why this class uses "Path" instead of "File" for the files that are being backed up. Given either one you can create the other. Path is an interface and File is a class so I would vote for Path. Just wondering if other reasons exist for using it.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point on using an interface that only exposes the getters.

The RestoreScript is part of the BackupDefinition to remove its logic from the BackupManager. Ultimately, we want to no longer have the script and instead have a manifest file that details the files and checks that need to be done to restore the backup (and do so in gemfire, not in a command line script), but that is a future feature.

The different file types are because we store those files in different directories in the backup, to prevent name collisions among other things. It is true that we could probably just have a list of files and have the methods that handle each type of file modify the path in the correct way. I think that the logic of where in the backup the file should go does belong here instead of in BackupManager.

I use Paths, generally, because they are now the prefered method of handling file IO (see https://docs.oracle.com/javase/tutorial/essential/io/legacy.html#), but also the Files utility class takes Paths only (added at the same time as Path) and I generally prefer using it (though there are still a few things that are easier to do with Apache Commons)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But what would an implementor of BackupDestination do with the RestoreScript? Are they responsible for copying it? It seems like it might fit better on FileSystemBackupDestination

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we do not yet have a mechanism for getting backups returned to a system or restored to the correct locations, we still need that restore script. Once we have a way to restore the files without the script, we should be able to remove it in its entirety.

import java.io.IOException;

public interface BackupDestination {
String USER_FILES_DIRECTORY = "user";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the BackupDestination interface have all these String constants? They seem like an implementation details that someone implementing this interface would not care about.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get the BackupDefinition to contain only a list of files, then these would not be needed here (and could be moved into that class). These would be used to set the correct relative paths. We should consider that.

return new HashSet<>();
}
HashSet<PersistentID> persistentIds = new HashSet<>();
tempDirectory = Files.createTempDirectory("backup_" + System.currentTimeMillis());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that each member backing up its disk stores now copies all the file to this "tempDirectory". This seems like a big change that could cause customers to need much more disk space during a backup. Even if we use hard links to move the files to this directory, hard links are not always available and it will then need to copy the file twice (once to the temp dir and once to the destination).

Why is a tempDirectory needed? Can we do a backup without it?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The temporary directory is required because we now do not backup the Oplog files to their final destination while holding locks that prevent Oplog deletion. It also results in the user, config files, and deployed jars being saved in the backup in the state they are in closest to the backup, instead of eventually being copied (for example, in a future backup to cloud storage, where there would be a delay between the backup being "done" and transported to the storage provider.

This method does not inherently use more space during the backup than we did before (recently) adding the hard-link support, as we copied the oplog files to where the backup was conducted. Now, if their backup location is remote from their local filesystem and the temporary location is on a different filesystem than the disk stores (preventing hard-links), it will use more space on the local file system for the duration of the backup.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This temporary directory change seems like a big change to how backup is done. If we could make sure that if the oplogs are on multiple file systems that we would copy them to a temp directory on the same file system then I'd accept this change.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option is to make a directory temporarily within the directory for each diskstore to hold the files for that diskstore during the backup. That should remove the likelihood that oplogs would need to be moved between filesystems.

import org.apache.geode.internal.i18n.LocalizedStrings;

public class FileSystemBackupDestination implements BackupDestination {
static final String INCOMPLETE_BACKUP_FILE = "INCOMPLETE_BACKUP_FILE";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This constant is already defined on BackupManager. I think it should only be defined in one place.
It seems wrong for BackupManager to be looking for this special file. Doesn't it need to be able to ask the BackupDestination if a previous backup exists? I think this magic file is used to determine if we can do an incremental backup. But since it is a file created when doing a backup, with the new BackupDestination interface, BackupManager should no longer think it will be on the file system.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be ideal, but I have not yet handled incremental backups in a generic way: they still need to be using a backup that is accessible from the filesystem.


cache.close();
cache = new CacheFactory(null).create();
cache = new CacheFactory().create();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it okay to not set MCAST_PORT to "0" here?
I think it would be better if you just had a private method that will create the cache and you could call that method in both setup and here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point: I will make that change.

@nreich nreich merged commit 6c49506 into apache:develop Dec 14, 2017
@nreich nreich deleted the feature/GEODE-3799 branch December 14, 2017 22:51
PivotalSarge pushed a commit to upthewaterspout/geode that referenced this pull request Dec 15, 2017
  * This effort decouples the definition of files to be backed up
    from the destination of the backup. It will allow for the
    development of different, pluggable, strategies for conducting
    backups, such as compressed archive files of backups or backup
    to cloud providers.

  * A new BackupDestination interface is provided and the existing
    logic for saving a backup to the filesystem has been moved into
    an implementation of this interface, FileSystemBackupDestination.
    During a backup, all files are copied to a temporary directory
    and added to a definition of the backup. This definition is then
    used to send the files to their ultimate destination. In the
    current case of moving the files to a specified location on the
    filesystem, this is implemented by moving the files from the
    temporary location to the target backup directory.
kirklund added a commit to kirklund/geode that referenced this pull request Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants