-
Notifications
You must be signed in to change notification settings - Fork 6.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New backup meta schema, with file temperatures #9660
Conversation
Summary: The primary goal of this change is to add support for backing up and restoring (applying on restore) file temperature metadata, without committing to either the DB manifest or the FS reported temperatures being exclusive "source of truth". To achieve this goal, we need to add temperature information to backup metadata, which requires updated backup meta schema. Fortunately I prepared for this in facebook#8609, which began forward compatibility in version 6.19.0 for this kind of schema update. (Previously, backup meta schema was not extensible! Making this schema update public will allow some other "nice to have" features like taking backups with hard links, and avoiding crc32c checksum computation when another checksum is already available.) While schema version 2 is newly public, the default schema version is still 1. Until we change the default, users will need to set to 2 to enable features like temperature data backup+restore. New metadata like temperature information will be ignored with a warning in versions before this change and since 6.19.0. The metadata is considered ignorable because a functioning DB can be restored without it. Some detail: * Some renaming because "future schema" is now just public schema 2. * Initialize some atomics in TestFs (linter reported) TODO?: consolidate terminology around either "actual" or "current" rather than mixing the two. Test Plan: related unit test majorly updated for the new functionality, including some shared testing support for tracking temperatures in a FS. Some other tests and testing hooks into production code also updated for making the backup meta schema change public.
Requesting @ajkr review publishing backup schema_version=2 and @jay-zhuang look at temperature-related functionality and testing |
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@@ -1424,14 +1442,20 @@ IOStatus BackupEngineImpl::CreateNewBackupWithMetadata( | |||
item.result.wait(); | |||
auto result = item.result.get(); | |||
item_io_status = result.io_status; | |||
Temperature temp = result.expected_src_temperature; | |||
if (result.current_src_temperature != Temperature::kUnknown && | |||
(temp == Temperature::kUnknown || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it mean that if the DB runs with non tiered storage mode, we will always go to check temperature from the file system?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like filesystem is always queried no matter the storage mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FileSystem object is always queried, which for non-custom FS is a no-op. Therefore, putting this behind an option would just create one more way to do the wrong thing with probably no tangible benefit vs. just always trying to do the right thing. Querying the Temperature of an open file should be an in-memory operation, because the custom FileSystem should know what tier it is talking to. And if you really want to hide Temperature info or avoid querying, you can do that with a FileSystem wrapper that always returns Temperature::kUnknown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM if Siying is satisfied with the answer to his question
@@ -1424,14 +1442,20 @@ IOStatus BackupEngineImpl::CreateNewBackupWithMetadata( | |||
item.result.wait(); | |||
auto result = item.result.get(); | |||
item_io_status = result.io_status; | |||
Temperature temp = result.expected_src_temperature; | |||
if (result.current_src_temperature != Temperature::kUnknown && | |||
(temp == Temperature::kUnknown || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like filesystem is always queried no matter the storage mode.
// option is false (default), (b) overrides (a) if both are not UNKNOWN. | ||
// When true, (a) overrides (b) if both are not UNKNOWN. Regardless of this | ||
// setting, a known temperature overrides UNKNOWN. | ||
bool current_temperatures_override_manifest = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it feels more natural to restore the DB files to the pre-backup storage state, i.e., according to GetTemperature()
. It'd be a behavior change though.
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Summary:
The primary goal of this change is to add support for backing up and
restoring (applying on restore) file temperature metadata, without
committing to either the DB manifest or the FS reported "current"
temperatures being exclusive "source of truth".
To achieve this goal, we need to add temperature information to backup
metadata, which requires updated backup meta schema. Fortunately I
prepared for this in #8069, which began forward compatibility in version
6.19.0 for this kind of schema update. (Previously, backup meta schema
was not extensible! Making this schema update public will allow some
other "nice to have" features like taking backups with hard links, and
avoiding crc32c checksum computation when another checksum is already
available.) While schema version 2 is newly public, the default schema
version is still 1. Until we change the default, users will need to set
to 2 to enable features like temperature data backup+restore. New
metadata like temperature information will be ignored with a warning
in versions before this change and since 6.19.0. The metadata is
considered ignorable because a functioning DB can be restored without
it.
Some detail:
Test Plan: related unit test majorly updated for the new functionality,
including some shared testing support for tracking temperatures in a FS.
Some other tests and testing hooks into production code also updated for
making the backup meta schema change public.