Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MDB-21949 Use tmp folder for restore/backup access control entities #155

Merged
merged 6 commits into from
May 30, 2024

Conversation

MikhailBurdukov
Copy link
Contributor

No description provided.

@MikhailBurdukov MikhailBurdukov force-pushed the MDB-21949-Use_tmp_dir_for_access_backup_restore branch from b7e207c to 5bf74a7 Compare May 24, 2024 13:08
@@ -45,14 +45,7 @@ Feature: Backup and restore functionality of replicated access control entities
Then we got the following backups on clickhouse01
| num | state | data_count | link_count |
| 0 | created | 1 | 0 |
When we execute command on clickhouse01
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed anymore, because we don't copy entities from the ZK to /var/lib/clickhouse/access

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, if we don't want to clean up access/tmp folders - should we change this step for checking tmp dir instead?

@MikhailBurdukov
Copy link
Contributor Author

Fix for tests : #153

"""
Interface for chown
"""
shutil.chown(path, user, group)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need to wrap this function with exact the same interface under a custom name then? originally it was done only for using our context

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like unused function and can be deleted?


def __exit__(self, exc_type, exc_value, traceback):
if exc_type is not None:
logging.warning("Dont remove tmp dir {} due to exception.", self._path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, do we want to log this exception?

self._backup_replicated(
backup_tmp_path, access_control.acl_ids, context
)
self._backup_local(clickhouse_access_path, backup_tmp_path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, am I right that now _backup_replicated() just copies data from ZK to tmp dir, while _backup_local() copies local files from access path to tmp and that's it?

And so:

  • _backup_local will always replace (in case of replicated storage) actual data from ZK with local version (because currently we do not clean them up)
  • we still will have a local (and soon "outdated") version of ACL in access folder anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_backup_local will always replace (in case of replicated storage) actual data from ZK with local version (because currently we do not clean them up)

Yeah, seem we should do not replace it, if we already have the same one, but replicated.

we still will have a local (and soon "outdated") version of ACL in access folder anyway?

Yes, we will have it. I think make sense to cleanup /var/lib/clickhouse/access in restore operation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we will have it. I think make sense to cleanup /var/lib/clickhouse/access in restore operation.

because I thought that originally we could have potential issue with local copies after each backup - in case of troubles with ZK we can't use replicated storage - so we can have a conflict with local sql files

so maybe we should clean it up after backup operation as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so maybe we should clean it up after backup operation as well?

Im not sure because suppose:

  1. we have 2 acl entries: 1 - old one without zk, 2 - with zk
  2. perform backup -> removing from /var/lib/clickhouse/acces.
  3. So after ch-server restart we will lose 1 entry.

It seems fine to do cleanup on restore, but we will lose non-replicated records when we do a backup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have 2 acl entries: 1 - old one without zk, 2 - with zk

is it already possible to have 2 entities in different storages (local and replicated)? because I've tested it initially (22.x), and found that you can use only one storage type for that (if already no - maybe it makes sense)

and if 1 possible storage is still relevant - it's better to clean it up after a backup as well (think about your current state on the clusters - they all have local files), or move/remove them during migration from replicated to local

otherwise I think current approach works only for new clusters and with new ch-backup version

ch_backup/backup/layout.py Outdated Show resolved Hide resolved
user = context.ch_ctl_conf["user"]
group = context.ch_ctl_conf["group"]

create_dir(clickhouse_access_path, user, group)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Actually, I don't see a necessity of creating separate function for this (creating folder and chown it). Two separate commands are relatively expressive.
  2. clickhouse_access_path does not exist here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clickhouse_access_path does not exist here ?

Yes, clickhouse lazily creates /var/lib/clickhouse/access

group = context.ch_ctl_conf["group"]

create_dir(clickhouse_access_path, user, group)
with temporary_directory(backup_tmp_path, user, group):
Copy link
Contributor

@aalexfvk aalexfvk May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Do we need to invent own class instead of the TemporaryDirectory
  2. Does this class produce leftovers in case of exception and is it expected

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For TemporaryDirectory there are no option to keep folder in case of exceptions.
https://github.com/python/cpython/blob/59630f92d8223f80993e3646b0f734d27f4b8dd4/Lib/tempfile.py#L944-L966

Copy link
Contributor

@aalexfvk aalexfvk May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to leave data in a temporary directory in case of an exception or a crash ? It might be corrupted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because it might be useful in trouble shooting. If we want to re-run operation(backup/restore), then old files will be replaced with a new one. Also it make sense to cleanup tmp folder before operation, if it exist.

Copy link
Contributor

@aalexfvk aalexfvk May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, nevertheless this class is too specific for a general class(to be placed in ch_backup/util.py). Someone might want to accidentally reuse it for their own purposes
I see the following options:

  1. Write a comment/docstring that this class is intended for using for access control backup procedure and write this motivation about keeping data in case of an exception
  2. Rename it to smth like access_control_temp_directory.
  3. Create a fully functional replacement for tempfile.TemporaryDirectory with keep parameter.
    Simplified example:
@contextmanager
def temporary_directory(keep=False):
    if keep:
        tmpdir = mkdtemp()
        yield tmpdir
    else:
        with TemporaryDirectory() as tmpdir:
            yield tmpdir

@yandex yandex deleted a comment from MikhailBurdukov May 27, 2024
Comment on lines 92 to 103
if (
context.backup_meta.access_control.backup_format
== BackupStorageFormat.TAR
):
context.backup_layout.download_access_control(
restore_tmp_path, context.backup_meta.name
)
else:
for name in _get_access_control_files(acl_ids):
context.backup_layout.download_access_control_file(
restore_tmp_path, context.backup_meta.name, name
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's extract this to a separate function _download_access_control

Comment on lines 83 to 94
clickhouse_access_path = context.ch_ctl_conf["access_control_path"]
restore_tmp_path = os.path.join(
context.ch_ctl_conf["tmp_path"], context.backup_meta.name
)
user = context.ch_ctl_conf["user"]
group = context.ch_ctl_conf["group"]

if os.path.exists(clickhouse_access_path):
shutil.rmtree(clickhouse_access_path)

os.makedirs(clickhouse_access_path)
shutil.chown(clickhouse_access_path, user, group)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense to extract this lines (except related tmp path) to a separate function with clean parameter.

@MikhailBurdukov MikhailBurdukov force-pushed the MDB-21949-Use_tmp_dir_for_access_backup_restore branch from f99c3d3 to 478c79f Compare May 29, 2024 14:50
@MikhailBurdukov
Copy link
Contributor Author

@ekopanev any comments?

Copy link
Contributor

@ekopanev ekopanev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rest LGTM

"""
Interface for chown
"""
shutil.chown(path, user, group)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like unused function and can be deleted?

@MikhailBurdukov MikhailBurdukov merged commit 63e5fbb into main May 30, 2024
14 checks passed
@MikhailBurdukov MikhailBurdukov deleted the MDB-21949-Use_tmp_dir_for_access_backup_restore branch May 30, 2024 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants