-
Notifications
You must be signed in to change notification settings - Fork 616
Snh/passive backup #340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Snh/passive backup #340
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Quiet audit-log restores if there are not indices
Routes for the gist repositories being restored are calculated remotely so we only need to rsync once per storage server available. This needs some server side support from GitHub Enterprise 2.6.3 (unreleased). The change is backwards compatible and only affects cluster restores. backup-utils will use the old/slower script when restoring to older GitHub Enterprise versions.
Cluster: faster gist restores
Actually required when using `--files-from` This reverts commit b110c66.
Bump version: 2.6.1
Simple benchmarking that logs the time it takes to restore data. Looks something like: $ cat data/26-snap/current/benchmarks/benchmark.1464253606.log ghe-import-mysql took: 7s ghe-import-redis took: 43s ghe-restore-repositories-dgit took: 11s ghe-restore-alambic-cluster-ng took: 6s ghe-restore-git-hooks-cluster took: 2s ghe-restore-es-audit-log took: 0s The log is timestamped and stored in the benchmarks directory, along with the data that was restored.
Benchmarking restores
The output is sent to fd#3, so only available when using the verbose flag (-v). This is consistent with the other scripts using rsync and also helpful for support/troubleshooting.
Verbose rsync when restoring gist repositories
restore host keys for cluster environment as well
Revert "restore host keys for cluster environment as well"
Restores Git over SSH (babeld) keys and distribute them to all the nodes in the cluster. Initially fixed by #228 and later reverted in #229, this new patch prevents the SSH host keys to be replaced, breaking `ghe-restore` in the process (new SSH connections after the host key restore will fail). Fixes https://github.com/github/backup-utils/226
Clusters: Restore Git over SSH host keys
We now use a tarball based backup/restore approach, fixing problems with permissions when backing up and restoring user provided environments. From @snh: If files or directories within a hook environment are lacking the user read and/or write bits, then a number of issues arise: * The hook environment cannot be backed up by the backup utilities, and causes a backup failure, as the backup utilities needs the user read bit to access all files and directories as the git user. * Older backup snapshots cannot be pruned, as the backup utilities needs the user write bit to remove all files and directories as the account that is running the backup utilities. @dbussink, @WillAbides and @snh: thanks for the new implementation and feedback.
We now rsync once per cluster node available instead of rsyncing each Git repository individually. Some simple benchmarks restoring a snapshot with 1183 repositories (13 GiB): * Using backup-utils 2.6.1 ``` real 20m34.923s user 5m1.888s sys 2m39.983s ``` * Using the new implementation ``` real 9m0.368s user 2m46.912s sys 1m18.746s ``` The old implementation is able to restore ~1 repo/s so restoring backup snapshots with a large number of repositories and a fast network will benefit the most from this. Here's the time it takes to restore 8K repositories (~800 MiB), all of them very similar in size (100K, with a single README file added): * Using backup-utils 2.6.1 ``` real 111m45.370s user 7m54.829s sys 6m38.247s ``` * Using the new implementation ``` real 6m20.087s user 0m16.509s sys 1m42.616s ``` In clusters with more than 3 Git server nodes, backup-utils 2.6.1 also restores the repositories to all the Git server nodes available. Only three copies of a Git repository are necessary so this patch also fixes that, speeding things up and optimizing disk usage. /cc @github/backup-utils
Verbose rsync when using `ghe-restore` or `ghe-backup` with `-v`. Print `ghe-hook-env-update` output when using verbose mode only. /cc @dbussink
Clusters: speedup repositories restore
git-hooks backup/restore fixes
Bump version: 2.6.2
Fixes a regression in backup-utils 2.6.2 when backing up GitHub Enterprise clusters not using custom pre-receive Git hooks environments.
Git hooks backup fixes
* Use headers for linkability * Some other minor tweaks
…up-utils into buckelij/redis-retry
Use default niceness for restores
Retry loop for redis-cli BGSAVE
Improve detection of failures in cluster backup rsync threads
Use existing Elasticsearch indices to speed up transfer during a restore
Include the user data directory in the benchmark name
Bump version: 2.10.0
Explicitly state OpenSSH in requirements
jeluhu
pushed a commit
that referenced
this pull request
Jun 12, 2023
[PERF] Change the restore logic
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.