Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup Deletion #33

Merged
merged 23 commits into from
Oct 5, 2023
Merged

Backup Deletion #33

merged 23 commits into from
Oct 5, 2023

Conversation

OpenBagTwo
Copy link
Owner

@OpenBagTwo OpenBagTwo commented Oct 5, 2023

Summary

Implements #8

List of Changes

  • Adds a new verb: gsb delete which can be used to delete one or more backups
    • These backups can be specified as tags, commit shortnames or full commit hashes
    • If no revisions are provided, the user will be prompted to enter the revision names (multiselect: yes, select-by-number: no)
    • This can only delete revisions within the linear commit history of the current HEAD (discoverable via repo.walk())
    • Deletion itself is performed by manually implementing an interactive rebase with any deleted commits getting squashed into the next commit (but with the latter commit's message being kept)
      1. Finding the oldest revision and creating a new branch that's reset to that commit's parent
      2. Fast-forwarding (git reset --hard commit && git reset --soft head) to each subsequent backup that won't be deleted
      3. and re-committing that to the new history
      4. updating tags if all goes well
      5. and then deleting the current gsb branch and pointing it to the rewritten branch's HEAD
    • If the history-rewrite fails for any reason, the original branch is retained, so this should be pretty safe
    • On the other hand, this does not trigger the actual deletion of any objects. See discussion here: APIs for git-gc and git-reflog libgit2/libgit2#3247
  • Under the hood, updates the pygit2 wrapper module to support:
    • branch checkouts (both new and existing)
    • branch deletions
    • tag deletions
    • choosing the initial branch name on git init (for use in testing)
    • setting timestamps explicitly when performing a git commit
  • backup.create_backup now allows for separately specifying the tag name and commit message (though this is not exposed to the CLI)
    • the default commit message is now "GSB-managed commit" instead of "Manual backup"
  • history.get_history now has two additional features (again, not exposed to the CLI):
  • On the testing front, _repo_with_history has been moved to conftest.py where the expensive module-level fixture can be used globally
    • As some additional sugar, the root fixture is a test-level fixture created by fully copying the repo-with-history so that tests can make additional revisions (or so we can, like, safely test deletions)
  • Finally, I attempted to implement gsb history should show the last backup by default #30 through 417c5b4, but as the work was more complicated than it originally appeared, I ended up reverting the change so it could be tackled in a separate PR

Tech Debt and Other Concerns

  • Both in the CLI prompting and in how history.get_history and fastforward.delete_backups walk the history, it feels like there's room for the core of the implementations to be shared. But at this point I think it's wise to postpone further abstraction until there's another use-case
  • The list of backups obtained for the backup deletion CLI is currently obtained via two history.get_history calls (and thus two repo-history walks). It feels like this could be done all in one go. This is related to how history.get_history is getting so many non-CLI-facing modifications--at some point soon it seems likely that we'll need to abstract away the innards of the get_history method into a more flexible, lower level utility.

Validation Performed

  • I created a dummy repo (non-GSB, actually), used the Git CLI to create a history and then started deleting the history via gsb delete, in both prompted and argument modes.
  • I also ran gsb delete (no args) in one of my big gsb-managed saves and got the prompt I was expecting.
  • Before merging this PR, I will use gsb delete to delete a backup on one of my actual game-save backups, will follow that up by running git gc and will observe that the size of the repo has decreased

PR Type

  • This PR introduces a breaking change (will
    require a bump in the minor version)
  • The changes in this PR are high urgency and necessitate a hotfix or patch
    release (will require rebasing off of release)
  • This is a release (staging) PR (maintainer use only)

Checklist:

  • I have read the contributor's guide
  • I have run mkdocs serve locally and ensured that all API docs and
    changes I have made to the static pages are rendering correctly, with all links
    working
  • All tech debt concerns have been resolved, documented as issues, or otherwise
    accepted
  • I agree to license my contribution to this project under
    the GNU Public License v3

- refactoring the "repo with history" fixture for shared use
- fixing some implementation mistakes with the fastforward stump
Only issue is handling pre-GSB backups.
If the branch doesn't exist, then why would you be trying to check it out?
Also: change test module name to align with CLI
... which important-logs the full history (which I don't want) and doesn't even let me use full commit hashes
also a fixture refactor so that ALL tests are using the cloned repo
Out of scope, but why not?
I just realized that with the numbering it's a lot more complicated, and since
it's outside the scope of this PR, I need to tackle it separately.

This reverts commit 417c5b4.
@OpenBagTwo OpenBagTwo linked an issue Oct 5, 2023 that may be closed by this pull request
It's what the fixture is for
@OpenBagTwo
Copy link
Owner Author

$ cd /path/to/my/minecraft/worlds/backup
$ du -h -d0 .
64G	.
$ git gc
one eternity later
Enumerating objects: 55317, done.
Counting objects: 100% (55317/55317), done.
Delta compression using up to 24 threads
Compressing objects: 100% (22202/22202), done.
Writing objects: 100% (55317/55317), done.
Total 55317 (delta 37312), reused 48423 (delta 32838), pack-reused 0
Enumerating cruft objects: 1548, done.
Traversing cruft objects: 3255, done.
Counting objects: 100% (1548/1548), done.
Delta compression using up to 24 threads
Compressing objects: 100% (1238/1238), done.
Writing objects: 100% (1548/1548), done.
Total 1548 (delta 710), reused 884 (delta 310), pack-reused 0
$ du -h -d0 .
55GB	.
$ gsb delete <17-untagged-backup-hashes>
Could not delete branch gsb:
    'Branch not found: gsb'
To permanently delete these backups, run the command:
  git gc
$ du -h -d0
55G	.
$ git gc
a few moments later
Enumerating objects: 55328, done.
Counting objects: 100% (55328/55328), done.
Delta compression using up to 24 threads
Compressing objects: 100% (17739/17739), done.
Writing objects: 100% (55328/55328), done.
Total 55328 (delta 37312), reused 55307 (delta 37312), pack-reused 0
Enumerating cruft objects: 1558, done.
Traversing cruft objects: 3274, done.
Counting objects: 100% (1558/1558), done.
Delta compression using up to 24 threads
Compressing objects: 100% (848/848), done.
Writing objects: 100% (1558/1558), done.
Total 1558 (delta 711), reused 1557 (delta 710), pack-reused 0
$ du -h -d0 .
55G	.

Oh right. Because the original branch wasn't gsb and was thus not deleted (that was the whole "Could not delete branch gsb" thing) , so those old commits aren't actually orphaned.

$ git branch
* gsb
  main
$ git branch -D main
Deleted branch main (was 05a457b2).
$ git gc
a few moments later
Enumerating objects: 55328, done.
Counting objects: 100% (55328/55328), done.
Delta compression using up to 24 threads
Compressing objects: 100% (17739/17739), done.
Writing objects: 100% (55328/55328), done.
Total 55328 (delta 37312), reused 55328 (delta 37312), pack-reused 0
Enumerating cruft objects: 1558, done.
Traversing cruft objects: 3274, done.
Counting objects: 100% (1558/1558), done.
Delta compression using up to 24 threads
Compressing objects: 100% (847/847), done.
Writing objects: 100% (1558/1558), done.
Total 1558 (delta 711), reused 1558 (delta 711), pack-reused 0
$ du -h -d0 .
55G	.

🤔

one stack exchange later

$ git gc --prune=now
Enumerating objects: 55328, done.
Counting objects: 100% (55328/55328), done.
Delta compression using up to 24 threads
Compressing objects: 100% (17739/17739), done.
Writing objects: 100% (55328/55328), done.
Total 55328 (delta 37312), reused 55328 (delta 37312), pack-reused 0
$ du -h -d0
54G	.

After all that back-and-forth that's not hugely impressive, but it is something...

Anyway, clearly the message needs to be updated, because git gc won't delete diddly for two weeks by default.

@OpenBagTwo
Copy link
Owner Author

OpenBagTwo commented Oct 5, 2023

Message now reads:

Deleted backups are now marked as "loose." To delete them immediately, run the command:
  git gc --aggressive --prune=now

(as tested on my small artificial repo)

@OpenBagTwo
Copy link
Owner Author

OpenBagTwo commented Oct 5, 2023

I also tested this on one of my coding git repos:

$ du -h -d0 .
98M     .
$ git prune
$ du -h -d0 .
88M     .
$ git gc --aggressive --prune=now
Enumerating objects: 5391, done.
Counting objects: 100% (5391/5391), done.
Delta compression using up to 24 threads
Compressing objects: 100% (4744/4744), done.
Writing objects: 100% (5391/5391), done.
Total 5391 (delta 3717), reused 819 (delta 0), pack-reused 0
$ du -h -d0 .
62M     .

So this confirms that git gc --prune=now is worth doing over just git prune (because of all that compression and packing, I guess).

I do like that git prune --dry-run enumerates all of the loose objects, though...

@OpenBagTwo OpenBagTwo merged commit 71a1357 into dev Oct 5, 2023
4 checks passed
@OpenBagTwo OpenBagTwo deleted the dont-fear-the-rebase branch October 5, 2023 18:33
@OpenBagTwo OpenBagTwo mentioned this pull request Oct 11, 2023
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Delete a backup
1 participant