Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support prune empty commit #27

Open
is opened this issue Nov 8, 2013 · 12 comments
Open

Support prune empty commit #27

is opened this issue Nov 8, 2013 · 12 comments

Comments

@is
Copy link

is commented Nov 8, 2013

Like git branch-filter --prune-empty

@mr-c
Copy link

mr-c commented Apr 4, 2014

This issue is a big deal.

BFG appears to only remove empty commits from the descendents of HEAD.

Due to not noticing the discrepency this caused a big headache for me. I ended up redoing the clean up with git-filter-branch & --prune-empty; fortunately it only took 2 minutes to run.

Here's a complete rundown on what I ran including stripping out the synthetic GitHub pull request refs.

git clone --mirror git@github.com:ged-lab/khmer.git
cd khmer.git
du -hs
# 113M
git config --unset-all remote.origin.fetch
git config --add remote.origin.fetch '+refs/heads/*:refs/heads/*'
git config --add remote.origin.fetch '+refs/tags/*:refs/heads/*'
rm -Rf refs/pull
sed -i '/.*pull.*/d' packed-refs
# use a modified git-largest-object.sh to sort based on packed size & to work on bare repo
# examine output to craft the next command
PATHS="space separated list of paths to remove"
git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch $PATHS --prune-empty --tag-name-filter cat -- --all
# takes ~1 minute; curly-bracket globs don't work here
rm -Rf refs/original
git reflog expire --expire=now --all
git gc --prune=now
git gc --aggressive --prune=now
du -hs
# 25M
#git push
# when you're ready

@rtyley
Copy link
Owner

rtyley commented Apr 4, 2014

BFG appears to only remove empty commits from the descendents of HEAD.

Ah, you might be getting a bit confused here - The BFG currently doesn't remove empty commits from anywhere - descendants of HEAD or no.

If you're not seeing empty commits in the history of your HEAD commit, but are seeing empty commits in other branches, this is probably because The BFG protects the contents of the HEAD commit by default, and generally won't remove files from history if they're already present in a protected commit:

http://rtyley.github.io/bfg-repo-cleaner/#protected-commits

...so if the history of your HEAD doesn't have empty commits, that's just because the contents where protected by your HEAD commit, and so the corresponding contents will not have been removed. In other branches, unprotected content will have been removed, and this may well have lead to commits on those branches becoming empty diffs.

@mr-c
Copy link

mr-c commented Apr 5, 2014

Then why did these branches diverge so much?

# original
mcrusoe@athyra:~/khmer/reposhrink/khmer.backup$ git show --raw `git merge-base master origin/feature/citations`
commit 58af106053356dfcb4a43bbc0a6f1614f7d5ac44
Author: C. Titus Brown <titus@idyll.org>
Date:   Mon Mar 31 23:21:45 2014 -0400

    added screed __version__ to info()

:100644 100644 3a9f0f0... a8f3ea9... M  khmer/khmer_args.py
# post bfg
mcrusoe@athyra:~/khmer/reposhrink/mirror-khmer.backup/khmer-bfg2.git$ git show --raw `git merge-base feature/citations master`
commit 0aaef480d02bcc955ebba0655dc1323fbe51ccc3
Author: C. Titus Brown <titus@idyll.org>
Date:   Sat Sep 18 18:42:30 2010 -0400

    fixed consume_fasta_and_tag for density approach

:100644 100644 0c79820... 73f2702... M  lib/hashbits.cc

Here is the output of that bfg run in between:

mcrusoe@athyra:~/khmer/reposhrink/mirror-khmer.backup/khmer-bfg2.git$ java -jar ~/khmer/gl-master/bfg-1.11.2.jar -D '{test-overlap1.ht,stamps-reads.fa.gz.bin,stamps-reads.fa.gz.bin.index,1m-filtered.fa,MSB2-surrender.fa,25k.fq.gz.bin,part-test.fa}' .                    [1095/9409]

Using repo : /home/mcrusoe/khmer/reposhrink/mirror-khmer.backup/khmer-bfg2.git/.

Found 677 objects to protect
Found 6 tag-pointing refs : refs/tags/2012-assembly-artifacts, refs/tags/2012-paper-kmer-percolation, refs/tags/2013-khmer-counting, ...
Found 39 commit-pointing refs : HEAD, refs/heads/bleeding-edge, refs/heads/calc-median-updates, ...

Protected commits
-----------------

These are your protected commits, and so their contents will NOT be altered:

 * commit 2988a630 (protected by 'HEAD')

Cleaning
--------

Found 4161 commits
Cleaning commits:       100% (4161/4161)
Cleaning commits completed in 3,991 ms.

Updating 44 Refs
----------------

        Ref                                          Before     After   
        ----------------------------------------------------------------
        refs/heads/bleeding-edge                   | 97f15c59 | 8060ee61
        refs/heads/calc-median-updates             | c41afcd7 | 74faee10
        refs/heads/calc_best_assembly              | f1f9f5fa | bc4cc5d5
        refs/heads/docs_comparison_info            | 69bda540 | f8d62545
        refs/heads/feature/citations               | ebeb8501 | 76b4891d
        refs/heads/feature/hll-counter             | 0d449acb | 4483e1e0
        refs/heads/feature/missing_file_exceptions | 8615c17e | 8fdd72b9
        refs/heads/fishjord_graphalign             | d145b3bc | c4dc6518
        refs/heads/fix/count_overlap               | 850fd8ce | 95ca7078
        refs/heads/fix/hash_sizes                  | 5ee9cebb | 41538717
        refs/heads/galaxy-integration              | dd2648b2 | 47daa076
        refs/heads/graphalign-fj                   | e15ebb6c | 6066f4fb
        refs/heads/kmer_error_profile              | 5fd0e671 | 49da8218
        refs/heads/label_align                     | c1f25e8c | c7403ae6
        refs/heads/label_traverse                  | 2674402c | adfaabd3
        refs/heads/legacy                          | a2766d47 | 4faa86dc
        refs/heads/location_kmer                   | db03b0f1 | e52e2827
        refs/heads/master                          | 2988a630 | 17f8cc9a
        refs/heads/mwright/opt_nbm                 | 2184c2f1 | 6760460f
        refs/heads/parallel                        | 2c4ef321 | 34825b8e
        refs/heads/partition_fq_fix_legacy         | 52bbb5ff | 80d36d58
        refs/heads/protocols-v0.8.5                | 1cd14221 | 214dfd0a
        refs/heads/refactor/cython_bindings        | 7b757e47 | 39ad33d4
        refs/heads/reservoir_sampling2             | 253185b4 | eeef5570
        refs/heads/sparse_median                   | a58d8915 | 9723f244
        refs/heads/split_interleave                | 7a5c8331 | 4321513b
        refs/heads/update_trimmomatic_legacy       | 509068ea | 36cbcc81
        refs/tags/2012-assembly-artifacts          | 91af914b | 5cfd1f55
        refs/tags/2012-paper-diginorm              | c8e942ae | 49d59ef7
        refs/tags/2012-paper-kmer-percolation      | 33afbdf3 | 4129cff4
        refs/tags/2013-caltech-cemi                | 8bd74039 | bd1fd107
        refs/tags/2013-khmer-counting              | 20f56b27 | 21f51689
        refs/tags/iPlantDiscoveryEnvironment       | 1834774b | 1683c416
        refs/tags/protocols-v0.8.3                 | 997b7de6 | e5321213
        refs/tags/protocols-v0.8.5                 | 997b7de6 | e5321213
        refs/tags/v0.5                             | ca0c9919 | 6a782dba
        refs/tags/v0.6.1                           | 849a9362 | 558126d5
        refs/tags/v0.7                             | 656f7570 | 4b0af905
        refs/tags/v0.7.1                           | 5a3f3597 | ff867b10
        refs/tags/v0.8                             | f923ecf3 | 5e0280b1
        refs/tags/v0.8-rc1                         | dcb7ce6b | 58b31051
        refs/tags/v0.8-rc2                         | 43983a0b | 8542925a
        refs/tags/v0.8-rc3                         | 4f026c07 | fde26162
        refs/tags/v1.0                             | 2988a630 | 17f8cc9a

Updating references:    100% (44/44)
...Ref update completed in 76 ms.

Commit Tree-Dirt History
------------------------

        Earliest                                              Latest
        |                                                          |
        ...DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD

        D = dirty commits (file tree fixed)
        m = modified commits (commit message or parents changed)
        . = clean commits (no changes to file tree)

                                Before     After   
        -------------------------------------------
        First modified commit | 8e6b90d0 | d2a8dfe0
        Last dirty commit     | 0d449acb | 4483e1e0


In total, 6132 object ids were changed - a record of these will be written to:

        /home/mcrusoe/khmer/reposhrink/mirror-khmer.backup/khmer-bfg2.git/..bfg-report/2014-04-04/20-14-31/object-id-map.old-new.txt

BFG run is complete!

@mr-c
Copy link

mr-c commented Apr 5, 2014

in the bfg'd repo the following empty commit is after 0aaef480d02bcc955ebba0655dc1323fbe51ccc3 and is the source of the divergence:

mcrusoe@athyra:~/khmer/reposhrink/mirror-khmer.backup/khmer-bfg2.git$ git show --raw 5ddd819fd896e3dfa1d60dd0c39cd3a68b711c7c
commit 5ddd819fd896e3dfa1d60dd0c39cd3a68b711c7c
Author: C. Titus Brown <titus@idyll.org>
Date:   Sat Sep 18 22:06:46 2010 -0400

    added annoying surrender set
mcrusoe@athyra:~/khmer/reposhrink/mirror-khmer.backup/khmer-bfg2.git$ grep 5ddd819fd896e3dfa1d60dd0c39cd3a68b711c7c ..bfg-report/2014-04-04/20-14-31/object-id-map.old-new.txt | awk '{ print $2 }'
60c191b93a34b9cf81953a17e10ae2d7bfdff848

The original version commit:

mcrusoe@athyra:~/khmer/reposhrink/mirror-khmer.backup/khmer.git$ git show --raw 60c191b93a34b9cf81953a17e10ae2d7bfdff848
commit 60c191b93a34b9cf81953a17e10ae2d7bfdff848
Author: C. Titus Brown <titus@idyll.org>
Date:   Sat Sep 18 22:06:46 2010 -0400

    added annoying surrender set

:000000 100644 0000000... 9ebab4b... A  data/MSB2-surrender.fa

@rtyley
Copy link
Owner

rtyley commented Apr 5, 2014

Could you share the original repo (before cleaning) with me?

Thanks for that diagnostic information you've already sent - unfortunately it doesn't quite give me enough information to get a clear picture of the evidence for your assertion that The BFG sometimes prunes empty commits. By 'pruning empty commits', I mean entire commits being removed from commit history when they no longer contain any file changes in their cleaned form, and as I said, I don't think The BFG currently does that at all.

rtyley added a commit that referenced this issue May 4, 2014
This feature removes commits that- after the cleaning process -contain *no*
file-tree change when compared to their parent commit. This would be
because the cleaning process has cleaned away whatever content it was that
was _changing_ in the original commit.

The option is off by default, it's activated by using the
`--prune-empty-commits` flag, eg:

$ bfg --delete-files foo --prune-empty-commits

#27
rtyley added a commit that referenced this issue May 14, 2014
This feature removes commits that- after the cleaning process -contain *no*
file-tree change when compared to their parent commit. This would be
because the cleaning process has cleaned away whatever content it was that
was _changing_ in the original commit.

The option is off by default, it's activated by using the
`--prune-empty-commits` flag, eg:

$ bfg --delete-files foo --prune-empty-commits

#27
@lipnitsk
Copy link

@rtyley, are you planning to merge these changes to master? I think this is a very useful feature.

@bogdanm
Copy link

bogdanm commented Aug 21, 2015

+1. Can indeed be very useful.

@gaborbernat
Copy link

+1

1 similar comment
@okravets
Copy link

okravets commented Sep 2, 2015

+1

@rtyley
Copy link
Owner

rtyley commented Sep 2, 2015

Working on the many open source projects I give to the community takes up a
large proportion of my spare time. If you'd like to support development of
this feature for the BFG, please donate to help me at
https://www.bountysource.com/teams/bfg-repo-cleaner
On 2 Sep 2015 21:28, "okravets" notifications@github.com wrote:

+1


Reply to this email directly or view it on GitHub
#27 (comment)
.

mdengler pushed a commit to mdengler/bfg-repo-cleaner that referenced this issue Dec 22, 2015
This feature removes commits that- after the cleaning process -contain *no*
file-tree change when compared to their parent commit. This would be
because the cleaning process has cleaned away whatever content it was that
was _changing_ in the original commit.

The option is off by default, it's activated by using the
`--prune-empty-commits` flag, eg:

$ bfg --delete-files foo --prune-empty-commits

rtyley#27
javabrett pushed a commit to javabrett/bfg-repo-cleaner that referenced this issue May 13, 2016
This feature removes commits that- after the cleaning process -contain *no*
file-tree change when compared to their parent commit. This would be
because the cleaning process has cleaned away whatever content it was that
was _changing_ in the original commit.

The option is off by default, it's activated by using the
`--prune-empty-commits` flag, eg:

$ bfg --delete-files foo --prune-empty-commits

rtyley#27
javabrett pushed a commit to javabrett/bfg-repo-cleaner that referenced this issue Jan 17, 2017
This feature removes commits that- after the cleaning process -contain *no*
file-tree change when compared to their parent commit. This would be
because the cleaning process has cleaned away whatever content it was that
was _changing_ in the original commit.

The option is off by default, it's activated by using the
`--prune-empty-commits` flag, eg:

$ bfg --delete-files foo --prune-empty-commits

rtyley#27
@wuganhao
Copy link

wuganhao commented Dec 6, 2017

+1

javabrett pushed a commit to javabrett/bfg-repo-cleaner that referenced this issue Feb 6, 2018
This feature removes commits that- after the cleaning process -contain *no*
file-tree change when compared to their parent commit. This would be
because the cleaning process has cleaned away whatever content it was that
was _changing_ in the original commit.

The option is off by default, it's activated by using the
`--prune-empty-commits` flag, eg:

$ bfg --delete-files foo --prune-empty-commits

rtyley#27
@farribeiro
Copy link

farribeiro commented Sep 5, 2018

+1

#121 ?
#147 ?

Whoaa512 added a commit to Whoaa512/bfg-repo-cleaner that referenced this issue Apr 13, 2020
rtyley#147

Squashed commit of the following:

commit 850d967
Author: Brett Randall <javabrett@gmail.com>
Date:   Tue Feb 6 20:39:47 2018 +1100

    Updated --prune-empty-commits test: specs2 -> scalatest.

commit c008b83
Author: Brett Randall <javabrett@gmail.com>
Date:   Mon May 16 09:17:33 2016 +1000

    Consider --prune-empty-commits option as work on-its-own, allow BFG to run with prune-empty-commits as its only cleaning-task.

commit ea4c8a2
Author: Brett Randall <javabrett@gmail.com>
Date:   Fri May 13 23:00:31 2016 +1000

    API updates to bring this up to master 8abe03c 1.12.13-SNAPSHOT.

commit 56c4cfe
Author: Martin Dengler <martin@martindengler.com>
Date:   Tue Dec 22 14:08:39 2015 -0600

    Prune empty commits test typo fix

commit 8b6366d
Author: Roberto Tyley <roberto.tyley@gmail.com>
Date:   Fri May 9 09:11:54 2014 +0100

    Add nasty nasty code to address pruning the initial commit...

    ...do we want to go this far!?

commit 1caf6f1
Author: Roberto Tyley <roberto.tyley@gmail.com>
Date:   Sat May 10 13:01:54 2014 +0100

    Prune empty commits test

commit 2f866b5
Author: Roberto Tyley <roberto.tyley@gmail.com>
Date:   Sun Apr 6 23:11:14 2014 +0100

    Add the option to prune empty commits (issue rtyley#27)

    This feature removes commits that- after the cleaning process -contain *no*
    file-tree change when compared to their parent commit. This would be
    because the cleaning process has cleaned away whatever content it was that
    was _changing_ in the original commit.

    The option is off by default, it's activated by using the
    `--prune-empty-commits` flag, eg:

    $ bfg --delete-files foo --prune-empty-commits

    rtyley#27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants