New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two small 'git repack' fixes #1098
Conversation
/submit |
Submitted as pull.1098.git.1639758526.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
@@ -693,7 +693,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) | |||
write_bitmaps = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff King wrote (reply to this):
On Fri, Dec 17, 2021 at 04:28:45PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Historically, we needed a single packfile in order to have reachability
> bitmaps. This introduced logic that when 'git repack' had a '-b' option
> that we should stop sending the '--honor-pack-keep' option to the 'git
> pack-objects' child process, ensuring that we create a packfile
> containing all reachable objects.
>
> In the world of multi-pack-index bitmaps, we no longer need to repack
> all objects into a single pack to have valid bitmaps. Thus, we should
> continue sending the '--honor-pack-keep' flag to 'git pack-objects'.
>
> The fix is very simple: only disable the flag when writing bitmaps but
> also _not_ writing the multi-pack-index.
>
> This opens the door to new repacking strategies that might want to keep
> some historical set of objects in a stable pack-file while only
> repacking more recent objects.
That all makes sense. Another way of thinking about it: we disable
--honor-pack-keep so we that keep packs do not interfere with writing a
pack bitmap. But with --write-midx, we skip the pack bitmap entirely.
In the end it's the same, but I wanted to emphasize that the important
hing is not so much that we are writing a midx bitmap as that we are
_not_ writing a pack bitmap. And that is what makes this OK to do (that
the repack code already disables the pack bitmap when writing a midx
one).
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 9b0be6a6ab3..1f128b7c90b 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -693,7 +693,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
> write_bitmaps = 0;
> }
> if (pack_kept_objects < 0)
> - pack_kept_objects = write_bitmaps > 0;
> + pack_kept_objects = write_bitmaps > 0 && !write_midx;
So the code change here looks good.
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 0260ad6f0e0..8c4ba6500be 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -372,4 +372,19 @@ test_expect_success '--write-midx with preferred bitmap tips' '
> )
> '
>
> +test_expect_success '--write-midx -b packs non-kept objects' '
> + git init midx-kept &&
> + test_when_finished "rm -fr midx-kept" &&
> + (
> + cd midx-kept &&
> + test_commit_bulk 100 &&
> + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> + git repack --write-midx -a -b &&
> + cat trace.txt | \
> + grep \"event\":\"start\" | \
> + grep pack-objects | \
> + grep \"--honor-pack-keep\"
> + )
> +'
This looks correct, though:
- do we really need this separate repo directory? The test before it
uses one, but only because it needs a very specific set of commits.
There is a long-running "midx" directory we could use, though I
think just the regular test repo would be fine, too.
- likewise, do we need 100 commits? They are not too expensive these
days with test_commit_bulk, but I think we don't even care about the
repo contents at all.
- there is no actual .keep file in your test. That's OK, as we are
just checking the args passed to pack-objects.
- useless use of cat. :) Also, you could probably do it all with one
grep. This is bikeshedding, of course, but it's nice to keep process
counts low in the test suite. Also, your middle grep is looser than
the others (it doesn't check for surrounding quotes).
So something like this would work:
test_expect_success '--write-midx -b packs non-kept objects' '
GIT_TRACE2_EVENT="$(pwd)/midx-keep-bitmap.trace" \
git -C midx repack --write-midx -a -b &&
grep "\"event\":\"start\".*\"pack-objects\".*\"--honor-pack-keep\"" \
midx-keep-bitmap.trace
'
One could perhaps argue that the combined grep is less readable. If
that's a concern, I'd suggest wrapping it in a function like:
# usage: check_trace2_arg <trace_file> <cmd> <arg>
check_trace2_arg () {
grep "\"event\":\"start\".*\"$2\".*\"$3\"" "$1"
}
All just suggestions, of course. I'd be happy enough to see the patch go
in as-is.
-Peff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 12/17/2021 12:24 PM, Jeff King wrote:
> On Fri, Dec 17, 2021 at 04:28:45PM +0000, Derrick Stolee via GitGitGadget wrote:
...
>> +test_expect_success '--write-midx -b packs non-kept objects' '
>> + git init midx-kept &&
>> + test_when_finished "rm -fr midx-kept" &&
>> + (
>> + cd midx-kept &&
>> + test_commit_bulk 100 &&
>> + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
>> + git repack --write-midx -a -b &&
>> + cat trace.txt | \
>> + grep \"event\":\"start\" | \
>> + grep pack-objects | \
>> + grep \"--honor-pack-keep\"
>> + )
>> +'
>
> This looks correct, though:
>
> - do we really need this separate repo directory? The test before it
> uses one, but only because it needs a very specific set of commits.
> There is a long-running "midx" directory we could use, though I
> think just the regular test repo would be fine, too.
>
> - likewise, do we need 100 commits? They are not too expensive these
> days with test_commit_bulk, but I think we don't even care about the
> repo contents at all.
>
> - there is no actual .keep file in your test. That's OK, as we are
> just checking the args passed to pack-objects.
>
> - useless use of cat. :) Also, you could probably do it all with one
> grep. This is bikeshedding, of course, but it's nice to keep process
> counts low in the test suite. Also, your middle grep is looser than
> the others (it doesn't check for surrounding quotes).
>
> So something like this would work:
>
> test_expect_success '--write-midx -b packs non-kept objects' '
> GIT_TRACE2_EVENT="$(pwd)/midx-keep-bitmap.trace" \
> git -C midx repack --write-midx -a -b &&
> grep "\"event\":\"start\".*\"pack-objects\".*\"--honor-pack-keep\"" \
> midx-keep-bitmap.trace
> '
>
> One could perhaps argue that the combined grep is less readable. If
> that's a concern, I'd suggest wrapping it in a function like:
>
> # usage: check_trace2_arg <trace_file> <cmd> <arg>
> check_trace2_arg () {
> grep "\"event\":\"start\".*\"$2\".*\"$3\"" "$1"
> }
>
> All just suggestions, of course. I'd be happy enough to see the patch go
> in as-is.
Thanks for the suggestions. I decided to adapt the 'test_subcommand'
helper into a 'test_subcommand_inexact' helper. The existing helper
requires the full argument list in exact order, but the new one only
cares about the given arguments (in that relative order).
Thanks,
-Stolee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff King wrote (reply to this):
On Mon, Dec 20, 2021 at 08:40:09AM -0500, Derrick Stolee wrote:
> > One could perhaps argue that the combined grep is less readable. If
> > that's a concern, I'd suggest wrapping it in a function like:
> >
> > # usage: check_trace2_arg <trace_file> <cmd> <arg>
> > check_trace2_arg () {
> > grep "\"event\":\"start\".*\"$2\".*\"$3\"" "$1"
> > }
> >
> > All just suggestions, of course. I'd be happy enough to see the patch go
> > in as-is.
>
> Thanks for the suggestions. I decided to adapt the 'test_subcommand'
> helper into a 'test_subcommand_inexact' helper. The existing helper
> requires the full argument list in exact order, but the new one only
> cares about the given arguments (in that relative order).
Heh, I should have looked to see if we had faced this problem before.
Extending that family of helpers is better still than my suggestion.
-Peff
User |
@@ -612,7 +612,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) | |||
struct tempfile *refs_snapshot = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff King wrote (reply to this):
On Fri, Dec 17, 2021 at 04:28:46PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> While testing some ideas in 'git repack', I ran it with '--quiet' and
> discovered that some progress output was still shown. Specifically, the
> output for writing the multi-pack-index showed the progress.
>
> The 'show_progress' variable in cmd_repack() is initialized with
> isatty(2) and is not modified at all by the '--quiet' flag. The
> '--quiet' flag modifies the po_args.quiet option which is translated
> into a '--quiet' flag for the 'git pack-objects' child process. However,
> 'show_progress' is used to directly send progress information to the
> multi-pack-index writing logic which does not use a child process.
>
> The fix here is to modify 'show_progress' to be false if po_opts.quiet
> is true, and isatty(2) otherwise. This new expectation simplifies a
> later condition that checks both.
Makes sense. I wondered if you might have to decide what to do with
"--progress --quiet", but we do not have an explicit progress option for
git-repack in the first place.
> This is difficult to test because the isatty(2) already prevents the
> progess indicators from appearing when we redirect stderr to a file.
You'd need test_terminal. Something like this:
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 8c4ba6500b..b673c49650 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -5,6 +5,7 @@ test_description='git repack works correctly'
. ./test-lib.sh
. "${TEST_DIRECTORY}/lib-bitmap.sh"
. "${TEST_DIRECTORY}/lib-midx.sh"
+. "${TEST_DIRECTORY}/lib-terminal.sh"
commit_and_pack () {
test_commit "$@" 1>&2 &&
@@ -387,4 +388,10 @@ test_expect_success '--write-midx -b packs non-kept objects' '
)
'
+test_expect_success TTY '--quiet disables progress' '
+ test_terminal env GIT_PROGRESS_DELAY=0 \
+ git -C midx repack -ad --quiet --write-midx 2>stderr &&
+ test_must_be_empty stderr
+'
+
test_done
-Peff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 12/17/2021 1:10 PM, Jeff King wrote:
> On Fri, Dec 17, 2021 at 04:28:46PM +0000, Derrick Stolee via GitGitGadget wrote:
>> This is difficult to test because the isatty(2) already prevents the
>> progess indicators from appearing when we redirect stderr to a file.
>
> You'd need test_terminal. Something like this:
>
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 8c4ba6500b..b673c49650 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -5,6 +5,7 @@ test_description='git repack works correctly'
> . ./test-lib.sh
> . "${TEST_DIRECTORY}/lib-bitmap.sh"
> . "${TEST_DIRECTORY}/lib-midx.sh"
> +. "${TEST_DIRECTORY}/lib-terminal.sh"
>
> commit_and_pack () {
> test_commit "$@" 1>&2 &&
> @@ -387,4 +388,10 @@ test_expect_success '--write-midx -b packs non-kept objects' '
> )
> '
>
> +test_expect_success TTY '--quiet disables progress' '
> + test_terminal env GIT_PROGRESS_DELAY=0 \
> + git -C midx repack -ad --quiet --write-midx 2>stderr &&
> + test_must_be_empty stderr
> +'
> +
> test_done
Thanks. I added this test.
When first running the test, it failed because I didn't have the
IO::Pty Perl module installed. I'm not sure why I don't fail with
other tests that use test_terminal. If someone knows more about
what is going on, then maybe we need to expand the TTY prereq?
Thanks,
-Stolee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff King wrote (reply to this):
On Mon, Dec 20, 2021 at 08:37:52AM -0500, Derrick Stolee wrote:
> > +test_expect_success TTY '--quiet disables progress' '
> > + test_terminal env GIT_PROGRESS_DELAY=0 \
> > + git -C midx repack -ad --quiet --write-midx 2>stderr &&
> > + test_must_be_empty stderr
> > +'
> > +
> > test_done
>
> Thanks. I added this test.
>
> When first running the test, it failed because I didn't have the
> IO::Pty Perl module installed. I'm not sure why I don't fail with
> other tests that use test_terminal. If someone knows more about
> what is going on, then maybe we need to expand the TTY prereq?
Weird. I uninstalled IO::Pty, and get:
checking prerequisite: TTY
[...prereq code...]
+ perl /home/peff/compile/git/t/test-terminal.perl sh -c test -t 1 && test -t 2
+ command /usr/bin/perl /home/peff/compile/git/t/test-terminal.perl sh -c test -t 1 && test -t 2
Can't locate IO/Pty.pm in @INC (you may need to install the IO::Pty module) (@INC contains: [...etc...]
BEGIN failed--compilation aborted at /home/peff/compile/git/t/test-terminal.perl line 5.
prerequisite TTY not satisfied
ok 25 # skip --quiet disables progress (missing TTY)
What does running with "-x -i" say for the prereq?
-Peff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 12/20/2021 8:49 AM, Jeff King wrote:
> On Mon, Dec 20, 2021 at 08:37:52AM -0500, Derrick Stolee wrote:
>
>>> +test_expect_success TTY '--quiet disables progress' '
>>> + test_terminal env GIT_PROGRESS_DELAY=0 \
>>> + git -C midx repack -ad --quiet --write-midx 2>stderr &&
>>> + test_must_be_empty stderr
>>> +'
>>> +
>>> test_done
>>
>> Thanks. I added this test.
>>
>> When first running the test, it failed because I didn't have the
>> IO::Pty Perl module installed. I'm not sure why I don't fail with
>> other tests that use test_terminal. If someone knows more about
>> what is going on, then maybe we need to expand the TTY prereq?
>
> Weird. I uninstalled IO::Pty, and get:
>
> checking prerequisite: TTY
>
> [...prereq code...]
>
>
> + perl /home/peff/compile/git/t/test-terminal.perl sh -c test -t 1 && test -t 2
> + command /usr/bin/perl /home/peff/compile/git/t/test-terminal.perl sh -c test -t 1 && test -t 2
> Can't locate IO/Pty.pm in @INC (you may need to install the IO::Pty module) (@INC contains: [...etc...]
> BEGIN failed--compilation aborted at /home/peff/compile/git/t/test-terminal.perl line 5.
> prerequisite TTY not satisfied
> ok 25 # skip --quiet disables progress (missing TTY)
>
> What does running with "-x -i" say for the prereq?
Ok, I got this same error and misread it as an error. The prereq
is working just fine. Thanks for checking.
-Stolee
@@ -612,7 +612,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) | |||
struct tempfile *refs_snapshot = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Fri, Dec 17 2021, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
>
> While testing some ideas in 'git repack', I ran it with '--quiet' and
> discovered that some progress output was still shown. Specifically, the
> output for writing the multi-pack-index showed the progress.
>
> The 'show_progress' variable in cmd_repack() is initialized with
> isatty(2) and is not modified at all by the '--quiet' flag. The
> '--quiet' flag modifies the po_args.quiet option which is translated
> into a '--quiet' flag for the 'git pack-objects' child process. However,
> 'show_progress' is used to directly send progress information to the
> multi-pack-index writing logic which does not use a child process.
>
> The fix here is to modify 'show_progress' to be false if po_opts.quiet
> is true, and isatty(2) otherwise. This new expectation simplifies a
> later condition that checks both.
>
> This is difficult to test because the isatty(2) already prevents the
> progess indicators from appearing when we redirect stderr to a file.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
> builtin/repack.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index 1f128b7c90b..c393a5db774 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -612,7 +612,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
> struct tempfile *refs_snapshot = NULL;
> int i, ext, ret;
> FILE *out;
> - int show_progress = isatty(2);
> + int show_progress;
>
> /* variables to be filled by option parsing */
> int pack_everything = 0;
> @@ -725,6 +725,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>
> prepare_pack_objects(&cmd, &po_args);
>
> + show_progress = !po_args.quiet && isatty(2);
> +
> strvec_push(&cmd.args, "--keep-true-parents");
> if (!pack_kept_objects)
> strvec_push(&cmd.args, "--honor-pack-keep");
> @@ -926,7 +928,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
> }
> strbuf_release(&buf);
> }
> - if (!po_args.quiet && show_progress)
> + if (show_progress)
> opts |= PRUNE_PACKED_VERBOSE;
> prune_packed_objects(opts);
This seems like a good change, but the documentation could really use an
update (also before this change). It just says about -q:
Pass the -q option to git pack-objects. See git-pack-objects(1).
But do various things in "git-repack" itself differently if it's
supplied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 12/18/2021 4:55 AM, Ævar Arnfjörð Bjarmason wrote:
> This seems like a good change, but the documentation could really use an
> update (also before this change). It just says about -q:
>
> Pass the -q option to git pack-objects. See git-pack-objects(1).
>
> But do various things in "git-repack" itself differently if it's
> supplied.
Good point. I will update the docs.
Thanks,
-Stolee
User |
@@ -693,7 +693,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix) | |||
write_bitmaps = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Fri, Dec 17 2021, Derrick Stolee via GitGitGadget wrote:
> +test_expect_success '--write-midx -b packs non-kept objects' '
> + git init midx-kept &&
> + test_when_finished "rm -fr midx-kept" &&
Nit: Better the other way around, so we'll clean up things if "git init"
dies midway, but unlikely to ever happen here.
> + (
> + cd midx-kept &&
> + test_commit_bulk 100 &&
> + GIT_TRACE2_EVENT="$(pwd)/trace.txt" \
> + git repack --write-midx -a -b &&
> + cat trace.txt | \
> + grep \"event\":\"start\" | \
The "cat" here should go in favor of "grep <pat> trace.txt", and can't
this all be on one grep line with "." between the relevant parts you're
extracting out?
Historically, we needed a single packfile in order to have reachability bitmaps. This introduced logic that when 'git repack' had a '-b' option that we should stop sending the '--honor-pack-keep' option to the 'git pack-objects' child process, ensuring that we create a packfile containing all reachable objects. In the world of multi-pack-index bitmaps, we no longer need to repack all objects into a single pack to have valid bitmaps. Thus, we should continue sending the '--honor-pack-keep' flag to 'git pack-objects'. The fix is very simple: only disable the flag when writing bitmaps but also _not_ writing the multi-pack-index. This opens the door to new repacking strategies that might want to keep some historical set of objects in a stable pack-file while only repacking more recent objects. To test, create a new 'test_subcommand_inexact' helper that is more flexible than 'test_subcommand'. This allows us to look for the --honor-pack-keep flag without over-indexing on the exact set of arguments. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
0836472
to
3e51bc6
Compare
User |
While testing some ideas in 'git repack', I ran it with '--quiet' and discovered that some progress output was still shown. Specifically, the output for writing the multi-pack-index showed the progress. The 'show_progress' variable in cmd_repack() is initialized with isatty(2) and is not modified at all by the '--quiet' flag. The '--quiet' flag modifies the po_args.quiet option which is translated into a '--quiet' flag for the 'git pack-objects' child process. However, 'show_progress' is used to directly send progress information to the multi-pack-index writing logic which does not use a child process. The fix here is to modify 'show_progress' to be false if po_opts.quiet is true, and isatty(2) otherwise. This new expectation simplifies a later condition that checks both. Update the documentation to make it clear that '-q' will disable all progress in addition to ensuring the 'git pack-objects' child process will receive the flag. Use 'test_terminal' to check that this works to get around the isatty(2) check. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
3e51bc6
to
41260bf
Compare
/submit |
Submitted as pull.1098.v2.git.1640011691.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
|
This branch is now known as |
This patch series was integrated into seen via git@9bda063. |
This patch series was integrated into seen via git@29844f0. |
This patch series was integrated into seen via git@433e30a. |
This patch series was integrated into seen via git@af1cafe. |
This patch series was integrated into seen via git@99a82bc. |
This patch series was integrated into seen via git@353223f. |
There was a status update in the "New Topics" section about the branch Two fixes around "git repack". Will merge to 'next'. source: <pull.1098.v2.git.1640011691.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@355e313. |
This patch series was integrated into seen via git@8b19e60. |
This patch series was integrated into seen via git@a9985ec. |
This patch series was integrated into next via git@8f8474a. |
There was a status update in the "Cooking" section about the branch Two fixes around "git repack". Will merge to 'master'. source: <pull.1098.v2.git.1640011691.gitgitgadget@gmail.com> |
I was experimenting with some ideas in 'git repack' and discovered these two bugs.
The first is a "real" bug in that it repacks much more data than is necessary when repacking with '--write-midx -b' and there exists a .keep pack. The fix is simple, which is to change a condition that was added for the '-b' case before '--write-midx' existed.
The second is a UX bug in that '--quiet' did not disable all progress, at least when stderr was interactive.
Updates in v2
Thanks for the quick review!
Thanks,
-Stolee
cc: me@ttaylorr.com
cc: gitster@pobox.com
cc: Jeff King peff@peff.net
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com
cc: Derrick Stolee stolee@gmail.com