multi-pack-index: fix verify on large repos #166

gitgitgadget · 2019-03-19T19:57:12Z

On the Git mailing list, Jeff Hostetler wrote (reply to this):

On 3/19/2019 10:37 AM, Jeff Hostetler via GitGitGadget wrote: > From: Jeff Hostetler <jeffhost@microsoft.com> > > Fix "git multi-pack-index verify" to handle repos with thousands > of packfiles. > > Midx verify adds the individual "packed_git" structures to the > multi_pack_index.packs array, but it does not add them to the > "repository.objects.packed_git" list. During the verification > code, each packfile is opened and scanned. And "pack_open_fds" > is incremented. If "pack_open_fds" equals the "pack_max_fds" > open_packed_git_1() calls close_one_pack() to LRU-style close > an already open packfile. But because the packfiles were never > added to the "packed_git" list, close_one_pack() does nothing. > If there are very many packfiles, Git runs out of file descriptors > and fails. > > Note that this was observed on Windows when build with GCC and > in a repository with more than (2048-25) packfiles. > > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> > --- > midx.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/midx.c b/midx.c > index 24141a7c62..b2f33bcd90 100644 > --- a/midx.c > +++ b/midx.c > @@ -975,6 +975,9 @@ int verify_midx_file(const char *object_dir) > for (i = 0; i < m->num_packs; i++) { > if (prepare_midx_pack(m, i)) > midx_report("failed to load pack in position %d", i); > + > + if (m->packs[i]) > + install_packed_git(the_repository, m->packs[i]); > } > > for (i = 0; i < 255; i++) { > I'd like to drop this commit from this series. I talked with Stolee offline about this. It does address the problem in this one instance, but it leads us to think about other places where there may be a similar problem. Also, the sort by packfile in the next commit in this series means we'll only have 1 packfile open at a time and so this commit isn't needed in this particular case. Thanks, Jeff

gitgitgadget · 2019-03-18T15:56:21Z

On the Git mailing list, =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason wrote (reply to this):

On Mon, Mar 18 2019, Jeff Hostetler via GitGitGadget wrote: > +static int compare_pair_pos_vs_id(const void *_a, const void *_b) > +{ > + struct pair_pos_vs_id *a = (struct pair_pos_vs_id *)_a; > + struct pair_pos_vs_id *b = (struct pair_pos_vs_id *)_b; > + > + if (a->pack_int_id < b->pack_int_id) > + return -1; > + if (a->pack_int_id > b->pack_int_id) > + return 1; > + > + return 0; > +} Not a suggestion for a change, just a note that this sent me down the rabbit hole of looking at the different idioms we use for QSORT() in different places. Some use this form, some a ternary nest, and some the succinct subtraction idiom of e.g. (in this case): return b->pack_int_id - a->pack_int_id; > + > int verify_midx_file(const char *object_dir) > { > - uint32_t i; > + struct pair_pos_vs_id *pairs = NULL; > + uint32_t i, k; > struct progress *progress; > struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); > verify_midx_error = 0; > @@ -997,15 +1017,36 @@ int verify_midx_file(const char *object_dir) > } > > progress = start_progress(_("Verifying object offsets"), m->num_objects); > + > + /* > + * Create an array mapping each object to its packfile id. Sort it > + * to group the objects by packfile. Use this permutation to visit > + * each of the objects and only require 1 packfile to be open at a > + * time. > + */ > + ALLOC_ARRAY(pairs, m->num_objects); > for (i = 0; i < m->num_objects; i++) { > + pairs[i].pos = i; > + pairs[i].pack_int_id = nth_midxed_pack_int_id(m, i); > + } > + QSORT(pairs, m->num_objects, compare_pair_pos_vs_id); > + > + for (k = 0; k < m->num_objects; k++) { > [...] I have not tested this (or midx in general), but isn't this new QSORT() introducing the same sort of progress stalling that I fixed for commit-graph in 890226ccb57 ("commit-graph write: add itermediate progress", 2019-01-19)? I.e. something you can work around with a "display_progress(progress, 0)" before the QSORT().

On the Git mailing list, Jeff Hostetler wrote (reply to this):

On 3/18/2019 11:53 AM, Ævar Arnfjörð Bjarmason wrote: > > On Mon, Mar 18 2019, Jeff Hostetler via GitGitGadget wrote: > >> +static int compare_pair_pos_vs_id(const void *_a, const void *_b) >> +{ >> + struct pair_pos_vs_id *a = (struct pair_pos_vs_id *)_a; >> + struct pair_pos_vs_id *b = (struct pair_pos_vs_id *)_b; >> + >> + if (a->pack_int_id < b->pack_int_id) >> + return -1; >> + if (a->pack_int_id > b->pack_int_id) >> + return 1; >> + >> + return 0; >> +} > > Not a suggestion for a change, just a note that this sent me down the > rabbit hole of looking at the different idioms we use for QSORT() in > different places. Some use this form, some a ternary nest, and some the > succinct subtraction idiom of e.g. (in this case): > > return b->pack_int_id - a->pack_int_id; Yeah, I'm not sure which way is better or worse here. An earlier draft of this function sorted by packfile id and then by OID (thinking we might benefit from some locality later when we do the verify), hence the independent if statements. But it didn't help, so I removed the other lines. On 43+M objects, your version is a hair faster, so I might as well take it instead. > >> + >> int verify_midx_file(const char *object_dir) >> { >> - uint32_t i; >> + struct pair_pos_vs_id *pairs = NULL; >> + uint32_t i, k; >> struct progress *progress; >> struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); >> verify_midx_error = 0; >> @@ -997,15 +1017,36 @@ int verify_midx_file(const char *object_dir) >> } >> >> progress = start_progress(_("Verifying object offsets"), m->num_objects); >> + >> + /* >> + * Create an array mapping each object to its packfile id. Sort it >> + * to group the objects by packfile. Use this permutation to visit >> + * each of the objects and only require 1 packfile to be open at a >> + * time. >> + */ >> + ALLOC_ARRAY(pairs, m->num_objects); >> for (i = 0; i < m->num_objects; i++) { >> + pairs[i].pos = i; >> + pairs[i].pack_int_id = nth_midxed_pack_int_id(m, i); >> + } >> + QSORT(pairs, m->num_objects, compare_pair_pos_vs_id); >> + >> + for (k = 0; k < m->num_objects; k++) { >> [...] > > I have not tested this (or midx in general), but isn't this new QSORT() > introducing the same sort of progress stalling that I fixed for > commit-graph in 890226ccb57 ("commit-graph write: add itermediate > progress", 2019-01-19)? I.e. something you can work around with a > "display_progress(progress, 0)" before the QSORT(). > I wasn't tracking your commit-graph changes, but yes, I think it is. Tinkering with how to display progress, I found a couple of problems. On my 3599 packfile, 43M object example, QSORT() takes about 5 seconds. But there's about 2 seconds of setup before the sort starts. The final verify loops takes about 17 seconds. Here I put trace2 regions around the main loops and used the GIT_TR2_PERF stream. > | cmd_name | | | | | multi-pack-index (multi-pack-index) > | cmd_mode | | | | | verify > | data | r0 | 0.031295 | 0.031295 | midx | load/num_packs:3599 > | data | r0 | 0.031330 | 0.031330 | midx | load/num_objects:42704807 > | region_enter | r0 | 0.031352 | | midx | label:verify/prepare > | region_leave | r0 | 0.626547 | 0.595195 | midx | label:verify/prepare > | region_enter | r0 | 0.626602 | | midx | label:verify/oid_order > | region_leave | r0 | 1.570195 | 0.943593 | midx | label:verify/oid_order > | region_enter | r0 | 1.570253 | | midx | label:verify/sort_setup > | region_leave | r0 | 1.809723 | 0.239470 | midx | label:verify/sort_setup > | region_enter | r0 | 1.809803 | | midx | label:verify/sort > | region_leave | r0 | 6.950595 | 5.140792 | midx | label:verify/sort > | region_enter | r0 | 6.950651 | | midx | label:verify/offsets > | region_leave | r0 | 24.059736 | 17.109085 | midx | label:verify/offsets > | exit | | 24.101434 | | | code:0 So just adding a delay progress block by itself around the sort doesn't help much. It just sits there for 7 seconds before the actual progress starts. If I add a non-delay progress block around the "verify/prepare", "verify/oid_order" and the "verify/offsets" loops, we get a pretty good experience. There is the dead time while the sort() itself is running, but at least there is isn't a 5+ second frozen at 0% message on screen. I'll re-roll shortly. Thanks, Jeff

On the Git mailing list, =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason wrote (reply to this):

On Mon, Mar 18 2019, Jeff Hostetler wrote: > On 3/18/2019 11:53 AM, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote: >> >> On Mon, Mar 18 2019, Jeff Hostetler via GitGitGadget wrote: >> >>> +static int compare_pair_pos_vs_id(const void *_a, const void *_b) >>> +{ >>> + struct pair_pos_vs_id *a =3D (struct pair_pos_vs_id *)_a; >>> + struct pair_pos_vs_id *b =3D (struct pair_pos_vs_id *)_b; >>> + >>> + if (a->pack_int_id < b->pack_int_id) >>> + return -1; >>> + if (a->pack_int_id > b->pack_int_id) >>> + return 1; >>> + >>> + return 0; >>> +} >> >> Not a suggestion for a change, just a note that this sent me down the >> rabbit hole of looking at the different idioms we use for QSORT() in >> different places. Some use this form, some a ternary nest, and some the >> succinct subtraction idiom of e.g. (in this case): >> >> return b->pack_int_id - a->pack_int_id; > > Yeah, I'm not sure which way is better or worse here. > An earlier draft of this function sorted by packfile id > and then by OID (thinking we might benefit from some > locality later when we do the verify), hence the independent > if statements. But it didn't help, so I removed the other > lines. > > On 43+M objects, your version is a hair faster, so I might > as well take it instead. Cool! >> >>> + >>> int verify_midx_file(const char *object_dir) >>> { >>> - uint32_t i; >>> + struct pair_pos_vs_id *pairs =3D NULL; >>> + uint32_t i, k; >>> struct progress *progress; >>> struct multi_pack_index *m =3D load_multi_pack_index(object_dir, 1); >>> verify_midx_error =3D 0; >>> @@ -997,15 +1017,36 @@ int verify_midx_file(const char *object_dir) >>> } >>> >>> progress =3D start_progress(_("Verifying object offsets"), m->num_ob= jects); >>> + >>> + /* >>> + * Create an array mapping each object to its packfile id. Sort it >>> + * to group the objects by packfile. Use this permutation to visit >>> + * each of the objects and only require 1 packfile to be open at a >>> + * time. >>> + */ >>> + ALLOC_ARRAY(pairs, m->num_objects); >>> for (i =3D 0; i < m->num_objects; i++) { >>> + pairs[i].pos =3D i; >>> + pairs[i].pack_int_id =3D nth_midxed_pack_int_id(m, i); >>> + } >>> + QSORT(pairs, m->num_objects, compare_pair_pos_vs_id); >>> + >>> + for (k =3D 0; k < m->num_objects; k++) { >>> [...] >> >> I have not tested this (or midx in general), but isn't this new QSORT() >> introducing the same sort of progress stalling that I fixed for >> commit-graph in 890226ccb57 ("commit-graph write: add itermediate >> progress", 2019-01-19)? I.e. something you can work around with a >> "display_progress(progress, 0)" before the QSORT(). >> > > I wasn't tracking your commit-graph changes, but yes, I think it is. > > Tinkering with how to display progress, I found a couple of problems. > On my 3599 packfile, 43M object example, QSORT() takes about 5 seconds. > But there's about 2 seconds of setup before the sort starts. The final > verify loops takes about 17 seconds. > > Here I put trace2 regions around the main loops and used the > GIT_TR2_PERF stream. > >> | cmd_name | | | | | multi-pack-i= ndex (multi-pack-index) >> | cmd_mode | | | | | verify >> | data | r0 | 0.031295 | 0.031295 | midx | load/num_pac= ks:3599 >> | data | r0 | 0.031330 | 0.031330 | midx | load/num_obj= ects:42704807 >> | region_enter | r0 | 0.031352 | | midx | >> label:verify/prepare | region_leave | r0 | 0.626547 | 0.595195 | >> midx | label:verify/prepare | region_enter | r0 | 0.626602 | >> | midx | label:verify/oid_order | region_leave | r0 | >> 1.570195 | 0.943593 | midx | label:verify/oid_order | >> region_enter | r0 | 1.570253 | | midx | >> label:verify/sort_setup | region_leave | r0 | 1.809723 | 0.239470 >> | midx | label:verify/sort_setup | region_enter | r0 | >> 1.809803 | | midx | label:verify/sort | region_leave >> | r0 | 6.950595 | 5.140792 | midx | label:verify/sort | >> region_enter | r0 | 6.950651 | | midx | >> label:verify/offsets | region_leave | r0 | 24.059736 | 17.109085 | >> midx | label:verify/offsets | exit | | 24.101434 | >> | | code:0 > > So just adding a delay progress block by itself around the sort doesn't > help much. It just sits there for 7 seconds before the actual progress > starts. > > If I add a non-delay progress block around the "verify/prepare", > "verify/oid_order" and the "verify/offsets" loops, we get a pretty good > experience. > > There is the dead time while the sort() itself is running, but at least > there is isn't a 5+ second frozen at 0% message on screen. Yeah, the same with the commit-graph with my hack. I.e. it'll sit there, but at least it sits like this: What I was doing before 100% (X/Y) What I'm about to start doing 0% (0/Z) [hanging] Instead of: What I was doing before 100% (X/Y) [hanging] So that's an improvement, i.e. you know it's started that next phase at least instead of just having a non-descriptive hang. Ideally there would be some way to reach into the QSORT() and display progress there, but that's all sorts of nasty, so as the TODO comment in commit-graph.c notes I punted it.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: >>>> + if (a->pack_int_id < b->pack_int_id) >>>> + return -1; >>>> + if (a->pack_int_id > b->pack_int_id) >>>> + return 1; >>>> + >>>> + return 0; >>>> +} >>> ... >>> succinct subtraction idiom of e.g. (in this case): >>> >>> return b->pack_int_id - a->pack_int_id; >> ... >> On 43+M objects, your version is a hair faster, so I might >> as well take it instead. > > Cool! Yup, following a well-known short idiom lets readers' eyes coast over without forcing them to make sure the two if statements have comparison going in the right direction and return the constant with the right sign, etc. Even if the idiomatic version weren't faster, it is worth using it. Thanks.

On the Git mailing list, Jeff Hostetler wrote (reply to this):

On 3/18/2019 6:02 PM, Ævar Arnfjörð Bjarmason wrote: > > On Mon, Mar 18 2019, Jeff Hostetler wrote: > >> On 3/18/2019 11:53 AM, Ævar Arnfjörð Bjarmason wrote: >>> >>> On Mon, Mar 18 2019, Jeff Hostetler via GitGitGadget wrote: >>> [...] >>>> + QSORT(pairs, m->num_objects, compare_pair_pos_vs_id); >>>> + >>>> + for (k = 0; k < m->num_objects; k++) { >>>> [...] >>> >>> I have not tested this (or midx in general), but isn't this new QSORT() >>> introducing the same sort of progress stalling that I fixed for >>> commit-graph in 890226ccb57 ("commit-graph write: add itermediate >>> progress", 2019-01-19)? I.e. something you can work around with a >>> "display_progress(progress, 0)" before the QSORT(). >>> >> >> I wasn't tracking your commit-graph changes, but yes, I think it is. >> [...] >> >> There is the dead time while the sort() itself is running, but at least >> there is isn't a 5+ second frozen at 0% message on screen. > > Yeah, the same with the commit-graph with my hack. I.e. it'll sit there, > but at least it sits like this: > > What I was doing before 100% (X/Y) > What I'm about to start doing 0% (0/Z) [hanging] > > Instead of: > > What I was doing before 100% (X/Y) > [hanging] > > So that's an improvement, i.e. you know it's started that next phase at > least instead of just having a non-descriptive hang. > > Ideally there would be some way to reach into the QSORT() and display > progress there, but that's all sorts of nasty, so as the TODO comment in > commit-graph.c notes I punted it. Perhaps I'm confused or this is a Windows issue, but when I do: progress = start_delayed_progress("sorting", n); display_progress(progress, 0); QSORT(...); stop_progress(&progress); I never see the 0% message. It always does the hang with the cursor in column 0 on a blank line. If I make this a regular start_progress(), I do see the 0% message for the duration of the qsort hang. I did some similar testing around your QSORT() in commit-graph.c and got the same result. It looks like start_delayed_processing() wants to wait at least 2 seconds before displaying anything and has an interval timer to notify it that another message should be printed, but the display_progress(0) prior to the QSORT() arrives before the 2 seconds are up and so nothing is printed. It's not until we get into the loop below the QSORT that one of the display_progress(i+1) calls could cause a message to appear. Right? Jeff

On the Git mailing list, =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason wrote (reply to this):

On Tue, Mar 19 2019, Jeff Hostetler wrote: > On 3/18/2019 6:02 PM, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote: >> >> On Mon, Mar 18 2019, Jeff Hostetler wrote: >> >>> On 3/18/2019 11:53 AM, =C3=86var Arnfj=C3=B6r=C3=B0 Bjarmason wrote: >>>> >>>> On Mon, Mar 18 2019, Jeff Hostetler via GitGitGadget wrote: >>>> > [...] >>>>> + QSORT(pairs, m->num_objects, compare_pair_pos_vs_id); >>>>> + >>>>> + for (k =3D 0; k < m->num_objects; k++) { >>>>> [...] >>>> >>>> I have not tested this (or midx in general), but isn't this new QSORT() >>>> introducing the same sort of progress stalling that I fixed for >>>> commit-graph in 890226ccb57 ("commit-graph write: add itermediate >>>> progress", 2019-01-19)? I.e. something you can work around with a >>>> "display_progress(progress, 0)" before the QSORT(). >>>> >>> >>> I wasn't tracking your commit-graph changes, but yes, I think it is. >>> > [...] >>> >>> There is the dead time while the sort() itself is running, but at least >>> there is isn't a 5+ second frozen at 0% message on screen. >> >> Yeah, the same with the commit-graph with my hack. I.e. it'll sit there, >> but at least it sits like this: >> >> What I was doing before 100% (X/Y) >> What I'm about to start doing 0% (0/Z) [hanging] >> >> Instead of: >> >> What I was doing before 100% (X/Y) >> [hanging] >> >> So that's an improvement, i.e. you know it's started that next phase at >> least instead of just having a non-descriptive hang. >> >> Ideally there would be some way to reach into the QSORT() and display >> progress there, but that's all sorts of nasty, so as the TODO comment in >> commit-graph.c notes I punted it. > > Perhaps I'm confused or this is a Windows issue, but when I do: > > progress =3D start_delayed_progress("sorting", n); > display_progress(progress, 0); > QSORT(...); > stop_progress(&progress); > > I never see the 0% message. It always does the hang with the cursor in > column 0 on a blank line. If I make this a regular start_progress(), > I do see the 0% message for the duration of the qsort hang. > > I did some similar testing around your QSORT() in commit-graph.c > and got the same result. It looks like start_delayed_processing() > wants to wait at least 2 seconds before displaying anything and has > an interval timer to notify it that another message should be printed, > but the display_progress(0) prior to the QSORT() arrives before the 2 > seconds are up and so nothing is printed. It's not until we get into > the loop below the QSORT that one of the display_progress(i+1) calls > could cause a message to appear. Correct, it'll still hang the N seconds that start_delayed_progress() imposes. In the commit-graph code this would sometimes take longer than that, so I'd see the 0% earlier. But maybe in this case even on humongous repositories it's faster than that.

gitgitgadget · 2019-03-19T16:45:03Z

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Tue, Mar 19, 2019 at 10:37 AM Jeff Hostetler via GitGitGadget <gitgitgadget@gmail.com> wrote: > Add new start_sparse_progress() and start_delayed_sparse_progress() > constructors and "sparse" flag to struct progress. > > Teach stop_progress() to force a 100% complete progress message before > printing the final "done" message when "sparse" is set. > > Calling display_progress() for every item in a large set can > be expensive. If callers try to filter this for performance > reasons, such as emitting every k-th item, progress would > not reach 100% unless they made a final call to display_progress() > with the item count before calling stop_progress(). > > Now this is automatic when "sparse" is set. > > Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> > --- > diff --git a/progress.c b/progress.c > +/* > + * Here "sparse" means that the caller might use some sampling criteria to > + * decide when to call display_progress() rather than calling it for every > + * integer value in[0 .. total). In particular, the caller might not call > + * display_progress() for the last value in the range. > + * > + * When "sparse" is set, stop_progress() will automatically force the done > + * message to show 100%. > + */ > +static void finish_if_sparse(struct progress **p_progress) > +{ > + struct progress *progress = *p_progress; > + > + if (progress && > + progress->sparse && > + progress->last_value != progress->total) > + display_progress(progress, progress->total); > } There's no reason for this function to take a 'struct progress **' rather than the simpler 'struct progress *', and doing so just confuses the reader. > void stop_progress(struct progress **p_progress) > { > + finish_if_sparse(p_progress); > + > stop_progress_msg(p_progress, _("done")); > } Rather than adding a new "sparse" mode, I wonder if this entire concept can be boiled down to a single new function: void finish_progress(struct progress **p_progress, const char *msg) { struct progress *progress = *p_progress; if (progress && progress->last_value != progress->total) display_progress(progress, progress->total); if (msg) stop_progress_msg(p_progress, msg); else stop_progress(p_progress); } without having to make any other changes to the implementation or add a new field to the structure. It would mean, though, that the caller would have to remember to invoke finish_progress() rather than stop_progress(). Which leads one to wonder why this functionality is needed at all since it's easy enough for a caller to make the one extra call to simulate this: /* client code */ if (progress) display_progress(progress, progress->total); stop_progress(&progress);

On the Git mailing list, Jeff Hostetler wrote (reply to this):

On 3/19/2019 12:42 PM, Eric Sunshine wrote: > On Tue, Mar 19, 2019 at 10:37 AM Jeff Hostetler via GitGitGadget > <gitgitgadget@gmail.com> wrote: >> Add new start_sparse_progress() and start_delayed_sparse_progress() >> constructors and "sparse" flag to struct progress. >> >> Teach stop_progress() to force a 100% complete progress message before >> printing the final "done" message when "sparse" is set. >> >> Calling display_progress() for every item in a large set can >> be expensive. If callers try to filter this for performance >> reasons, such as emitting every k-th item, progress would >> not reach 100% unless they made a final call to display_progress() >> with the item count before calling stop_progress(). >> >> Now this is automatic when "sparse" is set. >> >> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> >> --- >> diff --git a/progress.c b/progress.c >> +/* >> + * Here "sparse" means that the caller might use some sampling criteria to >> + * decide when to call display_progress() rather than calling it for every >> + * integer value in[0 .. total). In particular, the caller might not call >> + * display_progress() for the last value in the range. >> + * >> + * When "sparse" is set, stop_progress() will automatically force the done >> + * message to show 100%. >> + */ >> +static void finish_if_sparse(struct progress **p_progress) >> +{ >> + struct progress *progress = *p_progress; >> + >> + if (progress && >> + progress->sparse && >> + progress->last_value != progress->total) >> + display_progress(progress, progress->total); >> } > > There's no reason for this function to take a 'struct progress **' > rather than the simpler 'struct progress *', and doing so just > confuses the reader. I was just trying to match the existing API in the stop_progress() and stop_progress_msg() routines. But yes, I can simplify this. > >> void stop_progress(struct progress **p_progress) >> { >> + finish_if_sparse(p_progress); >> + >> stop_progress_msg(p_progress, _("done")); >> } > > Rather than adding a new "sparse" mode, I wonder if this entire > concept can be boiled down to a single new function: > > void finish_progress(struct progress **p_progress, const char *msg) > { > struct progress *progress = *p_progress; > if (progress && progress->last_value != progress->total) > display_progress(progress, progress->total); > if (msg) > stop_progress_msg(p_progress, msg); > else > stop_progress(p_progress); > } > > without having to make any other changes to the implementation or add > a new field to the structure. The existing model has start_progress() and start_delayed_progress(). I was following that model and added the new option at the start. I'm not planning on adding any additional flags, but if we had some we'd want them available on the startup next to this one. > It would mean, though, that the caller > would have to remember to invoke finish_progress() rather than > stop_progress(). Right, I was trying to isolate the change to the setup, so that loop bottoms and any in-loop return points don't all have to worry about whether to call stop_ or finish_. > Which leads one to wonder why this functionality is > needed at all since it's easy enough for a caller to make the one > extra call to simulate this: > > /* client code */ > if (progress) > display_progress(progress, progress->total); > stop_progress(&progress); > "struct progress" is private and I didn't want to publish it just for this. And again as with the finish_ remarks, that leads to more places that would need to be updated or maintained. And you're right, callers could always just call: x = ...whatever... progress = start_progress(..., x); ...loop or whatever... display_progress(progress, x); stop_progress(...); but that puts the burden on the caller to keep a copy of "x" that matches the original value so that we get the 100% message. I was doing that for a while, but after 3 or 4 progress loops, I found myself wanting the progress routines to handle that bookkeeping for me. Jeff

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Tue, Mar 19, 2019 at 2:33 PM Jeff Hostetler <git@jeffhostetler.com> wrote: > On 3/19/2019 12:42 PM, Eric Sunshine wrote: > > Rather than adding a new "sparse" mode, I wonder if this entire > > concept can be boiled down to a single new function: > > > > void finish_progress(struct progress **p_progress, const char *msg) > > { > > [...] > > } > > > > without having to make any other changes to the implementation or add > > a new field to the structure. > > The existing model has start_progress() and start_delayed_progress(). > I was following that model and added the new option at the start. > I'm not planning on adding any additional flags, but if we had some > we'd want them available on the startup next to this one. Perhaps it makes sense to just take a 'flags' argument instead of 'sparse' so we don't have to worry about this going forward. In other words: #define PROGRESS_DELAYED (1 << 0) #define PROGRESS_SPARSE (1 << 1) struct progress *start_progress_x(const char *title, uint64_t total, unsigned flags); which covers "delayed" start and "sparse". ("_x" is a placeholder since I can't think of a good name). > > It would mean, though, that the caller > > would have to remember to invoke finish_progress() rather than > > stop_progress(). > > Right, I was trying to isolate the change to the setup, so that loop > bottoms and any in-loop return points don't all have to worry about > whether to call stop_ or finish_. Yes, I understand the benefit. Anyhow, my comments are akin to bikeshedding, thus not necessarily actionable.

-Original file line number
+Diff line change
@@ Expand Up / @@ -3,6 +3,7 @@ @@
     #include "config.h"
     #include "parse-options.h"
     #include "midx.h"
+    #include "trace2.h"
     static char const * const builtin_multi_pack_index_usage[] = {
     	N_("git multi-pack-index [--object-dir=<dir>] (write|verify)"),
@@ Expand Down Expand Up / @@ -40,6 +41,8 @@ int cmd_multi_pack_index(int argc, const char **argv, @@
     		return 1;
     	}
+    	trace2_cmd_mode(argv[0]);
     	if (!strcmp(argv[0], "write"))
     		return write_midx_file(opts.object_dir);
     	if (!strcmp(argv[0], "verify"))
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -8,6 +8,7 @@ @@
     #include "sha1-lookup.h"
     #include "midx.h"
     #include "progress.h"
+    #include "trace2.h"
     #define MIDX_SIGNATURE 0x4d494458 /* "MIDX" */
     #define MIDX_VERSION 1
@@ Expand Down Expand Up @@
     			      m->pack_names[i]);
     	}
+    	trace2_data_intmax("midx", the_repository, "load/num_packs", m->num_packs);
+    	trace2_data_intmax("midx", the_repository, "load/num_objects", m->num_objects);
     	return m;
     cleanup_fail:
@@ Expand Down Expand Up / @@ -958,8 +962,35 @@ static void midx_report(const char *fmt, ...) @@
     	va_end(ap);
     }
+    struct pair_pos_vs_id
+    {
+    	uint32_t pos;
+    	uint32_t pack_int_id;
+    };
+    static int compare_pair_pos_vs_id(const void *_a, const void *_b)
+    {
+    	struct pair_pos_vs_id *a = (struct pair_pos_vs_id *)_a;
+    	struct pair_pos_vs_id *b = (struct pair_pos_vs_id *)_b;
+    	return b->pack_int_id - a->pack_int_id;
+    }
+    /*
+     * Limit calls to display_progress() for performance reasons.
+     * The interval here was arbitrarily chosen.
+     */
+    #define SPARSE_PROGRESS_INTERVAL (1 << 12)
+    #define midx_display_sparse_progress(progress, n) \
+    	do { \
+    		uint64_t _n = (n); \
+    		if ((_n & (SPARSE_PROGRESS_INTERVAL - 1)) == 0) \
+    			display_progress(progress, _n); \
+    	} while (0)
     int verify_midx_file(const char *object_dir)
     {
+    	struct pair_pos_vs_id *pairs = NULL;
     	uint32_t i;
     	struct progress *progress;
     	struct multi_pack_index *m = load_multi_pack_index(object_dir, 1);
@@ Expand All / @@ -968,10 +999,15 @@ int verify_midx_file(const char *object_dir) @@
     	if (!m)
     		return 0;
+    	progress = start_progress(_("Looking for referenced packfiles"),
+    				  m->num_packs);
     	for (i = 0; i < m->num_packs; i++) {
     		if (prepare_midx_pack(m, i))
     			midx_report("failed to load pack in position %d", i);
+    		display_progress(progress, i + 1);
     	}
+    	stop_progress(&progress);
     	for (i = 0; i < 255; i++) {
     		uint32_t oid_fanout1 = ntohl(m->chunk_oid_fanout[i]);
@@ Expand All / @@ -982,6 +1018,8 @@ int verify_midx_file(const char *object_dir) @@
     				    i, oid_fanout1, oid_fanout2, i + 1);
     	}
+    	progress = start_sparse_progress(_("Verifying OID order in MIDX"),
+    					 m->num_objects - 1);
     	for (i = 0; i < m->num_objects - 1; i++) {
     		struct object_id oid1, oid2;
@@ Expand All / @@ -991,18 +1029,47 @@ int verify_midx_file(const char *object_dir) @@
     		if (oidcmp(&oid1, &oid2) >= 0)
     			midx_report(_("oid lookup out of order: oid[%d] = %s >= %s = oid[%d]"),
     				    i, oid_to_hex(&oid1), oid_to_hex(&oid2), i + 1);
+    		midx_display_sparse_progress(progress, i + 1);
     	}
+    	stop_progress(&progress);
-    	progress = start_progress(_("Verifying object offsets"), m->num_objects);
+    	/*
+    	 * Create an array mapping each object to its packfile id.  Sort it
+    	 * to group the objects by packfile.  Use this permutation to visit
+    	 * each of the objects and only require 1 packfile to be open at a
+    	 * time.
+    	 */
+    	ALLOC_ARRAY(pairs, m->num_objects);
+    	for (i = 0; i < m->num_objects; i++) {
+    		pairs[i].pos = i;
+    		pairs[i].pack_int_id = nth_midxed_pack_int_id(m, i);
+    	}
+    	progress = start_sparse_progress(_("Sorting objects by packfile"),
+    					 m->num_objects);
+    	display_progress(progress, 0); /* TODO: Measure QSORT() progress */
+    	QSORT(pairs, m->num_objects, compare_pair_pos_vs_id);
+    	stop_progress(&progress);
+    	progress = start_sparse_progress(_("Verifying object offsets"), m->num_objects);
     	for (i = 0; i < m->num_objects; i++) {
     		struct object_id oid;
     		struct pack_entry e;
     		off_t m_offset, p_offset;
-    		nth_midxed_object_oid(&oid, m, i);
+    		if (i > 0 && pairs[i-1].pack_int_id != pairs[i].pack_int_id &&
+    		    m->packs[pairs[i-1].pack_int_id])
+    		{
+    			close_pack_fd(m->packs[pairs[i-1].pack_int_id]);
+    			close_pack_index(m->packs[pairs[i-1].pack_int_id]);
+    		}
+    		nth_midxed_object_oid(&oid, m, pairs[i].pos);
     		if (!fill_midx_entry(&oid, &e, m)) {
     			midx_report(_("failed to load pack entry for oid[%d] = %s"),
-    				    i, oid_to_hex(&oid));
+    				    pairs[i].pos, oid_to_hex(&oid));
     			continue;
     		}
@@ Expand All / @@ -1017,11 +1084,13 @@ int verify_midx_file(const char *object_dir) @@
     		if (m_offset != p_offset)
     			midx_report(_("incorrect object offset for oid[%d] = %s: %"PRIx64" != %"PRIx64),
-    				    i, oid_to_hex(&oid), m_offset, p_offset);
+    				    pairs[i].pos, oid_to_hex(&oid), m_offset, p_offset);
-    		display_progress(progress, i + 1);
+    		midx_display_sparse_progress(progress, i + 1);
     	}
     	stop_progress(&progress);
+    	free(pairs);
     	return verify_midx_error;
     }

-Original file line number
+Diff line change
@@ Expand Up / @@ -309,7 +309,7 @@ void close_pack_windows(struct packed_git *p) @@
     	}
     }
-    static int close_pack_fd(struct packed_git *p)
+    int close_pack_fd(struct packed_git *p)
     {
     	if (p->pack_fd < 0)
     		return 0;
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -76,6 +76,8 @@ extern int open_pack_index(struct packed_git *); @@
      */
     extern void close_pack_index(struct packed_git *);
+    int close_pack_fd(struct packed_git *p);
     extern uint32_t get_pack_fanout(struct packed_git *p, uint32_t value);
     extern unsigned char *use_pack(struct packed_git *, struct pack_window **, off_t, unsigned long *);
@@ Expand Down @@

-Original file line number
+Diff line change
@@ Expand Up / @@ -34,6 +34,7 @@ struct progress { @@
     	uint64_t total;
     	unsigned last_percent;
     	unsigned delay;
+    	unsigned sparse;
     	struct throughput *throughput;
     	uint64_t start_ns;
     };
@@ Expand Down Expand Up / @@ -194,7 +195,7 @@ int display_progress(struct progress *progress, uint64_t n) @@
     }
     static struct progress *start_progress_delay(const char *title, uint64_t total,
-    					     unsigned delay)
+    					     unsigned delay, unsigned sparse)
     {
     	struct progress *progress = malloc(sizeof(*progress));
     	if (!progress) {
@@ Expand All @@
     	progress->last_value = -1;
     	progress->last_percent = -1;
     	progress->delay = delay;
+    	progress->sparse = sparse;
     	progress->throughput = NULL;
     	progress->start_ns = getnanotime();
     	set_progress_signal();
@@ Expand All @@
     struct progress *start_delayed_progress(const char *title, uint64_t total)
     {
-    	return start_progress_delay(title, total, 2);
+    	return start_progress_delay(title, total, 2, 0);
     }
     struct progress *start_progress(const char *title, uint64_t total)
     {
-    	return start_progress_delay(title, total, 0);
+    	return start_progress_delay(title, total, 0, 0);
+    }
+    /*
+     * Here "sparse" means that the caller might use some sampling criteria to
+     * decide when to call display_progress() rather than calling it for every
+     * integer value in[0 .. total).  In particular, the caller might not call
+     * display_progress() for the last value in the range.
+     *
+     * When "sparse" is set, stop_progress() will automatically force the done
+     * message to show 100%.
+     */
+    struct progress *start_sparse_progress(const char *title, uint64_t total)
+    {
+    	return start_progress_delay(title, total, 0, 1);
+    }
+    struct progress *start_delayed_sparse_progress(const char *title,
+    					       uint64_t total)
+    {
+    	return start_progress_delay(title, total, 2, 1);
+    }
+    static void finish_if_sparse(struct progress *progress)
+    {
+    	if (progress &&
+    	    progress->sparse &&
+    	    progress->last_value != progress->total)
+    		display_progress(progress, progress->total);
     }
     void stop_progress(struct progress **p_progress)
     {
+    	finish_if_sparse(*p_progress);
     	stop_progress_msg(p_progress, _("done"));
     }
@@ Expand Down @@

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-pack-index: fix verify on large repos #166

Diff view

Diff view

There are no files selected for viewing

gitgitgadget bot Mar 19, 2019

gitgitgadget bot Mar 18, 2019

gitgitgadget bot Mar 18, 2019

gitgitgadget bot Mar 18, 2019

gitgitgadget bot Mar 19, 2019

gitgitgadget bot Mar 19, 2019

gitgitgadget bot Mar 19, 2019

gitgitgadget bot Mar 19, 2019

gitgitgadget bot Mar 19, 2019

gitgitgadget bot Mar 19, 2019

multi-pack-index: fix verify on large repos #166

multi-pack-index: fix verify on large repos #166

Diff view

Diff view

There are no files selected for viewing

gitgitgadget bot Mar 19, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 18, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 18, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 18, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 19, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 19, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 19, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 19, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 19, 2019

Choose a reason for hiding this comment

gitgitgadget bot Mar 19, 2019

Choose a reason for hiding this comment