Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse Index: API protections #906

Conversation

derrickstolee
Copy link

@derrickstolee derrickstolee commented Mar 15, 2021

Here is the second patch series submission coming out of the sparse-index RFC [1].

[1] https://lore.kernel.org/git/pull.847.git.1611596533.gitgitgadget@gmail.com/

This is based on ds/sparse-index.

The point of this series is to insert protections for the consumers of the in-memory index to avoid unintended behavior change when using a sparse index versus a full one.

We mark certain regions of code as needing a full index, so we call ensure_full_index() to expand a sparse index to a full one, if necessary. These protections are inserted file-by-file in every loop over all cache entries. Well, "most" loops, because some are going to be handled in the very next series so I leave them out.

Many callers use index_name_pos() to find a path by name. In these cases, we can check if that position resolves to a sparse directory instance. In those cases, we just expand to a full index and run the search again.

The last few patches deal with the name-hash hashtable for doing O(1) lookups.

These protections don't do much right now, since the previous series created the_repository->settings.command_requires_full_index to guard all index reads and writes to ensure the in-memory copy is full for commands that have not been tested with the sparse index yet.

However, after this series is complete, we now have a straight-forward plan for making commands "sparse aware" one-by-one:

  1. Disable settings.command_requires_full_index to allow an in-memory sparse-index.
  2. Run versions of that command under a debugger, breaking on ensure_full_index().
  3. Examine the call stack to determine the context of that expansion, then implement the proper behavior in those locations.
  4. Add tests to ensure we are checking this logic in the presence of sparse directory entries.

I will admit that mostly it is the writing of the test cases that takes the most time in the conversions I've done so far.

Updates in v3

  • I updated based on Elijah's feedback.
  • One new patch splits out a change that Elijah (rightfully) pointed out did not belong with the patch it was originally in.

I gave it time to see if any other comments came in, but it looks like review stabilized. I probably waited a bit longer than I should have.

Updates in v2

  • Rebased onto v5 of ds/sparse-index
  • Updated the technical doc to describe how these protections are guards to keep behavior consistent between a sparse-index and a full index. Whether or not that behavior is "correct" can be interrogated later.
  • Calls to ensure_full_index() are marked with a TODO comment saying these calls should be audited later (with tests).
  • Fixed an incorrectly squashed commit message.
  • Dropped the diff-lib.c commit because it was erroneously included in v2.
  • Dropped the merge-ort.c commit because of conflicts with work in flight and a quick audit that it is not needed.
  • I reviewed the merge of this topic with mt/add-rm-in-sparse-checkout and found it equivalent to what I would have done.

Thanks,
-Stolee

cc: newren@gmail.com
cc: gitster@pobox.com
cc: Derrick Stolee stolee@gmail.com
cc: Matheus Tavares Bernardino matheus.bernardino@usp.br

@derrickstolee derrickstolee force-pushed the sparse-index/protections branch 5 times, most recently from f5d2940 to 40cfc59 Compare March 16, 2021 19:59
@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 16, 2021

Submitted as pull.906.git.1615929435.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-906/derrickstolee/sparse-index/protections-v1

To fetch this version to local tag pr-906/derrickstolee/sparse-index/protections-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-906/derrickstolee/sparse-index/protections-v1

@derrickstolee derrickstolee marked this pull request as ready for review March 17, 2021 12:47
@@ -141,6 +141,7 @@ static int renormalize_tracked_files(const struct pathspec *pathspec, int flags)
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Before iterating over all cache entries, ensure that a sparse index is
> expanded to a full index to avoid unexpected behavior.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/add.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/builtin/add.c b/builtin/add.c
> index ea762a41e3a2..ab2ef4a63530 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -141,6 +141,7 @@ static int renormalize_tracked_files(const struct pathspec *pathspec, int flags)
>  {
>         int i, retval = 0;
>
> +       ensure_full_index(&the_index);
>         for (i = 0; i < active_nr; i++) {
>                 struct cache_entry *ce = active_cache[i];

I'm not so sure about this one; from git-add(1):

       --renormalize
           Apply the "clean" process freshly to all tracked files to forcibly
           add them again to the index. This is useful after changing
           core.autocrlf configuration or the text attribute in order to
           correct files added with wrong CRLF/LF line endings. This option
           implies -u.

... to "all tracked files".  Should that be interpreted as all files
within the sparsity paths, especially considering that we're trying to
enable folks to work within the set of files of interest to them?  If
it doesn't imply that, wouldn't users want an extra option to be able
to behave that way?  And if we add an option, why are we adding a
special option for people wanting to behave sparsely in a
sparse-index-cone-mode-sparse-checkout rather than for the special
case of folks wanting to behave densely in such a setup?

The code below already has the following check below:

    if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
      continue; /* do not touch non blobs */

suggesting that it'd correctly skip over the sparse directories, so I
think this one just isn't needed.

However, on a tangent, that S_ISLNK looks rather suspicious to me.
Why would we renormalize symlinks?  I mean, sure, symlinks aren't
likely to have CRLF/LF characters in them, but if they did, wouldn't
it be wrong to renormalize them?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Matheus Tavares Bernardino wrote (reply to this):

On Wed, Mar 17, 2021 at 2:36 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: Derrick Stolee <dstolee@microsoft.com>
> >
> > Before iterating over all cache entries, ensure that a sparse index is
> > expanded to a full index to avoid unexpected behavior.
> >
> > Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> > ---
> >  builtin/add.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/builtin/add.c b/builtin/add.c
> > index ea762a41e3a2..ab2ef4a63530 100644
> > --- a/builtin/add.c
> > +++ b/builtin/add.c
> > @@ -141,6 +141,7 @@ static int renormalize_tracked_files(const struct pathspec *pathspec, int flags)
> >  {
> >         int i, retval = 0;
> >
> > +       ensure_full_index(&the_index);
> >         for (i = 0; i < active_nr; i++) {
> >                 struct cache_entry *ce = active_cache[i];
>
> I'm not so sure about this one; from git-add(1):
>
>        --renormalize
>            Apply the "clean" process freshly to all tracked files to forcibly
>            add them again to the index. This is useful after changing
>            core.autocrlf configuration or the text attribute in order to
>            correct files added with wrong CRLF/LF line endings. This option
>            implies -u.
>
> ... to "all tracked files".  Should that be interpreted as all files
> within the sparsity paths, especially considering that we're trying to
> enable folks to work within the set of files of interest to them?

I had the same question when working on the add/rm sparse-checkout
series. As seen at [1], --renormalize and --chmod are, currently, the
only options in git-add that do not honor the sparsity patterns and do
update SKIP_WORKTREE paths too.

But is this by design or possibly just an oversight? In my series I
ended up making both options honor the sparsity rules, with a warning
when requested to update any sparse path. (If we are going with this
approach, perhaps should we also amend the docs to remove any
ambiguity as for what "all tracked files" means in this case?)

> The code below already has the following check below:
>
>     if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
>       continue; /* do not touch non blobs */
>
> suggesting that it'd correctly skip over the sparse directories, so I
> think this one just isn't needed.

But if sparse index is not enabled, it does not skip over the sparse
files, right? So isn't the ensure_full_index() call here (and in
chmod_pathspec()) important to be consistent with the case when sparse
index is not in use? Or maybe Stolee could rebase this on top of my
series, where both these places already skip the sparse paths?
(Assuming that's the behavior we are looking for. If not, I can also
fix my series.)

[1]: https://lore.kernel.org/git/d93c3f51465a3e409819db63bd81802e7ef20ea9.1615588108.git.matheus.bernardino@usp.br/

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/17/2021 4:35 PM, Matheus Tavares Bernardino wrote:
> On Wed, Mar 17, 2021 at 2:36 PM Elijah Newren <newren@gmail.com> wrote:
>>
>> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
>> <gitgitgadget@gmail.com> wrote:>> I'm not so sure about this one; from git-add(1):
>>
>>        --renormalize
>>            Apply the "clean" process freshly to all tracked files to forcibly
>>            add them again to the index. This is useful after changing
>>            core.autocrlf configuration or the text attribute in order to
>>            correct files added with wrong CRLF/LF line endings. This option
>>            implies -u.
>>
>> ... to "all tracked files".  Should that be interpreted as all files
>> within the sparsity paths, especially considering that we're trying to
>> enable folks to work within the set of files of interest to them?

I wrote in reply to another similar comment that I'm being overly
cautious in creating these protections. When I can come back later
with careful tests, we can ensure that everything behaves properly.

> I had the same question when working on the add/rm sparse-checkout
> series. As seen at [1], --renormalize and --chmod are, currently, the
> only options in git-add that do not honor the sparsity patterns and do
> update SKIP_WORKTREE paths too.
> 
> But is this by design or possibly just an oversight? In my series I
> ended up making both options honor the sparsity rules, with a warning
> when requested to update any sparse path. (If we are going with this
> approach, perhaps should we also amend the docs to remove any
> ambiguity as for what "all tracked files" means in this case?)

I'd be happy to see a resolution here, and it can happen in parallel
to what I'm doing here.

>> The code below already has the following check below:
>>
>>     if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
>>       continue; /* do not touch non blobs */
>>
>> suggesting that it'd correctly skip over the sparse directories, so I
>> think this one just isn't needed.
> 
> But if sparse index is not enabled, it does not skip over the sparse
> files, right? So isn't the ensure_full_index() call here (and in
> chmod_pathspec()) important to be consistent with the case when sparse
> index is not in use?

The tests I am adding in t1092-sparse-checkout-compatibility.sh are
focused on ensuring the same behavior across sparse-checkouts with
and without the sparse-index, and also a full checkout (when possible).

Since the sparse-index requires cone-mode sparse-checkout (currently),
and the sparse files to be within a directory (or no index change
happens), the tests you created in t3705-add-sparse-checkout.sh don't
seem to apply. I'll need to find some important scenarios to duplicate
in t1092 without the full depth of t3705.

> Or maybe Stolee could rebase this on top of my
> series, where both these places already skip the sparse paths?
> (Assuming that's the behavior we are looking for. If not, I can also> fix my series.)

I think they can proceed in parallel and then continue more carefully
in the future. The series _after_ this one makes 'git status' and
'git add' work with the sparse-index, so at that point we will be
removing these protections at the right time, with tests. In addition,
those tests will ensure that we don't expand the sparse-index unless
absolutely necessary.

If that work collides with your approach, then I'll rebase onto a
merge of that topic and this one.

> [1]: https://lore.kernel.org/git/d93c3f51465a3e409819db63bd81802e7ef20ea9.1615588108.git.matheus.bernardino@usp.br/
 
(I will go take a look over here. I've been ignoring the thread for
too long.)

Thanks,
-Stolee

@@ -119,6 +119,7 @@ static void checkout_all(const char *prefix, int prefix_length)
int i, errs = 0;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Before we iterate over all cache entries, ensure that the index is not
> sparse. This loop in checkout_all() might be safe to iterate over a
> sparse index, but let's put this protection here until it can be
> carefully tested.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/checkout-index.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c
> index 023e49e271c2..ba31a036f193 100644
> --- a/builtin/checkout-index.c
> +++ b/builtin/checkout-index.c
> @@ -119,6 +119,7 @@ static void checkout_all(const char *prefix, int prefix_length)
>         int i, errs = 0;
>         struct cache_entry *last_ce = NULL;
>
> +       ensure_full_index(&the_index);
>         for (i = 0; i < active_nr ; i++) {
>                 struct cache_entry *ce = active_cache[i];
>                 if (ce_stage(ce) != checkout_stage

With the caveat in the commit message, this change looks okay, but
checkout-index may be buggy regardless of the presence of
ensure_full_index().  If ensure_full_index() really is needed here
because it needs to operate on all SKIP_WORKTREE paths and not just
leading directories, that's because it's writing all those
SKIP_WORKTREE entries to the working tree.  When it writes them to the
working tree, is it clearing the SKIP_WORKTREE bit?  If not, we're in
a bit of a pickle...

Might be nice to add a
/* TODO: audit if this is needed; if it is, we may have other bugs... */
or something like that.  But then again, perhaps you're considering
all uses of ensure_full_index() to be need-to-be-reaudited codepaths?
If so, and we determine we really do need one and want to keep it
indefinitely, will we mark those with a comment about why it's
considered correct?

I just want a way to know what still needs to be audited and what
doesn't without doing a lot of history spelunking...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/17/2021 1:50 PM, Elijah Newren wrote:
> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> With the caveat in the commit message, this change looks okay, but
> checkout-index may be buggy regardless of the presence of
> ensure_full_index().  If ensure_full_index() really is needed here
> because it needs to operate on all SKIP_WORKTREE paths and not just
> leading directories, that's because it's writing all those
> SKIP_WORKTREE entries to the working tree.  When it writes them to the
> working tree, is it clearing the SKIP_WORKTREE bit?  If not, we're in
> a bit of a pickle...

Perhaps I'm unclear in my intentions with this series: _every_
insertion of ensure_full_index() is intended to be audited with
tests in the future. Some might need behavior change, and others
will not. In this series, I'm just putting in the protections so
we don't accidentally trigger unexpected behavior.

Since tests take time to write and review, I was hoping that these
insertions were minimal enough to get us to a safe place where we
can remove the guards carefully.

So with that in mind...

> Might be nice to add a
> /* TODO: audit if this is needed; if it is, we may have other bugs... */
> or something like that.  But then again, perhaps you're considering
> all uses of ensure_full_index() to be need-to-be-reaudited codepaths?
> If so, and we determine we really do need one and want to keep it
> indefinitely, will we mark those with a comment about why it's
> considered correct?
> 
> I just want a way to know what still needs to be audited and what
> doesn't without doing a lot of history spelunking...

...every insertion "needs to be audited" in the future. That's a
big part of the next "phases" in the implementation plan.

As you suggest, it might be a good idea to add a comment to every
insertion, to mark it as un-audited, such as:

	/* TODO: test if ensure_full_index() is necessary */

We can come back later to delete the comment if it truly is
necessary (and add tests to guarantee correct behavior). We can
also remove the comment _and_ the call by modifying the loop
behavior to do the right thing in some cases.

Thanks,
-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Wed, Mar 17, 2021 at 1:05 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 3/17/2021 1:50 PM, Elijah Newren wrote:
> > On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> > With the caveat in the commit message, this change looks okay, but
> > checkout-index may be buggy regardless of the presence of
> > ensure_full_index().  If ensure_full_index() really is needed here
> > because it needs to operate on all SKIP_WORKTREE paths and not just
> > leading directories, that's because it's writing all those
> > SKIP_WORKTREE entries to the working tree.  When it writes them to the
> > working tree, is it clearing the SKIP_WORKTREE bit?  If not, we're in
> > a bit of a pickle...
>
> Perhaps I'm unclear in my intentions with this series: _every_
> insertion of ensure_full_index() is intended to be audited with
> tests in the future. Some might need behavior change, and others
> will not. In this series, I'm just putting in the protections so
> we don't accidentally trigger unexpected behavior.

I think this may be part of my qualms -- what do you mean by not
accidentally triggering unexpected behavior?  In particular, does your
statement imply that whatever behavior you get after putting in
ensure_full_index() is "expected"?  I think I'm reading that
implication into it, and objecting that the behavior with the
ensure_full_index() still isn't expected.  You've only removed a
certain class of unexpected behavior, namely code that wasn't written
to expect tree entries that suddenly gets them.  You haven't handled
the class of "user wants to work with a subset of files, why are all
these unrelated files being munged/updated/computed/shown/etc."
unexpected behavior.

I'm worrying that expectations are being set up such that working with
just a small section of the code will be unusably hard.  There may be
several commands/flags where it could make sense to operate on either
(a) all files in the repo or (b) just on files within your sparse
paths.  If, though, folks interpret operate-on-all-files as the
"normal" mode (and history suggests they will), then people start
adding all kinds of --no-do-this-sparsely flags to each command, and
then users who want sparse operation have to remember to type such a
flag with each and every command they ever run -- despite having taken
at least three steps already to get a sparse-index.

I believe the extended discussions (for _months_!) on just grep & rm,
plus watching a --sparse patch being floated just in the last day for
ls-files suggest to me that this is a _very_ likely outcome and I'm
worried about it.

> Since tests take time to write and review, I was hoping that these
> insertions were minimal enough to get us to a safe place where we
> can remove the guards carefully.
>
> So with that in mind...
>
> > Might be nice to add a
> > /* TODO: audit if this is needed; if it is, we may have other bugs... */
> > or something like that.  But then again, perhaps you're considering
> > all uses of ensure_full_index() to be need-to-be-reaudited codepaths?
> > If so, and we determine we really do need one and want to keep it
> > indefinitely, will we mark those with a comment about why it's
> > considered correct?
> >
> > I just want a way to know what still needs to be audited and what
> > doesn't without doing a lot of history spelunking...
>
> ...every insertion "needs to be audited" in the future. That's a
> big part of the next "phases" in the implementation plan.
>
> As you suggest, it might be a good idea to add a comment to every
> insertion, to mark it as un-audited, such as:
>
>         /* TODO: test if ensure_full_index() is necessary */
>
> We can come back later to delete the comment if it truly is
> necessary (and add tests to guarantee correct behavior). We can
> also remove the comment _and_ the call by modifying the loop
> behavior to do the right thing in some cases.

If it's "needs to be audited for both performance reasons (can we
operate on fewer entries as an invisible doesn't-change-results
optimization) and correctness reasons (should we operate on fewer
entries and given a modified result within a sparse-index because
users would expect that, but maybe provide a special flag for the
users who want to operate on all files in the repo)" and there's also
an agreement that either audited or unaudited ones will be marked (or
both), then great, I'm happy.  If not, can we discuss which part of my
performance/correctness/marking we aren't in agreement on?


Thanks,
Elijah

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/17/2021 5:10 PM, Elijah Newren wrote:
> On Wed, Mar 17, 2021 at 1:05 PM Derrick Stolee <stolee@gmail.com> wrote:
>>
>> On 3/17/2021 1:50 PM, Elijah Newren wrote:
>>> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
>>> <gitgitgadget@gmail.com> wrote:
>>> With the caveat in the commit message, this change looks okay, but
>>> checkout-index may be buggy regardless of the presence of
>>> ensure_full_index().  If ensure_full_index() really is needed here
>>> because it needs to operate on all SKIP_WORKTREE paths and not just
>>> leading directories, that's because it's writing all those
>>> SKIP_WORKTREE entries to the working tree.  When it writes them to the
>>> working tree, is it clearing the SKIP_WORKTREE bit?  If not, we're in
>>> a bit of a pickle...
>>
>> Perhaps I'm unclear in my intentions with this series: _every_
>> insertion of ensure_full_index() is intended to be audited with
>> tests in the future. Some might need behavior change, and others
>> will not. In this series, I'm just putting in the protections so
>> we don't accidentally trigger unexpected behavior.
> 
> I think this may be part of my qualms -- what do you mean by not
> accidentally triggering unexpected behavior?  In particular, does your
> statement imply that whatever behavior you get after putting in
> ensure_full_index() is "expected"?  I think I'm reading that
> implication into it, and objecting that the behavior with the
> ensure_full_index() still isn't expected.  You've only removed a
> certain class of unexpected behavior, namely code that wasn't written
> to expect tree entries that suddenly gets them.  You haven't handled
> the class of "user wants to work with a subset of files, why are all
> these unrelated files being munged/updated/computed/shown/etc."
> unexpected behavior.

My intention is to ensure that (at this moment) choosing to use
the on-disk sparse-index format does not alter Git's end-to-end
behavior.

I want to avoid as much as possible a state where enabling the
sparse-index can start changing how Git commands behave, perhaps
in destructive ways.

By adding these checks, we ensure the in-memory data structure
matches whatever a full index would have created, and then the
behavior matches what Git would do there. It might not be the
"correct" behavior, but it is _consistent_.

> I'm worrying that expectations are being set up such that working with
> just a small section of the code will be unusably hard.  There may be
> several commands/flags where it could make sense to operate on either
> (a) all files in the repo or (b) just on files within your sparse
> paths.  If, though, folks interpret operate-on-all-files as the
> "normal" mode (and history suggests they will), then people start
> adding all kinds of --no-do-this-sparsely flags to each command, and
> then users who want sparse operation have to remember to type such a
> flag with each and every command they ever run -- despite having taken
> at least three steps already to get a sparse-index.
> 
> I believe the extended discussions (for _months_!) on just grep & rm,
> plus watching a --sparse patch being floated just in the last day for
> ls-files suggest to me that this is a _very_ likely outcome and I'm
> worried about it.

It's these behavior changes that I would like to delay as much as
possible and focus on the format and making commands fast that don't
need a change in behavior.

(Yes, there will be exceptions, like when "git add" specifically
adds a file that is in a directory that should be out of the cone,
but the user added it anyway. Atypical behavior like that can be
slow for now.)

>> Since tests take time to write and review, I was hoping that these
>> insertions were minimal enough to get us to a safe place where we
>> can remove the guards carefully.
>>
>> So with that in mind...
>>
>>> Might be nice to add a
>>> /* TODO: audit if this is needed; if it is, we may have other bugs... */
>>> or something like that.  But then again, perhaps you're considering
>>> all uses of ensure_full_index() to be need-to-be-reaudited codepaths?
>>> If so, and we determine we really do need one and want to keep it
>>> indefinitely, will we mark those with a comment about why it's
>>> considered correct?
>>>
>>> I just want a way to know what still needs to be audited and what
>>> doesn't without doing a lot of history spelunking...
>>
>> ...every insertion "needs to be audited" in the future. That's a
>> big part of the next "phases" in the implementation plan.
>>
>> As you suggest, it might be a good idea to add a comment to every
>> insertion, to mark it as un-audited, such as:
>>
>>         /* TODO: test if ensure_full_index() is necessary */
>>
>> We can come back later to delete the comment if it truly is
>> necessary (and add tests to guarantee correct behavior). We can
>> also remove the comment _and_ the call by modifying the loop
>> behavior to do the right thing in some cases.
> 
> If it's "needs to be audited for both performance reasons (can we
> operate on fewer entries as an invisible doesn't-change-results
> optimization) and correctness reasons (should we operate on fewer
> entries and given a modified result within a sparse-index because
> users would expect that, but maybe provide a special flag for the
> users who want to operate on all files in the repo)" and there's also
> an agreement that either audited or unaudited ones will be marked (or
> both), then great, I'm happy.  If not, can we discuss which part of my
> performance/correctness/marking we aren't in agreement on?

I will mark all of the ones I'm inserting. My hope is to eventually
remove it entirely except for when disabling the sparse-index. That
is likely too far out to really hope for, but it is the direction I
am trying to go.

As I indicate that we should carefully test each of these instances
where ensure_full_index() _might_ be necessary before removing them,
it is even more important to test the scenarios where the behavior
changes from a full index with sparse-checkout. Preferably, we just
change the behavior under sparse-checkout and then the sparse-index
can match that (see "test_sparse_match" in t1092).

Thanks,
-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Wed, Mar 17, 2021 at 2:33 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 3/17/2021 5:10 PM, Elijah Newren wrote:
> > On Wed, Mar 17, 2021 at 1:05 PM Derrick Stolee <stolee@gmail.com> wrote:
> >>
> >> On 3/17/2021 1:50 PM, Elijah Newren wrote:
> >>> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
> >>> <gitgitgadget@gmail.com> wrote:
> >>> With the caveat in the commit message, this change looks okay, but
> >>> checkout-index may be buggy regardless of the presence of
> >>> ensure_full_index().  If ensure_full_index() really is needed here
> >>> because it needs to operate on all SKIP_WORKTREE paths and not just
> >>> leading directories, that's because it's writing all those
> >>> SKIP_WORKTREE entries to the working tree.  When it writes them to the
> >>> working tree, is it clearing the SKIP_WORKTREE bit?  If not, we're in
> >>> a bit of a pickle...
> >>
> >> Perhaps I'm unclear in my intentions with this series: _every_
> >> insertion of ensure_full_index() is intended to be audited with
> >> tests in the future. Some might need behavior change, and others
> >> will not. In this series, I'm just putting in the protections so
> >> we don't accidentally trigger unexpected behavior.
> >
> > I think this may be part of my qualms -- what do you mean by not
> > accidentally triggering unexpected behavior?  In particular, does your
> > statement imply that whatever behavior you get after putting in
> > ensure_full_index() is "expected"?  I think I'm reading that
> > implication into it, and objecting that the behavior with the
> > ensure_full_index() still isn't expected.  You've only removed a
> > certain class of unexpected behavior, namely code that wasn't written
> > to expect tree entries that suddenly gets them.  You haven't handled
> > the class of "user wants to work with a subset of files, why are all
> > these unrelated files being munged/updated/computed/shown/etc."
> > unexpected behavior.
>
> My intention is to ensure that (at this moment) choosing to use
> the on-disk sparse-index format does not alter Git's end-to-end
> behavior.
>
> I want to avoid as much as possible a state where enabling the
> sparse-index can start changing how Git commands behave, perhaps
> in destructive ways.
>
> By adding these checks, we ensure the in-memory data structure
> matches whatever a full index would have created, and then the
> behavior matches what Git would do there. It might not be the
> "correct" behavior, but it is _consistent_.

That sounds good.  Could this be included in
Documentation/technical/sparse-index.txt?  There's only an oblique
reference to it when talking about grep and rm, which incidentally
were already updated by Matheus.  Perhaps also a reference to the
warning in Documentation/git-sparse-checkout.txt would be worthwhile.

> > I'm worrying that expectations are being set up such that working with
> > just a small section of the code will be unusably hard.  There may be
> > several commands/flags where it could make sense to operate on either
> > (a) all files in the repo or (b) just on files within your sparse
> > paths.  If, though, folks interpret operate-on-all-files as the
> > "normal" mode (and history suggests they will), then people start
> > adding all kinds of --no-do-this-sparsely flags to each command, and
> > then users who want sparse operation have to remember to type such a
> > flag with each and every command they ever run -- despite having taken
> > at least three steps already to get a sparse-index.
> >
> > I believe the extended discussions (for _months_!) on just grep & rm,
> > plus watching a --sparse patch being floated just in the last day for
> > ls-files suggest to me that this is a _very_ likely outcome and I'm
> > worried about it.
>
> It's these behavior changes that I would like to delay as much as
> possible and focus on the format and making commands fast that don't
> need a change in behavior.

Delaying sounds great, just so long as that delay doesn't also cement
the behavior and confuse consistency for correctness.

I still think we don't have correct behavior for sparse-checkouts in
many cases (I mean, when "git reset --hard" throws errors about not
removing files and then removes them, we've obviously got some
problems), but we had a decade long cementing of sorts with
SKIP_WORKTREE and now a year-or-two long cementing since
sparse-checkout was introduced and we never went through and cleaned
up the commands. We should at some point, especially since we put the
huge scary warning in Documenation/git-sparse-checkout.txt expressly
for this purpose.

(I would have started this sooner, but trying to feed merge-ort and
keep up with patch review already keeps me at less time on non-git
projects than I think is expected for me.  Once merge-ort is done...)

> (Yes, there will be exceptions, like when "git add" specifically
> adds a file that is in a directory that should be out of the cone,
> but the user added it anyway. Atypical behavior like that can be
> slow for now.)

I agree that there will be exceptions where we can't make the behavior
be fast, but I disagree with that specific example.  "git add" should
just give a warning message and not add any file outside the cone, for
both sparse-index and sparse-checkout.  That can be done quickly.

I know you want to delay the discussion of behavior fixes for specific
commands somewhat, but this particular discussion has already been
ongoing for about 4-12 months now (started at [1] and [2]) and it's
reached a point where I've put my Reviewed-by on it; see [3] for the
git-add piece specifically.

[1] https://lore.kernel.org/git/9f2135f90ffea7f4ccb226f506bf554deab324cc.1605205427.git.matheus.bernardino@usp.br/
[2] https://lore.kernel.org/git/0b9b4c4b414a571877163667694afa3053bf8890.1585027716.git.matheus.bernardino@usp.br/
[3] https://lore.kernel.org/git/66d5c71182274c78e1fcfe84e77deb17e4f0d7e6.1615588109.git.matheus.bernardino@usp.br/

> >> Since tests take time to write and review, I was hoping that these
> >> insertions were minimal enough to get us to a safe place where we
> >> can remove the guards carefully.
> >>
> >> So with that in mind...
> >>
> >>> Might be nice to add a
> >>> /* TODO: audit if this is needed; if it is, we may have other bugs... */
> >>> or something like that.  But then again, perhaps you're considering
> >>> all uses of ensure_full_index() to be need-to-be-reaudited codepaths?
> >>> If so, and we determine we really do need one and want to keep it
> >>> indefinitely, will we mark those with a comment about why it's
> >>> considered correct?
> >>>
> >>> I just want a way to know what still needs to be audited and what
> >>> doesn't without doing a lot of history spelunking...
> >>
> >> ...every insertion "needs to be audited" in the future. That's a
> >> big part of the next "phases" in the implementation plan.
> >>
> >> As you suggest, it might be a good idea to add a comment to every
> >> insertion, to mark it as un-audited, such as:
> >>
> >>         /* TODO: test if ensure_full_index() is necessary */
> >>
> >> We can come back later to delete the comment if it truly is
> >> necessary (and add tests to guarantee correct behavior). We can
> >> also remove the comment _and_ the call by modifying the loop
> >> behavior to do the right thing in some cases.
> >
> > If it's "needs to be audited for both performance reasons (can we
> > operate on fewer entries as an invisible doesn't-change-results
> > optimization) and correctness reasons (should we operate on fewer
> > entries and given a modified result within a sparse-index because
> > users would expect that, but maybe provide a special flag for the
> > users who want to operate on all files in the repo)" and there's also
> > an agreement that either audited or unaudited ones will be marked (or
> > both), then great, I'm happy.  If not, can we discuss which part of my
> > performance/correctness/marking we aren't in agreement on?
>
> I will mark all of the ones I'm inserting. My hope is to eventually
> remove it entirely except for when disabling the sparse-index. That
> is likely too far out to really hope for, but it is the direction I
> am trying to go.
>
> As I indicate that we should carefully test each of these instances
> where ensure_full_index() _might_ be necessary before removing them,
> it is even more important to test the scenarios where the behavior
> changes from a full index with sparse-checkout. Preferably, we just
> change the behavior under sparse-checkout and then the sparse-index
> can match that (see "test_sparse_match" in t1092).

Makes sense.  I agree that it'd be nice to have the two generally
match, though I think we should be open to there being special cases
that differ.  The only one I can think of right now is `git ls-files`
(list the entries in the index) -- since there are tree entries in a
sparse-index, ls-files would naturally show the tree entries.
However, we can discuss that and any other cases -- if there are any
-- later.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/17/2021 6:36 PM, Elijah Newren wrote:
> On Wed, Mar 17, 2021 at 2:33 PM Derrick Stolee <stolee@gmail.com> wrote:
>>
>> On 3/17/2021 5:10 PM, Elijah Newren wrote:
>>> On Wed, Mar 17, 2021 at 1:05 PM Derrick Stolee <stolee@gmail.com> wrote:
>>>>
>>>> On 3/17/2021 1:50 PM, Elijah Newren wrote:
>>>>> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
>>>>> <gitgitgadget@gmail.com> wrote:
>>>>> With the caveat in the commit message, this change looks okay, but
>>>>> checkout-index may be buggy regardless of the presence of
>>>>> ensure_full_index().  If ensure_full_index() really is needed here
>>>>> because it needs to operate on all SKIP_WORKTREE paths and not just
>>>>> leading directories, that's because it's writing all those
>>>>> SKIP_WORKTREE entries to the working tree.  When it writes them to the
>>>>> working tree, is it clearing the SKIP_WORKTREE bit?  If not, we're in
>>>>> a bit of a pickle...
>>>>
>>>> Perhaps I'm unclear in my intentions with this series: _every_
>>>> insertion of ensure_full_index() is intended to be audited with
>>>> tests in the future. Some might need behavior change, and others
>>>> will not. In this series, I'm just putting in the protections so
>>>> we don't accidentally trigger unexpected behavior.
>>>
>>> I think this may be part of my qualms -- what do you mean by not
>>> accidentally triggering unexpected behavior?  In particular, does your
>>> statement imply that whatever behavior you get after putting in
>>> ensure_full_index() is "expected"?  I think I'm reading that
>>> implication into it, and objecting that the behavior with the
>>> ensure_full_index() still isn't expected.  You've only removed a
>>> certain class of unexpected behavior, namely code that wasn't written
>>> to expect tree entries that suddenly gets them.  You haven't handled
>>> the class of "user wants to work with a subset of files, why are all
>>> these unrelated files being munged/updated/computed/shown/etc."
>>> unexpected behavior.
>>
>> My intention is to ensure that (at this moment) choosing to use
>> the on-disk sparse-index format does not alter Git's end-to-end
>> behavior.
>>
>> I want to avoid as much as possible a state where enabling the
>> sparse-index can start changing how Git commands behave, perhaps
>> in destructive ways.
>>
>> By adding these checks, we ensure the in-memory data structure
>> matches whatever a full index would have created, and then the
>> behavior matches what Git would do there. It might not be the
>> "correct" behavior, but it is _consistent_.
> 
> That sounds good.  Could this be included in
> Documentation/technical/sparse-index.txt?  There's only an oblique
> reference to it when talking about grep and rm, which incidentally
> were already updated by Matheus.  Perhaps also a reference to the
> warning in Documentation/git-sparse-checkout.txt would be worthwhile.

As I was thinking about how to make this more clear, I realized
that an update to that doc early in this series would be wise.
Thanks.

>>> I'm worrying that expectations are being set up such that working with
>>> just a small section of the code will be unusably hard.  There may be
>>> several commands/flags where it could make sense to operate on either
>>> (a) all files in the repo or (b) just on files within your sparse
>>> paths.  If, though, folks interpret operate-on-all-files as the
>>> "normal" mode (and history suggests they will), then people start
>>> adding all kinds of --no-do-this-sparsely flags to each command, and
>>> then users who want sparse operation have to remember to type such a
>>> flag with each and every command they ever run -- despite having taken
>>> at least three steps already to get a sparse-index.
>>>
>>> I believe the extended discussions (for _months_!) on just grep & rm,
>>> plus watching a --sparse patch being floated just in the last day for
>>> ls-files suggest to me that this is a _very_ likely outcome and I'm
>>> worried about it.
>>
>> It's these behavior changes that I would like to delay as much as
>> possible and focus on the format and making commands fast that don't
>> need a change in behavior.
> 
> Delaying sounds great, just so long as that delay doesn't also cement
> the behavior and confuse consistency for correctness.
> 
> I still think we don't have correct behavior for sparse-checkouts in
> many cases (I mean, when "git reset --hard" throws errors about not
> removing files and then removes them, we've obviously got some
> problems), but we had a decade long cementing of sorts with
> SKIP_WORKTREE and now a year-or-two long cementing since
> sparse-checkout was introduced and we never went through and cleaned
> up the commands. We should at some point, especially since we put the
> huge scary warning in Documenation/git-sparse-checkout.txt expressly
> for this purpose.
> 
> (I would have started this sooner, but trying to feed merge-ort and
> keep up with patch review already keeps me at less time on non-git
> projects than I think is expected for me.  Once merge-ort is done...)
> 
>> (Yes, there will be exceptions, like when "git add" specifically
>> adds a file that is in a directory that should be out of the cone,
>> but the user added it anyway. Atypical behavior like that can be
>> slow for now.)
> 
> I agree that there will be exceptions where we can't make the behavior
> be fast, but I disagree with that specific example.  "git add" should
> just give a warning message and not add any file outside the cone, for
> both sparse-index and sparse-checkout.  That can be done quickly.

That would be great! I would love behavior changes that make the
performance work I'm interested in easier.

> I know you want to delay the discussion of behavior fixes for specific
> commands somewhat, but this particular discussion has already been
> ongoing for about 4-12 months now (started at [1] and [2]) and it's
> reached a point where I've put my Reviewed-by on it; see [3] for the
> git-add piece specifically.
> 
> [1] https://lore.kernel.org/git/9f2135f90ffea7f4ccb226f506bf554deab324cc.1605205427.git.matheus.bernardino@usp.br/
> [2] https://lore.kernel.org/git/0b9b4c4b414a571877163667694afa3053bf8890.1585027716.git.matheus.bernardino@usp.br/
> [3] https://lore.kernel.org/git/66d5c71182274c78e1fcfe84e77deb17e4f0d7e6.1615588109.git.matheus.bernardino@usp.br/

I'm in full support of these ideas to change the behavior, when
possible. I would love to see that those changes make it really
easy to integrate the sparse-index into the commands.

>>>> Since tests take time to write and review, I was hoping that these
>>>> insertions were minimal enough to get us to a safe place where we
>>>> can remove the guards carefully.
>>>>
>>>> So with that in mind...
>>>>
>>>>> Might be nice to add a
>>>>> /* TODO: audit if this is needed; if it is, we may have other bugs... */
>>>>> or something like that.  But then again, perhaps you're considering
>>>>> all uses of ensure_full_index() to be need-to-be-reaudited codepaths?
>>>>> If so, and we determine we really do need one and want to keep it
>>>>> indefinitely, will we mark those with a comment about why it's
>>>>> considered correct?
>>>>>
>>>>> I just want a way to know what still needs to be audited and what
>>>>> doesn't without doing a lot of history spelunking...
>>>>
>>>> ...every insertion "needs to be audited" in the future. That's a
>>>> big part of the next "phases" in the implementation plan.
>>>>
>>>> As you suggest, it might be a good idea to add a comment to every
>>>> insertion, to mark it as un-audited, such as:
>>>>
>>>>         /* TODO: test if ensure_full_index() is necessary */
>>>>
>>>> We can come back later to delete the comment if it truly is
>>>> necessary (and add tests to guarantee correct behavior). We can
>>>> also remove the comment _and_ the call by modifying the loop
>>>> behavior to do the right thing in some cases.
>>>
>>> If it's "needs to be audited for both performance reasons (can we
>>> operate on fewer entries as an invisible doesn't-change-results
>>> optimization) and correctness reasons (should we operate on fewer
>>> entries and given a modified result within a sparse-index because
>>> users would expect that, but maybe provide a special flag for the
>>> users who want to operate on all files in the repo)" and there's also
>>> an agreement that either audited or unaudited ones will be marked (or
>>> both), then great, I'm happy.  If not, can we discuss which part of my
>>> performance/correctness/marking we aren't in agreement on?
>>
>> I will mark all of the ones I'm inserting. My hope is to eventually
>> remove it entirely except for when disabling the sparse-index. That
>> is likely too far out to really hope for, but it is the direction I
>> am trying to go.
>>
>> As I indicate that we should carefully test each of these instances
>> where ensure_full_index() _might_ be necessary before removing them,
>> it is even more important to test the scenarios where the behavior
>> changes from a full index with sparse-checkout. Preferably, we just
>> change the behavior under sparse-checkout and then the sparse-index
>> can match that (see "test_sparse_match" in t1092).
> 
> Makes sense.  I agree that it'd be nice to have the two generally
> match, though I think we should be open to there being special cases
> that differ.  The only one I can think of right now is `git ls-files`
> (list the entries in the index) -- since there are tree entries in a
> sparse-index, ls-files would naturally show the tree entries.
> However, we can discuss that and any other cases -- if there are any
> -- later.

Here, I at least want to make it very clear that the change is
happening and include updates to the documentation. This is a
case where it is unclear what the "promise" is to _external_
tools that depend on ls-files (somehow). What is their intended
use of ls-files, and how do those expectations change when in a
sparse-checkout?

It's questions like this where I at least want to be making a
clear change that motivates why a change in behavior is merited,
then test that behavior to demonstrate that we expect that to be
the same in perpetuity.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 17, 2021

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Here is the second patch series submission coming out of the sparse-index
> RFC [1].
>
> [1]
> https://lore.kernel.org/git/pull.847.git.1611596533.gitgitgadget@gmail.com/
>
> This is based on v3 of the format series [2].
>
> [2]
> https://lore.kernel.org/git/pull.883.v3.git.1615912983.gitgitgadget@gmail.com/
>
> The point of this series is to insert protections for the consumers of the
> in-memory index. We mark certain regions of code as needing a full index, so
> we call ensure_full_index() to expand a sparse index to a full one, if
> necessary. These protections are inserted file-by-file in every loop over
> all cache entries. Well, "most" loops, because some are going to be handled
> in the very next series so I leave them out.
>
> Many callers use index_name_pos() to find a path by name. In these cases, we
> can check if that position resolves to a sparse directory instance. In those
> cases, we just expand to a full index and run the search again.
>
> The last few patches deal with the name-hash hashtable for doing O(1)
> lookups.
>
> These protections don't do much right now, since the previous series created
> the_repository->settings.command_requires_full_index to guard all index
> reads and writes to ensure the in-memory copy is full for commands that have
> not been tested with the sparse index yet.
>
> However, after this series is complete, we now have a straight-forward plan
> for making commands "sparse aware" one-by-one:
>
>  1. Disable settings.command_requires_full_index to allow an in-memory
>     sparse-index.
>  2. Run versions of that command under a debugger, breaking on
>     ensure_full_index().
>  3. Examine the call stack to determine the context of that expansion, then
>     implement the proper behavior in those locations.
>  4. Add tests to ensure we are checking this logic in the presence of sparse
>     directory entries.

I started reading the series, then noticed I didn't like the first few
additions of ensure_full_index().  The first was because I thought it
just wasn't needed as per a few lines of code later, but the more
important one that stuck out to me was another where if the
ensure_full_index() call is needed to avoid the code blowing up, then
the code has a good chance that it is doing something inherently wrong
in a sparse-checkout/sparse-index anyway.

So I guess that brings me to the question I asked in 07/27 -- is the
presence of ensure_full_index() a note that the code needs to be later
audited and tweaked to work with sparse-indexes?  If so, then good,
this series may make sense.  However, will that always be the case?
If we think some of these will be left in the code, is there a plan to
annotate each one that has been audited and determined that it needs
to stay?  If not, then each ensure_full_index() might or might not
have been audited for correctness and it becomes a pain to know what's
left to fix.  If the plan is that these are to be audited, and to be
marked if they are truly deemed necessary, then the series makes sense
to me.  If not, then I'm confused about something with the series and
need some help with the goals and plans.

If I'm confused about the goals and the plans, then my reviews will
probably be less than helpful, so I'll suspend reading the series
until I understand the plan a little better.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 17, 2021

User Derrick Stolee <stolee@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 17, 2021

User Matheus Tavares Bernardino <matheus.bernardino@usp.br> has been added to the cc: list.

@@ -1350,6 +1350,7 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q
int i;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> When adjusting the sparse-checkout definition, ensure that a sparse
> index is expanded to a full one to avoid unexpected behavior. This is a
> top candidate for later integration with the sparse-index, but requires
> careful testing.


I was going to comment on most of these collectively, but your commit
message is completely divorced from the patch; it appears you either
place the wrong commit message here or copied it from the one for a
sparse-checkout.  The code modifies stash, though.

> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  builtin/stash.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/builtin/stash.c b/builtin/stash.c
> index ba774cce674f..b3d4fe814b6b 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -1350,6 +1350,7 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q
>                 int i;
>                 char *ps_matched = xcalloc(ps->nr, 1);
>
> +               ensure_full_index(&the_index);
>                 for (i = 0; i < active_nr; i++)
>                         ce_path_match(&the_index, active_cache[i], ps,
>                                       ps_matched);
> --
> gitgitgadget

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/18/2021 1:22 AM, Elijah Newren wrote:
> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> When adjusting the sparse-checkout definition, ensure that a sparse
>> index is expanded to a full one to avoid unexpected behavior. This is a
>> top candidate for later integration with the sparse-index, but requires
>> careful testing.
> 
> 
> I was going to comment on most of these collectively, but your commit
> message is completely divorced from the patch; it appears you either
> place the wrong commit message here or copied it from the one for a
> sparse-checkout.  The code modifies stash, though.

Thanks for pointing this out. Must have been a bad squash during one
of many rebases.

-Stolee

diff-lib.c Outdated
@@ -102,6 +102,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Before iterating over all cache entries, ensure that a sparse index is
> expanded to a full index to avoid unexpected behavior.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  diff-lib.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/diff-lib.c b/diff-lib.c
> index b73cc1859a49..41d6fcec1a81 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -102,6 +102,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>
>         if (diff_unmerged_stage < 0)
>                 diff_unmerged_stage = 2;
> +
>         entries = istate->cache_nr;
>         for (i = 0; i < entries; i++) {
>                 unsigned int oldmode, newmode;

I don't think adding a blank newline will ensure the index is expanded.  ;-)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/18/2021 1:24 AM, Elijah Newren wrote:
> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> Before iterating over all cache entries, ensure that a sparse index is
>> expanded to a full index to avoid unexpected behavior.
>>
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>> ---
>>  diff-lib.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/diff-lib.c b/diff-lib.c
>> index b73cc1859a49..41d6fcec1a81 100644
>> --- a/diff-lib.c
>> +++ b/diff-lib.c
>> @@ -102,6 +102,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>>
>>         if (diff_unmerged_stage < 0)
>>                 diff_unmerged_stage = 2;
>> +
>>         entries = istate->cache_nr;
>>         for (i = 0; i < entries; i++) {
>>                 unsigned int oldmode, newmode;
> 
> I don't think adding a blank newline will ensure the index is expanded.  ;-)

Oops! This is one where I think we determined the loop doesn't
need the guard, but I didn't remove the newline and hence the
patch wasn't dropped.

Thanks,
-Stolee

merge-ort.c Outdated
@@ -3112,6 +3112,7 @@ static int record_conflicted_index_entries(struct merge_options *opt,
if (strmap_empty(conflicted))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Before iterating over all cache entries, ensure that a sparse index is
> expanded to a full one to avoid unexpected behavior.
>
> This is a top candidate for updating later with a proper integration
> with the sparse index, since we will likely enable the ORT strategy by
> default when the sparse index is enabled. This appears to be the only

s/appears to be/is/   :-)

> place where the ORT strategy interacts with the index in such a global
> way, so that integration should be clear once the ORT strategy and the
> sparse index topics stabilize.

Right, there is one more patch that will touch this function -- patch
7 from the series that marks merge-ort stable over here:
https://lore.kernel.org/git/pull.905.v2.git.1616016485.gitgitgadget@gmail.com/

While I have more optimizations for merge-ort, none of them will touch
this function.

> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  merge-ort.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/merge-ort.c b/merge-ort.c
> index 603d30c52170..9f737212555d 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -3112,6 +3112,7 @@ static int record_conflicted_index_entries(struct merge_options *opt,
>         if (strmap_empty(conflicted))
>                 return 0;
>
> +       ensure_full_index(index);
>         original_cache_nr = index->cache_nr;
>
>         /* Put every entry from paths into plist, then sort */
> --
> gitgitgadget

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/18/2021 1:31 AM, Elijah Newren wrote:
> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> Before iterating over all cache entries, ensure that a sparse index is
>> expanded to a full one to avoid unexpected behavior.
>>
>> This is a top candidate for updating later with a proper integration
>> with the sparse index, since we will likely enable the ORT strategy by
>> default when the sparse index is enabled. This appears to be the only
> 
> s/appears to be/is/   :-)
> 
>> place where the ORT strategy interacts with the index in such a global
>> way, so that integration should be clear once the ORT strategy and the
>> sparse index topics stabilize.
> 
> Right, there is one more patch that will touch this function -- patch
> 7 from the series that marks merge-ort stable over here:
> https://lore.kernel.org/git/pull.905.v2.git.1616016485.gitgitgadget@gmail.com/
> 
> While I have more optimizations for merge-ort, none of them will touch
> this function.

Yes, I noticed that that was one of the conflicts introduced with
this series.

I took a closer look at that patch and method, and it seems like
it only cares about paths with SKIP_WORKTREE _and_ is conflicted.
If it has a conflict, then I believe we won't collapse those
entries to a sparse-directory entry.

I think I will drop this patch for now, and rely on the fact that
I plan to enable merge-ORT directly if sparse-index is enabled.
At that time, I will carefully test this logic.

For now, the 'command_requires_full_index' guard will be
sufficient to preserve behavior.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 18, 2021

On the Git mailing list, Elijah Newren wrote (reply to this):

On Wed, Mar 17, 2021 at 11:03 AM Elijah Newren <newren@gmail.com> wrote:
>
> On Tue, Mar 16, 2021 at 2:17 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > The point of this series is to insert protections for the consumers of the
> > in-memory index. We mark certain regions of code as needing a full index, so
> > we call ensure_full_index() to expand a sparse index to a full one, if
> > necessary. These protections are inserted file-by-file in every loop over
> > all cache entries. Well, "most" loops, because some are going to be handled
> > in the very next series so I leave them out.
...
> > However, after this series is complete, we now have a straight-forward plan
> > for making commands "sparse aware" one-by-one:
> >
> >  1. Disable settings.command_requires_full_index to allow an in-memory
> >     sparse-index.
> >  2. Run versions of that command under a debugger, breaking on
> >     ensure_full_index().
> >  3. Examine the call stack to determine the context of that expansion, then
> >     implement the proper behavior in those locations.
> >  4. Add tests to ensure we are checking this logic in the presence of sparse
> >     directory entries.
>
...
> If I'm confused about the goals and the plans, then my reviews will
> probably be less than helpful, so I'll suspend reading the series
> until I understand the plan a little better.

Thanks for patiently responding to all my other queries.  I read
through more of the series today.  I inserted comments on a couple
specific patches, but most of patches 6-25 are all along the same
theme.

I have some overall comments on those patches in 6-25 (none of which
need to be addressed in this series, but are meant as hopefully
helpful guides about future work):

add/rm: both of these were just above loops that had a
skip-the-SKIP_WORKTREE entries, at least once Matheus' series is
merged (https://lore.kernel.org/git/cover.1615588108.git.matheus.bernardino@usp.br/).
I think the upshot is just that these become easier to convert.

checkout/commit -- I think these follow the add/rm model and likewise
become easy to convert.

ls-files - I commented elsewhere in this thread about how I think it'd
make sense to have it list the entries in the index, as always.
Obviously, that'd give different output than a full index, because
there are different entries present in the index when using a
sparse-index vs. a full one.

However, all of these, as you say, can wait until later.


I also noted that patches 26 and 27 were based on ones from the RFC
series that I reviewed before, and I noticed you fixed an issue or two
I flagged there, but you also made some other changes and it's too
late at night for my brain to continue thinking and try to compare
them; I may try again tomorrow.

Other than patch 26 which I'm too tired to think through right now,
and the patches I specifically commented on, the rest of the series
looks good to me.

@@ -718,7 +718,7 @@ static struct attr_stack *read_attr_from_file(const char *path, int macro_ok)
return res;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <dstolee@microsoft.com>
>
> Several methods specify that they take a 'struct index_state' pointer
> with the 'const' qualifier because they intend to only query the data,
> not change it. However, we will be introducing a step very low in the
> method stack that might modify a sparse-index to become a full index in
> the case that our queries venture inside a sparse-directory entry.
>
> This change only removes the 'const' qualifiers that are necessary for
> the following change which will actually modify the implementation of
> index_name_stage_pos().

This step has a bit of interaction with Matheus's "git add/rm" work
in sparse checkout (mt/add-rm-in-sparse-checkout), which I believe
is still in a bit of flux.  I didn't check potential conflicts the
remainder of the series may have with other in-flight topics.

So, I may throw review comments at the patches in this topic as if
they are standalone, but please do not be upset if it didn't appear
in the 'seen' topic.

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/19/2021 5:01 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> Several methods specify that they take a 'struct index_state' pointer
>> with the 'const' qualifier because they intend to only query the data,
>> not change it. However, we will be introducing a step very low in the
>> method stack that might modify a sparse-index to become a full index in
>> the case that our queries venture inside a sparse-directory entry.
>>
>> This change only removes the 'const' qualifiers that are necessary for
>> the following change which will actually modify the implementation of
>> index_name_stage_pos().
> 
> This step has a bit of interaction with Matheus's "git add/rm" work
> in sparse checkout (mt/add-rm-in-sparse-checkout), which I believe
> is still in a bit of flux.  I didn't check potential conflicts the
> remainder of the series may have with other in-flight topics.
> 
> So, I may throw review comments at the patches in this topic as if
> they are standalone, but please do not be upset if it didn't appear
> in the 'seen' topic.

Thanks for the point. I need to reset my expectations about how
much I am stomping on other topics in flight. I rarely work so
widely across the source tree.

I could split this patch out on its own so it can fit into a
narrow window between other topics that might collide with it,
then rebase the rest of the series on top after things settle.

Thanks,
-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Junio C Hamano <gitster@pobox.com> writes:

> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> Several methods specify that they take a 'struct index_state' pointer
>> with the 'const' qualifier because they intend to only query the data,
>> not change it. However, we will be introducing a step very low in the
>> method stack that might modify a sparse-index to become a full index in
>> the case that our queries venture inside a sparse-directory entry.
>>
>> This change only removes the 'const' qualifiers that are necessary for
>> the following change which will actually modify the implementation of
>> index_name_stage_pos().
>
> This step has a bit of interaction with Matheus's "git add/rm" work
> in sparse checkout (mt/add-rm-in-sparse-checkout), which I believe
> is still in a bit of flux.  I didn't check potential conflicts the
> remainder of the series may have with other in-flight topics.
>
> So, I may throw review comments at the patches in this topic as if
> they are standalone, but please do not be upset if it didn't appear
> in the 'seen' topic.

Tonight's pushout will have this topic in 'seen', but I consider the
branch a series of trial merges (there are other first-time topics).
The result seems to compile for me, but other than that, I have not
much confidence in conflict resolution.  Please give them an extra
set of eyeballs.

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 3/19/2021 9:52 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> From: Derrick Stolee <dstolee@microsoft.com>
>>>
>>> Several methods specify that they take a 'struct index_state' pointer
>>> with the 'const' qualifier because they intend to only query the data,
>>> not change it. However, we will be introducing a step very low in the
>>> method stack that might modify a sparse-index to become a full index in
>>> the case that our queries venture inside a sparse-directory entry.
>>>
>>> This change only removes the 'const' qualifiers that are necessary for
>>> the following change which will actually modify the implementation of
>>> index_name_stage_pos().
>>
>> This step has a bit of interaction with Matheus's "git add/rm" work
>> in sparse checkout (mt/add-rm-in-sparse-checkout), which I believe
>> is still in a bit of flux.  I didn't check potential conflicts the
>> remainder of the series may have with other in-flight topics.
>>
>> So, I may throw review comments at the patches in this topic as if
>> they are standalone, but please do not be upset if it didn't appear
>> in the 'seen' topic.
> 
> Tonight's pushout will have this topic in 'seen', but I consider the
> branch a series of trial merges (there are other first-time topics).
> The result seems to compile for me, but other than that, I have not
> much confidence in conflict resolution.  Please give them an extra
> set of eyeballs.

I looked at your merge, and recreated and equivalent one in my own
merging. Thanks for working through all those details. I trust that
your rerere data will remember those changes and help avoid trouble
when replaying new versions.

Otherwise, I could create a new version of this patch with the
intention of playing directly on top of mt/add-rm-in-sparse-checkout,
then keep future versions of this series on top of a merge of that
series and ds/sparse-index.

I have a v2 of this series ready to send, but I'll give it a day to
see how things shake out on ds/sparse-index and if you have another
suggestion for this conflict.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 19, 2021

This branch is now known as ds/sparse-index-protections.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 19, 2021

This patch series was integrated into seen via git@8bc8e10.

@gitgitgadget gitgitgadget bot added the seen label Mar 19, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Mar 20, 2021

This patch series was integrated into seen via git@2969a5e.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 20, 2021

This patch series was integrated into seen via git@8a72c0f.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 20, 2021

There was a status update about the branch ds/sparse-index-protections on the Git mailing list:

Sparse-index technology demonstration.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 21, 2021

This patch series was integrated into seen via git@0720399.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 22, 2021

This patch series was integrated into seen via git@e5e6544.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 22, 2021

This patch series was integrated into seen via git@445a71d.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 22, 2021

This patch series was integrated into seen via git@c5e2d8a.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 23, 2021

This patch series was integrated into seen via git@6b9bb26.

A sparse-index loads the name-hash data for its entries, including the
sparse-directory entries. If a caller asks for a path that is contained
within a sparse-directory entry, we need to expand to a full index and
recalculate the name hash table before returning the result. Insert
calls to expand_to_path() to protect against this case.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
@derrickstolee derrickstolee changed the base branch from stolee/sparse-index/format to ds/sparse-index April 12, 2021 20:34
@derrickstolee
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 12, 2021

Submitted as pull.906.v3.git.1618261697.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-906/derrickstolee/sparse-index/protections-v3

To fetch this version to local tag pr-906/derrickstolee/sparse-index/protections-v3:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-906/derrickstolee/sparse-index/protections-v3

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 12, 2021

This patch series was integrated into seen via git@e9eb556.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 13, 2021

There was a status update in the "Cooking" section about the branch ds/sparse-index-protections on the Git mailing list:

Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 13, 2021

On the Git mailing list, Elijah Newren wrote (reply to this):

On Mon, Apr 12, 2021 at 2:08 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Here is the second patch series submission coming out of the sparse-index
> RFC [1].
>
> [1]
> https://lore.kernel.org/git/pull.847.git.1611596533.gitgitgadget@gmail.com/
>
> This is based on ds/sparse-index.
>
> The point of this series is to insert protections for the consumers of the
> in-memory index to avoid unintended behavior change when using a sparse
> index versus a full one.
>
> We mark certain regions of code as needing a full index, so we call
> ensure_full_index() to expand a sparse index to a full one, if necessary.
> These protections are inserted file-by-file in every loop over all cache
> entries. Well, "most" loops, because some are going to be handled in the
> very next series so I leave them out.
>
> Many callers use index_name_pos() to find a path by name. In these cases, we
> can check if that position resolves to a sparse directory instance. In those
> cases, we just expand to a full index and run the search again.
>
> The last few patches deal with the name-hash hashtable for doing O(1)
> lookups.
>
> These protections don't do much right now, since the previous series created
> the_repository->settings.command_requires_full_index to guard all index
> reads and writes to ensure the in-memory copy is full for commands that have
> not been tested with the sparse index yet.
>
> However, after this series is complete, we now have a straight-forward plan
> for making commands "sparse aware" one-by-one:
>
>  1. Disable settings.command_requires_full_index to allow an in-memory
>     sparse-index.
>  2. Run versions of that command under a debugger, breaking on
>     ensure_full_index().
>  3. Examine the call stack to determine the context of that expansion, then
>     implement the proper behavior in those locations.
>  4. Add tests to ensure we are checking this logic in the presence of sparse
>     directory entries.
>
> I will admit that mostly it is the writing of the test cases that takes the
> most time in the conversions I've done so far.
>
>
> Updates in v3
> =============
>
>  * I updated based on Elijah's feedback.
>  * One new patch splits out a change that Elijah (rightfully) pointed out
>    did not belong with the patch it was originally in.
>
> I gave it time to see if any other comments came in, but it looks like
> review stabilized. I probably waited a bit longer than I should have.

This round looks good to me.

Reviewed-by: Elijah Newren <newren@gmail.com>

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 14, 2021

This patch series was integrated into seen via git@69245c1.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 14, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

Elijah Newren <newren@gmail.com> writes:

> This round looks good to me.
>
> Reviewed-by: Elijah Newren <newren@gmail.com>

Thanks; this kind of change inevitably would involve semantic
conflicts with topics in flight, but we've seen Derrick works well
together with others in such scenarios already, so let's run with
this version and see what happens.

Will merge to 'next' until there are new issues found, by the end of
the week at the latest.

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 14, 2021

This patch series was integrated into seen via git@da665f1.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 15, 2021

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 4/14/2021 4:44 PM, Junio C Hamano wrote:
> Elijah Newren <newren@gmail.com> writes:
> 
>> This round looks good to me.
>>
>> Reviewed-by: Elijah Newren <newren@gmail.com>
> 
> Thanks; this kind of change inevitably would involve semantic
> conflicts with topics in flight, but we've seen Derrick works well
> together with others in such scenarios already, so let's run with
> this version and see what happens.

Thanks!

Semantic conflicts like the one in mt/add-rm-sparse-checkout won't
show up until I start relaxing command_requires_full_index, so I'll
need to be careful at those points. But I'll keep a lookout for
other issues contributors have when integrating across this change.

> Will merge to 'next' until there are new issues found, by the end of
> the week at the latest.

That timeline works for me.

-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 15, 2021

This patch series was integrated into seen via git@52ea535.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 16, 2021

There was a status update in the "Cooking" section about the branch ds/sparse-index-protections on the Git mailing list:

Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 16, 2021

This patch series was integrated into seen via git@4277a02.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 17, 2021

This patch series was integrated into seen via git@154c011.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 17, 2021

This patch series was integrated into seen via git@15ec733.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 17, 2021

This patch series was integrated into next via git@f1c40f8.

@gitgitgadget gitgitgadget bot added the next label Apr 17, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Apr 19, 2021

This patch series was integrated into seen via git@caa04bf.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 19, 2021

There was a status update in the "Cooking" section about the branch ds/sparse-index-protections on the Git mailing list:

Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

Will merge to 'master'.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 21, 2021

This patch series was integrated into seen via git@7cd2085.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 28, 2021

This patch series was integrated into seen via git@5150518.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 29, 2021

There was a status update in the "Cooking" section about the branch ds/sparse-index-protections on the Git mailing list:

Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.

Will merge to 'master'.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 30, 2021

This patch series was integrated into seen via git@8e97852.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 30, 2021

This patch series was integrated into next via git@8e97852.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 30, 2021

This patch series was integrated into master via git@8e97852.

@gitgitgadget gitgitgadget bot added the master label Apr 30, 2021
@gitgitgadget gitgitgadget bot closed this Apr 30, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Apr 30, 2021

Closed via 8e97852.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant