New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove more index compatibility macros #830
Remove more index compatibility macros #830
Conversation
/submit |
Submitted as pull.830.git.1609506428.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
User |
On the Git mailing list, Elijah Newren wrote (reply to this):
|
On the Git mailing list, Eric Sunshine wrote (reply to this):
|
User |
builtin/merge-index.c
Outdated
@@ -1,23 +1,23 @@ | |||
#define USE_THE_INDEX_COMPATIBILITY_MACROS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Alban Gruin wrote (reply to this):
Hi Derrick,
Le 01/01/2021 à 14:06, Derrick Stolee via GitGitGadget a écrit :
> From: Derrick Stolee <dstolee@microsoft.com>
>
> Replace uses of the old macros for the_index and instead pass around a
> 'struct index_state' pointer. This allows dropping the compatibility
> flag.
>
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
I already libified builtin/merge-index.c in ag/merge-strategies-in-c,
and such dropped the_index. I modified merge_entry(), merge_one_path()
and merge_all() to take a callback, itself taking a repository. As
such, in my series, these functions take a `struct repository *' instead
of an index state.
I'm not sure how we should proceed with our respective patches.
Cheers,
Alban
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 1/3/2021 6:31 PM, Alban Gruin wrote:
> Hi Derrick,
>
> Le 01/01/2021 à 14:06, Derrick Stolee via GitGitGadget a écrit :
>> From: Derrick Stolee <dstolee@microsoft.com>
>>
>> Replace uses of the old macros for the_index and instead pass around a
>> 'struct index_state' pointer. This allows dropping the compatibility
>> flag.
>>
>> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
>
> I already libified builtin/merge-index.c in ag/merge-strategies-in-c,
> and such dropped the_index. I modified merge_entry(), merge_one_path()
> and merge_all() to take a callback, itself taking a repository. As
> such, in my series, these functions take a `struct repository *' instead
> of an index state.
>
> I'm not sure how we should proceed with our respective patches.
Hi Alban,
Sorry I didn't realize that. I'll drop this patch. Thanks for letting
me know!
Thanks,
-Stolee
User |
User |
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
On the Git mailing list, Eric Sunshine wrote (reply to this):
|
c65b4a8
to
9d8e48b
Compare
The mv builtin uses the compatibility macros to interact with the index. Update these to use modern methods referring to a 'struct index_state' pointer. Several helper methods need to be updated to consider such a pointer, but the modifications are rudimentary. Two macros can be deleted from cache.h because these are the last uses. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
The rm builtin still uses the antiquated compatibility macros for interacting with the index. Update these to the more modern uses by passing around a 'struct index_state' pointer. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
9d8e48b
to
2b171a1
Compare
/submit |
Submitted as pull.830.v2.git.1609821783.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
@@ -3,7 +3,6 @@ | |||
* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Eric Sunshine wrote (reply to this):
On Mon, Jan 4, 2021 at 11:43 PM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
> In order to remove index compatibility macros cleanly, we relied upon
> static globals 'repo' and 'istate' to be pointers to the_repository and
> the_index, respectively. We remove these static globals inside the
> option parsing callbacks, which are the final uses in update-index.
>
> The callbacks cannot change their method signature, so we must use the
> value member of 'struct option', assigned in the array of option macros.
> There are several callback methods that require at least one of 'repo'
> and 'istate', but they use a variety of different data types for the
> callback value.
>
> Unify these callback methods to use a consistent 'struct callback_data'
> that contains 'repo' and 'istate', ready to use. This takes the place of
> the previous 'struct refresh_params' which served only to group the
> 'flags' and 'has_errors' ints. We also collect other one-off settings,
> but only those that require access to the index or repository in their
> operation.
Makes sense. The patch itself is necessarily a bit noisy, but there's
nothing particularly complicated in that noise.
> Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
> ---
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> @@ -784,19 +784,21 @@ static int do_reupdate(struct repository *repo,
> -struct refresh_params {
> +struct callback_data {
> + struct repository *repo;
> + struct index_state *istate;
> +
> unsigned int flags;
> - int *has_errors;
> + unsigned int has_errors;
> + unsigned nul_term_line;
> + unsigned read_from_stdin;
> };
The only mildly unexpected thing here is that `has_errors` is now a
simple value rather than a pointer to a value, but you handle that
easily enough by always accessing `has_error` directly from the
structure, even within the function in which `has_error` used to be a
local variable. Fine.
> @@ -818,7 +820,7 @@ static int really_refresh_callback(const struct option *opt,
> static int chmod_callback(const struct option *opt,
> - const char *arg, int unset)
> + const char *arg, int unset)
> @@ -829,11 +831,12 @@ static int chmod_callback(const struct option *opt,
> static int resolve_undo_clear_callback(const struct option *opt,
> - const char *arg, int unset)
> + const char *arg, int unset)
A couple drive-by indentation fixes. Okay.
> @@ -1098,8 +1103,13 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
> - istate = repo->index;
> + cd.repo = repo;
> + cd.istate = istate = repo->index;
Will there ever be a case in which `cd.istate` will be different from
`cd.repo->index`? If not, then we could get by with having only
`cd.repo`; callers requiring access to `istate` can fetch it from
`cd.repo`. If, on the other hand, `cd.istate` can be different from
`cd.repo->istate` -- or if that might become a possibility in the
future -- then having `cd.istate` makes sense. Not a big deal, though.
Just generally curious about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 1/7/2021 12:09 AM, Eric Sunshine wrote:
> On Mon, Jan 4, 2021 at 11:43 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> @@ -1098,8 +1103,13 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
>> - istate = repo->index;
>> + cd.repo = repo;
>> + cd.istate = istate = repo->index;
>
> Will there ever be a case in which `cd.istate` will be different from
> `cd.repo->index`? If not, then we could get by with having only
> `cd.repo`; callers requiring access to `istate` can fetch it from
> `cd.repo`. If, on the other hand, `cd.istate` can be different from
> `cd.repo->istate` -- or if that might become a possibility in the
> future -- then having `cd.istate` makes sense. Not a big deal, though.
> Just generally curious about it.
I don't believe that 'istate' and 'repo->index' will ever be
different in this file. This includes the members of the
callback_data struct, but also the method parameters throughout.
Mostly, this could be seen as an artifact of how we got here:
1. References to the_index or other compatibility macros were
converted to use the static global 'istate'.
2. References to the static global 'istate' were replaced with
method parameters for everything except these callbacks.
3. These callbacks were updated to use 'cd.istate' instead of
the (now defunct) static global 'istate'.
It could be possible to replace all references to 'istate' with
'repo->index' but the patches get slightly more messy. I also
think the code looks messier, but you do make a good point that
there is no concrete reason to separate the two.
Thanks,
-Stolee
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Eric Sunshine wrote (reply to this):
On Thu, Jan 7, 2021 at 6:19 AM Derrick Stolee <stolee@gmail.com> wrote:
> On 1/7/2021 12:09 AM, Eric Sunshine wrote:
> > Will there ever be a case in which `cd.istate` will be different from
> > `cd.repo->index`? If not, then we could get by with having only
> > `cd.repo`; callers requiring access to `istate` can fetch it from
> > `cd.repo`. If, on the other hand, `cd.istate` can be different from
> > `cd.repo->istate` -- or if that might become a possibility in the
> > future -- then having `cd.istate` makes sense. Not a big deal, though.
> > Just generally curious about it.
>
> I don't believe that 'istate' and 'repo->index' will ever be
> different in this file. This includes the members of the
> callback_data struct, but also the method parameters throughout.
>
> It could be possible to replace all references to 'istate' with
> 'repo->index' but the patches get slightly more messy. I also
> think the code looks messier, but you do make a good point that
> there is no concrete reason to separate the two.
I agree that it would make the code a bit noisier (to read) if
`istate` is eliminated from the callback structure, however, even
though I didn't originally feel strongly one way or the other about
having both `repo` and `istate` in the structure, I'm now leaning more
toward seeing `istate` eliminated. My one (big) concern with `istate`
is that it confuses readers into wondering whether `istate` and
`repo->istate` will ever be different. One way to avoid such confusion
would be to leave a comment in the code stating that the two values
will always be the same. The other way, of course, is to eliminate
`istate` from the structure altogether. I don't want to make more work
for you, but the more I think about it, the more I feel that removing
`istate` is the sensible thing to do. (And it doesn't require an extra
patch -- it can just be how this patch is crafted -- without ever
introducing `istate` to the structure in the first place.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
Eric Sunshine <sunshine@sunshineco.com> writes:
>> It could be possible to replace all references to 'istate' with
>> 'repo->index' but the patches get slightly more messy. I also
>> think the code looks messier, but you do make a good point that
>> there is no concrete reason to separate the two.
>
> I agree that it would make the code a bit noisier (to read) if
> `istate` is eliminated from the callback structure, however, even
> though I didn't originally feel strongly one way or the other about
> having both `repo` and `istate` in the structure, I'm now leaning more
> toward seeing `istate` eliminated. My one (big) concern with `istate`
> is that it confuses readers into wondering whether `istate` and
> `repo->istate` will ever be different.
Some applications may want to work on more than one in-core index at
the same time (perhaps a merge strategy may want to keep a copy of
the original index and update a second copy with the result of the
merge), and it may be useful for such applications if 'repo->istate'
does not have to be the in-core index instance to be worked on. So
things that go in libgit.a may want to allow such distinction.
But what goes in builtin/ is a different story. As long as this
application has no need for such a feature and will always work on
the primary in-core index, prepared for the in-core repository
structure for convenience, it may not worth it to support such a
feature that no callers benefit from.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Derrick Stolee wrote (reply to this):
On 1/7/2021 2:57 PM, Junio C Hamano wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
>
>>> It could be possible to replace all references to 'istate' with
>>> 'repo->index' but the patches get slightly more messy. I also
>>> think the code looks messier, but you do make a good point that
>>> there is no concrete reason to separate the two.
>>
>> I agree that it would make the code a bit noisier (to read) if
>> `istate` is eliminated from the callback structure, however, even
>> though I didn't originally feel strongly one way or the other about
>> having both `repo` and `istate` in the structure, I'm now leaning more
>> toward seeing `istate` eliminated. My one (big) concern with `istate`
>> is that it confuses readers into wondering whether `istate` and
>> `repo->istate` will ever be different.
>
> Some applications may want to work on more than one in-core index at
> the same time (perhaps a merge strategy may want to keep a copy of
> the original index and update a second copy with the result of the
> merge), and it may be useful for such applications if 'repo->istate'
> does not have to be the in-core index instance to be worked on. So
> things that go in libgit.a may want to allow such distinction.
>
> But what goes in builtin/ is a different story. As long as this
> application has no need for such a feature and will always work on
> the primary in-core index, prepared for the in-core repository
> structure for convenience, it may not worth it to support such a
> feature that no callers benefit from.
I'll try to restructure my patches to do the following order:
1. replace compatibility macros with static global references, except
i. use 'istate' in the methods that don't need a repository.
ii. use 'repo->index' in the methods that need a repository.
2. replace static globals with method parameters.
i. drop 'istate' static global with method parameters. Methods that
have a repo will pass 'repo->index' to these methods.
ii. drop 'repo' static global with method parameters.
3. replace static globals in callback methods using 'repo->index',
where 'repo' is a member of the callback_data struct.
That should keep the structure as presented in v2 while also avoiding
this question of "can istate differ from repo->index?"
Thanks,
-Stolee
2b171a1
to
5a21edc
Compare
To reduce the need for the index compatibility macros, we will replace their uses in update-index mechanically. This is the most interesting change, which creates global "repo" and "istate" pointers. The macros that expand to use the_index can then be mechanically replaced by references to the istate pointer. We will be careful to use "repo->index" over "istate" whenever repo is needed by a method. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Also use repo->index over istate, when possible. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Also use "repo->index" over "istate" when possible. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
This branch is now known as |
This patch series was integrated into seen via git@d9b4954. |
This patch series was integrated into seen via git@f13eaa8. |
This patch series was integrated into seen via git@8c2d7b3. |
This patch series was integrated into seen via git@d2e8dfc. |
This patch series was integrated into seen via git@1347110. |
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
On the Git mailing list, Eric Sunshine wrote (reply to this):
|
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
This patch series was integrated into seen via git@4182c5b. |
This patch series was integrated into seen via git@13d8b6f. |
This patch series was integrated into seen via git@cde3228. |
This patch series was integrated into seen via git@bf74db8. |
This patch series was integrated into seen via git@e4878a0. |
This patch series was integrated into seen via git@91defb9. |
This patch series was integrated into seen via git@b8d1e44. |
This patch series was integrated into seen via git@ab41db6. |
This patch series was integrated into seen via git@b8f05fd. |
This patch series was integrated into seen via git@1a9c87e. |
This patch series was integrated into seen via git@a5b9558. |
This patch series was integrated into seen via git@b1bdc6f. |
This patch series was integrated into seen via git@d726151. |
This patch series was integrated into seen via git@74b7b0a. |
This patch series was integrated into seen via git@8f33792. |
This patch series was integrated into seen via git@74b06e1. |
This patch series was integrated into seen via git@e6b2e66. |
closing to unblock other work. |
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
UPDATE: this is now based on ag/merge-strategies-in-c to avoid conflicts in 'seen'. The changes in builtin/rm.c still conflict with mt/rm-sparse-checkout, but that branch seems to be waiting for a clearer plan on some corner cases. I thought about ejecting it, but 'rm' still uses ce_match_stat(), so just dropping the patch gives less of a final stake at the end of the series. (I'm still open to it, if necessary.)
I noticed that Duy's project around USE_THE_INDEX_COMPATIBILITY_MACROS has been on pause for a while. Here is my attempt to continue that project a little.
I started going through the builtins that still use
cache_name_pos()
and the first was easy: mv and rm.Then I hit update-index and it was a bit bigger.
My strategy for update-index was to create static globals "repo" and "istate" that point to the_repository and the_index, respectively. Then, I was able to remove macros one-by-one without changing method prototypes within the file. Then, these static globals were also removed by systematically updating the local method prototypes, plus some fancy structure stuff for the option parsing callbacks.
I had started trying to keep everything local to the method signatures, but I hit a snag when reaching the command-line parsing callbacks, which I could not modify their call signature. At that point, I had something that was already much more complicated than what I present now. Outside of the first update-index commit, everything was a mechanical find/replace.
In total, this allows us to remove four of the compatibility macros because they are no longer used.
Updates in V3
Methods that know about the 'repo' pointer no longer also have an 'istate' pointer and instead prefer 'repo->index'
This includes the callback_data struct which only has a 'repo' member, no 'istate'.
Thanks,
-Stolee
Cc: pclouds@gmail.com
Cc: gitster@pobox.com
cc: Elijah Newren newren@gmail.com
cc: Eric Sunshine sunshine@sunshineco.com
cc: Alban Gruin alban.gruin@gmail.com
cc: Derrick Stolee stolee@gmail.com