Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove implicit dependency of the_repository #915

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

chinmoy12c
Copy link

@chinmoy12c chinmoy12c commented Mar 24, 2021

There are multiple files that try to reference the repository and the_index directly. To follow a more object-oriented
convention these references should be replaced with r and
index, or with istate->repo and passed through functions.

Signed-off-by: Chinmoy Chakraborty chinmoy12c@gmail.com

Related issue

#379

cc: Derrick Stolee stolee@gmail.com

Changes since v3

  • Used istate->repo instead of the_repository to prevent making changes in callers of the function.

cc: Derrick Stolee stolee@gmail.com

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 24, 2021

Welcome to GitGitGadget

Hi @chinmoy12c, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that your Pull Request has a good description, as it will be used as cover letter.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the FreeNode IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Freenode. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 24, 2021

There is an issue in commit 51ab74f:
The first line must be separated from the rest by an empty line

@dscho
Copy link
Member

dscho commented Mar 24, 2021

/allow

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 24, 2021

User chinmoy12c is now allowed to use GitGitGadget.

WARNING: chinmoy12c has no public email address set on GitHub

Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether there is another function that can be converted, one that does not require such extensive changes as cache_tree_update() (because some of its callers are outside of builtin/, and they, too, need to be converted not to use the_repository).

builtin/checkout.c Outdated Show resolved Hide resolved
cache-tree.h Outdated Show resolved Hide resolved
unpack-trees.c Outdated
@@ -1726,7 +1726,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
cache_tree_verify(the_repository, &o->result);
if (!cache_tree_fully_valid(o->result.cache_tree))
cache_tree_update(&o->result,
cache_tree_update(the_repository,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, too, we're outside of builtin/ and outside of t/helper/, so it is not really okay to introduce the_repository.

In this instance, we should consider putting it into struct unpack_trees_options, anyway, but that will require a much larger set of changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the changes in other files. Changing this would need changes to a lot of files. What would you suggest? Should I make the changes to those files?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is ultimately up to you: this is a rabbit hole, and it is your decision how far down you want to go.

In general, I would suggest to do the conversion one function signature at a time. All the callers that do not have an r to work with should get it via a (transitional!) struct repository *r = the_repository; at the beginning of the function, just like you did here: c977a2a#diff-eeb5ad245cf301bdd1dcd3253bf45156af5171a7dd82abf60ed241a044d704d4R65

Ultimately, we will want to have the entire code base converted, with cmd_*() functions in builtin/*.c getting their struct repository *r passed in as parameter.

But that is probably many, many years down the road. It's also a pretty disruptive set of changes, so it also makes sense to pay attention to what is happening in Git's seen branch, which is a branch collecting most of the patch series that are currently in flight. If your patches conflict with any of those, it is a safe bet that it would be wise to work on other parts of the code ;-)

Copy link
Author

@chinmoy12c chinmoy12c Mar 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I would suggest to do the conversion one function signature at a time. All the callers that do not have an r to work with should get it via a (transitional!) struct repository *r = the_repository; at the beginning of the function, just like you did here: c977a2a#diff-eeb5ad245cf301bdd1dcd3253bf45156af5171a7dd82abf60ed241a044d704d4R65

Okay, I'll try to do this and convert them one at a time for this patch and thereby address individual callers in different patches.

But that is probably many, many years down the road. It's also a pretty disruptive set of changes, so it also makes sense to pay attention to what is happening in Git's seen branch, which is a branch collecting most of the patch series that are currently in flight. If your patches conflict with any of those, it is a safe bet that it would be wise to work on other parts of the code ;-)

Yes, I'll be on the lookout for those :). Thanks for helping. 😄

EDIT: Would it be fine to let the function unpack_trees work with struct repository *r = the_repository; and then pass r for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is totally legitimate, as we want to convert the code base incrementally, to make it substantially easier to review.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 25, 2021

There is an issue in commit 970af5a:
Commit not signed off

@chinmoy12c chinmoy12c force-pushed the issue_379 branch 2 times, most recently from c977a2a to 764b1a2 Compare March 25, 2021 13:57
@chinmoy12c
Copy link
Author

Hey, @dscho could you please take a final look before I submit the patch :)

cache-tree.h Outdated
if (!the_index.cache_tree)
the_index.cache_tree = cache_tree();
return cache_tree_update(&the_index, flags);
return cache_tree_update(r, r->index, flags);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might need a bit of explanation... a casual reviewer might point out (and I almost did!) that r->index is easily deduced from r, so why not change the signature of cache_tree_update() to take a struct repository* instead of a struct index_state *?

The solution to that riddle is, of course, that you can have temporary indexes. Therefore, you might want to update the cache tree for an index other than r->index.

Most likely, information like this would be most welcome in the commit message.

Speaking of the commit message: maybe you want to imitate, say, 3a7a698's commit message? It talks a bit about what it does, but also a bit about "why?", and a bit about "what about ...?".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I'll update this with a proper commit message with that explanation :)

Copy link
Member

@dscho dscho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pretty good GSoC mini project, "to learn the ropes".

unpack-trees.c Outdated
@@ -1726,7 +1726,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
cache_tree_verify(the_repository, &o->result);
if (!cache_tree_fully_valid(o->result.cache_tree))
cache_tree_update(&o->result,
cache_tree_update(the_repository,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This is totally legitimate, as we want to convert the code base incrementally, to make it substantially easier to review.

@chinmoy12c
Copy link
Author

This is a pretty good GSoC mini project, "to learn the ropes".

Nice 😄 could you suggest a few more good issues to dive deeper into the code. I wanna understand the codebase a bit more properly before working on my proposal 😃

@dscho
Copy link
Member

dscho commented Mar 25, 2021

This is a pretty good GSoC mini project, "to learn the ropes".

Nice 😄 could you suggest a few more good issues to dive deeper into the code. I wanna understand the codebase a bit more properly before working on my proposal 😃

From https://github.com/gitgitgadget/git/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22, I would like to highlight #302 (to learn how to build/modify the documentation). It might not look like much, but the mini project really is more about proving that you can efficiently contribute to the Git project (whose contribution process is unfortunately not very welcoming).

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 25, 2021

Submitted as pull.915.git.1616701733901.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-915/chinmoy12c/issue_379-v1

To fetch this version to local tag pr-915/chinmoy12c/issue_379-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-915/chinmoy12c/issue_379-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 25, 2021

On the Git mailing list, Derrick Stolee wrote (reply to this):

n 3/25/2021 3:48 PM, Chinmoy via GitGitGadget wrote:
> From: Chinmoy Chakraborty <chinmoy12c@gmail.com>
> 
> This kills the_repository dependency in cache_tree_update(),
> but for unpack_trees(), they still assume the_repository
> (which also means the_index).

This is a worthwhile goal.
 
> Unfortunately the widespread use of unpack_trees() will make
> it hard to make the conversion now.

And this is true.

> The `update_main_cache_tree()` method uses `cache_tree_update(r, r->index, flags)`.
> `r->index` is easily deduced from `r` but the signature of `cache_tree_update()`
> is not changed to take `struct repository *` instead of `struct index_state *`
> because there can be temporary indexes. Therefore, one might want to update
> the cache tree for an index other than `r->index`.

Unfortunately, you're also hitting against a use of
cache_tree_update() that I'm introducing in my sparse-index
work [1], so that will cause a semantic, non-textual conflict.

[1] https://lore.kernel.org/git/pull.883.v4.git.1616507069.gitgitgadget@gmail.com/

There are a couple options here:

1. Wait a little while for that to settle and merge to master.
2. Rebase your work onto ds/sparse-index, so you also fix the
   new use in sparse-index.c.
3. Continue without it, and maybe Junio will update the merge
   with this diff:

diff --git a/sparse-index.c b/sparse-index.c
index e5712d98d8..585f269214 100644
--- a/sparse-index.c
+++ b/sparse-index.c
@@ -169,7 +169,7 @@ int convert_to_sparse(struct index_state *istate)
 		return -1;
 	}
 
-	if (cache_tree_update(istate, 0)) {
+	if (cache_tree_update(istate->repo, istate, 0)) {
 		warning(_("unable to update cache-tree, staying full"));
 		return -1;
 	}
@@ -183,7 +183,7 @@ int convert_to_sparse(struct index_state *istate)
 
 	/* Clear and recompute the cache-tree */
 	cache_tree_free(&istate->cache_tree);
-	cache_tree_update(istate, 0);
+	cache_tree_update(istate->repo, istate, 0);
 
 	istate->sparse_index = 1;
 	trace2_region_leave("index", "convert_to_sparse", istate->repo);

Finally, there is yet a fourth option: use istate->repo instead. In
1fd9ae51 (repository: add repo reference to index_state), I added a
'repo' member to struct index_state. This is intended for methods to
access a repository directly from the index.

However, this is slightly dangerous because there are some cases where
and index is loaded without even the_repository being populated (and
hence istate->repo can be NULL). They are rare, so perhaps
unpack_trees() is always used on an index_state with a populated repo.
If not, then a temporary fix can be done to set istate->repo to
the_repository at the start of the method.

This could help you avoid changing method prototypes, reducing the
conflicts in your patch.

> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 2d6550bc3c86..3bc630ef64e7 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -684,6 +684,7 @@ static int merge_working_tree(const struct checkout_opts *opts,
>  	int ret;
>  	struct lock_file lock_file = LOCK_INIT;
>  	struct tree *new_tree;
> +	struct repository *r = the_repository;

I do appreciate you using this pattern so we could
possibly add 'r' as a method parameter later.

The rest of the patch looks pretty standard. I hope my comments
above are helpful in your direction here.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 25, 2021

User Derrick Stolee <stolee@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 26, 2021

On the Git mailing list, Chinmoy Chakraborty wrote (reply to this):


On 3/26/21 2:01 AM, Derrick Stolee wrote:
> n 3/25/2021 3:48 PM, Chinmoy via GitGitGadget wrote:
> 2. Rebase your work onto ds/sparse-index, so you also fix the
>     new use in sparse-index.c.

     I'll try to rebase my work on top of ds/sparse-index so that it's 
fixed too.

     Thanks, for the reply :)

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 26, 2021

On the Git mailing list, Chinmoy Chakraborty wrote (reply to this):


On 3/26/21 2:01 AM, Derrick Stolee wrote
> 2. Rebase your work onto ds/sparse-index, so you also fix the
>     new use in sparse-index.c.
Should I open a separate branch (and PR) on my repo, with the the 
updated ds/sparse-index and then submit it?

@chinmoy12c chinmoy12c force-pushed the issue_379 branch 3 times, most recently from fa0e766 to cfa9931 Compare March 26, 2021 12:54
@gitgitgadget
Copy link

gitgitgadget bot commented Mar 26, 2021

There is an issue in commit cfa9931:
The first line must be separated from the rest by an empty line

@chinmoy12c
Copy link
Author

@dscho, I made the required fixes on @derrickstolee 's patch. Should I open a PR with this on ds/sparse-index ?

@dscho
Copy link
Member

dscho commented Mar 26, 2021

@dscho, I made the required fixes on @derrickstolee 's patch. Should I open a PR with this on ds/sparse-index ?

I did not follow that discussion, so I do not know.

If the patch in question is a separate fix on top of ds/sparse-index, then yes, a new PR is appropriate.

If you want to extend this PR to cover a fix on top of ds/sparse-index, it might make more sense to re-target this PR (hit Edit next to the PR title, and you can change the target branch).

@chinmoy12c chinmoy12c changed the base branch from stolee/sparse-index/format to ds/sparse-index-wip March 26, 2021 14:53
@chinmoy12c chinmoy12c changed the base branch from ds/sparse-index-wip to ds/sparse-index March 26, 2021 14:54
@chinmoy12c
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 26, 2021

Submitted as pull.915.v2.git.1616772930098.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-915/chinmoy12c/issue_379-v2

To fetch this version to local tag pr-915/chinmoy12c/issue_379-v2:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-915/chinmoy12c/issue_379-v2

@chinmoy12c
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 3, 2021

Submitted as pull.915.v3.git.1617465421353.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-915/chinmoy12c/issue_379-v3

To fetch this version to local tag pr-915/chinmoy12c/issue_379-v3:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-915/chinmoy12c/issue_379-v3

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 4, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Chinmoy via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Chinmoy Chakraborty <chinmoy12c@gmail.com>
>
> This kills the_repository dependency in cache_tree_update(),

That part of the sentence is good.  But the next four lines are not.

> but for unpack_trees(), they still assume the_repository
> (which also means the_index).
>
> Unfortunately the widespread use of unpack_trees() will make
> it hard to make the conversion now.

As long as all the users of unpack_trees() want to work with the
primary in-core index instance, that is not a problem.  There is no
need to lament with a phrase like "unfortunately".  If you are not
touching unpack_trees(), but you needed to adjust its use of
cache_tree_update(), then none of the above needs to be said.  If
you change one API function (i.e. cache_tree_update()), all callers
need to be adjusted for that change, but if one caller only wants to
work in the primary repository and with its primary index, that is
fine.  You change it so that it passes the repository and the index
it wants the updated unpack_trees() to work on, and there is nothing
notable in it.

> The `update_main_cache_tree()` method uses `cache_tree_update(r, r->index, flags)`.
> `r->index` is easily deduced from `r` but the signature of `cache_tree_update()`
> is not changed to take `struct repository *` instead of `struct index_state *`
> because there can be temporary indexes. Therefore, one might want to update
> the cache tree for an index other than `r->index`.

The flow of explanation of the thought in this paragraph is probably
backwards for most people to understand what you wanted to say.

    The cache_tree_update() was originally designed to take an istate so
    that it can work on an instance of in-core index that is not the
    primary one for the repository, and we want to keep that property
    while teaching the subsystem how to deal with a repository other
    than the_repository.  Hence it can take a pointer to the repository
    and a pointer to istate separately.

Nothing else needs to be said.  The update-main-cache-tree is a thin
wrapper to work on cache tree for THE MAIN in-core index instance,
so it is natural if it needs to pass r and r->index, exactly because
the latter is THE MAIN in-core index instance for the repository.
That is too obvious that it is best left unsaid, for the same reason
why you shouldn't even mention what you did in updating the use of
update_cache_tree() inside unpack_trees().  There is nothing notable
in there.

cf. Documentation/SubmittingPatches
[[describe-changes]]
[[summary-section]]
[[meaningful-message]]
[[imperative-mood]]


> This commit also fixes the `sparse-index.c` file in which
> the `convert_to_sparse()` and `ensure_full_index()`
> method use `cache_tree_update()`.

What branch did you base this patch on?  There is no update_cache_tree()
call in sparse-index.c on 'master' (well, there is no sparse-index.c
in 'master' to begin with).  

While it is not prohibited to build on a commit that is not on
'master', you'd need to keep your dependencies to the minimum so
that the topic won't be taken hostage by other topics that do not
move quickly enough to your liking.  Building on 'next' or 'seen'
as a whole is a guarantee that the topic will never hit 'master',
as no tip of 'next' will ever be merged directly to 'master'.

And if you are building somewhere other than the tip of 'master', it
is necessary to say where, in order to save confusion and wasted
effort for your readers.  E.g.  "This patch was made on top of
merging topic 'X' and 'Y' into 'master'".  Then you'd only have to
wait for 'X' and 'Y' to graduate to 'master' before the topic can
proceed.

Thanks.


>  builtin/checkout.c              | 11 ++++++-----
>  cache-tree.c                    | 10 ++++++----
>  cache-tree.h                    |  6 ++++--
>  sequencer.c                     |  6 +++---
>  sparse-index.c                  | 13 ++++++++-----
>  t/helper/test-dump-cache-tree.c |  2 +-
>  unpack-trees.c                  | 19 ++++++++++---------
>  7 files changed, 38 insertions(+), 29 deletions(-)
>
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 0e6639052001..33d762a7caf9 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -684,6 +684,7 @@ static int merge_working_tree(const struct checkout_opts *opts,
>  	int ret;
>  	struct lock_file lock_file = LOCK_INIT;
>  	struct tree *new_tree;
> +	struct repository *r = the_repository;
>  
>  	hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR);
>  	if (read_cache_preload(NULL) < 0)
> @@ -768,7 +769,7 @@ static int merge_working_tree(const struct checkout_opts *opts,
>  				return 1;
>  			old_tree = get_commit_tree(old_branch_info->commit);
>  
> -			if (repo_index_has_changes(the_repository, old_tree, &sb))
> +			if (repo_index_has_changes(r, old_tree, &sb))
>  				die(_("cannot continue with staged changes in "
>  				      "the following files:\n%s"), sb.buf);
>  			strbuf_release(&sb);
> @@ -787,9 +788,9 @@ static int merge_working_tree(const struct checkout_opts *opts,
>  			 */
>  
>  			add_files_to_cache(NULL, NULL, 0);
> -			init_merge_options(&o, the_repository);
> +			init_merge_options(&o, r);
>  			o.verbosity = 0;
> -			work = write_in_core_index_as_tree(the_repository);
> +			work = write_in_core_index_as_tree(r);
>  
>  			ret = reset_tree(new_tree,
>  					 opts, 1,
> @@ -822,9 +823,9 @@ static int merge_working_tree(const struct checkout_opts *opts,
>  	}
>  
>  	if (!cache_tree_fully_valid(active_cache_tree))
> -		cache_tree_update(&the_index, WRITE_TREE_SILENT | WRITE_TREE_REPAIR);
> +		cache_tree_update(r, r->index, WRITE_TREE_SILENT | WRITE_TREE_REPAIR);
>  
> -	if (write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
> +	if (write_locked_index(r->index, &lock_file, COMMIT_LOCK))
>  		die(_("unable to write new index file"));
>  
>  	if (!opts->discard_changes && !opts->quiet && new_branch_info->commit)
> diff --git a/cache-tree.c b/cache-tree.c
> index 11bf1fcae6e1..011cfd01c565 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -452,7 +452,7 @@ static int update_one(struct cache_tree *it,
>  	return i;
>  }
>  
> -int cache_tree_update(struct index_state *istate, int flags)
> +int cache_tree_update(struct repository *r, struct index_state *istate, int flags)
>  {
>  	int skip, i;
>  
> @@ -467,10 +467,10 @@ int cache_tree_update(struct index_state *istate, int flags)
>  		istate->cache_tree = cache_tree();
>  
>  	trace_performance_enter();
> -	trace2_region_enter("cache_tree", "update", the_repository);
> +	trace2_region_enter("cache_tree", "update", r);
>  	i = update_one(istate->cache_tree, istate->cache, istate->cache_nr,
>  		       "", 0, &skip, flags);
> -	trace2_region_leave("cache_tree", "update", the_repository);
> +	trace2_region_leave("cache_tree", "update", r);
>  	trace_performance_leave("cache_tree_update");
>  	if (i < 0)
>  		return i;
> @@ -654,12 +654,14 @@ static int write_index_as_tree_internal(struct object_id *oid,
>  					int flags,
>  					const char *prefix)
>  {
> +	struct repository *r = the_repository;
> +
>  	if (flags & WRITE_TREE_IGNORE_CACHE_TREE) {
>  		cache_tree_free(&index_state->cache_tree);
>  		cache_tree_valid = 0;
>  	}
>  
> -	if (!cache_tree_valid && cache_tree_update(index_state, flags) < 0)
> +	if (!cache_tree_valid && cache_tree_update(r, index_state, flags) < 0)
>  		return WRITE_TREE_UNMERGED_INDEX;
>  
>  	if (prefix) {
> diff --git a/cache-tree.h b/cache-tree.h
> index 8efeccebfc9f..80cc38f176c2 100644
> --- a/cache-tree.h
> +++ b/cache-tree.h
> @@ -33,7 +33,7 @@ void cache_tree_write(struct strbuf *, struct cache_tree *root);
>  struct cache_tree *cache_tree_read(const char *buffer, unsigned long size);
>  
>  int cache_tree_fully_valid(struct cache_tree *);
> -int cache_tree_update(struct index_state *, int);
> +int cache_tree_update(struct repository *, struct index_state *, int);
>  void cache_tree_verify(struct repository *, struct index_state *);
>  
>  /* bitmasks to write_index_as_tree flags */
> @@ -62,9 +62,11 @@ static inline int write_cache_as_tree(struct object_id *oid, int flags, const ch
>  
>  static inline int update_main_cache_tree(int flags)
>  {
> +	struct repository *r = the_repository;
> +
>  	if (!the_index.cache_tree)
>  		the_index.cache_tree = cache_tree();
> -	return cache_tree_update(&the_index, flags);
> +	return cache_tree_update(r, r->index, flags);
>  }
>  #endif
>  
> diff --git a/sequencer.c b/sequencer.c
> index d2332d3e1787..b0fad54ae5a4 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -677,10 +677,10 @@ static int do_recursive_merge(struct repository *r,
>  	return !clean;
>  }
>  
> -static struct object_id *get_cache_tree_oid(struct index_state *istate)
> +static struct object_id *get_cache_tree_oid(struct repository *r, struct index_state *istate)
>  {
>  	if (!cache_tree_fully_valid(istate->cache_tree))
> -		if (cache_tree_update(istate, 0)) {
> +		if (cache_tree_update(r, istate, 0)) {
>  			error(_("unable to update cache tree"));
>  			return NULL;
>  		}
> @@ -710,7 +710,7 @@ static int is_index_unchanged(struct repository *r)
>  	if (parse_commit(head_commit))
>  		return -1;
>  
> -	if (!(cache_tree_oid = get_cache_tree_oid(istate)))
> +	if (!(cache_tree_oid = get_cache_tree_oid(r, istate)))
>  		return -1;
>  
>  	return oideq(cache_tree_oid, get_commit_tree_oid(head_commit));
> diff --git a/sparse-index.c b/sparse-index.c
> index 95ea17174da3..e4323ffd81db 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -128,12 +128,14 @@ int set_sparse_index_config(struct repository *repo, int enable)
>  int convert_to_sparse(struct index_state *istate)
>  {
>  	int test_env;
> +	struct repository *r = the_repository;
> +
>  	if (istate->split_index || istate->sparse_index ||
>  	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
>  		return 0;
>  
>  	if (!istate->repo)
> -		istate->repo = the_repository;
> +		istate->repo = r;
>  
>  	/*
>  	 * The GIT_TEST_SPARSE_INDEX environment variable triggers the
> @@ -161,7 +163,7 @@ int convert_to_sparse(struct index_state *istate)
>  		return -1;
>  	}
>  
> -	if (cache_tree_update(istate, 0)) {
> +	if (cache_tree_update(r, istate, 0)) {
>  		warning(_("unable to update cache-tree, staying full"));
>  		return -1;
>  	}
> @@ -175,7 +177,7 @@ int convert_to_sparse(struct index_state *istate)
>  
>  	/* Clear and recompute the cache-tree */
>  	cache_tree_free(&istate->cache_tree);
> -	cache_tree_update(istate, 0);
> +	cache_tree_update(r, istate, 0);
>  
>  	istate->sparse_index = 1;
>  	trace2_region_leave("index", "convert_to_sparse", istate->repo);
> @@ -216,12 +218,13 @@ void ensure_full_index(struct index_state *istate)
>  	int i;
>  	struct index_state *full;
>  	struct strbuf base = STRBUF_INIT;
> +	struct repository *r = the_repository;
>  
>  	if (!istate || !istate->sparse_index)
>  		return;
>  
>  	if (!istate->repo)
> -		istate->repo = the_repository;
> +		istate->repo = r;
>  
>  	trace2_region_enter("index", "ensure_full_index", istate->repo);
>  
> @@ -279,7 +282,7 @@ void ensure_full_index(struct index_state *istate)
>  
>  	/* Clear and recompute the cache-tree */
>  	cache_tree_free(&istate->cache_tree);
> -	cache_tree_update(istate, 0);
> +	cache_tree_update(r, istate, 0);
>  
>  	trace2_region_leave("index", "ensure_full_index", istate->repo);
>  }
> diff --git a/t/helper/test-dump-cache-tree.c b/t/helper/test-dump-cache-tree.c
> index 6a3f88f5f5d4..e6d57f9900f6 100644
> --- a/t/helper/test-dump-cache-tree.c
> +++ b/t/helper/test-dump-cache-tree.c
> @@ -64,6 +64,6 @@ int cmd__dump_cache_tree(int ac, const char **av)
>  		die("unable to read index file");
>  	istate = the_index;
>  	istate.cache_tree = another;
> -	cache_tree_update(&istate, WRITE_TREE_DRY_RUN);
> +	cache_tree_update(the_repository, &istate, WRITE_TREE_DRY_RUN);
>  	return dump_cache_tree(active_cache_tree, another, "");
>  }
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 0b888dab2246..e6bf576e174a 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1571,7 +1571,7 @@ static int verify_absent(const struct cache_entry *,
>   */
>  int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options *o)
>  {
> -	struct repository *repo = the_repository;
> +	struct repository *r = the_repository;
>  	int i, ret;
>  	static struct cache_entry *dfc;
>  	struct pattern_list pl;
> @@ -1581,10 +1581,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
>  
>  	trace_performance_enter();
> -	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
> +	trace2_region_enter("unpack_trees", "unpack_trees", r);
>  
> -	prepare_repo_settings(repo);
> -	if (repo->settings.command_requires_full_index) {
> +	prepare_repo_settings(r);
> +	if (r->settings.command_requires_full_index) {
>  		ensure_full_index(o->src_index);
>  		ensure_full_index(o->dst_index);
>  	}
> @@ -1662,9 +1662,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  		}
>  
>  		trace_performance_enter();
> -		trace2_region_enter("unpack_trees", "traverse_trees", the_repository);
> +		trace2_region_enter("unpack_trees", "traverse_trees", r);
>  		ret = traverse_trees(o->src_index, len, t, &info);
> -		trace2_region_leave("unpack_trees", "traverse_trees", the_repository);
> +		trace2_region_leave("unpack_trees", "traverse_trees", r);
>  		trace_performance_leave("traverse_trees");
>  		if (ret < 0)
>  			goto return_failed;
> @@ -1732,9 +1732,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  		move_index_extensions(&o->result, o->src_index);
>  		if (!ret) {
>  			if (git_env_bool("GIT_TEST_CHECK_CACHE_TREE", 0))
> -				cache_tree_verify(the_repository, &o->result);
> +				cache_tree_verify(r, &o->result);
>  			if (!cache_tree_fully_valid(o->result.cache_tree))
> -				cache_tree_update(&o->result,
> +				cache_tree_update(r,
> +						  &o->result,
>  						  WRITE_TREE_SILENT |
>  						  WRITE_TREE_REPAIR);
>  		}
> @@ -1750,7 +1751,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  done:
>  	if (free_pattern_list)
>  		clear_pattern_list(&pl);
> -	trace2_region_leave("unpack_trees", "unpack_trees", the_repository);
> +	trace2_region_leave("unpack_trees", "unpack_trees", r);
>  	trace_performance_leave("unpack_trees");
>  	return ret;
>  
>
> base-commit: c9e40ae8ec41c5566e5849a87c969fa81ef49fcd

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 4, 2021

On the Git mailing list, Chinmoy Chakraborty wrote (reply to this):


On 4/4/21 7:19 AM, Junio C Hamano wrote:
> Nothing else needs to be said.  The update-main-cache-tree is a thin
> wrapper to work on cache tree for THE MAIN in-core index instance,
> so it is natural if it needs to pass r and r->index, exactly because
> the latter is THE MAIN in-core index instance for the repository.
> That is too obvious that it is best left unsaid, for the same reason
> why you shouldn't even mention what you did in updating the use of
> update_cache_tree() inside unpack_trees().  There is nothing notable
> in there.
>
> cf. Documentation/SubmittingPatches
> [[describe-changes]]
> [[summary-section]]
> [[meaningful-message]]
> [[imperative-mood]]

Okay, I'll update the message. Thanks for the review.


>> This commit also fixes the `sparse-index.c` file in which
>> the `convert_to_sparse()` and `ensure_full_index()`
>> method use `cache_tree_update()`.
> What branch did you base this patch on?  There is no update_cache_tree()
> call in sparse-index.c on 'master' (well, there is no sparse-index.c
> in 'master' to begin with).

Actually the patch hit against a use of `cache_tree_update()` in

the branch 'ds/sparse-index', so I was suggested by Derrick Stolee

to rebase my work on to of his branch.

You may view the discussion here:

https://lore.kernel.org/git/f187df01-8e59-ac74-01e1-586a7a63fd4e@gmail.com/


> And if you are building somewhere other than the tip of 'master', it
> is necessary to say where, in order to save confusion and wasted
> effort for your readers.  E.g.  "This patch was made on top of
> merging topic 'X' and 'Y' into 'master'".  Then you'd only have to
> wait for 'X' and 'Y' to graduate to 'master' before the topic can
> proceed.
>
> Thanks.

Should this information also be added to the commit message

or just the cover letter of the patch?

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 4, 2021

On the Git mailing list, Chinmoy Chakraborty wrote (reply to this):


On 4/4/21 7:19 AM, Junio C Hamano wrote:
> Nothing else needs to be said.  The update-main-cache-tree is a thin
> wrapper to work on cache tree for THE MAIN in-core index instance,
> so it is natural if it needs to pass r and r->index, exactly because
> the latter is THE MAIN in-core index instance for the repository.
> That is too obvious that it is best left unsaid, for the same reason
> why you shouldn't even mention what you did in updating the use of
> update_cache_tree() inside unpack_trees().  There is nothing notable
> in there.
>
> cf. Documentation/SubmittingPatches
> [[describe-changes]]
> [[summary-section]]
> [[meaningful-message]]
> [[imperative-mood]]

Okay, I'll update the message. Thanks for the review.


>> This commit also fixes the `sparse-index.c` file in which
>> the `convert_to_sparse()` and `ensure_full_index()`
>> method use `cache_tree_update()`.
> What branch did you base this patch on?  There is no update_cache_tree()
> call in sparse-index.c on 'master' (well, there is no sparse-index.c
> in 'master' to begin with).

Actually the patch hit against a use of `cache_tree_update()` in

the branch 'ds/sparse-index', so I was suggested by Derrick Stolee

to rebase my work on to of his branch.

You may view the discussion here:

https://lore.kernel.org/git/f187df01-8e59-ac74-01e1-586a7a63fd4e@gmail.com/


> And if you are building somewhere other than the tip of 'master', it
> is necessary to say where, in order to save confusion and wasted
> effort for your readers.  E.g.  "This patch was made on top of
> merging topic 'X' and 'Y' into 'master'".  Then you'd only have to
> wait for 'X' and 'Y' to graduate to 'master' before the topic can
> proceed.
>
> Thanks.

Should this information be added to the commit message

or just the cover letter of the patch?

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 4, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

Chinmoy Chakraborty <chinmoy12c@gmail.com> writes:

>> And if you are building somewhere other than the tip of 'master', it
>> is necessary to say where, in order to save confusion and wasted
>> effort for your readers.  E.g.  "This patch was made on top of
>> merging topic 'X' and 'Y' into 'master'".  Then you'd only have to
>> wait for 'X' and 'Y' to graduate to 'master' before the topic can
>> proceed.
>>
>> Thanks.
>
> Should this information also be added to the commit message
>
> or just the cover letter of the patch?

We often see people write it in the cover letter for multi-patch
series.  For a single patch topic, it is written under the
three-dash line.

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 4, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Chinmoy via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/sparse-index.c b/sparse-index.c
> index 95ea17174da3..e4323ffd81db 100644
> --- a/sparse-index.c
> +++ b/sparse-index.c
> @@ -128,12 +128,14 @@ int set_sparse_index_config(struct repository *repo, int enable)
>  int convert_to_sparse(struct index_state *istate)
>  {
>  	int test_env;
> +	struct repository *r = the_repository;
> +
>  	if (istate->split_index || istate->sparse_index ||
>  	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
>  		return 0;
>  
>  	if (!istate->repo)
> -		istate->repo = the_repository;
> +		istate->repo = r;
>  

I am not quite sure if this is a reasonable conversion.  Surely it
would not make the compiler barf, but are we sure that the caller of
convert_to_sparse() wants us to work on the_repository instance and
no other repository?  As an istate has a .repo member, it seems to
me that using istate->repo would be a lot saner approach than
assigning the_repository upfront to r.  It might be even needed, if
cache_tree_update() wants to take a repository instance, to ensure
that all callers to this helper sets istate->repo before they call
it so that the above "if not set yet, use the_repository" code does
not have to kick in.

>  	/*
>  	 * The GIT_TEST_SPARSE_INDEX environment variable triggers the
> @@ -161,7 +163,7 @@ int convert_to_sparse(struct index_state *istate)
>  		return -1;
>  	}
>  
> -	if (cache_tree_update(istate, 0)) {
> +	if (cache_tree_update(r, istate, 0)) {

And this looks like a bad conversion.  It may happen to do the same
thing, but the flow of the logic up to this point in the function
was to make sure istate->repo is not empty by filling it if it is
not yet set, and update the cache tree of that istate.  So, it seems
more logical if this call were like so, no?

	if (cache_tree_update(istate->repo, istate, 0)) {

In fact, in the world after 1fd9ae51 (repository: add repo reference
to index_state, 2021-01-23), it is dubious that this topic to teach
cache_tree_update() to take a repository pointer is sensible.  While
working on a single repository, we may create multiple in-core index
instances that represent temporary indices, but each of these in-core
index instances (i.e. istate) belong to a single repository.

And in a call to cache_tree_update(R, I, F), if I->repo is *NOT* R,
that must mean a bug.  Here is what 1fd9ae51 says on this point.

    repository: add repo reference to index_state

    It will be helpful to add behavior to index operations that might
    trigger an object lookup. Since each index belongs to a specific
    repository, add a 'repo' pointer to struct index_state that allows
    access to this repository.

    Add a BUG() statement if the repo already has an index, and the index
    already has a repo, but somehow the index points to a different repo.

    This will prevent future changes from needing to pass an additional
    'struct repository *repo' parameter and instead rely only on the 'struct
    index_state *istate' parameter.

Derrick, what's you thought on this?  The patch under discussion
looks to me a prime example of "future change(s)" needing "to pass
an additional 'struct repository *repo' parameter".

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 5, 2021

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 4/4/2021 2:09 AM, Junio C Hamano wrote:
> "Chinmoy via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> diff --git a/sparse-index.c b/sparse-index.c
>> index 95ea17174da3..e4323ffd81db 100644
>> --- a/sparse-index.c
>> +++ b/sparse-index.c
>> @@ -128,12 +128,14 @@ int set_sparse_index_config(struct repository *repo, int enable)
>>  int convert_to_sparse(struct index_state *istate)
>>  {
>>  	int test_env;
>> +	struct repository *r = the_repository;
>> +
>>  	if (istate->split_index || istate->sparse_index ||
>>  	    !core_apply_sparse_checkout || !core_sparse_checkout_cone)
>>  		return 0;
>>  
>>  	if (!istate->repo)
>> -		istate->repo = the_repository;
>> +		istate->repo = r;
>>  
> 
> I am not quite sure if this is a reasonable conversion.  Surely it
> would not make the compiler barf, but are we sure that the caller of
> convert_to_sparse() wants us to work on the_repository instance and
> no other repository?  As an istate has a .repo member, it seems to
> me that using istate->repo would be a lot saner approach than
> assigning the_repository upfront to r.  It might be even needed, if
> cache_tree_update() wants to take a repository instance, to ensure
> that all callers to this helper sets istate->repo before they call
> it so that the above "if not set yet, use the_repository" code does
> not have to kick in.
> 
>>  	/*
>>  	 * The GIT_TEST_SPARSE_INDEX environment variable triggers the
>> @@ -161,7 +163,7 @@ int convert_to_sparse(struct index_state *istate)
>>  		return -1;
>>  	}
>>  
>> -	if (cache_tree_update(istate, 0)) {
>> +	if (cache_tree_update(r, istate, 0)) {
> 
> And this looks like a bad conversion.  It may happen to do the same
> thing, but the flow of the logic up to this point in the function
> was to make sure istate->repo is not empty by filling it if it is
> not yet set, and update the cache tree of that istate.  So, it seems
> more logical if this call were like so, no?
> 
> 	if (cache_tree_update(istate->repo, istate, 0)) {
> 
> In fact, in the world after 1fd9ae51 (repository: add repo reference
> to index_state, 2021-01-23), it is dubious that this topic to teach
> cache_tree_update() to take a repository pointer is sensible.  While
> working on a single repository, we may create multiple in-core index
> instances that represent temporary indices, but each of these in-core
> index instances (i.e. istate) belong to a single repository.
> 
> And in a call to cache_tree_update(R, I, F), if I->repo is *NOT* R,
> that must mean a bug.  Here is what 1fd9ae51 says on this point.
> 
>     repository: add repo reference to index_state
> 
>     It will be helpful to add behavior to index operations that might
>     trigger an object lookup. Since each index belongs to a specific
>     repository, add a 'repo' pointer to struct index_state that allows
>     access to this repository.
> 
>     Add a BUG() statement if the repo already has an index, and the index
>     already has a repo, but somehow the index points to a different repo.
> 
>     This will prevent future changes from needing to pass an additional
>     'struct repository *repo' parameter and instead rely only on the 'struct
>     index_state *istate' parameter.
> 
> Derrick, what's you thought on this?  The patch under discussion
> looks to me a prime example of "future change(s)" needing "to pass
> an additional 'struct repository *repo' parameter".

With your additional comments, I think it is clear that the "fourth
option" I mentioned earlier [1] is the way to go:

  Finally, there is yet a fourth option: use istate->repo instead. In
  1fd9ae51 (repository: add repo reference to index_state), I added a
  'repo' member to struct index_state. This is intended for methods to
  access a repository directly from the index.

[1] https://lore.kernel.org/git/f187df01-8e59-ac74-01e1-586a7a63fd4e@gmail.com/

So in this sense, we should always use istate->repo, but we might
still need the following guard in some places:

	if (!istate->repo)
		istate->repo = the_repository;

in case there are situations where the index is loaded before
the_repository is loaded. I have hit this in testing, but don't fully
understand the cases where this can happen.

The way it would change this patch is to drop the 'struct repository *r'
pointers and changes to the method signatures. Instead, keep the
methods only taking a 'struct index_state *istate' and use istate->repo
everywhere.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 5, 2021

User Derrick Stolee <stolee@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 5, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

Derrick Stolee <stolee@gmail.com> writes:

> With your additional comments, I think it is clear that the "fourth
> option" I mentioned earlier [1] is the way to go:
>
>   Finally, there is yet a fourth option: use istate->repo instead. In
>   1fd9ae51 (repository: add repo reference to index_state), I added a
>   'repo' member to struct index_state. This is intended for methods to
>   access a repository directly from the index.
>
> [1] https://lore.kernel.org/git/f187df01-8e59-ac74-01e1-586a7a63fd4e@gmail.com/

Thanks.  I wasn't following the earlier discussion closely, as the
topic seemed to be in the hands of good reviewers ;-)

> So in this sense, we should always use istate->repo, but we might
> still need the following guard in some places:
>
> 	if (!istate->repo)
> 		istate->repo = the_repository;
>
> in case there are situations where the index is loaded before
> the_repository is loaded. I have hit this in testing, but don't fully
> understand the cases where this can happen.

As a longer term goal, it may not be a bad idea to make sure that
anybody who passes an istate should not have to pass a repo.  I do
not think of a reason why, other than historical inertia, to do so
in post 1fd9ae51 world.

> The way it would change this patch is to drop the 'struct repository *r'
> pointers and changes to the method signatures. Instead, keep the
> methods only taking a 'struct index_state *istate' and use istate->repo
> everywhere.

Yes, and that would result in a patch that touches very limited
small parts of cache-tree.c without needing to touch any caller, I
would presume.

Thanks.

@chinmoy12c chinmoy12c changed the base branch from ds/sparse-index to master April 7, 2021 05:49
@gitgitgadget
Copy link

gitgitgadget bot commented Apr 7, 2021

The pull request has 283 commits. The max allowed is 30. Please split the patch series into multiple pull requests. Also consider squashing related commits.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 7, 2021

There is an issue in commit 84cf07d:
Commit not signed off

This kills the_repository dependency in cache_tree_update()
and prime_cache_tree().

Signed-off-by: Chinmoy Chakraborty <chinmoy12c@gmail.com>
@chinmoy12c
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 7, 2021

Submitted as pull.915.v4.git.1617778489719.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-915/chinmoy12c/issue_379-v4

To fetch this version to local tag pr-915/chinmoy12c/issue_379-v4:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-915/chinmoy12c/issue_379-v4

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 7, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Chinmoy via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Chinmoy Chakraborty <chinmoy12c@gmail.com>
>
> This kills the_repository dependency in cache_tree_update()
> and prime_cache_tree().
>
> Signed-off-by: Chinmoy Chakraborty <chinmoy12c@gmail.com>
> ---
>     Replace the_repository with r

Huh???

> diff --git a/cache-tree.c b/cache-tree.c
> index add1f0771317..4928a9f0f13b 100644
> --- a/cache-tree.c
> +++ b/cache-tree.c
> @@ -446,10 +446,10 @@ int cache_tree_update(struct index_state *istate, int flags)
>  		istate->cache_tree = cache_tree();
>  
>  	trace_performance_enter();
> -	trace2_region_enter("cache_tree", "update", the_repository);
> +	trace2_region_enter("cache_tree", "update", istate->repo);
>  	i = update_one(istate->cache_tree, istate->cache, istate->cache_nr,
>  		       "", 0, &skip, flags);
> -	trace2_region_leave("cache_tree", "update", the_repository);
> +	trace2_region_leave("cache_tree", "update", istate->repo);
>  	trace_performance_leave("cache_tree_update");
>  	if (i < 0)
>  		return i;
> @@ -746,13 +746,13 @@ void prime_cache_tree(struct repository *r,
>  		      struct index_state *istate,
>  		      struct tree *tree)
>  {
> -	trace2_region_enter("cache-tree", "prime_cache_tree", the_repository);
> +	trace2_region_enter("cache-tree", "prime_cache_tree", r);
>  	cache_tree_free(&istate->cache_tree);
>  	istate->cache_tree = cache_tree();
>  
>  	prime_cache_tree_rec(r, istate->cache_tree, tree);
>  	istate->cache_changed |= CACHE_TREE_CHANGED;
> -	trace2_region_leave("cache-tree", "prime_cache_tree", the_repository);
> +	trace2_region_leave("cache-tree", "prime_cache_tree", r);
>  }

The patch assumes that istate->repo will always set, but it does not
even try to justify why that assumption is safe to make (e.g. "the
entire codebase that leads to this function has been audited and
made sure istate at this point will always have its .repo member is
set" in the log message, if such an audit has actually been done,
may have been convincing), which I find quite troubling.

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 8, 2021

On the Git mailing list, Chinmoy Chakraborty wrote (reply to this):


On 4/8/21 4:33 AM, Junio C Hamano wrote:
> "Chinmoy via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Chinmoy Chakraborty <chinmoy12c@gmail.com>
>>
>> This kills the_repository dependency in cache_tree_update()
>> and prime_cache_tree().
>>
>> Signed-off-by: Chinmoy Chakraborty <chinmoy12c@gmail.com>
>> ---
>>      Replace the_repository with r
> Huh???


Sorry I forgot to change the cover letter.


> The patch assumes that istate->repo will always set, but it does not
> even try to justify why that assumption is safe to make (e.g. "the
> entire codebase that leads to this function has been audited and
> made sure istate at this point will always have its .repo member is
> set" in the log message, if such an audit has actually been done,
> may have been convincing), which I find quite troubling.


Is it safe to make this assumption? I mean to be completely

sure of this, one would have to track back to all the callers.

Should a check like:

     if(!istate->repo)

         istate->repo = the_repository;


be required?

@gitgitgadget
Copy link

gitgitgadget bot commented Apr 8, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

Chinmoy Chakraborty <chinmoy12c@gmail.com> writes:

> Is it safe to make this assumption?

It was the question I asked, and I didn't see a reason to believe it
is safe.

> I mean to be completely sure of this, one would have to track back
> to all the callers.

Yes.  If we audit the callers to make sure istate->repo always
points at the right repository, add missing assignment to
istate->repo as necessary, and add

	if (!istate->repo)
		BUG("caller of cache_tree_udpate() did not fill istate->repo");

then that would be an improvement.  But...

> Should a check like:
>
>     if(!istate->repo)
>
>         istate->repo = the_repository;
>
> be required?

... if we add such an "if the caller did not set istate->repo,
assume the_repository" code, then the resulting code explicitly
assumes that the istate the caller passed to us without setting
istate->repo belongs to the default repository.

I do not quite see the point of such a change---it is not all that
better than "implicit dependency on the_repository" the patch tries
to address, is it?

@chinmoy12c chinmoy12c changed the title Replace the_repository with r Remove implicit dependency of the_repository Apr 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants