Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix various issues around removal of untracked files/directories #1036

Closed
Closed
5 changes: 3 additions & 2 deletions Documentation/git-checkout.txt
Expand Up @@ -118,8 +118,9 @@ OPTIONS
-f::
newren marked this conversation as resolved.
Show resolved Hide resolved
--force::
When switching branches, proceed even if the index or the
working tree differs from `HEAD`. This is used to throw away
local changes.
working tree differs from `HEAD`, and even if there are untracked
files in the way. This is used to throw away local changes and
any untracked files or directories that are in the way.
+
When checking out paths from the index, do not fail upon unmerged
entries; instead, unmerged entries are ignored.
Expand Down
23 changes: 4 additions & 19 deletions Documentation/git-read-tree.txt
Expand Up @@ -10,8 +10,7 @@ SYNOPSIS
--------
[verse]
'git read-tree' [[-m [--trivial] [--aggressive] | --reset | --prefix=<prefix>]
[-u [--exclude-per-directory=<gitignore>] | -i]]
[--index-output=<file>] [--no-sparse-checkout]
[-u | -i]] [--index-output=<file>] [--no-sparse-checkout]
(--empty | <tree-ish1> [<tree-ish2> [<tree-ish3>]])


Expand Down Expand Up @@ -39,8 +38,9 @@ OPTIONS

--reset::
Same as -m, except that unmerged entries are discarded instead
of failing. When used with `-u`, updates leading to loss of
working tree changes will not abort the operation.
of failing. When used with `-u`, updates leading to loss of
working tree changes or untracked files or directories will not
abort the operation.

-u::
After a successful merge, update the files in the work
Expand Down Expand Up @@ -88,21 +88,6 @@ OPTIONS
The command will refuse to overwrite entries that already
existed in the original index file.

--exclude-per-directory=<gitignore>::
When running the command with `-u` and `-m` options, the
merge result may need to overwrite paths that are not
tracked in the current branch. The command usually
refuses to proceed with the merge to avoid losing such a
path. However this safety valve sometimes gets in the
way. For example, it often happens that the other
branch added a file that used to be a generated file in
your branch, and the safety valve triggers when you try
to switch to that branch after you ran `make` but before
running `make clean` to remove the generated file. This
option tells the command to read per-directory exclude
file (usually '.gitignore') and allows such an untracked
but explicitly ignored file to be overwritten.

--index-output=<file>::
Instead of writing the results out to `$GIT_INDEX_FILE`,
write the resulting index in the named file. While the
Expand Down
3 changes: 2 additions & 1 deletion Documentation/git-reset.txt
Expand Up @@ -69,7 +69,8 @@ linkgit:git-add[1]).

--hard::
Resets the index and working tree. Any changes to tracked files in the
working tree since `<commit>` are discarded.
working tree since `<commit>` are discarded. Any untracked files or
directories in the way of writing any tracked files are simply deleted.

--merge::
Resets the index and updates the files in the working tree that are
Expand Down
3 changes: 2 additions & 1 deletion builtin/am.c
Expand Up @@ -1917,7 +1917,8 @@ static int fast_forward_to(struct tree *head, struct tree *remote, int reset)
opts.dst_index = &the_index;
newren marked this conversation as resolved.
Show resolved Hide resolved
newren marked this conversation as resolved.
Show resolved Hide resolved
newren marked this conversation as resolved.
Show resolved Hide resolved
opts.update = 1;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):


On Mon, Sep 27 2021, Elijah Newren via GitGitGadget wrote:

> From: Elijah Newren <newren@gmail.com>
>
> Currently, every caller of unpack_trees() that wants to ensure ignored
> files are overwritten by default needs to:
>    * allocate unpack_trees_options.dir
>    * flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags
>    * call setup_standard_excludes
> AND then after the call to unpack_trees() needs to
>    * call dir_clear()
>    * deallocate unpack_trees_options.dir
> That's a fair amount of boilerplate, and every caller uses identical
> code.  Make this easier by instead introducing a new boolean value where
> the default value (0) does what we want so that new callers of
> unpack_trees() automatically get the appropriate behavior.  And move all
> the handling of unpack_trees_options.dir into unpack_trees() itself.
>
> While preserve_ignored = 0 is the behavior we feel is the appropriate
> default, we defer fixing commands to use the appropriate default until a
> later commit.  So, this commit introduces several locations where we
> manually set preserve_ignored=1.  This makes it clear where code paths
> were previously preserving ignored files when they should not have been;
> a future commit will flip these to instead use a value of 0 to get the
> behavior we want.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  builtin/am.c        |  3 +++
>  builtin/checkout.c  | 11 ++---------
>  builtin/clone.c     |  2 ++
>  builtin/merge.c     |  2 ++
>  builtin/read-tree.c | 13 +++----------
>  builtin/reset.c     |  2 ++
>  builtin/stash.c     |  3 +++
>  merge-ort.c         |  8 +-------
>  merge-recursive.c   |  8 +-------
>  merge.c             |  8 +-------
>  reset.c             |  2 ++
>  sequencer.c         |  2 ++
>  unpack-trees.c      | 10 ++++++++++
>  unpack-trees.h      |  1 +
>  14 files changed, 35 insertions(+), 40 deletions(-)
>
> diff --git a/builtin/am.c b/builtin/am.c
> index e4a0ff9cd7c..1ee70692bc3 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1918,6 +1918,9 @@ static int fast_forward_to(struct tree *head, struct tree *remote, int reset)
>  	opts.update = 1;
>  	opts.merge = 1;
>  	opts.reset = reset;
> +	if (!reset)
> +		/* FIXME: Default should be to remove ignored files */
> +		opts.preserve_ignored = 1;
>  	opts.fn = twoway_merge;
>  	init_tree_desc(&t[0], head->buffer, head->size);
>  	init_tree_desc(&t[1], remote->buffer, remote->size);
> diff --git a/builtin/checkout.c b/builtin/checkout.c
> index 5335435d616..5e7957dd068 100644
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -648,6 +648,7 @@ static int reset_tree(struct tree *tree, const struct checkout_opts *o,
>  	opts.skip_unmerged = !worktree;
>  	opts.reset = 1;
>  	opts.merge = 1;
> +	opts.preserve_ignored = 0;
>  	opts.fn = oneway_merge;
>  	opts.verbose_update = o->show_progress;
>  	opts.src_index = &the_index;
> @@ -746,11 +747,7 @@ static int merge_working_tree(const struct checkout_opts *opts,
>  				       new_branch_info->commit ?
>  				       &new_branch_info->commit->object.oid :
>  				       &new_branch_info->oid, NULL);
> -		if (opts->overwrite_ignore) {
> -			topts.dir = xcalloc(1, sizeof(*topts.dir));
> -			topts.dir->flags |= DIR_SHOW_IGNORED;
> -			setup_standard_excludes(topts.dir);
> -		}
> +		topts.preserve_ignored = !opts->overwrite_ignore;
>  		tree = parse_tree_indirect(old_branch_info->commit ?
>  					   &old_branch_info->commit->object.oid :
>  					   the_hash_algo->empty_tree);
> @@ -760,10 +757,6 @@ static int merge_working_tree(const struct checkout_opts *opts,
>  		init_tree_desc(&trees[1], tree->buffer, tree->size);
>  
>  		ret = unpack_trees(2, trees, &topts);
> -		if (topts.dir) {
> -			dir_clear(topts.dir);
> -			FREE_AND_NULL(topts.dir);
> -		}
>  		clear_unpack_trees_porcelain(&topts);
>  		if (ret == -1) {
>  			/*
> diff --git a/builtin/clone.c b/builtin/clone.c
> index ff1d3d447a3..be1c3840d62 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -687,6 +687,8 @@ static int checkout(int submodule_progress)
>  	opts.update = 1;
>  	opts.merge = 1;
>  	opts.clone = 1;
> +	/* FIXME: Default should be to remove ignored files */
> +	opts.preserve_ignored = 1;
>  	opts.fn = oneway_merge;
>  	opts.verbose_update = (option_verbosity >= 0);
>  	opts.src_index = &the_index;
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 3fbdacc7db4..1e5fff095fc 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -680,6 +680,8 @@ static int read_tree_trivial(struct object_id *common, struct object_id *head,
>  	opts.verbose_update = 1;
>  	opts.trivial_merges_only = 1;
>  	opts.merge = 1;
> +	/* FIXME: Default should be to remove ignored files */
> +	opts.preserve_ignored = 1;
>  	trees[nr_trees] = parse_tree_indirect(common);
>  	if (!trees[nr_trees++])
>  		return -1;
> diff --git a/builtin/read-tree.c b/builtin/read-tree.c
> index 73cb957a69b..443d206eca6 100644
> --- a/builtin/read-tree.c
> +++ b/builtin/read-tree.c
> @@ -201,11 +201,9 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
>  	if ((opts.update || opts.index_only) && !opts.merge)
>  		die("%s is meaningless without -m, --reset, or --prefix",
>  		    opts.update ? "-u" : "-i");
> -	if (opts.update && !opts.reset) {
> -		CALLOC_ARRAY(opts.dir, 1);
> -		opts.dir->flags |= DIR_SHOW_IGNORED;
> -		setup_standard_excludes(opts.dir);
> -	}
> +	if (opts.update && !opts.reset)
> +		opts.preserve_ignored = 0;
> +	/* otherwise, opts.preserve_ignored is irrelevant */
>  	if (opts.merge && !opts.index_only)
>  		setup_work_tree();
>  
> @@ -245,11 +243,6 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
>  	if (unpack_trees(nr_trees, t, &opts))
>  		return 128;
>  
> -	if (opts.dir) {
> -		dir_clear(opts.dir);
> -		FREE_AND_NULL(opts.dir);
> -	}
> -
>  	if (opts.debug_unpack || opts.dry_run)
>  		return 0; /* do not write the index out */
>  
> diff --git a/builtin/reset.c b/builtin/reset.c
> index 51c9e2f43ff..7f38656f018 100644
> --- a/builtin/reset.c
> +++ b/builtin/reset.c
> @@ -67,6 +67,8 @@ static int reset_index(const char *ref, const struct object_id *oid, int reset_t
>  	case KEEP:
>  	case MERGE:
>  		opts.update = 1;
> +		/* FIXME: Default should be to remove ignored files */
> +		opts.preserve_ignored = 1;
>  		break;
>  	case HARD:
>  		opts.update = 1;
> diff --git a/builtin/stash.c b/builtin/stash.c
> index 8f42360ca91..88287b890d5 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -258,6 +258,9 @@ static int reset_tree(struct object_id *i_tree, int update, int reset)
>  	opts.merge = 1;
>  	opts.reset = reset;
>  	opts.update = update;
> +	if (update && !reset)
> +		/* FIXME: Default should be to remove ignored files */
> +		opts.preserve_ignored = 1;
>  	opts.fn = oneway_merge;
>  
>  	if (unpack_trees(nr_trees, t, &opts))
> diff --git a/merge-ort.c b/merge-ort.c
> index 35aa979c3a4..0d64ec716bd 100644
> --- a/merge-ort.c
> +++ b/merge-ort.c
> @@ -4045,11 +4045,7 @@ static int checkout(struct merge_options *opt,
>  	unpack_opts.quiet = 0; /* FIXME: sequencer might want quiet? */
>  	unpack_opts.verbose_update = (opt->verbosity > 2);
>  	unpack_opts.fn = twoway_merge;
> -	if (1/* FIXME: opts->overwrite_ignore*/) {
> -		CALLOC_ARRAY(unpack_opts.dir, 1);
> -		unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
> -		setup_standard_excludes(unpack_opts.dir);
> -	}
> +	unpack_opts.preserve_ignored = 0; /* FIXME: !opts->overwrite_ignore*/
>  	parse_tree(prev);
>  	init_tree_desc(&trees[0], prev->buffer, prev->size);
>  	parse_tree(next);
> @@ -4057,8 +4053,6 @@ static int checkout(struct merge_options *opt,
>  
>  	ret = unpack_trees(2, trees, &unpack_opts);
>  	clear_unpack_trees_porcelain(&unpack_opts);
> -	dir_clear(unpack_opts.dir);
> -	FREE_AND_NULL(unpack_opts.dir);
>  	return ret;
>  }
>  
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 233d9f686ad..2be3f5d4044 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -411,9 +411,7 @@ static int unpack_trees_start(struct merge_options *opt,
>  	else {
>  		opt->priv->unpack_opts.update = 1;
>  		/* FIXME: should only do this if !overwrite_ignore */
> -		CALLOC_ARRAY(opt->priv->unpack_opts.dir, 1);
> -		opt->priv->unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
> -		setup_standard_excludes(opt->priv->unpack_opts.dir);
> +		opt->priv->unpack_opts.preserve_ignored = 0;
>  	}
>  	opt->priv->unpack_opts.merge = 1;
>  	opt->priv->unpack_opts.head_idx = 2;
> @@ -428,10 +426,6 @@ static int unpack_trees_start(struct merge_options *opt,
>  	init_tree_desc_from_tree(t+2, merge);
>  
>  	rc = unpack_trees(3, t, &opt->priv->unpack_opts);
> -	if (opt->priv->unpack_opts.dir) {
> -		dir_clear(opt->priv->unpack_opts.dir);
> -		FREE_AND_NULL(opt->priv->unpack_opts.dir);
> -	}
>  	cache_tree_free(&opt->repo->index->cache_tree);
>  
>  	/*
> diff --git a/merge.c b/merge.c
> index 6e736881d90..2382ff66d35 100644
> --- a/merge.c
> +++ b/merge.c
> @@ -53,7 +53,6 @@ int checkout_fast_forward(struct repository *r,
>  	struct unpack_trees_options opts;
>  	struct tree_desc t[MAX_UNPACK_TREES];
>  	int i, nr_trees = 0;
> -	struct dir_struct dir = DIR_INIT;
>  	struct lock_file lock_file = LOCK_INIT;
>  
>  	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
> @@ -80,11 +79,7 @@ int checkout_fast_forward(struct repository *r,
>  	}
>  
>  	memset(&opts, 0, sizeof(opts));
> -	if (overwrite_ignore) {
> -		dir.flags |= DIR_SHOW_IGNORED;
> -		setup_standard_excludes(&dir);
> -		opts.dir = &dir;
> -	}
> +	opts.preserve_ignored = !overwrite_ignore;
>  
>  	opts.head_idx = 1;
>  	opts.src_index = r->index;
> @@ -101,7 +96,6 @@ int checkout_fast_forward(struct repository *r,
>  		clear_unpack_trees_porcelain(&opts);
>  		return -1;
>  	}
> -	dir_clear(&dir);
>  	clear_unpack_trees_porcelain(&opts);
>  
>  	if (write_locked_index(r->index, &lock_file, COMMIT_LOCK))
> diff --git a/reset.c b/reset.c
> index 79310ae071b..41b3e2d88de 100644
> --- a/reset.c
> +++ b/reset.c
> @@ -56,6 +56,8 @@ int reset_head(struct repository *r, struct object_id *oid, const char *action,
>  	unpack_tree_opts.fn = reset_hard ? oneway_merge : twoway_merge;
>  	unpack_tree_opts.update = 1;
>  	unpack_tree_opts.merge = 1;
> +	/* FIXME: Default should be to remove ignored files */
> +	unpack_tree_opts.preserve_ignored = 1;
>  	init_checkout_metadata(&unpack_tree_opts.meta, switch_to_branch, oid, NULL);
>  	if (!detach_head)
>  		unpack_tree_opts.reset = 1;
> diff --git a/sequencer.c b/sequencer.c
> index 614d56f5e21..098566c68d9 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -3699,6 +3699,8 @@ static int do_reset(struct repository *r,
>  	unpack_tree_opts.fn = oneway_merge;
>  	unpack_tree_opts.merge = 1;
>  	unpack_tree_opts.update = 1;
> +	/* FIXME: Default should be to remove ignored files */
> +	unpack_tree_opts.preserve_ignored = 1;
>  	init_checkout_metadata(&unpack_tree_opts.meta, name, &oid, NULL);
>  
>  	if (repo_read_index_unmerged(r)) {
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 8ea0a542da8..1e4eae1dc7d 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1707,6 +1707,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  		ensure_full_index(o->dst_index);
>  	}
>  
> +	if (!o->preserve_ignored) {
> +		CALLOC_ARRAY(o->dir, 1);
> +		o->dir->flags |= DIR_SHOW_IGNORED;
> +		setup_standard_excludes(o->dir);
> +	}
> +
>  	if (!core_apply_sparse_checkout || !o->update)
>  		o->skip_sparse_checkout = 1;
>  	if (!o->skip_sparse_checkout && !o->pl) {
> @@ -1868,6 +1874,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  done:
>  	if (free_pattern_list)
>  		clear_pattern_list(&pl);
> +	if (o->dir) {
> +		dir_clear(o->dir);
> +		FREE_AND_NULL(o->dir);
> +	}
>  	trace2_region_leave("unpack_trees", "unpack_trees", the_repository);
>  	trace_performance_leave("unpack_trees");
>  	return ret;
> diff --git a/unpack-trees.h b/unpack-trees.h
> index 2d88b19dca7..f98cfd49d7b 100644
> --- a/unpack-trees.h
> +++ b/unpack-trees.h
> @@ -49,6 +49,7 @@ struct unpack_trees_options {
>  	unsigned int reset,
>  		     merge,
>  		     update,
> +		     preserve_ignored,
>  		     clone,
>  		     index_only,
>  		     nontrivial_merge,

I think getting rid of the boilerplate makes sense, but it doesn't sound
from the commit message like you've considered just making that "struct
dir*" member a "struct dir" instead.

That simplifies things a lot, i.e. we can just DIR_INIT it, and don't
need every caller to malloc/free it.

Sometimes a pointer makes sense, but in this case the "struct
unpack_trees_options" can just own it.

As part of WIP leak fixes I have unsubmitted I'd implemented that, patch
follows below.

I think the part of it that deals with managing the "struct dir" is much
nicer, but you might still want to keep the "preserve_ignored" you've
added.

Oh, and I noticed I removed the dir_clear() here but didn't add it to
clear_unpack_trees_porcelain(), that also needs to be done (and I did it
in a later fix that I should squash in), but I can't be bothered to
re-do the below diff just for that, and since the point is how we manage
the struct itself (the freeing is rather trivial...).

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 8c69dcdf72a..632da036717 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -747,9 +747,8 @@ static int merge_working_tree(const struct checkout_opts *opts,
 				       &new_branch_info->commit->object.oid :
 				       &new_branch_info->oid, NULL);
 		if (opts->overwrite_ignore) {
-			topts.dir = xcalloc(1, sizeof(*topts.dir));
-			topts.dir->flags |= DIR_SHOW_IGNORED;
-			setup_standard_excludes(topts.dir);
+			topts.dir.flags |= DIR_SHOW_IGNORED;
+			setup_standard_excludes(&topts.dir);
 		}
 		tree = parse_tree_indirect(old_branch_info->commit ?
 					   &old_branch_info->commit->object.oid :
diff --git a/builtin/read-tree.c b/builtin/read-tree.c
index 485e7b04794..6d529c77c49 100644
--- a/builtin/read-tree.c
+++ b/builtin/read-tree.c
@@ -53,20 +53,17 @@ static int index_output_cb(const struct option *opt, const char *arg,
 static int exclude_per_directory_cb(const struct option *opt, const char *arg,
 				    int unset)
 {
-	struct dir_struct *dir;
 	struct unpack_trees_options *opts;
 
 	BUG_ON_OPT_NEG(unset);
 
 	opts = (struct unpack_trees_options *)opt->value;
 
-	if (opts->dir)
+	if (opts->dir.exclude_per_dir)
 		die("more than one --exclude-per-directory given.");
 
-	dir = xcalloc(1, sizeof(*opts->dir));
-	dir->flags |= DIR_SHOW_IGNORED;
-	dir->exclude_per_dir = arg;
-	opts->dir = dir;
+	opts->dir.flags |= DIR_SHOW_IGNORED;
+	opts->dir.exclude_per_dir = arg;
 	/* We do not need to nor want to do read-directory
 	 * here; we are merely interested in reusing the
 	 * per directory ignore stack mechanism.
@@ -209,7 +206,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
 	if ((opts.update || opts.index_only) && !opts.merge)
 		die("%s is meaningless without -m, --reset, or --prefix",
 		    opts.update ? "-u" : "-i");
-	if ((opts.dir && !opts.update))
+	if ((opts.dir.exclude_per_dir && !opts.update))
 		die("--exclude-per-directory is meaningless unless -u");
 	if (opts.merge && !opts.index_only)
 		setup_work_tree();
diff --git a/merge-ort.c b/merge-ort.c
index 35aa979c3a4..e526b78b88d 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -4021,9 +4021,8 @@ static int checkout(struct merge_options *opt,
 	/* Switch the index/working copy from old to new */
 	int ret;
 	struct tree_desc trees[2];
-	struct unpack_trees_options unpack_opts;
+	struct unpack_trees_options unpack_opts = UNPACK_TREES_OPTIONS_INIT;
 
-	memset(&unpack_opts, 0, sizeof(unpack_opts));
 	unpack_opts.head_idx = -1;
 	unpack_opts.src_index = opt->repo->index;
 	unpack_opts.dst_index = opt->repo->index;
@@ -4046,9 +4045,8 @@ static int checkout(struct merge_options *opt,
 	unpack_opts.verbose_update = (opt->verbosity > 2);
 	unpack_opts.fn = twoway_merge;
 	if (1/* FIXME: opts->overwrite_ignore*/) {
-		CALLOC_ARRAY(unpack_opts.dir, 1);
-		unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
-		setup_standard_excludes(unpack_opts.dir);
+		unpack_opts.dir.flags |= DIR_SHOW_IGNORED;
+		setup_standard_excludes(&unpack_opts.dir);
 	}
 	parse_tree(prev);
 	init_tree_desc(&trees[0], prev->buffer, prev->size);
@@ -4057,8 +4055,6 @@ static int checkout(struct merge_options *opt,
 
 	ret = unpack_trees(2, trees, &unpack_opts);
 	clear_unpack_trees_porcelain(&unpack_opts);
-	dir_clear(unpack_opts.dir);
-	FREE_AND_NULL(unpack_opts.dir);
 	return ret;
 }
 
diff --git a/merge.c b/merge.c
index 6e736881d90..9cb32990dd9 100644
--- a/merge.c
+++ b/merge.c
@@ -50,10 +50,9 @@ int checkout_fast_forward(struct repository *r,
 			  int overwrite_ignore)
 {
 	struct tree *trees[MAX_UNPACK_TREES];
-	struct unpack_trees_options opts;
+	struct unpack_trees_options opts = UNPACK_TREES_OPTIONS_INIT;
 	struct tree_desc t[MAX_UNPACK_TREES];
 	int i, nr_trees = 0;
-	struct dir_struct dir = DIR_INIT;
 	struct lock_file lock_file = LOCK_INIT;
 
 	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
@@ -79,11 +78,9 @@ int checkout_fast_forward(struct repository *r,
 		init_tree_desc(t+i, trees[i]->buffer, trees[i]->size);
 	}
 
-	memset(&opts, 0, sizeof(opts));
 	if (overwrite_ignore) {
-		dir.flags |= DIR_SHOW_IGNORED;
-		setup_standard_excludes(&dir);
-		opts.dir = &dir;
+		opts.dir.flags |= DIR_SHOW_IGNORED;
+		setup_standard_excludes(&opts.dir);
 	}
 
 	opts.head_idx = 1;
@@ -101,7 +98,6 @@ int checkout_fast_forward(struct repository *r,
 		clear_unpack_trees_porcelain(&opts);
 		return -1;
 	}
-	dir_clear(&dir);
 	clear_unpack_trees_porcelain(&opts);
 
 	if (write_locked_index(r->index, &lock_file, COMMIT_LOCK))
diff --git a/unpack-trees.c b/unpack-trees.c
index 8ea0a542da8..33a2dc23ffc 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -2081,7 +2081,7 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
 	 */
 	int namelen;
 	int i;
-	struct dir_struct d;
+	struct dir_struct d = DIR_INIT;
 	char *pathbuf;
 	int cnt = 0;
 
@@ -2132,9 +2132,7 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
 	 */
 	pathbuf = xstrfmt("%.*s/", namelen, ce->name);
 
-	memset(&d, 0, sizeof(d));
-	if (o->dir)
-		d.exclude_per_dir = o->dir->exclude_per_dir;
+	d.exclude_per_dir = o->dir.exclude_per_dir;
 	i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
 	if (i)
 		return add_rejected_path(o, ERROR_NOT_UPTODATE_DIR, ce->name);
@@ -2175,8 +2173,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 	if (ignore_case && icase_exists(o, name, len, st))
 		return 0;
 
-	if (o->dir &&
-	    is_excluded(o->dir, o->src_index, name, &dtype))
+	if (is_excluded(&o->dir, o->src_index, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
diff --git a/unpack-trees.h b/unpack-trees.h
index 2d88b19dca7..6fa6a4dfc3e 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -5,6 +5,7 @@
 #include "strvec.h"
 #include "string-list.h"
 #include "tree-walk.h"
+#include "dir.h"
 
 #define MAX_UNPACK_TREES MAX_TRAVERSE_TREES
 
@@ -66,7 +67,7 @@ struct unpack_trees_options {
 		     dry_run;
 	const char *prefix;
 	int cache_bottom;
-	struct dir_struct *dir;
+	struct dir_struct dir;
 	struct pathspec *pathspec;
 	merge_fn_t fn;
 	const char *msgs[NB_UNPACK_TREES_WARNING_TYPES];
@@ -90,6 +91,9 @@ struct unpack_trees_options {
 	struct pattern_list *pl; /* for internal use */
 	struct checkout_metadata meta;
 };
+#define UNPACK_TREES_OPTIONS_INIT { \
+	.dir = DIR_INIT, \
+}
 
 int unpack_trees(unsigned n, struct tree_desc *t,
 		 struct unpack_trees_options *options);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Wed, Sep 29, 2021 at 2:27 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Mon, Sep 27 2021, Elijah Newren via GitGitGadget wrote:
>
> > From: Elijah Newren <newren@gmail.com>
> >
> > Currently, every caller of unpack_trees() that wants to ensure ignored
> > files are overwritten by default needs to:
> >    * allocate unpack_trees_options.dir
> >    * flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags
> >    * call setup_standard_excludes
> > AND then after the call to unpack_trees() needs to
> >    * call dir_clear()
> >    * deallocate unpack_trees_options.dir
> > That's a fair amount of boilerplate, and every caller uses identical
> > code.  Make this easier by instead introducing a new boolean value where
> > the default value (0) does what we want so that new callers of
> > unpack_trees() automatically get the appropriate behavior.  And move all
> > the handling of unpack_trees_options.dir into unpack_trees() itself.
> >
> > While preserve_ignored = 0 is the behavior we feel is the appropriate
> > default, we defer fixing commands to use the appropriate default until a
> > later commit.  So, this commit introduces several locations where we
> > manually set preserve_ignored=1.  This makes it clear where code paths
> > were previously preserving ignored files when they should not have been;
> > a future commit will flip these to instead use a value of 0 to get the
> > behavior we want.
> >
> > Signed-off-by: Elijah Newren <newren@gmail.com>
> > ---
> >  builtin/am.c        |  3 +++
> >  builtin/checkout.c  | 11 ++---------
> >  builtin/clone.c     |  2 ++
> >  builtin/merge.c     |  2 ++
> >  builtin/read-tree.c | 13 +++----------
> >  builtin/reset.c     |  2 ++
> >  builtin/stash.c     |  3 +++
> >  merge-ort.c         |  8 +-------
> >  merge-recursive.c   |  8 +-------
> >  merge.c             |  8 +-------
> >  reset.c             |  2 ++
> >  sequencer.c         |  2 ++
> >  unpack-trees.c      | 10 ++++++++++
> >  unpack-trees.h      |  1 +
> >  14 files changed, 35 insertions(+), 40 deletions(-)
> >
> > diff --git a/builtin/am.c b/builtin/am.c
> > index e4a0ff9cd7c..1ee70692bc3 100644
> > --- a/builtin/am.c
> > +++ b/builtin/am.c
> > @@ -1918,6 +1918,9 @@ static int fast_forward_to(struct tree *head, struct tree *remote, int reset)
> >       opts.update = 1;
> >       opts.merge = 1;
> >       opts.reset = reset;
> > +     if (!reset)
> > +             /* FIXME: Default should be to remove ignored files */
> > +             opts.preserve_ignored = 1;
> >       opts.fn = twoway_merge;
> >       init_tree_desc(&t[0], head->buffer, head->size);
> >       init_tree_desc(&t[1], remote->buffer, remote->size);
> > diff --git a/builtin/checkout.c b/builtin/checkout.c
> > index 5335435d616..5e7957dd068 100644
> > --- a/builtin/checkout.c
> > +++ b/builtin/checkout.c
> > @@ -648,6 +648,7 @@ static int reset_tree(struct tree *tree, const struct checkout_opts *o,
> >       opts.skip_unmerged = !worktree;
> >       opts.reset = 1;
> >       opts.merge = 1;
> > +     opts.preserve_ignored = 0;
> >       opts.fn = oneway_merge;
> >       opts.verbose_update = o->show_progress;
> >       opts.src_index = &the_index;
> > @@ -746,11 +747,7 @@ static int merge_working_tree(const struct checkout_opts *opts,
> >                                      new_branch_info->commit ?
> >                                      &new_branch_info->commit->object.oid :
> >                                      &new_branch_info->oid, NULL);
> > -             if (opts->overwrite_ignore) {
> > -                     topts.dir = xcalloc(1, sizeof(*topts.dir));
> > -                     topts.dir->flags |= DIR_SHOW_IGNORED;
> > -                     setup_standard_excludes(topts.dir);
> > -             }
> > +             topts.preserve_ignored = !opts->overwrite_ignore;
> >               tree = parse_tree_indirect(old_branch_info->commit ?
> >                                          &old_branch_info->commit->object.oid :
> >                                          the_hash_algo->empty_tree);
> > @@ -760,10 +757,6 @@ static int merge_working_tree(const struct checkout_opts *opts,
> >               init_tree_desc(&trees[1], tree->buffer, tree->size);
> >
> >               ret = unpack_trees(2, trees, &topts);
> > -             if (topts.dir) {
> > -                     dir_clear(topts.dir);
> > -                     FREE_AND_NULL(topts.dir);
> > -             }
> >               clear_unpack_trees_porcelain(&topts);
> >               if (ret == -1) {
> >                       /*
> > diff --git a/builtin/clone.c b/builtin/clone.c
> > index ff1d3d447a3..be1c3840d62 100644
> > --- a/builtin/clone.c
> > +++ b/builtin/clone.c
> > @@ -687,6 +687,8 @@ static int checkout(int submodule_progress)
> >       opts.update = 1;
> >       opts.merge = 1;
> >       opts.clone = 1;
> > +     /* FIXME: Default should be to remove ignored files */
> > +     opts.preserve_ignored = 1;
> >       opts.fn = oneway_merge;
> >       opts.verbose_update = (option_verbosity >= 0);
> >       opts.src_index = &the_index;
> > diff --git a/builtin/merge.c b/builtin/merge.c
> > index 3fbdacc7db4..1e5fff095fc 100644
> > --- a/builtin/merge.c
> > +++ b/builtin/merge.c
> > @@ -680,6 +680,8 @@ static int read_tree_trivial(struct object_id *common, struct object_id *head,
> >       opts.verbose_update = 1;
> >       opts.trivial_merges_only = 1;
> >       opts.merge = 1;
> > +     /* FIXME: Default should be to remove ignored files */
> > +     opts.preserve_ignored = 1;
> >       trees[nr_trees] = parse_tree_indirect(common);
> >       if (!trees[nr_trees++])
> >               return -1;
> > diff --git a/builtin/read-tree.c b/builtin/read-tree.c
> > index 73cb957a69b..443d206eca6 100644
> > --- a/builtin/read-tree.c
> > +++ b/builtin/read-tree.c
> > @@ -201,11 +201,9 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
> >       if ((opts.update || opts.index_only) && !opts.merge)
> >               die("%s is meaningless without -m, --reset, or --prefix",
> >                   opts.update ? "-u" : "-i");
> > -     if (opts.update && !opts.reset) {
> > -             CALLOC_ARRAY(opts.dir, 1);
> > -             opts.dir->flags |= DIR_SHOW_IGNORED;
> > -             setup_standard_excludes(opts.dir);
> > -     }
> > +     if (opts.update && !opts.reset)
> > +             opts.preserve_ignored = 0;
> > +     /* otherwise, opts.preserve_ignored is irrelevant */
> >       if (opts.merge && !opts.index_only)
> >               setup_work_tree();
> >
> > @@ -245,11 +243,6 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
> >       if (unpack_trees(nr_trees, t, &opts))
> >               return 128;
> >
> > -     if (opts.dir) {
> > -             dir_clear(opts.dir);
> > -             FREE_AND_NULL(opts.dir);
> > -     }
> > -
> >       if (opts.debug_unpack || opts.dry_run)
> >               return 0; /* do not write the index out */
> >
> > diff --git a/builtin/reset.c b/builtin/reset.c
> > index 51c9e2f43ff..7f38656f018 100644
> > --- a/builtin/reset.c
> > +++ b/builtin/reset.c
> > @@ -67,6 +67,8 @@ static int reset_index(const char *ref, const struct object_id *oid, int reset_t
> >       case KEEP:
> >       case MERGE:
> >               opts.update = 1;
> > +             /* FIXME: Default should be to remove ignored files */
> > +             opts.preserve_ignored = 1;
> >               break;
> >       case HARD:
> >               opts.update = 1;
> > diff --git a/builtin/stash.c b/builtin/stash.c
> > index 8f42360ca91..88287b890d5 100644
> > --- a/builtin/stash.c
> > +++ b/builtin/stash.c
> > @@ -258,6 +258,9 @@ static int reset_tree(struct object_id *i_tree, int update, int reset)
> >       opts.merge = 1;
> >       opts.reset = reset;
> >       opts.update = update;
> > +     if (update && !reset)
> > +             /* FIXME: Default should be to remove ignored files */
> > +             opts.preserve_ignored = 1;
> >       opts.fn = oneway_merge;
> >
> >       if (unpack_trees(nr_trees, t, &opts))
> > diff --git a/merge-ort.c b/merge-ort.c
> > index 35aa979c3a4..0d64ec716bd 100644
> > --- a/merge-ort.c
> > +++ b/merge-ort.c
> > @@ -4045,11 +4045,7 @@ static int checkout(struct merge_options *opt,
> >       unpack_opts.quiet = 0; /* FIXME: sequencer might want quiet? */
> >       unpack_opts.verbose_update = (opt->verbosity > 2);
> >       unpack_opts.fn = twoway_merge;
> > -     if (1/* FIXME: opts->overwrite_ignore*/) {
> > -             CALLOC_ARRAY(unpack_opts.dir, 1);
> > -             unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
> > -             setup_standard_excludes(unpack_opts.dir);
> > -     }
> > +     unpack_opts.preserve_ignored = 0; /* FIXME: !opts->overwrite_ignore*/
> >       parse_tree(prev);
> >       init_tree_desc(&trees[0], prev->buffer, prev->size);
> >       parse_tree(next);
> > @@ -4057,8 +4053,6 @@ static int checkout(struct merge_options *opt,
> >
> >       ret = unpack_trees(2, trees, &unpack_opts);
> >       clear_unpack_trees_porcelain(&unpack_opts);
> > -     dir_clear(unpack_opts.dir);
> > -     FREE_AND_NULL(unpack_opts.dir);
> >       return ret;
> >  }
> >
> > diff --git a/merge-recursive.c b/merge-recursive.c
> > index 233d9f686ad..2be3f5d4044 100644
> > --- a/merge-recursive.c
> > +++ b/merge-recursive.c
> > @@ -411,9 +411,7 @@ static int unpack_trees_start(struct merge_options *opt,
> >       else {
> >               opt->priv->unpack_opts.update = 1;
> >               /* FIXME: should only do this if !overwrite_ignore */
> > -             CALLOC_ARRAY(opt->priv->unpack_opts.dir, 1);
> > -             opt->priv->unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
> > -             setup_standard_excludes(opt->priv->unpack_opts.dir);
> > +             opt->priv->unpack_opts.preserve_ignored = 0;
> >       }
> >       opt->priv->unpack_opts.merge = 1;
> >       opt->priv->unpack_opts.head_idx = 2;
> > @@ -428,10 +426,6 @@ static int unpack_trees_start(struct merge_options *opt,
> >       init_tree_desc_from_tree(t+2, merge);
> >
> >       rc = unpack_trees(3, t, &opt->priv->unpack_opts);
> > -     if (opt->priv->unpack_opts.dir) {
> > -             dir_clear(opt->priv->unpack_opts.dir);
> > -             FREE_AND_NULL(opt->priv->unpack_opts.dir);
> > -     }
> >       cache_tree_free(&opt->repo->index->cache_tree);
> >
> >       /*
> > diff --git a/merge.c b/merge.c
> > index 6e736881d90..2382ff66d35 100644
> > --- a/merge.c
> > +++ b/merge.c
> > @@ -53,7 +53,6 @@ int checkout_fast_forward(struct repository *r,
> >       struct unpack_trees_options opts;
> >       struct tree_desc t[MAX_UNPACK_TREES];
> >       int i, nr_trees = 0;
> > -     struct dir_struct dir = DIR_INIT;
> >       struct lock_file lock_file = LOCK_INIT;
> >
> >       refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
> > @@ -80,11 +79,7 @@ int checkout_fast_forward(struct repository *r,
> >       }
> >
> >       memset(&opts, 0, sizeof(opts));
> > -     if (overwrite_ignore) {
> > -             dir.flags |= DIR_SHOW_IGNORED;
> > -             setup_standard_excludes(&dir);
> > -             opts.dir = &dir;
> > -     }
> > +     opts.preserve_ignored = !overwrite_ignore;
> >
> >       opts.head_idx = 1;
> >       opts.src_index = r->index;
> > @@ -101,7 +96,6 @@ int checkout_fast_forward(struct repository *r,
> >               clear_unpack_trees_porcelain(&opts);
> >               return -1;
> >       }
> > -     dir_clear(&dir);
> >       clear_unpack_trees_porcelain(&opts);
> >
> >       if (write_locked_index(r->index, &lock_file, COMMIT_LOCK))
> > diff --git a/reset.c b/reset.c
> > index 79310ae071b..41b3e2d88de 100644
> > --- a/reset.c
> > +++ b/reset.c
> > @@ -56,6 +56,8 @@ int reset_head(struct repository *r, struct object_id *oid, const char *action,
> >       unpack_tree_opts.fn = reset_hard ? oneway_merge : twoway_merge;
> >       unpack_tree_opts.update = 1;
> >       unpack_tree_opts.merge = 1;
> > +     /* FIXME: Default should be to remove ignored files */
> > +     unpack_tree_opts.preserve_ignored = 1;
> >       init_checkout_metadata(&unpack_tree_opts.meta, switch_to_branch, oid, NULL);
> >       if (!detach_head)
> >               unpack_tree_opts.reset = 1;
> > diff --git a/sequencer.c b/sequencer.c
> > index 614d56f5e21..098566c68d9 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -3699,6 +3699,8 @@ static int do_reset(struct repository *r,
> >       unpack_tree_opts.fn = oneway_merge;
> >       unpack_tree_opts.merge = 1;
> >       unpack_tree_opts.update = 1;
> > +     /* FIXME: Default should be to remove ignored files */
> > +     unpack_tree_opts.preserve_ignored = 1;
> >       init_checkout_metadata(&unpack_tree_opts.meta, name, &oid, NULL);
> >
> >       if (repo_read_index_unmerged(r)) {
> > diff --git a/unpack-trees.c b/unpack-trees.c
> > index 8ea0a542da8..1e4eae1dc7d 100644
> > --- a/unpack-trees.c
> > +++ b/unpack-trees.c
> > @@ -1707,6 +1707,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >               ensure_full_index(o->dst_index);
> >       }
> >
> > +     if (!o->preserve_ignored) {
> > +             CALLOC_ARRAY(o->dir, 1);
> > +             o->dir->flags |= DIR_SHOW_IGNORED;
> > +             setup_standard_excludes(o->dir);
> > +     }
> > +
> >       if (!core_apply_sparse_checkout || !o->update)
> >               o->skip_sparse_checkout = 1;
> >       if (!o->skip_sparse_checkout && !o->pl) {
> > @@ -1868,6 +1874,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
> >  done:
> >       if (free_pattern_list)
> >               clear_pattern_list(&pl);
> > +     if (o->dir) {
> > +             dir_clear(o->dir);
> > +             FREE_AND_NULL(o->dir);
> > +     }
> >       trace2_region_leave("unpack_trees", "unpack_trees", the_repository);
> >       trace_performance_leave("unpack_trees");
> >       return ret;
> > diff --git a/unpack-trees.h b/unpack-trees.h
> > index 2d88b19dca7..f98cfd49d7b 100644
> > --- a/unpack-trees.h
> > +++ b/unpack-trees.h
> > @@ -49,6 +49,7 @@ struct unpack_trees_options {
> >       unsigned int reset,
> >                    merge,
> >                    update,
> > +                  preserve_ignored,
> >                    clone,
> >                    index_only,
> >                    nontrivial_merge,
>
> I think getting rid of the boilerplate makes sense, but it doesn't sound
> from the commit message like you've considered just making that "struct
> dir*" member a "struct dir" instead.
>
> That simplifies things a lot, i.e. we can just DIR_INIT it, and don't
> need every caller to malloc/free it.

See the next patch in the series.  :-)

> Sometimes a pointer makes sense, but in this case the "struct
> unpack_trees_options" can just own it.

I did make it internal to unpack_trees_options in the next patch, but
kept it as a pointer just because that let me know whether it was used
or not.  I guess I could have added a boolean as well.  But I don't
actually allocate anything, because it's either a NULL pointer, or a
pointer to something on the stack.  So, I do get to just use DIR_INIT.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):


On Wed, Sep 29 2021, Elijah Newren wrote:

> On Wed, Sep 29, 2021 at 2:27 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> On Mon, Sep 27 2021, Elijah Newren via GitGitGadget wrote:
>>
>> > From: Elijah Newren <newren@gmail.com>
>> >
>> > Currently, every caller of unpack_trees() that wants to ensure ignored
>> > files are overwritten by default needs to:
>> >    * allocate unpack_trees_options.dir
>> >    * flip the DIR_SHOW_IGNORED flag in unpack_trees_options.dir->flags
>> >    * call setup_standard_excludes
>> > AND then after the call to unpack_trees() needs to
>> >    * call dir_clear()
>> >    * deallocate unpack_trees_options.dir
>> > That's a fair amount of boilerplate, and every caller uses identical
>> > code.  Make this easier by instead introducing a new boolean value where
>> > the default value (0) does what we want so that new callers of
>> > unpack_trees() automatically get the appropriate behavior.  And move all
>> > the handling of unpack_trees_options.dir into unpack_trees() itself.
>> >
>> > While preserve_ignored = 0 is the behavior we feel is the appropriate
>> > default, we defer fixing commands to use the appropriate default until a
>> > later commit.  So, this commit introduces several locations where we
>> > manually set preserve_ignored=1.  This makes it clear where code paths
>> > were previously preserving ignored files when they should not have been;
>> > a future commit will flip these to instead use a value of 0 to get the
>> > behavior we want.
>> >
>> > Signed-off-by: Elijah Newren <newren@gmail.com>
>> > ---
>> >  builtin/am.c        |  3 +++
>> >  builtin/checkout.c  | 11 ++---------
>> >  builtin/clone.c     |  2 ++
>> >  builtin/merge.c     |  2 ++
>> >  builtin/read-tree.c | 13 +++----------
>> >  builtin/reset.c     |  2 ++
>> >  builtin/stash.c     |  3 +++
>> >  merge-ort.c         |  8 +-------
>> >  merge-recursive.c   |  8 +-------
>> >  merge.c             |  8 +-------
>> >  reset.c             |  2 ++
>> >  sequencer.c         |  2 ++
>> >  unpack-trees.c      | 10 ++++++++++
>> >  unpack-trees.h      |  1 +
>> >  14 files changed, 35 insertions(+), 40 deletions(-)
>> >
>> > diff --git a/builtin/am.c b/builtin/am.c
>> > index e4a0ff9cd7c..1ee70692bc3 100644
>> > --- a/builtin/am.c
>> > +++ b/builtin/am.c
>> > @@ -1918,6 +1918,9 @@ static int fast_forward_to(struct tree *head, struct tree *remote, int reset)
>> >       opts.update = 1;
>> >       opts.merge = 1;
>> >       opts.reset = reset;
>> > +     if (!reset)
>> > +             /* FIXME: Default should be to remove ignored files */
>> > +             opts.preserve_ignored = 1;
>> >       opts.fn = twoway_merge;
>> >       init_tree_desc(&t[0], head->buffer, head->size);
>> >       init_tree_desc(&t[1], remote->buffer, remote->size);
>> > diff --git a/builtin/checkout.c b/builtin/checkout.c
>> > index 5335435d616..5e7957dd068 100644
>> > --- a/builtin/checkout.c
>> > +++ b/builtin/checkout.c
>> > @@ -648,6 +648,7 @@ static int reset_tree(struct tree *tree, const struct checkout_opts *o,
>> >       opts.skip_unmerged = !worktree;
>> >       opts.reset = 1;
>> >       opts.merge = 1;
>> > +     opts.preserve_ignored = 0;
>> >       opts.fn = oneway_merge;
>> >       opts.verbose_update = o->show_progress;
>> >       opts.src_index = &the_index;
>> > @@ -746,11 +747,7 @@ static int merge_working_tree(const struct checkout_opts *opts,
>> >                                      new_branch_info->commit ?
>> >                                      &new_branch_info->commit->object.oid :
>> >                                      &new_branch_info->oid, NULL);
>> > -             if (opts->overwrite_ignore) {
>> > -                     topts.dir = xcalloc(1, sizeof(*topts.dir));
>> > -                     topts.dir->flags |= DIR_SHOW_IGNORED;
>> > -                     setup_standard_excludes(topts.dir);
>> > -             }
>> > +             topts.preserve_ignored = !opts->overwrite_ignore;
>> >               tree = parse_tree_indirect(old_branch_info->commit ?
>> >                                          &old_branch_info->commit->object.oid :
>> >                                          the_hash_algo->empty_tree);
>> > @@ -760,10 +757,6 @@ static int merge_working_tree(const struct checkout_opts *opts,
>> >               init_tree_desc(&trees[1], tree->buffer, tree->size);
>> >
>> >               ret = unpack_trees(2, trees, &topts);
>> > -             if (topts.dir) {
>> > -                     dir_clear(topts.dir);
>> > -                     FREE_AND_NULL(topts.dir);
>> > -             }
>> >               clear_unpack_trees_porcelain(&topts);
>> >               if (ret == -1) {
>> >                       /*
>> > diff --git a/builtin/clone.c b/builtin/clone.c
>> > index ff1d3d447a3..be1c3840d62 100644
>> > --- a/builtin/clone.c
>> > +++ b/builtin/clone.c
>> > @@ -687,6 +687,8 @@ static int checkout(int submodule_progress)
>> >       opts.update = 1;
>> >       opts.merge = 1;
>> >       opts.clone = 1;
>> > +     /* FIXME: Default should be to remove ignored files */
>> > +     opts.preserve_ignored = 1;
>> >       opts.fn = oneway_merge;
>> >       opts.verbose_update = (option_verbosity >= 0);
>> >       opts.src_index = &the_index;
>> > diff --git a/builtin/merge.c b/builtin/merge.c
>> > index 3fbdacc7db4..1e5fff095fc 100644
>> > --- a/builtin/merge.c
>> > +++ b/builtin/merge.c
>> > @@ -680,6 +680,8 @@ static int read_tree_trivial(struct object_id *common, struct object_id *head,
>> >       opts.verbose_update = 1;
>> >       opts.trivial_merges_only = 1;
>> >       opts.merge = 1;
>> > +     /* FIXME: Default should be to remove ignored files */
>> > +     opts.preserve_ignored = 1;
>> >       trees[nr_trees] = parse_tree_indirect(common);
>> >       if (!trees[nr_trees++])
>> >               return -1;
>> > diff --git a/builtin/read-tree.c b/builtin/read-tree.c
>> > index 73cb957a69b..443d206eca6 100644
>> > --- a/builtin/read-tree.c
>> > +++ b/builtin/read-tree.c
>> > @@ -201,11 +201,9 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
>> >       if ((opts.update || opts.index_only) && !opts.merge)
>> >               die("%s is meaningless without -m, --reset, or --prefix",
>> >                   opts.update ? "-u" : "-i");
>> > -     if (opts.update && !opts.reset) {
>> > -             CALLOC_ARRAY(opts.dir, 1);
>> > -             opts.dir->flags |= DIR_SHOW_IGNORED;
>> > -             setup_standard_excludes(opts.dir);
>> > -     }
>> > +     if (opts.update && !opts.reset)
>> > +             opts.preserve_ignored = 0;
>> > +     /* otherwise, opts.preserve_ignored is irrelevant */
>> >       if (opts.merge && !opts.index_only)
>> >               setup_work_tree();
>> >
>> > @@ -245,11 +243,6 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
>> >       if (unpack_trees(nr_trees, t, &opts))
>> >               return 128;
>> >
>> > -     if (opts.dir) {
>> > -             dir_clear(opts.dir);
>> > -             FREE_AND_NULL(opts.dir);
>> > -     }
>> > -
>> >       if (opts.debug_unpack || opts.dry_run)
>> >               return 0; /* do not write the index out */
>> >
>> > diff --git a/builtin/reset.c b/builtin/reset.c
>> > index 51c9e2f43ff..7f38656f018 100644
>> > --- a/builtin/reset.c
>> > +++ b/builtin/reset.c
>> > @@ -67,6 +67,8 @@ static int reset_index(const char *ref, const struct object_id *oid, int reset_t
>> >       case KEEP:
>> >       case MERGE:
>> >               opts.update = 1;
>> > +             /* FIXME: Default should be to remove ignored files */
>> > +             opts.preserve_ignored = 1;
>> >               break;
>> >       case HARD:
>> >               opts.update = 1;
>> > diff --git a/builtin/stash.c b/builtin/stash.c
>> > index 8f42360ca91..88287b890d5 100644
>> > --- a/builtin/stash.c
>> > +++ b/builtin/stash.c
>> > @@ -258,6 +258,9 @@ static int reset_tree(struct object_id *i_tree, int update, int reset)
>> >       opts.merge = 1;
>> >       opts.reset = reset;
>> >       opts.update = update;
>> > +     if (update && !reset)
>> > +             /* FIXME: Default should be to remove ignored files */
>> > +             opts.preserve_ignored = 1;
>> >       opts.fn = oneway_merge;
>> >
>> >       if (unpack_trees(nr_trees, t, &opts))
>> > diff --git a/merge-ort.c b/merge-ort.c
>> > index 35aa979c3a4..0d64ec716bd 100644
>> > --- a/merge-ort.c
>> > +++ b/merge-ort.c
>> > @@ -4045,11 +4045,7 @@ static int checkout(struct merge_options *opt,
>> >       unpack_opts.quiet = 0; /* FIXME: sequencer might want quiet? */
>> >       unpack_opts.verbose_update = (opt->verbosity > 2);
>> >       unpack_opts.fn = twoway_merge;
>> > -     if (1/* FIXME: opts->overwrite_ignore*/) {
>> > -             CALLOC_ARRAY(unpack_opts.dir, 1);
>> > -             unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
>> > -             setup_standard_excludes(unpack_opts.dir);
>> > -     }
>> > +     unpack_opts.preserve_ignored = 0; /* FIXME: !opts->overwrite_ignore*/
>> >       parse_tree(prev);
>> >       init_tree_desc(&trees[0], prev->buffer, prev->size);
>> >       parse_tree(next);
>> > @@ -4057,8 +4053,6 @@ static int checkout(struct merge_options *opt,
>> >
>> >       ret = unpack_trees(2, trees, &unpack_opts);
>> >       clear_unpack_trees_porcelain(&unpack_opts);
>> > -     dir_clear(unpack_opts.dir);
>> > -     FREE_AND_NULL(unpack_opts.dir);
>> >       return ret;
>> >  }
>> >
>> > diff --git a/merge-recursive.c b/merge-recursive.c
>> > index 233d9f686ad..2be3f5d4044 100644
>> > --- a/merge-recursive.c
>> > +++ b/merge-recursive.c
>> > @@ -411,9 +411,7 @@ static int unpack_trees_start(struct merge_options *opt,
>> >       else {
>> >               opt->priv->unpack_opts.update = 1;
>> >               /* FIXME: should only do this if !overwrite_ignore */
>> > -             CALLOC_ARRAY(opt->priv->unpack_opts.dir, 1);
>> > -             opt->priv->unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
>> > -             setup_standard_excludes(opt->priv->unpack_opts.dir);
>> > +             opt->priv->unpack_opts.preserve_ignored = 0;
>> >       }
>> >       opt->priv->unpack_opts.merge = 1;
>> >       opt->priv->unpack_opts.head_idx = 2;
>> > @@ -428,10 +426,6 @@ static int unpack_trees_start(struct merge_options *opt,
>> >       init_tree_desc_from_tree(t+2, merge);
>> >
>> >       rc = unpack_trees(3, t, &opt->priv->unpack_opts);
>> > -     if (opt->priv->unpack_opts.dir) {
>> > -             dir_clear(opt->priv->unpack_opts.dir);
>> > -             FREE_AND_NULL(opt->priv->unpack_opts.dir);
>> > -     }
>> >       cache_tree_free(&opt->repo->index->cache_tree);
>> >
>> >       /*
>> > diff --git a/merge.c b/merge.c
>> > index 6e736881d90..2382ff66d35 100644
>> > --- a/merge.c
>> > +++ b/merge.c
>> > @@ -53,7 +53,6 @@ int checkout_fast_forward(struct repository *r,
>> >       struct unpack_trees_options opts;
>> >       struct tree_desc t[MAX_UNPACK_TREES];
>> >       int i, nr_trees = 0;
>> > -     struct dir_struct dir = DIR_INIT;
>> >       struct lock_file lock_file = LOCK_INIT;
>> >
>> >       refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
>> > @@ -80,11 +79,7 @@ int checkout_fast_forward(struct repository *r,
>> >       }
>> >
>> >       memset(&opts, 0, sizeof(opts));
>> > -     if (overwrite_ignore) {
>> > -             dir.flags |= DIR_SHOW_IGNORED;
>> > -             setup_standard_excludes(&dir);
>> > -             opts.dir = &dir;
>> > -     }
>> > +     opts.preserve_ignored = !overwrite_ignore;
>> >
>> >       opts.head_idx = 1;
>> >       opts.src_index = r->index;
>> > @@ -101,7 +96,6 @@ int checkout_fast_forward(struct repository *r,
>> >               clear_unpack_trees_porcelain(&opts);
>> >               return -1;
>> >       }
>> > -     dir_clear(&dir);
>> >       clear_unpack_trees_porcelain(&opts);
>> >
>> >       if (write_locked_index(r->index, &lock_file, COMMIT_LOCK))
>> > diff --git a/reset.c b/reset.c
>> > index 79310ae071b..41b3e2d88de 100644
>> > --- a/reset.c
>> > +++ b/reset.c
>> > @@ -56,6 +56,8 @@ int reset_head(struct repository *r, struct object_id *oid, const char *action,
>> >       unpack_tree_opts.fn = reset_hard ? oneway_merge : twoway_merge;
>> >       unpack_tree_opts.update = 1;
>> >       unpack_tree_opts.merge = 1;
>> > +     /* FIXME: Default should be to remove ignored files */
>> > +     unpack_tree_opts.preserve_ignored = 1;
>> >       init_checkout_metadata(&unpack_tree_opts.meta, switch_to_branch, oid, NULL);
>> >       if (!detach_head)
>> >               unpack_tree_opts.reset = 1;
>> > diff --git a/sequencer.c b/sequencer.c
>> > index 614d56f5e21..098566c68d9 100644
>> > --- a/sequencer.c
>> > +++ b/sequencer.c
>> > @@ -3699,6 +3699,8 @@ static int do_reset(struct repository *r,
>> >       unpack_tree_opts.fn = oneway_merge;
>> >       unpack_tree_opts.merge = 1;
>> >       unpack_tree_opts.update = 1;
>> > +     /* FIXME: Default should be to remove ignored files */
>> > +     unpack_tree_opts.preserve_ignored = 1;
>> >       init_checkout_metadata(&unpack_tree_opts.meta, name, &oid, NULL);
>> >
>> >       if (repo_read_index_unmerged(r)) {
>> > diff --git a/unpack-trees.c b/unpack-trees.c
>> > index 8ea0a542da8..1e4eae1dc7d 100644
>> > --- a/unpack-trees.c
>> > +++ b/unpack-trees.c
>> > @@ -1707,6 +1707,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>> >               ensure_full_index(o->dst_index);
>> >       }
>> >
>> > +     if (!o->preserve_ignored) {
>> > +             CALLOC_ARRAY(o->dir, 1);
>> > +             o->dir->flags |= DIR_SHOW_IGNORED;
>> > +             setup_standard_excludes(o->dir);
>> > +     }
>> > +
>> >       if (!core_apply_sparse_checkout || !o->update)
>> >               o->skip_sparse_checkout = 1;
>> >       if (!o->skip_sparse_checkout && !o->pl) {
>> > @@ -1868,6 +1874,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>> >  done:
>> >       if (free_pattern_list)
>> >               clear_pattern_list(&pl);
>> > +     if (o->dir) {
>> > +             dir_clear(o->dir);
>> > +             FREE_AND_NULL(o->dir);
>> > +     }
>> >       trace2_region_leave("unpack_trees", "unpack_trees", the_repository);
>> >       trace_performance_leave("unpack_trees");
>> >       return ret;
>> > diff --git a/unpack-trees.h b/unpack-trees.h
>> > index 2d88b19dca7..f98cfd49d7b 100644
>> > --- a/unpack-trees.h
>> > +++ b/unpack-trees.h
>> > @@ -49,6 +49,7 @@ struct unpack_trees_options {
>> >       unsigned int reset,
>> >                    merge,
>> >                    update,
>> > +                  preserve_ignored,
>> >                    clone,
>> >                    index_only,
>> >                    nontrivial_merge,
>>
>> I think getting rid of the boilerplate makes sense, but it doesn't sound
>> from the commit message like you've considered just making that "struct
>> dir*" member a "struct dir" instead.
>>
>> That simplifies things a lot, i.e. we can just DIR_INIT it, and don't
>> need every caller to malloc/free it.
>
> See the next patch in the series.  :-)

Ah!

>> Sometimes a pointer makes sense, but in this case the "struct
>> unpack_trees_options" can just own it.
>
> I did make it internal to unpack_trees_options in the next patch, but
> kept it as a pointer just because that let me know whether it was used
> or not.  I guess I could have added a boolean as well.  But I don't
> actually allocate anything, because it's either a NULL pointer, or a
> pointer to something on the stack.  So, I do get to just use DIR_INIT.

I think I'm probably missing something. I just made it allocated on the
stack by the caller using "struct unpack_trees_options", but then you
end up having a dir* in the struct, but that's only filled in as a
pointer to the stack variable? Maybe there's some subtlety I'm missing
here...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Wed, Sep 29, 2021 at 11:32 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Wed, Sep 29 2021, Elijah Newren wrote:
>
> > On Wed, Sep 29, 2021 at 2:27 AM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
...
> >>
> >> I think getting rid of the boilerplate makes sense, but it doesn't sound
> >> from the commit message like you've considered just making that "struct
> >> dir*" member a "struct dir" instead.
> >>
> >> That simplifies things a lot, i.e. we can just DIR_INIT it, and don't
> >> need every caller to malloc/free it.
> >
> > See the next patch in the series.  :-)
>
> Ah!
>
> >> Sometimes a pointer makes sense, but in this case the "struct
> >> unpack_trees_options" can just own it.
> >
> > I did make it internal to unpack_trees_options in the next patch, but
> > kept it as a pointer just because that let me know whether it was used
> > or not.  I guess I could have added a boolean as well.  But I don't
> > actually allocate anything, because it's either a NULL pointer, or a
> > pointer to something on the stack.  So, I do get to just use DIR_INIT.
>
> I think I'm probably missing something. I just made it allocated on the
> stack by the caller using "struct unpack_trees_options", but then you
> end up having a dir* in the struct, but that's only filled in as a
> pointer to the stack variable? Maybe there's some subtlety I'm missing
> here...

As per the next patch:

int unpack_trees(..., struct unpack_trees_options *o)
{
    struct dir_struct dir = DIR_INIT;
    ...
    if (!o->preserve_ignored) {
        /* Setup 'dir', make o->dir point to it */
        ....
        o->dir = &dir;
    }
    ...
    if (o->dir)
        /* cleanup */
    ....
}

The caller doesn't touch o->dir (other than initializing it to zeros);
unpack_trees() is wholly responsible for it.  I'd kind of like to
entirely remove dir from unpack_trees_options(), but I need a way of
passing it down through all the other functions in unpack-trees.c, and
leaving it in unpack_trees_options seems the easiest way to do so.  So
I just marked it as "for internal use only".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):


On Wed, Sep 29 2021, Elijah Newren wrote:

> On Wed, Sep 29, 2021 at 11:32 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> On Wed, Sep 29 2021, Elijah Newren wrote:
>>
>> > On Wed, Sep 29, 2021 at 2:27 AM Ævar Arnfjörð Bjarmason
>> > <avarab@gmail.com> wrote:
>> >>
> ...
>> >>
>> >> I think getting rid of the boilerplate makes sense, but it doesn't sound
>> >> from the commit message like you've considered just making that "struct
>> >> dir*" member a "struct dir" instead.
>> >>
>> >> That simplifies things a lot, i.e. we can just DIR_INIT it, and don't
>> >> need every caller to malloc/free it.
>> >
>> > See the next patch in the series.  :-)
>>
>> Ah!
>>
>> >> Sometimes a pointer makes sense, but in this case the "struct
>> >> unpack_trees_options" can just own it.
>> >
>> > I did make it internal to unpack_trees_options in the next patch, but
>> > kept it as a pointer just because that let me know whether it was used
>> > or not.  I guess I could have added a boolean as well.  But I don't
>> > actually allocate anything, because it's either a NULL pointer, or a
>> > pointer to something on the stack.  So, I do get to just use DIR_INIT.
>>
>> I think I'm probably missing something. I just made it allocated on the
>> stack by the caller using "struct unpack_trees_options", but then you
>> end up having a dir* in the struct, but that's only filled in as a
>> pointer to the stack variable? Maybe there's some subtlety I'm missing
>> here...
>
> As per the next patch:
>
> int unpack_trees(..., struct unpack_trees_options *o)
> {
>     struct dir_struct dir = DIR_INIT;
>     ...
>     if (!o->preserve_ignored) {
>         /* Setup 'dir', make o->dir point to it */
>         ....
>         o->dir = &dir;
>     }
>     ...
>     if (o->dir)
>         /* cleanup */
>     ....
> }
>
> The caller doesn't touch o->dir (other than initializing it to zeros);
> unpack_trees() is wholly responsible for it.  I'd kind of like to
> entirely remove dir from unpack_trees_options(), but I need a way of
> passing it down through all the other functions in unpack-trees.c, and
> leaving it in unpack_trees_options seems the easiest way to do so.  So
> I just marked it as "for internal use only".

I think I understand *how* it works, I'm puzzled by why you went for
this whole level of indirection when you're using a struct on the stack
in the end anyway, just ... put that in "struct unpack_trees_options"?

Anyway, I see I have only myself to blame here, as you added these leak
fixes in the v2 in response to some of my offhand comments.

FWIW I then went on to do some deeper fixes not just on these leaks but
the surrounding leaks, which will be blocked by 2/11 & 05/11 of this
topic for a while. I suppose I only have myself to blame :)

Below is a patch-on-top that I think makes this whole thing much simpler
by doing away with the pointer entirely.

I suppose this is also a partial reply to
https://lore.kernel.org/git/CABPp-BG_qigBoirMGR-Yk9Niyxt0UmYCEqojsYxbSEarLAmraA@mail.gmail.com/;
but I quite dislike this pattern of including a pointer like this where
it's not needed just for the practicalities of memory management.

I.e. here you use DIR_INIT. In my local patches to fix up the wider
memory leaks in this area I've got DIR_INIT also using a STRBUF_INIT,
and DIR_INIT will in turn be referenced by a
UNPACK_TREES_OPTIONS_INIT. It's quite nice if you're having to
initialize with "UNPACK_TREES_OPTIONS_INIT" have that initialization
work all the way down the chain, and not need e.g. a manual
strbuf_init(), dir_init() etc.

I removed the dir_init() in ce93a4c6127 (dir.[ch]: replace dir_init()
with DIR_INIT, 2021-07-01), but would probably need to bring it back, of
course you need some "release()" method for the
UNPACK_TREES_OPTIONS_INIT, which in turn needs to call the dir_release()
(well, "dir_clear()" in that case), and it needs to call
"strbuf_release()". It's just nicer if that boilerplate is all on
destruction, but not also on struct/object setup.

We do need that setup in some cases (although a lot could just be
replaced by lazy initialization), but if we don't....

diff --git a/unpack-trees.c b/unpack-trees.c
index a7e1712d236..de5cc6cd025 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1694,15 +1694,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	static struct cache_entry *dfc;
 	struct pattern_list pl;
 	int free_pattern_list = 0;
-	struct dir_struct dir = DIR_INIT;
 
 	if (o->reset == UNPACK_RESET_INVALID)
 		BUG("o->reset had a value of 1; should be UNPACK_TREES_*_UNTRACKED");
 
 	if (len > MAX_UNPACK_TREES)
 		die("unpack_trees takes at most %d trees", MAX_UNPACK_TREES);
-	if (o->dir)
-		BUG("o->dir is for internal use only");
 
 	trace_performance_enter();
 	trace2_region_enter("unpack_trees", "unpack_trees", the_repository);
@@ -1718,9 +1715,8 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		BUG("UNPACK_RESET_OVERWRITE_UNTRACKED incompatible with preserved ignored files");
 
 	if (!o->preserve_ignored) {
-		o->dir = &dir;
-		o->dir->flags |= DIR_SHOW_IGNORED;
-		setup_standard_excludes(o->dir);
+		o->dir.flags |= DIR_SHOW_IGNORED;
+		setup_standard_excludes(&o->dir);
 	}
 
 	if (!core_apply_sparse_checkout || !o->update)
@@ -1884,10 +1880,7 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 done:
 	if (free_pattern_list)
 		clear_pattern_list(&pl);
-	if (o->dir) {
-		dir_clear(o->dir);
-		o->dir = NULL;
-	}
+	dir_clear(&o->dir);
 	trace2_region_leave("unpack_trees", "unpack_trees", the_repository);
 	trace_performance_leave("unpack_trees");
 	return ret;
@@ -2153,8 +2146,7 @@ static int verify_clean_subdirectory(const struct cache_entry *ce,
 	pathbuf = xstrfmt("%.*s/", namelen, ce->name);
 
 	memset(&d, 0, sizeof(d));
-	if (o->dir)
-		d.exclude_per_dir = o->dir->exclude_per_dir;
+	d.exclude_per_dir = o->dir.exclude_per_dir;
 	i = read_directory(&d, o->src_index, pathbuf, namelen+1, NULL);
 	if (i)
 		return add_rejected_path(o, ERROR_NOT_UPTODATE_DIR, ce->name);
@@ -2201,8 +2193,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 	if (ignore_case && icase_exists(o, name, len, st))
 		return 0;
 
-	if (o->dir &&
-	    is_excluded(o->dir, o->src_index, name, &dtype))
+	if (is_excluded(&o->dir, o->src_index, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
diff --git a/unpack-trees.h b/unpack-trees.h
index 71ffb7eeb0c..a8afbb20170 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -5,6 +5,7 @@
 #include "strvec.h"
 #include "string-list.h"
 #include "tree-walk.h"
+#include "dir.h"
 
 #define MAX_UNPACK_TREES MAX_TRAVERSE_TREES
 
@@ -95,7 +96,7 @@ struct unpack_trees_options {
 	struct index_state result;
 
 	struct pattern_list *pl; /* for internal use */
-	struct dir_struct *dir; /* for internal use only */
+	struct dir_struct dir; /* for internal use only */
 	struct checkout_metadata meta;
 };
 

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):


On Sat, Oct 02 2021, Ævar Arnfjörð Bjarmason wrote:

> On Fri, Oct 01 2021, Elijah Newren wrote:
>
>> On Fri, Oct 1, 2021 at 1:47 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>>
>>> On Thu, Sep 30 2021, Elijah Newren wrote:
>>>
>>> > On Thu, Sep 30, 2021 at 7:15 AM Ævar Arnfjörð Bjarmason
>>> > <avarab@gmail.com> wrote:
>>> >>
>>> >> On Wed, Sep 29 2021, Elijah Newren wrote:
[...]
>>> > I might be going on a tangent here, but looking at that patch, I'm
>>> > worried that dir_init() was buggy and that you perpetuated that bug
>>> > with DIR_INIT.  Note that dir_struct has a struct strbuf basebuf
>>> > member, which neither dir_init() or DIR_INIT initialize properly
>>> > (using either strbuf_init() or STRBUF_INIT).  As far as I can tell,
>>> > dir.c relies on either strbuf_add() calls to just happen to work with
>>> > this incorrectly initialized strbuf, or else use the strbuf_init()
>>> > call in prep_exclude() to do so, using the following snippet:
>>> >
>>> >     if (!dir->basebuf.buf)
>>> >         strbuf_init(&dir->basebuf, PATH_MAX);
>>> >
>>> > However, earlier in that same function we see
>>> >
>>> >     if (stk->baselen <= baselen &&
>>> >         !strncmp(dir->basebuf.buf, base, stk->baselen))
>>> >             break;
>>> >
>>> > So either that function can never have dir->basebuf.buf be NULL and
>>> > the strbuf_init() is dead code, or else it's possible for us to
>>> > trigger a segfault.  If it's the former, it may just be a ticking time
>>> > bomb that will transform into the latter with some other change,
>>> > because it's not at all obvious to me how dir->basebuf gets
>>> > initialized appropriately to avoid that strncmp call.  Perhaps there
>>> > is some invariant where exclude_stack is only set up by previous calls
>>> > to prep_exclude() and those won't set up exclude_stack until first
>>> > initializing basebuf.  But that really at least deserves a comment
>>> > about how we're abusing basebuf, and would probably be cleaner if we
>>> > initialized basebuf to STRBUF_INIT.
>>>
>>> ...because yes, I forgot about that when sending you the diff-on-top,
>>> sorry. Yes that's buggy with the diff-on-top I sent you.
>>
>> That bug didn't come from the diff-on-top you sent me, it came from
>> the commit already merged to master -- ce93a4c6127  (dir.[ch]: replace
>> dir_init() with DIR_INIT, 2021-07-01), merged as part of
>> ab/struct-init on Jul 16.
>
> Ah, I misunderstood you there. I'll look at that / fix it. Sorry.

Just to tie up this loose end: Yes this control flow suck, and I've got
some patches to unpack-trees.[ch] & dir.[ch] I'm about to submit to fix
it. But just to comment on the existing behavior of the code, i.e. your
(above):

    "So either that function can never have dir->basebuf.buf be NULL and
    the strbuf_init() is dead code, or else it's possible for us to
    trigger a segfault.".

I hadn't had time to look into it when I said I'd fix it, but now that I
have I found thath there's nothing to fix, and this code wasn't buggy
either before or after my ce93a4c6127 (dir.[ch]: replace dir_init() with
DIR_INIT, 2021-07-01). I.e. we do have the invariant you mentioned.

The dir.[ch] API has always relied on the "struct dir_struct" being
zero'd out. First with memset() before your eceba532141 (dir: fix
problematic API to avoid memory leaks, 2020-08-18), and after my
ce93a4c6127 with the DIR_INIT, which both amount to the same thing.

We both missed a caller that used neither dir_init() nor uses DIR_INIT
now, but it uses "{ 0 }", so it's always zero'd.

Now, of course it being zero'd *would* segfault if you feed
"dir->basebuf.buf" to strncmp() as you note above, but that code isn't
reachable. The structure of that function is (pseudocode):

void prep_exclude(...)
{
	struct exclude_stack *stk = NULL;
	[...]

	while ((stk = dir->exclude_stack) != NULL)
		/* the strncmp() against "dir->basebuf.buf" is here */

	/* maybe we'll early return here */

	if (!dir->basebuf.buf)
		strbuf_init(&dir->basebuf, PATH_MAX);

	/*
         * Code that sets dir->exclude_stack to non-NULL for the first
	 * time follows...
	 */
}

I.e. dir->exclude_stack is *only* referenced in this function and
dir_clear() (where we also check it for NULL first).

It's state management between calls to prep_exclude(). So that that
initial while-loop can only be entered the the >1th time prep_exclude()
is called.

We'll then either have reached that strbuf_init() already, or if we took
an early return before the strbuf_init() we couldn't have set
dir->exclude_stack either. So that "dir->basebuf.buf" dereference is
safe in either case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Sat, Oct 2, 2021 at 2:07 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> On Fri, Oct 01 2021, Elijah Newren wrote:
>
...
> > So maybe I'll submit some patches on top that rip these direct members
> > out of of unpack_trees_options and push them inside some opaque
> > struct.
>
> Sure, that sounds good. I only had a mild objection to doing it in a way
> where you'll need that sort of code I removed in the linked commit in
> prep_exclude() because you were trying not to expose that at any cost,
> including via some *_INIT macro. I.e. if it's private we can just name
> it "priv_*" or have a :
>
>     struct dont_touch_this {
>         struct dir_struct dir;
>     };
>
> Which are both ways of /messaging/ that it's private, and since the
> target audience is just the rest of the git.git codebase I think that
> ultimately something that 1) sends the right message 2) makes accidents
> pretty much impossible suffices. I.e. you don't accidentally introduce a
> new API user accessing a field called "->priv_*" or
> "->private_*". Someone will review those patches...

An internal struct with all the members meant to be internal-only
provides nearly all the advantages that I was going for with the
opaque struct, while also being a smaller change than what I was
thinking of doing.  I like that idea; thanks for the suggestion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Sun, Oct 3, 2021 at 3:38 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
> On Sat, Oct 02 2021, Ævar Arnfjörð Bjarmason wrote:
>
> > On Fri, Oct 01 2021, Elijah Newren wrote:
> >
> >> On Fri, Oct 1, 2021 at 1:47 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> >>>
> >>> On Thu, Sep 30 2021, Elijah Newren wrote:
> >>>
> >>> > On Thu, Sep 30, 2021 at 7:15 AM Ævar Arnfjörð Bjarmason
> >>> > <avarab@gmail.com> wrote:
> >>> >>
> >>> >> On Wed, Sep 29 2021, Elijah Newren wrote:
> [...]
> >>> > I might be going on a tangent here, but looking at that patch, I'm
> >>> > worried that dir_init() was buggy and that you perpetuated that bug
> >>> > with DIR_INIT.  Note that dir_struct has a struct strbuf basebuf
> >>> > member, which neither dir_init() or DIR_INIT initialize properly
> >>> > (using either strbuf_init() or STRBUF_INIT).  As far as I can tell,
> >>> > dir.c relies on either strbuf_add() calls to just happen to work with
> >>> > this incorrectly initialized strbuf, or else use the strbuf_init()
> >>> > call in prep_exclude() to do so, using the following snippet:
> >>> >
> >>> >     if (!dir->basebuf.buf)
> >>> >         strbuf_init(&dir->basebuf, PATH_MAX);
> >>> >
> >>> > However, earlier in that same function we see
> >>> >
> >>> >     if (stk->baselen <= baselen &&
> >>> >         !strncmp(dir->basebuf.buf, base, stk->baselen))
> >>> >             break;
> >>> >
> >>> > So either that function can never have dir->basebuf.buf be NULL and
> >>> > the strbuf_init() is dead code, or else it's possible for us to
> >>> > trigger a segfault.  If it's the former, it may just be a ticking time
> >>> > bomb that will transform into the latter with some other change,
> >>> > because it's not at all obvious to me how dir->basebuf gets
> >>> > initialized appropriately to avoid that strncmp call.  Perhaps there
> >>> > is some invariant where exclude_stack is only set up by previous calls
> >>> > to prep_exclude() and those won't set up exclude_stack until first
> >>> > initializing basebuf.  But that really at least deserves a comment
> >>> > about how we're abusing basebuf, and would probably be cleaner if we
> >>> > initialized basebuf to STRBUF_INIT.
> >>>
> >>> ...because yes, I forgot about that when sending you the diff-on-top,
> >>> sorry. Yes that's buggy with the diff-on-top I sent you.
> >>
> >> That bug didn't come from the diff-on-top you sent me, it came from
> >> the commit already merged to master -- ce93a4c6127  (dir.[ch]: replace
> >> dir_init() with DIR_INIT, 2021-07-01), merged as part of
> >> ab/struct-init on Jul 16.
> >
> > Ah, I misunderstood you there. I'll look at that / fix it. Sorry.
>
> Just to tie up this loose end: Yes this control flow suck, and I've got
> some patches to unpack-trees.[ch] & dir.[ch] I'm about to submit to fix
> it. But just to comment on the existing behavior of the code, i.e. your
> (above):
>
>     "So either that function can never have dir->basebuf.buf be NULL and
>     the strbuf_init() is dead code, or else it's possible for us to
>     trigger a segfault.".
>
> I hadn't had time to look into it when I said I'd fix it, but now that I
> have I found thath there's nothing to fix, and this code wasn't buggy
> either before or after my ce93a4c6127 (dir.[ch]: replace dir_init() with
> DIR_INIT, 2021-07-01). I.e. we do have the invariant you mentioned.
>
> The dir.[ch] API has always relied on the "struct dir_struct" being
> zero'd out. First with memset() before your eceba532141 (dir: fix
> problematic API to avoid memory leaks, 2020-08-18), and after my
> ce93a4c6127 with the DIR_INIT, which both amount to the same thing.
>
> We both missed a caller that used neither dir_init() nor uses DIR_INIT
> now, but it uses "{ 0 }", so it's always zero'd.
>
> Now, of course it being zero'd *would* segfault if you feed
> "dir->basebuf.buf" to strncmp() as you note above, but that code isn't
> reachable. The structure of that function is (pseudocode):
>
> void prep_exclude(...)
> {
>         struct exclude_stack *stk = NULL;
>         [...]
>
>         while ((stk = dir->exclude_stack) != NULL)
>                 /* the strncmp() against "dir->basebuf.buf" is here */
>
>         /* maybe we'll early return here */
>
>         if (!dir->basebuf.buf)
>                 strbuf_init(&dir->basebuf, PATH_MAX);
>
>         /*
>          * Code that sets dir->exclude_stack to non-NULL for the first
>          * time follows...
>          */
> }
>
> I.e. dir->exclude_stack is *only* referenced in this function and
> dir_clear() (where we also check it for NULL first).
>
> It's state management between calls to prep_exclude(). So that that
> initial while-loop can only be entered the the >1th time prep_exclude()
> is called.
>
> We'll then either have reached that strbuf_init() already, or if we took
> an early return before the strbuf_init() we couldn't have set
> dir->exclude_stack either. So that "dir->basebuf.buf" dereference is
> safe in either case.

Thanks for digging into this.  I wonder if dir_struct could use some
separation of putting things inside an embedded internal struct as
well, similar to our discussions with unpack_trees_options.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):


On Mon, Oct 04 2021, Elijah Newren wrote:

> On Sat, Oct 2, 2021 at 2:07 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>>
>> On Fri, Oct 01 2021, Elijah Newren wrote:
>>
> ...
>> > So maybe I'll submit some patches on top that rip these direct members
>> > out of of unpack_trees_options and push them inside some opaque
>> > struct.
>>
>> Sure, that sounds good. I only had a mild objection to doing it in a way
>> where you'll need that sort of code I removed in the linked commit in
>> prep_exclude() because you were trying not to expose that at any cost,
>> including via some *_INIT macro. I.e. if it's private we can just name
>> it "priv_*" or have a :
>>
>>     struct dont_touch_this {
>>         struct dir_struct dir;
>>     };
>>
>> Which are both ways of /messaging/ that it's private, and since the
>> target audience is just the rest of the git.git codebase I think that
>> ultimately something that 1) sends the right message 2) makes accidents
>> pretty much impossible suffices. I.e. you don't accidentally introduce a
>> new API user accessing a field called "->priv_*" or
>> "->private_*". Someone will review those patches...
>
> An internal struct with all the members meant to be internal-only
> provides nearly all the advantages that I was going for with the
> opaque struct, while also being a smaller change than what I was
> thinking of doing.  I like that idea; thanks for the suggestion.

Yeah, just to provide an explicit example something like the below. It
compiles to the same assembly (at least under -O3, didn't exhaustively
try other optimization levels).

I'm rather "meh" on it v.s. just prefixing the relevant member names
with "priv_" or "private_", but it results in the same semantics &
machine code, so it's effectively just a way of doing the labeling for
human consumption.

diff --git a/dir.c b/dir.c
index 39fce3bcba7..a714640e782 100644
--- a/dir.c
+++ b/dir.c
@@ -1533,12 +1533,12 @@ static void prep_exclude(struct dir_struct *dir,
 	 * which originate from directories not in the prefix of the
 	 * path being checked.
 	 */
-	while ((stk = dir->exclude_stack) != NULL) {
+	while ((stk = dir->private.exclude_stack) != NULL) {
 		if (stk->baselen <= baselen &&
 		    !strncmp(dir->basebuf.buf, base, stk->baselen))
 			break;
-		pl = &group->pl[dir->exclude_stack->exclude_ix];
-		dir->exclude_stack = stk->prev;
+		pl = &group->pl[dir->private.exclude_stack->exclude_ix];
+		dir->private.exclude_stack = stk->prev;
 		dir->pattern = NULL;
 		free((char *)pl->src); /* see strbuf_detach() below */
 		clear_pattern_list(pl);
@@ -1584,7 +1584,7 @@ static void prep_exclude(struct dir_struct *dir,
 						 base + current,
 						 cp - base - current);
 		}
-		stk->prev = dir->exclude_stack;
+		stk->prev = dir->private.exclude_stack;
 		stk->baselen = cp - base;
 		stk->exclude_ix = group->nr;
 		stk->ucd = untracked;
@@ -1605,7 +1605,7 @@ static void prep_exclude(struct dir_struct *dir,
 			    dir->pattern->flags & PATTERN_FLAG_NEGATIVE)
 				dir->pattern = NULL;
 			if (dir->pattern) {
-				dir->exclude_stack = stk;
+				dir->private.exclude_stack = stk;
 				return;
 			}
 		}
@@ -1662,7 +1662,7 @@ static void prep_exclude(struct dir_struct *dir,
 			invalidate_gitignore(dir->untracked, untracked);
 			oidcpy(&untracked->exclude_oid, &oid_stat.oid);
 		}
-		dir->exclude_stack = stk;
+		dir->private.exclude_stack = stk;
 		current = stk->baselen;
 	}
 	strbuf_setlen(&dir->basebuf, baselen);
@@ -3302,7 +3302,7 @@ void dir_clear(struct dir_struct *dir)
 	free(dir->ignored);
 	free(dir->entries);
 
-	stk = dir->exclude_stack;
+	stk = dir->private.exclude_stack;
 	while (stk) {
 		struct exclude_stack *prev = stk->prev;
 		free(stk);
diff --git a/dir.h b/dir.h
index 83f46c0fb4c..d30d294308d 100644
--- a/dir.h
+++ b/dir.h
@@ -209,6 +209,11 @@ struct untracked_cache {
  * record the paths discovered. A single `struct dir_struct` is used regardless
  * of whether or not the traversal recursively descends into subdirectories.
  */
+
+struct dir_struct_private {
+	struct exclude_stack *exclude_stack;
+};
+
 struct dir_struct {
 
 	/* The number of members in `entries[]` array. */
@@ -327,7 +332,7 @@ struct dir_struct {
 	 * (sub)directory in the traversal. Exclude points to the
 	 * matching exclude struct if the directory is excluded.
 	 */
-	struct exclude_stack *exclude_stack;
+	struct dir_struct_private private;
 	struct path_pattern *pattern;
 	struct strbuf basebuf;
 

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Elijah Newren wrote (reply to this):

On Mon, Oct 4, 2021 at 7:12 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>
> On Mon, Oct 04 2021, Elijah Newren wrote:
>
> > On Sat, Oct 2, 2021 at 2:07 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> >>
> >> On Fri, Oct 01 2021, Elijah Newren wrote:
> >>
> > ...
> >> > So maybe I'll submit some patches on top that rip these direct members
> >> > out of of unpack_trees_options and push them inside some opaque
> >> > struct.
> >>
> >> Sure, that sounds good. I only had a mild objection to doing it in a way
> >> where you'll need that sort of code I removed in the linked commit in
> >> prep_exclude() because you were trying not to expose that at any cost,
> >> including via some *_INIT macro. I.e. if it's private we can just name
> >> it "priv_*" or have a :
> >>
> >>     struct dont_touch_this {
> >>         struct dir_struct dir;
> >>     };
> >>
> >> Which are both ways of /messaging/ that it's private, and since the
> >> target audience is just the rest of the git.git codebase I think that
> >> ultimately something that 1) sends the right message 2) makes accidents
> >> pretty much impossible suffices. I.e. you don't accidentally introduce a
> >> new API user accessing a field called "->priv_*" or
> >> "->private_*". Someone will review those patches...
> >
> > An internal struct with all the members meant to be internal-only
> > provides nearly all the advantages that I was going for with the
> > opaque struct, while also being a smaller change than what I was
> > thinking of doing.  I like that idea; thanks for the suggestion.
>
> Yeah, just to provide an explicit example something like the below. It
> compiles to the same assembly (at least under -O3, didn't exhaustively
> try other optimization levels).
>
> I'm rather "meh" on it v.s. just prefixing the relevant member names
> with "priv_" or "private_", but it results in the same semantics &
> machine code, so it's effectively just a way of doing the labeling for
> human consumption.
>
> diff --git a/dir.c b/dir.c
> index 39fce3bcba7..a714640e782 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1533,12 +1533,12 @@ static void prep_exclude(struct dir_struct *dir,
>          * which originate from directories not in the prefix of the
>          * path being checked.
>          */
> -       while ((stk = dir->exclude_stack) != NULL) {
> +       while ((stk = dir->private.exclude_stack) != NULL) {
>                 if (stk->baselen <= baselen &&
>                     !strncmp(dir->basebuf.buf, base, stk->baselen))
>                         break;
> -               pl = &group->pl[dir->exclude_stack->exclude_ix];
> -               dir->exclude_stack = stk->prev;
> +               pl = &group->pl[dir->private.exclude_stack->exclude_ix];
> +               dir->private.exclude_stack = stk->prev;
>                 dir->pattern = NULL;
>                 free((char *)pl->src); /* see strbuf_detach() below */
>                 clear_pattern_list(pl);
> @@ -1584,7 +1584,7 @@ static void prep_exclude(struct dir_struct *dir,
>                                                  base + current,
>                                                  cp - base - current);
>                 }
> -               stk->prev = dir->exclude_stack;
> +               stk->prev = dir->private.exclude_stack;
>                 stk->baselen = cp - base;
>                 stk->exclude_ix = group->nr;
>                 stk->ucd = untracked;
> @@ -1605,7 +1605,7 @@ static void prep_exclude(struct dir_struct *dir,
>                             dir->pattern->flags & PATTERN_FLAG_NEGATIVE)
>                                 dir->pattern = NULL;
>                         if (dir->pattern) {
> -                               dir->exclude_stack = stk;
> +                               dir->private.exclude_stack = stk;
>                                 return;
>                         }
>                 }
> @@ -1662,7 +1662,7 @@ static void prep_exclude(struct dir_struct *dir,
>                         invalidate_gitignore(dir->untracked, untracked);
>                         oidcpy(&untracked->exclude_oid, &oid_stat.oid);
>                 }
> -               dir->exclude_stack = stk;
> +               dir->private.exclude_stack = stk;
>                 current = stk->baselen;
>         }
>         strbuf_setlen(&dir->basebuf, baselen);
> @@ -3302,7 +3302,7 @@ void dir_clear(struct dir_struct *dir)
>         free(dir->ignored);
>         free(dir->entries);
>
> -       stk = dir->exclude_stack;
> +       stk = dir->private.exclude_stack;
>         while (stk) {
>                 struct exclude_stack *prev = stk->prev;
>                 free(stk);
> diff --git a/dir.h b/dir.h
> index 83f46c0fb4c..d30d294308d 100644
> --- a/dir.h
> +++ b/dir.h
> @@ -209,6 +209,11 @@ struct untracked_cache {
>   * record the paths discovered. A single `struct dir_struct` is used regardless
>   * of whether or not the traversal recursively descends into subdirectories.
>   */
> +
> +struct dir_struct_private {
> +       struct exclude_stack *exclude_stack;
> +};
> +
>  struct dir_struct {
>
>         /* The number of members in `entries[]` array. */
> @@ -327,7 +332,7 @@ struct dir_struct {
>          * (sub)directory in the traversal. Exclude points to the
>          * matching exclude struct if the directory is excluded.
>          */
> -       struct exclude_stack *exclude_stack;
> +       struct dir_struct_private private;
>         struct path_pattern *pattern;
>         struct strbuf basebuf;

Yeah, that doesn't help much at all, and I'd argue even makes things
worse, because you're just looking at a single member.  This subtly
implies that all the other private variables are public API.  The
dir.h portion of the patch should look more like this:

$ git diff -w dir.h
diff --git a/dir.h b/dir.h
index 83f46c0fb4..93a9f02688 100644
--- a/dir.h
+++ b/dir.h
@@ -214,14 +214,9 @@ struct dir_struct {
        /* The number of members in `entries[]` array. */
        int nr;

-       /* Internal use; keeps track of allocation of `entries[]` array.*/
-       int alloc;
-
        /* The number of members in `ignored[]` array. */
        int ignored_nr;

-       int ignored_alloc;
-
        /* bit-field of options */
        enum {

@@ -301,11 +296,19 @@ struct dir_struct {
         */
        const char *exclude_per_dir;

+       struct dir_struct_internal {
+               /* Keeps track of allocation of `entries[]` array.*/
+               int alloc;
+
+               /* Keeps track of allocation of `ignored[]` array. */
+               int ignored_alloc;
+
                /*
                 * We maintain three groups of exclude pattern lists:
                 *
                 * EXC_CMDL lists patterns explicitly given on the command line.
-        * EXC_DIRS lists patterns obtained from per-directory ignore files.
+                * EXC_DIRS lists patterns obtained from per-directory ignore
+                *          files.
                 * EXC_FILE lists patterns from fallback ignore files, e.g.
                 *   - .git/info/exclude
                 *   - core.excludesfile
@@ -340,6 +343,7 @@ struct dir_struct {
                /* Stats about the traversal */
                unsigned visited_paths;
                unsigned visited_directories;
+       } internal;
 };

 #define DIR_INIT { 0 }


The above change would make it clear that there are 12 variables meant
for use only within dir.c that external callers should not be
initializing or reading for output after the fact -- and only 6 that
are part of the public API that they need worry about.  It also makes
it easier for folks messing with dir.c to know which parts are just
internal state management, which I think would have made it easier to
understand the weird basebuf/exclude_stack stuff in prep_exclude()
that you nicely tracked down.  But overall, I'm really most happy
about the part of this patch that lets external callers realize they
only need to worry about 6 out of 18 fields and that they can ignore
the rest.

unpack_trees_options should have something similar done with it, and
maybe some others.

opts.merge = 1;
opts.reset = reset;
opts.reset = reset ? UNPACK_RESET_PROTECT_UNTRACKED : 0;
opts.preserve_ignored = 0; /* FIXME: !overwrite_ignore */
opts.fn = twoway_merge;
init_tree_desc(&t[0], head->buffer, head->size);
init_tree_desc(&t[1], remote->buffer, remote->size);
Expand Down
10 changes: 4 additions & 6 deletions builtin/checkout.c
Expand Up @@ -646,7 +646,9 @@ static int reset_tree(struct tree *tree, const struct checkout_opts *o,
opts.head_idx = -1;
opts.update = worktree;
opts.skip_unmerged = !worktree;
opts.reset = 1;
opts.reset = o->force ? UNPACK_RESET_OVERWRITE_UNTRACKED :
UNPACK_RESET_PROTECT_UNTRACKED;
opts.preserve_ignored = (!o->force && !o->overwrite_ignore);
opts.merge = 1;
opts.fn = oneway_merge;
opts.verbose_update = o->show_progress;
Expand Down Expand Up @@ -746,11 +748,7 @@ static int merge_working_tree(const struct checkout_opts *opts,
new_branch_info->commit ?
&new_branch_info->commit->object.oid :
&new_branch_info->oid, NULL);
if (opts->overwrite_ignore) {
topts.dir = xcalloc(1, sizeof(*topts.dir));
topts.dir->flags |= DIR_SHOW_IGNORED;
setup_standard_excludes(topts.dir);
}
topts.preserve_ignored = !opts->overwrite_ignore;
tree = parse_tree_indirect(old_branch_info->commit ?
&old_branch_info->commit->object.oid :
the_hash_algo->empty_tree);
Expand Down
1 change: 1 addition & 0 deletions builtin/clone.c
Expand Up @@ -687,6 +687,7 @@ static int checkout(int submodule_progress)
opts.update = 1;
opts.merge = 1;
opts.clone = 1;
opts.preserve_ignored = 0;
opts.fn = oneway_merge;
opts.verbose_update = (option_verbosity >= 0);
opts.src_index = &the_index;
Expand Down
1 change: 1 addition & 0 deletions builtin/merge.c
Expand Up @@ -680,6 +680,7 @@ static int read_tree_trivial(struct object_id *common, struct object_id *head,
opts.verbose_update = 1;
opts.trivial_merges_only = 1;
opts.merge = 1;
opts.preserve_ignored = 0; /* FIXME: !overwrite_ignore */
trees[nr_trees] = parse_tree_indirect(common);
if (!trees[nr_trees++])
return -1;
Expand Down
26 changes: 11 additions & 15 deletions builtin/read-tree.c
Expand Up @@ -38,7 +38,7 @@ static int list_tree(struct object_id *oid)
}

static const char * const read_tree_usage[] = {
N_("git read-tree [(-m [--trivial] [--aggressive] | --reset | --prefix=<prefix>) [-u [--exclude-per-directory=<gitignore>] | -i]] [--no-sparse-checkout] [--index-output=<file>] (--empty | <tree-ish1> [<tree-ish2> [<tree-ish3>]])"),
N_("git read-tree [(-m [--trivial] [--aggressive] | --reset | --prefix=<prefix>) [-u | -i]] [--no-sparse-checkout] [--index-output=<file>] (--empty | <tree-ish1> [<tree-ish2> [<tree-ish3>]])"),
NULL
};

Expand All @@ -53,24 +53,16 @@ static int index_output_cb(const struct option *opt, const char *arg,
static int exclude_per_directory_cb(const struct option *opt, const char *arg,
int unset)
{
struct dir_struct *dir;
struct unpack_trees_options *opts;

BUG_ON_OPT_NEG(unset);

opts = (struct unpack_trees_options *)opt->value;

if (opts->dir)
die("more than one --exclude-per-directory given.");

dir = xcalloc(1, sizeof(*opts->dir));
dir->flags |= DIR_SHOW_IGNORED;
dir->exclude_per_dir = arg;
opts->dir = dir;
/* We do not need to nor want to do read-directory
* here; we are merely interested in reusing the
* per directory ignore stack mechanism.
*/
if (!opts->update)
die("--exclude-per-directory is meaningless unless -u");
if (strcmp(arg, ".gitignore"))
die("--exclude-per-directory argument must be .gitignore");
return 0;
}

Expand Down Expand Up @@ -174,6 +166,9 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
if (1 < opts.merge + opts.reset + prefix_set)
die("Which one? -m, --reset, or --prefix?");

if (opts.reset)
opts.reset = UNPACK_RESET_OVERWRITE_UNTRACKED;

/*
* NEEDSWORK
*
Expand Down Expand Up @@ -209,8 +204,9 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix)
if ((opts.update || opts.index_only) && !opts.merge)
die("%s is meaningless without -m, --reset, or --prefix",
opts.update ? "-u" : "-i");
if ((opts.dir && !opts.update))
die("--exclude-per-directory is meaningless unless -u");
if (opts.update && !opts.reset)
opts.preserve_ignored = 0;
/* otherwise, opts.preserve_ignored is irrelevant */
if (opts.merge && !opts.index_only)
setup_work_tree();

Expand Down
10 changes: 8 additions & 2 deletions builtin/reset.c
Expand Up @@ -67,12 +67,18 @@ static int reset_index(const char *ref, const struct object_id *oid, int reset_t
case KEEP:
case MERGE:
opts.update = 1;
opts.preserve_ignored = 0; /* FIXME: !overwrite_ignore */
break;
case HARD:
opts.update = 1;
/* fallthrough */
opts.reset = UNPACK_RESET_OVERWRITE_UNTRACKED;
break;
case MIXED:
opts.reset = UNPACK_RESET_PROTECT_UNTRACKED;
/* but opts.update=0, so working tree not updated */
break;
default:
opts.reset = 1;
BUG("invalid reset_type passed to reset_index");
}

read_cache_unmerged();
Expand Down
5 changes: 4 additions & 1 deletion builtin/stash.c
Expand Up @@ -256,8 +256,10 @@ static int reset_tree(struct object_id *i_tree, int update, int reset)
opts.src_index = &the_index;
opts.dst_index = &the_index;
opts.merge = 1;
opts.reset = reset;
opts.reset = reset ? UNPACK_RESET_PROTECT_UNTRACKED : 0;
opts.update = update;
if (update)
opts.preserve_ignored = 0; /* FIXME: !overwrite_ignore */
opts.fn = oneway_merge;

if (unpack_trees(nr_trees, t, &opts))
Expand Down Expand Up @@ -1519,6 +1521,7 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q
} else {
struct child_process cp = CHILD_PROCESS_INIT;
cp.git_cmd = 1;
/* BUG: this nukes untracked files in the way */
strvec_pushl(&cp.args, "reset", "--hard", "-q",
"--no-recurse-submodules", NULL);
if (run_command(&cp)) {
Expand Down
4 changes: 4 additions & 0 deletions builtin/submodule--helper.c
Expand Up @@ -3090,6 +3090,10 @@ static int add_submodule(const struct add_data *add_data)
prepare_submodule_repo_env(&cp.env_array);
cp.git_cmd = 1;
cp.dir = add_data->sm_path;
/*
* NOTE: we only get here if add_data->force is true, so
* passing --force to checkout is reasonable.
*/
strvec_pushl(&cp.args, "checkout", "-f", "-q", NULL);

if (add_data->branch) {
Expand Down
2 changes: 1 addition & 1 deletion contrib/rerere-train.sh
Expand Up @@ -91,7 +91,7 @@ do
git checkout -q $commit -- .
git rerere
fi
git reset -q --hard
git reset -q --hard # Might nuke untracked files...
done

if test -z "$branch"
Expand Down
8 changes: 1 addition & 7 deletions merge-ort.c
Expand Up @@ -4045,20 +4045,14 @@ static int checkout(struct merge_options *opt,
unpack_opts.quiet = 0; /* FIXME: sequencer might want quiet? */
unpack_opts.verbose_update = (opt->verbosity > 2);
unpack_opts.fn = twoway_merge;
if (1/* FIXME: opts->overwrite_ignore*/) {
CALLOC_ARRAY(unpack_opts.dir, 1);
unpack_opts.dir->flags |= DIR_SHOW_IGNORED;
setup_standard_excludes(unpack_opts.dir);
}
unpack_opts.preserve_ignored = 0; /* FIXME: !opts->overwrite_ignore */
parse_tree(prev);
init_tree_desc(&trees[0], prev->buffer, prev->size);
parse_tree(next);
init_tree_desc(&trees[1], next->buffer, next->size);

ret = unpack_trees(2, trees, &unpack_opts);
clear_unpack_trees_porcelain(&unpack_opts);
dir_clear(unpack_opts.dir);
FREE_AND_NULL(unpack_opts.dir);
return ret;
}

Expand Down
5 changes: 4 additions & 1 deletion merge-recursive.c
Expand Up @@ -408,8 +408,11 @@ static int unpack_trees_start(struct merge_options *opt,
memset(&opt->priv->unpack_opts, 0, sizeof(opt->priv->unpack_opts));
if (opt->priv->call_depth)
opt->priv->unpack_opts.index_only = 1;
else
else {
opt->priv->unpack_opts.update = 1;
/* FIXME: should only do this if !overwrite_ignore */
opt->priv->unpack_opts.preserve_ignored = 0;
}
opt->priv->unpack_opts.merge = 1;
opt->priv->unpack_opts.head_idx = 2;
opt->priv->unpack_opts.fn = threeway_merge;
Expand Down
8 changes: 1 addition & 7 deletions merge.c
Expand Up @@ -53,7 +53,6 @@ int checkout_fast_forward(struct repository *r,
struct unpack_trees_options opts;
struct tree_desc t[MAX_UNPACK_TREES];
int i, nr_trees = 0;
struct dir_struct dir = DIR_INIT;
struct lock_file lock_file = LOCK_INIT;

refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
Expand All @@ -80,11 +79,7 @@ int checkout_fast_forward(struct repository *r,
}

memset(&opts, 0, sizeof(opts));
if (overwrite_ignore) {
dir.flags |= DIR_SHOW_IGNORED;
setup_standard_excludes(&dir);
opts.dir = &dir;
}
opts.preserve_ignored = !overwrite_ignore;

opts.head_idx = 1;
opts.src_index = r->index;
Expand All @@ -101,7 +96,6 @@ int checkout_fast_forward(struct repository *r,
clear_unpack_trees_porcelain(&opts);
return -1;
}
dir_clear(&dir);
clear_unpack_trees_porcelain(&opts);

if (write_locked_index(r->index, &lock_file, COMMIT_LOCK))
Expand Down
3 changes: 2 additions & 1 deletion reset.c
Expand Up @@ -56,9 +56,10 @@ int reset_head(struct repository *r, struct object_id *oid, const char *action,
unpack_tree_opts.fn = reset_hard ? oneway_merge : twoway_merge;
unpack_tree_opts.update = 1;
unpack_tree_opts.merge = 1;
unpack_tree_opts.preserve_ignored = 0; /* FIXME: !overwrite_ignore */
init_checkout_metadata(&unpack_tree_opts.meta, switch_to_branch, oid, NULL);
if (!detach_head)
unpack_tree_opts.reset = 1;
unpack_tree_opts.reset = UNPACK_RESET_PROTECT_UNTRACKED;

if (repo_read_index_unmerged(r) < 0) {
ret = error(_("could not read index"));
Expand Down
1 change: 1 addition & 0 deletions sequencer.c
Expand Up @@ -3699,6 +3699,7 @@ static int do_reset(struct repository *r,
unpack_tree_opts.fn = oneway_merge;
unpack_tree_opts.merge = 1;
unpack_tree_opts.update = 1;
unpack_tree_opts.preserve_ignored = 0; /* FIXME: !overwrite_ignore */
init_checkout_metadata(&unpack_tree_opts.meta, name, &oid, NULL);

if (repo_read_index_unmerged(r)) {
Expand Down
1 change: 1 addition & 0 deletions submodule.c
Expand Up @@ -1908,6 +1908,7 @@ static void submodule_reset_index(const char *path)

strvec_pushf(&cp.args, "--super-prefix=%s%s/",
get_super_prefix_or_empty(), path);
/* TODO: determine if this might overwright untracked files */
strvec_pushl(&cp.args, "read-tree", "-u", "--reset", NULL);

strvec_push(&cp.args, empty_tree_oid_hex());
Expand Down
1 change: 0 additions & 1 deletion t/t1013-read-tree-submodule.sh
Expand Up @@ -6,7 +6,6 @@ test_description='read-tree can handle submodules'
. "$TEST_DIRECTORY"/lib-submodule-update.sh

KNOWN_FAILURE_DIRECTORY_SUBMODULE_CONFLICTS=1
KNOWN_FAILURE_SUBMODULE_OVERWRITE_IGNORED_UNTRACKED=1

test_submodule_switch_recursing_with_args "read-tree -u -m"

Expand Down