Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-pack-index: use real paths for --object-dir #498

Merged

Conversation

derrickstolee
Copy link
Collaborator

This resolves #497.

Looking at this, it seems that VFS for Git users have not been getting all of the benefits of background maintenance for quite some time. This will probably justify a v2.36.0.vfs.0.1 release for those users. We should wait for upstream feedback, first. I will prepare an upstream patch soon.

This helper looks for a parsed multi-pack-index whose object directory
matches the given object_dir. Before going into the loop over the parsed
multi-pack-indexes, it calls find_odb() to ensure that the given
object_dir is actually a known object directory.

However, find_odb() uses real-path manipulations to compare the input to
the alternate directories. This same real-path comparison is not used in
the loop, leading to potential issues with the strcmp().

Update the method to use the real-path values instead.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
@derrickstolee derrickstolee self-assigned this Apr 21, 2022
Copy link
Collaborator

@vdye vdye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a (hopefully straightforward? and hopefully workable?) suggestion for how you might be able to centralize the call to normalize_object_dir, but otherwise this functionally looks correct.

@@ -127,6 +135,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv)

FREE_AND_NULL(options);

normalize_object_dir();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than calling this in each subcommand, you can possibly do it with OPT_CALLBACK:

static struct option common_opts[] = {
	OPT_CALLBACK(0, "object-dir", &opts.object_dir, N_("file"),
		     N_("object directory containing set of packfile and pack-index pairs"),
		     normalize_object_dir),
	OPT_END(),
};

The function signature of normalize_object_dir would need to change to:

static int normalize_object_dir(const struct option *opt, const char *arg, int unset)

...and you might need to copy some of the filename processing from parse-options.c:get_value() (link), but it would centralize the processing and avoid needing to make sure we have normalize_object_dir in every new subcommand.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea! That's much cleaner. The initialization to the .git/objects directory can happen before parsing.

I've got this sent upstream, too. Do you want to add that suggestion there, too?

@derrickstolee derrickstolee force-pushed the midx-normalize-object-dir branch 4 times, most recently from af83a7e to a2af0ee Compare April 25, 2022 17:01
The --object-dir argument to 'git multi-pack-index' allows a user to
specify an alternate to use instead of the local $GITDIR. This is used
by third-party tools like VFS for Git to maintain the pack-files in a
"shared object cache" used by multiple clones.

On Windows, the user can specify a path using a Windows-style file path
with backslashes such as "C:\Path\To\ObjectDir". This same path style is
used in the .git/objects/info/alternates file, so it already matches the
path of that alternate. However, find_odb() converts these paths to
real-paths for the comparison, which use forward slashes. As of the
previous change, lookup_multi_pack_index() uses real-paths, so it
correctly finds the target multi-pack-index when given these paths.

Some commands such as 'git multi-pack-index repack' call child processes
using the object_dir value, so it can be helpful to convert the path to
the real-path before sending it to those locations.

Add a callback to convert the real path immediately upon parsing the
argument. We need to be careful that we don't store the exact value out
of get_object_directory() and free it, or we could corrupt a later use
of the_repository->objects->odb->path.

We don't use get_object_directory() for the initial instantiation in
cmd_multi_pack_index() because we need 'git multi-pack-index -h' to work
without a Git repository.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
The get_object_directory() method returns the exact string stored at
the_repository->objects->odb->path. The return type of "char *" implies
that the caller must keep track of the buffer and free() it when
complete. This causes significant problems later when the ODB is
accessed.

Use "const char *" as the return type to avoid this confusion. There are
no current callers that care about the non-const definition.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
@derrickstolee
Copy link
Collaborator Author

@vdye: this made it to next upstream. Could you re-review this version so we have it in case of a minor release?

Copy link
Collaborator

@vdye vdye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@derrickstolee derrickstolee merged commit 1ac48ab into microsoft:vfs-2.36.0 May 4, 2022
dscho pushed a commit that referenced this pull request Jun 17, 2022
…t-dir

This resolves #497.

Looking at this, it seems that VFS for Git users have not been getting all of the benefits of background maintenance for quite some time. This will probably justify a `v2.36.0.vfs.0.1` release for those users. We should wait for upstream feedback, first. I will prepare an upstream patch soon.
dscho pushed a commit that referenced this pull request Jun 17, 2022
…t-dir

This resolves #497.

Looking at this, it seems that VFS for Git users have not been getting all of the benefits of background maintenance for quite some time. This will probably justify a `v2.36.0.vfs.0.1` release for those users. We should wait for upstream feedback, first. I will prepare an upstream patch soon.
dscho pushed a commit that referenced this pull request Jun 17, 2022
…t-dir

This resolves #497.

Looking at this, it seems that VFS for Git users have not been getting all of the benefits of background maintenance for quite some time. This will probably justify a `v2.36.0.vfs.0.1` release for those users. We should wait for upstream feedback, first. I will prepare an upstream patch soon.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

multi-pack-index expire doesn't do anything if --objects-dir contains backward slashes
2 participants