Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clone: add remote.cloneDefault config option #710

Closed

Conversation

sjbarag
Copy link

@sjbarag sjbarag commented Aug 25, 2020

While the default remote name of origin can be overridden both
pre-clone (with git clone --origin foo) and post-clone (with git remote rename origin foo), it'd be handy to have a configurable
system-wide default for clones! This implements
remote.cloneDefault as a parallel to remote.pushDefault,
with prioritized name resolution:

  1. (Highest priority) git clone's -o option
  2. git config's remote.cloneDefault value
  3. origin

There should be no impact for existing users, as it's pretty unlikely
that anyone's already configured remote.cloneDefault and the porcelain
hasn't changed (as best I can tell at least!).

cc: Derrick Stolee stolee@gmail.com
cc: Sean Barag sean@barag.org

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 25, 2020

Welcome to GitGitGadget

Hi @sjbarag, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that your Pull Request has a good description, as it will be used as cover letter.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "commit:", and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the FreeNode IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Freenode. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 25, 2020

There are issues in commit 6fb2369:
Prefixed commit message must be in lower case: clone: Add remote.cloneDefault config option
Commit not signed off

@sjbarag sjbarag force-pushed the add-remote.cloneDefault-config branch from 6fb2369 to 1581587 Compare August 25, 2020 15:59
@gitgitgadget
Copy link

gitgitgadget bot commented Aug 25, 2020

There is an issue in commit 1581587:
Prefixed commit message must be in lower case: clone: Add remote.cloneDefault config option

@dscho
Copy link
Member

dscho commented Aug 25, 2020

/allow

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 25, 2020

User sjbarag is now allowed to use GitGitGadget.

WARNING: sjbarag has no public email address set on GitHub

@sjbarag
Copy link
Author

sjbarag commented Aug 25, 2020

Whoops! Email is now public.

Thanks @dscho!

While the default remote name of `origin` can be overridden both
pre-clone (with `git clone --origin foo`) and post-clone (with `git
remote rename origin foo`), it'd be handy to have a configurable
system-wide default for clones!  This commit implements
`remote.cloneDefault` as a parallel to `remote.pushDefault`,
with prioritized name resolution:

1. (Highest priority) `git clone`'s `-o` option
2. `git config`'s `remote.cloneDefault` value
3. `origin`

There should be no impact for existing users, as it's pretty unlikely
that anyone's already configured `remote.cloneDefault` and the porcelain
hasn't changed (as best I can tell at least!).

Signed-off-by: Sean Barag <sean@barag.org>
@sjbarag sjbarag force-pushed the add-remote.cloneDefault-config branch from 1581587 to 9cb1898 Compare August 25, 2020 20:01
@sjbarag sjbarag changed the title clone: Add remote.cloneDefault config option clone: add remote.cloneDefault config option Aug 25, 2020
@dscho
Copy link
Member

dscho commented Aug 26, 2020

Time for a /preview?

@sjbarag
Copy link
Author

sjbarag commented Aug 26, 2020

/preview

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 26, 2020

Preview email sent as pull.710.git.1598456141200.gitgitgadget@gmail.com

@sjbarag
Copy link
Author

sjbarag commented Aug 26, 2020

Preview looked sane, so I'm going to /submit 🤞

Thanks for all your help @dscho! You've made this very easy for a newcomer.

@sjbarag
Copy link
Author

sjbarag commented Aug 26, 2020

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 26, 2020

Submitted as pull.710.git.1598456751674.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-710/sjbarag/add-remote.cloneDefault-config-v1

To fetch this version to local tag pr-710/sjbarag/add-remote.cloneDefault-config-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-710/sjbarag/add-remote.cloneDefault-config-v1

@dscho
Copy link
Member

dscho commented Aug 26, 2020

Preview looked sane, so I'm going to /submit 🤞

Thanks for all your help @dscho! You've made this very easy for a newcomer.

I'm glad!

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 26, 2020

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Sean Barag <sean@barag.org>
>
> While the default remote name of `origin` can be overridden both
> pre-clone (with `git clone --origin foo`) and post-clone (with `git
> remote rename origin foo`), it'd be handy to have a configurable
> system-wide default for clones!  

I doubt it is handy enough to deserve an explamation point.  Replace
it with a plain-vanilla full-stop instead.

I however tend to agree that, even evidenced by the manual page
description of "git clone", i.e.

    -o <name>::
    --origin <name>::
            Instead of using the remote name `origin` to keep track
            of the upstream repository, use `<name>`.

that it is understandable if many users and projects wish to call it
"upstream".

> This commit implements
> `remote.cloneDefault` as a parallel to `remote.pushDefault`,
> with prioritized name resolution:

I highly doubt that .cloneDefault is a good name.  After reading
only the title of the patch e-mail, i.e. when the only available
information on the change available to me was the name of the
configuration variable and the fact that it pertains to the command
"git clone", I thought it is to specify a URL, from which "git
clone" without the URL would clone from that single repository.

And the name will cause the same misunderstanding to normal users,
not just to reviewers of your patch, after this change hits a future
Git release.

Taking a parallel from init.defaultBranchName, I would probably call
it clone.defaultUpstreamName if I were writing this feature.

> diff --git a/builtin/clone.c b/builtin/clone.c
> index b087ee40c2..b0dbb848c6 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -941,6 +941,29 @@ static int path_exists(const char *path)
>  	return !stat(path, &sb);
>  }
>  
> +struct clone_default_info
> +{
> +	enum config_scope scope;
> +	struct strbuf remote_name;
> +	int linenr;
> +};
> +
> +static int config_read_clone_default(const char *key, const char *value,
> +	void *cb)
> +{
> +	struct clone_default_info* info = cb;
> +	if (strcmp(key, "remote.clonedefault") || !value) {
> +		return 0;
> +	}
> +
> +	info->scope = current_config_scope();
> +	strbuf_reset(&info->remote_name);
> +	strbuf_addstr(&info->remote_name, value);
> +	info->linenr = current_config_line();
> +
> +	return 0;
> +}

This feels way overkill and insufficient at the same time.  It does
not need scope, it does not need linenr, and we already have a place
to store end-user specified name for the upstream in the form of the
variable option_origin.  And the code is not diagnosing any error.

static int git_clone_config(const char *k, const char *v, void *cb)
{
	if (option_origin)
		return 0; /* ignore -- the user gave us an option */

	if (!strcmp(k, "clone.defaultupstreamname")) {
		if (!v)
			return config_error_nonbool(k);
		if (strchr(v, '/') || check_refname_format(v, REFNAME_ALLOW_ONELEVEL))
                	return error(_("invalid upstream name '%s'"), v);
		option_origin = xstrdup(v);
		return 0;
	}
	return 0;
}

would be sufficient, and at the same time makes sure it rejects
names like 'o..ri..gin', 'o/ri/gin', etc.

> @@ -992,8 +1015,15 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		option_no_checkout = 1;
>  	}
>  
> -	if (!option_origin)
> -		option_origin = "origin";
> +	if (!option_origin) {
> +		struct clone_default_info clone_default = { CONFIG_SCOPE_UNKNOWN, STRBUF_INIT, -1 };
> +		git_config(config_read_clone_default, &clone_default);
> +		if (strcmp("", (const char*) clone_default.remote_name.buf))
> +			option_origin = clone_default.remote_name.buf;
> +		else
> +			option_origin = "origin";
> +	}
> +

It is somewhat sad that we have the git_config(git_default_config)
call so late in the control flow.  I wonder if we can update the
start-up sequence to match the usual flow, e.g. these things happen
in the following order in a normal program.

    - variables are given their default value.

    - call git_config(git_commandspecific_config, ...) early in the
      program.

      - git_commandspecific_config() interprets the command specific
        variables and passes everything else to git_default_config()

      variables like option_origin are given values of configuration
      variable at this point.

    - call parse_options() next, which may override the variables
      from the value on the command line.

    - main control flow uses the variable.  "Command-line wins over
      configuration which wins over the default" falls out naturally.

One oddity "git clone" has is that it wants to delay the reading of
configuration files (they are read only once, and second and
subsequent git_config() calls will reuse what was read before [*])
so that it can read what clone.c::write_config() wrote, so if we were
to "fix" the start-up sequence to match the usual flow, we need to
satisfy what that odd arrangement wanted to achieve in some other
way (e.g. feed what is in option_config to git_default_config
ourselves, without using git_config(), as part of the "main control
flow uses the variable" part), but it should be doable.

	[*Side note*].  The above means that this patch, even when
	the configuration variable does not give upstream name, may
	break the option_config feature by breaking the second call
	to git_config().  We need to have a test for that.

> diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
> index e69427f881..8aac67b385 100755
> --- a/t/t5606-clone-options.sh
> +++ b/t/t5606-clone-options.sh
> @@ -19,6 +19,20 @@ test_expect_success 'clone -o' '
>  
>  '
>  
> +test_expect_success 'clone respects remote.cloneDefault' '
> +
> +	git -c remote.cloneDefault=bar clone parent clone-config &&
> +	(cd clone-config && git rev-parse --verify refs/remotes/bar/master)
> +
> +'
> +
> +test_expect_success 'clone chooses correct remote name' '
> +
> +	git -c remote.cloneDefault=bar clone -o foo parent clone-o-and-config &&
> +	(cd clone-o-and-config && git rev-parse --verify refs/remotes/foo/master)
> +
> +'

These two are "showing off my shiny new toy" tests, which are
needed, but we also need negative tests where the shiny new toy does
not kick in when it should not.  For example

	git -c remote.cloneDefault="bad.../...name" clone parent

should fail, no?

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 26, 2020

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 8/26/2020 2:46 PM, Junio C Hamano wrote:
> "Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> This commit implements
>> `remote.cloneDefault` as a parallel to `remote.pushDefault`,
>> with prioritized name resolution:
> 
> I highly doubt that .cloneDefault is a good name.  After reading
> only the title of the patch e-mail, i.e. when the only available
> information on the change available to me was the name of the
> configuration variable and the fact that it pertains to the command
> "git clone", I thought it is to specify a URL, from which "git
> clone" without the URL would clone from that single repository.
> 
> And the name will cause the same misunderstanding to normal users,
> not just to reviewers of your patch, after this change hits a future
> Git release.
> 
> Taking a parallel from init.defaultBranchName, I would probably call
> it clone.defaultUpstreamName if I were writing this feature.

I was thinking "clone.defaultRemoteName" makes it clear we are naming
the remote for the provided <url> in the command. Having

	clone.defaultUpstreamName = upstream

may look a bit confusing, while

	clone.defaultRemoteName = upstream

makes it a bit clearer that we will set

	remote.upstream.url = <url>

>> diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
>> index e69427f881..8aac67b385 100755
>> --- a/t/t5606-clone-options.sh
>> +++ b/t/t5606-clone-options.sh
>> @@ -19,6 +19,20 @@ test_expect_success 'clone -o' '
>>  
>>  '
>>  
>> +test_expect_success 'clone respects remote.cloneDefault' '
>> +
>> +	git -c remote.cloneDefault=bar clone parent clone-config &&
>> +	(cd clone-config && git rev-parse --verify refs/remotes/bar/master)
>> +
>> +'
>> +
>> +test_expect_success 'clone chooses correct remote name' '
>> +
>> +	git -c remote.cloneDefault=bar clone -o foo parent clone-o-and-config &&
>> +	(cd clone-o-and-config && git rev-parse --verify refs/remotes/foo/master)
>> +
>> +'
> 
> These two are "showing off my shiny new toy" tests, which are
> needed, but we also need negative tests where the shiny new toy does
> not kick in when it should not.  For example
> 
> 	git -c remote.cloneDefault="bad.../...name" clone parent
> 
> should fail, no?

This is an important suggestion.

Thanks,
-Stolee

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 26, 2020

User Derrick Stolee <stolee@gmail.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 26, 2020

On the Git mailing list, Junio C Hamano wrote (reply to this):

Derrick Stolee <stolee@gmail.com> writes:

> On 8/26/2020 2:46 PM, Junio C Hamano wrote:
>> "Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>> This commit implements
>>> `remote.cloneDefault` as a parallel to `remote.pushDefault`,
>>> with prioritized name resolution:
>> 
>> I highly doubt that .cloneDefault is a good name.  After reading
>> only the title of the patch e-mail, i.e. when the only available
>> information on the change available to me was the name of the
>> configuration variable and the fact that it pertains to the command
>> "git clone", I thought it is to specify a URL, from which "git
>> clone" without the URL would clone from that single repository.
>> 
>> And the name will cause the same misunderstanding to normal users,
>> not just to reviewers of your patch, after this change hits a future
>> Git release.
>> 
>> Taking a parallel from init.defaultBranchName, I would probably call
>> it clone.defaultUpstreamName if I were writing this feature.
>
> I was thinking "clone.defaultRemoteName" makes it clear we are naming
> the remote for the provided <url> in the command.

I 100% agree that defaultremotename is much better.

>> ...  For example
>> 
>> 	git -c remote.cloneDefault="bad.../...name" clone parent
>> 
>> should fail, no?
>
> This is an important suggestion.

To be fair, the current code does not handle the "--origin" command
line option not so carefully.

Back when the command was scripted, e.g. 47874d6d (revamp git-clone
(take #2)., 2006-03-21), had both ref-format check and */*
multi-level check, and these checks been retained throughout its
life until 8434c2f1 (Build in clone, 2008-04-27) rewrote the whole
thing while discarding these checks for --origin=bad.../...name

It would make an excellent #leftoverbits or #microproject.

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 27, 2020

On the Git mailing list, Sean Barag wrote (reply to this):

> Derrick Stolee <stolee@gmail.com> writes:
> 
>> On 8/26/2020 2:46 PM, Junio C Hamano wrote:
>>> "Sean Barag via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>>> This commit implements
>>>> `remote.cloneDefault` as a parallel to `remote.pushDefault`,
>>>> with prioritized name resolution:
>>> 
>>> I highly doubt that .cloneDefault is a good name.  After reading
>>> only the title of the patch e-mail, i.e. when the only available
>>> information on the change available to me was the name of the
>>> configuration variable and the fact that it pertains to the command
>>> "git clone", I thought it is to specify a URL, from which "git
>>> clone" without the URL would clone from that single repository.
>>> 
>>> And the name will cause the same misunderstanding to normal users,
>>> not just to reviewers of your patch, after this change hits a future
>>> Git release.
>>> 
>>> Taking a parallel from init.defaultBranchName, I would probably call
>>> it clone.defaultUpstreamName if I were writing this feature.
>>
>> I was thinking "clone.defaultRemoteName" makes it clear we are naming
>> the remote for the provided <url> in the command.
>
>I 100% agree that defaultremotename is much better.

Perfect, I'll move forward with `clone.defaultRemoteName`.  Thanks for
the recommendation.  This would be the first config variable inside
the a "clone" subsection -- is there anything special that needs to
happen when a new subsection is added?

>>>> ...  For example
>>> 
>>> 	git -c remote.cloneDefault="bad.../...name" clone parent
>>> 
>>> should fail, no?
>>
>> This is an important suggestion.
>
> To be fair, the current code does not handle the "--origin" command
> line option not so carefully.

Agreed - I'm sorry for not including those tests.  They'll be present
in v2.  I'll be sure to include some validation for
`clone.defaultRemoteName` within `git_config` as well.

> It is somewhat sad that we have the git_config(git_default_config)
> call so late in the control flow.  I wonder if we can update the
> start-up sequence to match the usual flow
> ...
> One oddity "git clone" has is that it wants to delay the reading of
> configuration files (they are read only once, and second and
> subsequent git_config() calls will reuse what was read before [*]) so
> that it can read what clone.c::write_config() wrote, so if we were to
> "fix" the start-up sequence to match the usual flow, we need to
> satisfy what that odd arrangement wanted to achieve in some other way
> (e.g. feed what is in option_config to git_default_config ourselves,
> without using git_config(), as part of the "main control flow uses the
> variable" part), but it should be doable.

Sounds like a pretty big change! I'm willing to take a crack at it,
but given that this is my first patch I'm frankly a bit intimidated :)
How would you feel about that being a separate patch?

Thanks for all the guidance, folks.
Sean

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 27, 2020

User Sean Barag <sean@barag.org> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 27, 2020

On the Git mailing list, Junio C Hamano wrote (reply to this):

Sean Barag <sean@barag.org> writes:

>> It is somewhat sad that we have the git_config(git_default_config)
>> call so late in the control flow.  I wonder if we can update the
>> start-up sequence to match the usual flow
>> ...
>> One oddity "git clone" has is that it wants to delay the reading of
>> configuration files (they are read only once, and second and
>> subsequent git_config() calls will reuse what was read before [*]) so
>> that it can read what clone.c::write_config() wrote, so if we were to
>> "fix" the start-up sequence to match the usual flow, we need to
>> satisfy what that odd arrangement wanted to achieve in some other way
>> (e.g. feed what is in option_config to git_default_config ourselves,
>> without using git_config(), as part of the "main control flow uses the
>> variable" part), but it should be doable.
>
> Sounds like a pretty big change! I'm willing to take a crack at it,
> but given that this is my first patch I'm frankly a bit intimidated :)
> How would you feel about that being a separate patch?

It definitely needs to be a separate patch.  

In order to realize any new feature that needs to read the existing
(i.e. per-machine or per-user) configuration files to affect the
behaviour of "git clone", whether it is the "give default to
--origin option" or any other thing, first needs to fix the start-up
sequence so that the configuration is read once before we process
command line options, which is the norm.  Only after that is done,
we can build the clone.defaultRemoteName and other features that
would be affected by the settings of clone.* variables on top, and
it is not wise to mix the foundation with a new feature that uses
the foundation.  So I would imagine it would at least be 3-patch
series:

 - [1/3] would be to whip the start-up sequence into the normal
   order.  This may be the most tricky part.  I identified that the
   handling of option_config might need adjustment, but there may be
   others.  This may not need any new tests, but if the existing
   tests for "clone -c var=val" do not catch breakage when we
   naively move the git_config(git_default_config) call early
   without doing any other adjustment, we might need to add more
   tests to cover the option.  If we find things other than
   option_config needs adjustment, they also need test coverage.

 - [2/3] would be to tighten the error checking of option_origin
   that was lost when the command was reimplemented in C (already
   discussed).  This would need new tests to see "--origin $bogus"
   command line option is flagged as an error.

 - [3/3] would be to read option_origin from the configuration file.
   The start-up sequence fixed by [1/3] would allow us to write the
   config callback in a more natural way, compared to what you wrote
   and what I suggested to rewrite.  Error checking code in [2/3]
   would directly be reusable in the code added in this step.  We'd
   need tests similar to we add in [2/3] for the configuration, and
   combination of configuration and command line (you already wrote
   most of these).

Thanks.


@gitgitgadget
Copy link

gitgitgadget bot commented Aug 27, 2020

On the Git mailing list, Sean Barag wrote (reply to this):

> In order to realize any new feature that needs to read the existing
> (i.e. per-machine or per-user) configuration files to affect the
> behaviour of "git clone", whether it is the "give default to
> --origin option" or any other thing, first needs to fix the start-up
> sequence so that the configuration is read once before we process
> command line options, which is the norm.  Only after that is done,
> we can build the clone.defaultRemoteName and other features that
> would be affected by the settings of clone.* variables on top, and
> it is not wise to mix the foundation with a new feature that uses
> the foundation.

Makes sense!  Thanks for all your help -- I _really_ appreciate it.
I'll give it a try over the next few days.

@sjbarag
Copy link
Author

sjbarag commented Aug 27, 2020

Taking a different approach — will submit separate patch(es) later!

@sjbarag sjbarag closed this Aug 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants