Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anonymize URLs in the reflog #797

Closed
wants to merge 1 commit into from
Closed

Conversation

dscho
Copy link
Member

@dscho dscho commented May 30, 2020

This came up in an internal audit, but we do not consider this to be a big deal: the reflog is local and not really shared with anybody.

Changes since v1:

  • Changed the if...else if...else cadence to move the die() to the last arm
  • Stopped the memory leak of display_repo (allocated by transport_anonymize_url())

@SHwareSystemsDev
Copy link

To me this should be a per-new-repo option, or if that's not enabled a per-commit-enabled-user option for those new repos, not a new behavior forced on all users after a client upgrade, to elect their commits get anonymized or not. For some the current behavior IS a feature, I'd think, not a bug.

@dscho
Copy link
Member Author

dscho commented May 30, 2020

You misunderstand. This is about the reflog. And about avoiding to write secrets to reflogs.

@dscho
Copy link
Member Author

dscho commented Jun 1, 2020

/submit

@gitgitgadget-git
Copy link

@gitgitgadget-git
Copy link

On the Git mailing list, Jeff King wrote (reply to this):

On Mon, Jun 01, 2020 at 07:20:02PM +0000, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> Even if we strongly discourage putting credentials into the URLs passed
> via the command-line, there _is_ support for that, and users _do_ do
> that.
> 
> Let's scrub them before writing them to the reflog.

Good idea.

>     This came up in an internal audit, but we do not consider this to be a
>     big deal: the reflog is local and not really shared with anybody.

Agreed.

>  builtin/clone.c            | 10 ++++++----
>  builtin/fetch.c            |  9 +++++++--
>  t/t5541-http-push-smart.sh | 15 +++++++++++++++

The patch itself looks very neatly done.

> @@ -993,11 +993,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  
>  	path = get_repo_path(repo_name, &is_bundle);
>  	if (path)
> -		repo = absolute_pathdup(repo_name);
> +		display_repo = repo = absolute_pathdup(repo_name);
>  	else if (!strchr(repo_name, ':'))
>  		die(_("repository '%s' does not exist"), repo_name);
> -	else
> +	else {
>  		repo = repo_name;
> +		display_repo = transport_anonymize_url(repo);
> +	}

Not introduced by your patch, but I had to read this a few times to make
sure we always end up with repo and display_repo set. IMHO it would be
easier to read as:

  if (this) {
     repo = ...;
     display_repo = ...;
  } else if (that) {
     repo = ...;
     display_repo = ...;
  } else {
     die(...);
  }

instead of sticking the die() in the middle.  Maybe just personal
preference, though. :)

> +	# should have been scrubbed down to vanilla URL
> +	git log -g master >reflog &&
> +	grep "$HTTPD_URL" reflog &&
> +	! grep "$HTTPD_URL_USER_PASS" reflog
> +'

And you make sure we retain the username. Nice.

-Peff

@gitgitgadget-git
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Jeff King <peff@peff.net> writes:

> On Mon, Jun 01, 2020 at 07:20:02PM +0000, Johannes Schindelin via GitGitGadget wrote:
>
>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>> 
>> Even if we strongly discourage putting credentials into the URLs passed
>> via the command-line, there _is_ support for that, and users _do_ do
>> that.
>> 
>> Let's scrub them before writing them to the reflog.
>
> Good idea.
>
>>     This came up in an internal audit, but we do not consider this to be a
>>     big deal: the reflog is local and not really shared with anybody.
>
> Agreed.

Nice.

>>  builtin/clone.c            | 10 ++++++----
>>  builtin/fetch.c            |  9 +++++++--
>>  t/t5541-http-push-smart.sh | 15 +++++++++++++++
>
> The patch itself looks very neatly done.
>
>> @@ -993,11 +993,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>  
>>  	path = get_repo_path(repo_name, &is_bundle);
>>  	if (path)
>> -		repo = absolute_pathdup(repo_name);
>> +		display_repo = repo = absolute_pathdup(repo_name);
>>  	else if (!strchr(repo_name, ':'))
>>  		die(_("repository '%s' does not exist"), repo_name);
>> -	else
>> +	else {
>>  		repo = repo_name;
>> +		display_repo = transport_anonymize_url(repo);
>> +	}
>
> Not introduced by your patch, but I had to read this a few times to make
> sure we always end up with repo and display_repo set. IMHO it would be
> easier to read as:
>
>   if (this) {
>      repo = ...;
>      display_repo = ...;
>   } else if (that) {
>      repo = ...;
>      display_repo = ...;
>   } else {
>      die(...);
>   }
>
> instead of sticking the die() in the middle.  Maybe just personal
> preference, though. :)

For a if/elseif cascade of few-liner blocks each, I do not think it
would matter, but if a block were larger, having the die() case at
the beginning or at the end would indeed make it easier to spot any
anomalies.

>> +	# should have been scrubbed down to vanilla URL
>> +	git log -g master >reflog &&
>> +	grep "$HTTPD_URL" reflog &&
>> +	! grep "$HTTPD_URL_USER_PASS" reflog
>> +'
>
> And you make sure we retain the username. Nice.

@gitgitgadget-git
Copy link

This branch is now known as js/reflog-anonymize-for-clone-and-fetch.

@gitgitgadget-git
Copy link

This patch series was integrated into pu via 9f4681d.

@gitgitgadget-git gitgitgadget-git bot added the pu label Jun 2, 2020
@gitgitgadget-git
Copy link

This patch series was integrated into pu via 64cc9df.

Even if we strongly discourage putting credentials into the URLs passed
via the command-line, there _is_ support for that, and users _do_ do
that.

Let's scrub them before writing them to the reflog.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
@gitgitgadget-git
Copy link

This patch series was integrated into pu via 4d21865.

@dscho
Copy link
Member Author

dscho commented Jun 4, 2020

/submit

@gitgitgadget-git
Copy link

@gitgitgadget-git
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:

> diff --git a/builtin/clone.c b/builtin/clone.c
> index 1ad26f4d8c8..002d23ab0a2 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -939,7 +939,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  {
>  	int is_bundle = 0, is_local;
>  	const char *repo_name, *repo, *work_tree, *git_dir;
> -	char *path, *dir;
> +	char *path, *dir, *display_repo = NULL;
>  	int dest_exists;
>  	const struct ref *refs, *remote_head;
>  	const struct ref *remote_head_points_at;
> @@ -994,10 +994,11 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	path = get_repo_path(repo_name, &is_bundle);
>  	if (path)
>  		repo = absolute_pathdup(repo_name);
> -	else if (!strchr(repo_name, ':'))
> -		die(_("repository '%s' does not exist"), repo_name);
> -	else
> +	else if (strchr(repo_name, ':')) {
>  		repo = repo_name;
> +		display_repo = transport_anonymize_url(repo);
> +	} else
> +		die(_("repository '%s' does not exist"), repo_name);
>  
>  	/* no need to be strict, transport_set_option() will validate it again */
>  	if (option_depth && atoi(option_depth) < 1)
> @@ -1014,7 +1015,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  		die(_("destination path '%s' already exists and is not "
>  			"an empty directory."), dir);
>  
> -	strbuf_addf(&reflog_msg, "clone: from %s", repo);
> +	strbuf_addf(&reflog_msg, "clone: from %s",
> +		    display_repo ? display_repo : repo);
> +	free(display_repo);

The new patch is easier to see because display_repo becomes non NULL
only when anonymization was necessary and done.  Makes sense.

Will queue.  Thanks.

@gitgitgadget-git
Copy link

This patch series was integrated into pu via 93e9c45.

@gitgitgadget-git
Copy link

This patch series was integrated into pu via 9321da6.

@gitgitgadget-git
Copy link

This patch series was integrated into pu via 3228781.

@gitgitgadget-git
Copy link

This patch series was integrated into pu via 406df49.

@gitgitgadget-git
Copy link

This patch series was integrated into next via 2901ff9.

@gitgitgadget-git
Copy link

This patch series was integrated into pu via 7238482.

@gitgitgadget-git
Copy link

This patch series was integrated into pu via 524caf8.

@gitgitgadget-git
Copy link

This patch series was integrated into next via 524caf8.

@gitgitgadget-git
Copy link

This patch series was integrated into master via 524caf8.

@gitgitgadget-git
Copy link

Closed via 524caf8.

@dscho dscho deleted the anonymize-clone-reflog branch June 18, 2020 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants