-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GSOC][RFC] cat-file: reuse ref-filter logic #980
[GSOC][RFC] cat-file: reuse ref-filter logic #980
Conversation
/submit |
Submitted as pull.980.git.1623496458.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
@@ -144,6 +144,7 @@ enum atom_type { | |||
ATOM_BODY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Christian Couder wrote (reply to this):
On Sat, Jun 12, 2021 at 1:14 PM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> Let `populate_value()`, `get_ref_atom_value()` and
> `format_ref_array_item()` get the return value of `get_value()`
> correctly. This can help us later let `cat-file --batch` get the
> correct error message and return value of `get_value()`.
Is it get_object() or get_value()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
Christian Couder <christian.couder@gmail.com> 于2021年6月13日周日 上午4:09写道:
>
> On Sat, Jun 12, 2021 at 1:14 PM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Let `populate_value()`, `get_ref_atom_value()` and
> > `format_ref_array_item()` get the return value of `get_value()`
> > correctly. This can help us later let `cat-file --batch` get the
> > correct error message and return value of `get_value()`.
>
> Is it get_object() or get_value()?
Oh, it's get_object().
Thanks for pointing out:)
--
ZheNing Hu
0004d5b
to
cbf7d51
Compare
/submit |
Submitted as pull.980.v2.git.1623763746.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
@@ -144,6 +144,7 @@ enum atom_type { | |||
ATOM_BODY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: ZheNing Hu <adlternative@gmail.com>
>
> Let `populate_value()`, `get_ref_atom_value()` and
> `format_ref_array_item()` get the return value of `get_object()`
> correctly.
The "get" the value correctly, I think. What you are teaching them
is to pass the return value from get_object() through the callchain
to their callers.
The readers will be helped if you say what kind of errors
get_object() wants to tell its callers, not just "-1" is for error,
which is what populate_value() assumes to be sufficient. In other
words, which non-zero returns from get_object() are interesting and
why?
> @@ -1997,9 +1997,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
> static int get_ref_atom_value(struct ref_array_item *ref, int atom,
> struct atom_value **v, struct strbuf *err)
> {
> + int ret = 0;
> +
> if (!ref->value) {
> - if (populate_value(ref, err))
> - return -1;
> + if ((ret = populate_value(ref, err)))
> + return ret;
The new variable only needs to be in this scope, and does not have
to be shown to the entire function.
> @@ -2573,6 +2575,7 @@ int format_ref_array_item(struct ref_array_item *info,
> {
> const char *cp, *sp, *ep;
> struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
> + int ret = 0;
This is dubious...
> state.quote_style = format->quote_style;
> push_stack_element(&state.stack);
> @@ -2585,10 +2588,10 @@ int format_ref_array_item(struct ref_array_item *info,
> if (cp < sp)
> append_literal(cp, sp, &state);
> pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
> - if (pos < 0 || get_ref_atom_value(info, pos, &atomv, error_buf) ||
> + if (pos < 0 || (ret = get_ref_atom_value(info, pos, &atomv, error_buf)) ||
Here, if "ret" gets assigned any non-zero value, the condition is
satisfied, and ...
> atomv->handler(atomv, &state, error_buf)) {
> pop_stack_element(&state.stack);
> - return -1;
> + return ret ? ret : -1;
... the control flow will leave this function. Therefore, ...
> }
> }
> if (*cp) {
> @@ -2610,7 +2613,7 @@ int format_ref_array_item(struct ref_array_item *info,
> }
> strbuf_addbuf(final_buf, &state.stack->output);
> pop_stack_element(&state.stack);
> - return 0;
> + return ret;
... at this point, "ret" can never be anything other than zero. Am
I misreading the patch?
If I am not misreading the patch, then "ret" does not have to be
globally visible in this function---it can have the same scope as
"pos".
> }
>
> void pretty_print_ref(const char *name, const struct object_id *oid,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
Junio C Hamano <gitster@pobox.com> 于2021年6月16日周三 下午3:36写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Let `populate_value()`, `get_ref_atom_value()` and
> > `format_ref_array_item()` get the return value of `get_object()`
> > correctly.
>
> The "get" the value correctly, I think. What you are teaching them
> is to pass the return value from get_object() through the callchain
> to their callers.
>
Yes, this is exactly what I meant.
> The readers will be helped if you say what kind of errors
> get_object() wants to tell its callers, not just "-1" is for error,
> which is what populate_value() assumes to be sufficient. In other
> words, which non-zero returns from get_object() are interesting and
> why?
>
As stated in 765337a, We can just print the error without exiting if the
return value of format_ref_array_item() is greater than 0. Therefore,
the current patch is to make get_object() return a value other than
-1 when an error occurs.
> > @@ -1997,9 +1997,11 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
> > static int get_ref_atom_value(struct ref_array_item *ref, int atom,
> > struct atom_value **v, struct strbuf *err)
> > {
> > + int ret = 0;
> > +
> > if (!ref->value) {
> > - if (populate_value(ref, err))
> > - return -1;
> > + if ((ret = populate_value(ref, err)))
> > + return ret;
>
> The new variable only needs to be in this scope, and does not have
> to be shown to the entire function.
>
Makes sense.
> > @@ -2573,6 +2575,7 @@ int format_ref_array_item(struct ref_array_item *info,
> > {
> > const char *cp, *sp, *ep;
> > struct ref_formatting_state state = REF_FORMATTING_STATE_INIT;
> > + int ret = 0;
>
> This is dubious...
>
> > state.quote_style = format->quote_style;
> > push_stack_element(&state.stack);
> > @@ -2585,10 +2588,10 @@ int format_ref_array_item(struct ref_array_item *info,
> > if (cp < sp)
> > append_literal(cp, sp, &state);
> > pos = parse_ref_filter_atom(format, sp + 2, ep, error_buf);
> > - if (pos < 0 || get_ref_atom_value(info, pos, &atomv, error_buf) ||
> > + if (pos < 0 || (ret = get_ref_atom_value(info, pos, &atomv, error_buf)) ||
>
> Here, if "ret" gets assigned any non-zero value, the condition is
> satisfied, and ...
>
> > atomv->handler(atomv, &state, error_buf)) {
> > pop_stack_element(&state.stack);
> > - return -1;
> > + return ret ? ret : -1;
>
> ... the control flow will leave this function. Therefore, ...
>
> > }
> > }
> > if (*cp) {
> > @@ -2610,7 +2613,7 @@ int format_ref_array_item(struct ref_array_item *info,
> > }
> > strbuf_addbuf(final_buf, &state.stack->output);
> > pop_stack_element(&state.stack);
> > - return 0;
> > + return ret;
>
> ... at this point, "ret" can never be anything other than zero. Am
> I misreading the patch?
>
> If I am not misreading the patch, then "ret" does not have to be
> globally visible in this function---it can have the same scope as
> "pos".
>
You are right, It is correct to only return 0 here at the moment.
Thanks.
--
ZheNing Hu
@@ -144,6 +144,7 @@ enum atom_type { | |||
ATOM_BODY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: ZheNing Hu <adlternative@gmail.com>
>
> Introduce free_array_item_internal() for freeing ref_array_item value.
> It will be called internally by free_array_item(), and it will help
> `cat-file --batch` free ref_array_item's memory later.
As a file local static function, the horrible name free_array_item()
was tolerable. But before exposing a name like that to the outside
world, think twice if that is specific enough, and it is not. There
are 47 different kinds of "array"s we use in the system, but this
new helper function only works with ref_array_item and not on items
in any other kinds of arrays.
> -/* Free memory allocated for a ref_array_item */
> -static void free_array_item(struct ref_array_item *item)
> +void free_array_item_internal(struct ref_array_item *item)
> {
And "internal" is a horrible name to have as an external name. You
probably can come up with a more appropriate name when you imagine
yourself explaining to somebody who is relatively new to this part
of the codebase what the difference between free_array_item() and
this new helper is, where the difference comes from, why the symref
member (and no other member) is so special, etc.
I _think_ what is special is not the .symref but is the .value
field, IOW, you are trying to come up with an interface to free the
value part of ref_array_item without touching other things. But it
is not helpful at all to readers if you do not explain why you want
to do so. Why is the .value member so special? The ability to
clear only the .value member without touching other members is useful
because ...?
In any case, assuming that you'd establish why the .value member is
so special to deserve an externally callable function, when external
callers do not have to be able to free the item as a whole (i.e.
free_array_item() is still file-scope static), in the proposed log
message in an updated patch, I would imagine that
free_ref_array_item_value()
would be a more suitable name than the _internal thing. When it
happens, you might want to rename the static one to
free_ref_array_item() to match, even if it does not have external
callers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
Junio C Hamano <gitster@pobox.com> 于2021年6月16日周三 下午3:49写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > From: ZheNing Hu <adlternative@gmail.com>
> >
> > Introduce free_array_item_internal() for freeing ref_array_item value.
> > It will be called internally by free_array_item(), and it will help
> > `cat-file --batch` free ref_array_item's memory later.
>
> As a file local static function, the horrible name free_array_item()
> was tolerable. But before exposing a name like that to the outside
> world, think twice if that is specific enough, and it is not. There
> are 47 different kinds of "array"s we use in the system, but this
> new helper function only works with ref_array_item and not on items
> in any other kinds of arrays.
>
Well, this is indeed ignored by me.
So free_array_item() --> free_ref_array_item() and free_array_item()
--> free_ref_array_item().
> > -/* Free memory allocated for a ref_array_item */
> > -static void free_array_item(struct ref_array_item *item)
> > +void free_array_item_internal(struct ref_array_item *item)
> > {
>
> And "internal" is a horrible name to have as an external name. You
> probably can come up with a more appropriate name when you imagine
> yourself explaining to somebody who is relatively new to this part
> of the codebase what the difference between free_array_item() and
> this new helper is, where the difference comes from, why the symref
> member (and no other member) is so special, etc.
>
Yeah, "internal" is an incorrect naming. The main purpose here is
to only free a ref_array_item's value.
> I _think_ what is special is not the .symref but is the .value
> field, IOW, you are trying to come up with an interface to free the
> value part of ref_array_item without touching other things. But it
> is not helpful at all to readers if you do not explain why you want
> to do so. Why is the .value member so special? The ability to
> clear only the .value member without touching other members is useful
> because ...?
>
Because batch_object_write() use a ref_array_item which is allocated on the
stack, and it's member symref is not used at all. ref_array_item's symref and
itself do not need free. The original free_array_item() will free all
dynamically
allocated content in ref_array_item. It cannot meet our requirements
at this time:
only free the ref_array_item's value.
> In any case, assuming that you'd establish why the .value member is
> so special to deserve an externally callable function, when external
> callers do not have to be able to free the item as a whole (i.e.
> free_array_item() is still file-scope static), in the proposed log
> message in an updated patch, I would imagine that
>
> free_ref_array_item_value()
>
> would be a more suitable name than the _internal thing. When it
> happens, you might want to rename the static one to
> free_ref_array_item() to match, even if it does not have external
> callers.
Make sence.
--
ZheNing Hu
On the Git mailing list, ZheNing Hu wrote (reply to this):
|
@@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Sat, Jun 12 2021, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
>
> Only tag and commit objects use `grab_sub_body_contents()` to grab
> object contents in the current codebase. We want to teach the
> function to also handle blobs and trees to get their raw data,
> without parsing a blob (whose contents looks like a commit or a tag)
> incorrectly as a commit or a tag.
>
> Skip the block of code that is specific to handling commits and tags
> early when the given object is of a wrong type to help later
> addition to handle other types of objects in this function.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: Hariom Verma <hariom18599@gmail.com>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
> ---
> ref-filter.c | 24 +++++++++++++++---------
> 1 file changed, 15 insertions(+), 9 deletions(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index 4db0e40ff4c6..5cee6512fbaf 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1356,11 +1356,12 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
> }
>
> /* See grab_values */
> -static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
> +static void grab_sub_body_contents(struct atom_value *val, int deref, struct expand_data *data)
> {
> int i;
> const char *subpos = NULL, *bodypos = NULL, *sigpos = NULL;
> size_t sublen = 0, bodylen = 0, nonsiglen = 0, siglen = 0;
> + void *buf = data->content;
>
> for (i = 0; i < used_atom_cnt; i++) {
> struct used_atom *atom = &used_atom[i];
> @@ -1371,10 +1372,13 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
> continue;
> if (deref)
> name++;
> - if (strcmp(name, "body") &&
> - !starts_with(name, "subject") &&
> - !starts_with(name, "trailers") &&
> - !starts_with(name, "contents"))
> +
> + if ((data->type != OBJ_TAG &&
> + data->type != OBJ_COMMIT) ||
> + (strcmp(name, "body") &&
> + !starts_with(name, "subject") &&
> + !starts_with(name, "trailers") &&
> + !starts_with(name, "contents")))
We have 4 "real" object types, commit, tree, blob, tag. Do you really
mean "not tag or commit" here, don't you mean "is tree or blob" instead?
I.e. do we really want to pass OBJ_NONE etc. here?
> continue;
> if (!subpos)
> find_subpos(buf,
> @@ -1438,17 +1442,19 @@ static void fill_missing_values(struct atom_value *val)
> * pointed at by the ref itself; otherwise it is the object the
> * ref (which is a tag) refers to.
> */
> -static void grab_values(struct atom_value *val, int deref, struct object *obj, void *buf)
> +static void grab_values(struct atom_value *val, int deref, struct object *obj, struct expand_data *data)
> {
> + void *buf = data->content;
> +
> switch (obj->type) {
> case OBJ_TAG:
> grab_tag_values(val, deref, obj);
> - grab_sub_body_contents(val, deref, buf);
> + grab_sub_body_contents(val, deref, data);
> grab_person("tagger", val, deref, buf);
> break;
> case OBJ_COMMIT:
> grab_commit_values(val, deref, obj);
> - grab_sub_body_contents(val, deref, buf);
> + grab_sub_body_contents(val, deref, data);
> grab_person("author", val, deref, buf);
> grab_person("committer", val, deref, buf);
> break;
> @@ -1678,7 +1684,7 @@ static int get_object(struct ref_array_item *ref, int deref, struct object **obj
> return strbuf_addf_ret(err, -1, _("parse_object_buffer failed on %s for %s"),
> oid_to_hex(&oi->oid), ref->refname);
> }
> - grab_values(ref->value, deref, *obj, oi->content);
> + grab_values(ref->value, deref, *obj, oi);
> }
>
> grab_common_values(ref->value, deref, oi);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>> - if (strcmp(name, "body") &&
>> - !starts_with(name, "subject") &&
>> - !starts_with(name, "trailers") &&
>> - !starts_with(name, "contents"))
>> +
>> + if ((data->type != OBJ_TAG &&
>> + data->type != OBJ_COMMIT) ||
>> + (strcmp(name, "body") &&
>> + !starts_with(name, "subject") &&
>> + !starts_with(name, "trailers") &&
>> + !starts_with(name, "contents")))
>
> We have 4 "real" object types, commit, tree, blob, tag. Do you really
> mean "not tag or commit" here, don't you mean "is tree or blob" instead?
> I.e. do we really want to pass OBJ_NONE etc. here?
>
>> continue;
If somebody throws OBJ_NONE at us by mistake, we do not want to
handle such an object and try to extract the subject member from it
anyway, no?
The intent of the code here, before the patch, is that "what we do
after the control flow passes this point is about the body, subject,
trailers, and contents request, so everybody else should go to the
next iteration". The caller used to give us an object compatible
with these four types of requests, now the caller may throw others,
hence "by the way, we know these four kinds of requests make sense
only for tags and commits, so everybody else should go to the next
iteration, too" would be a natural thing to add. So in that sense,
I prefer it over "we know these four types of requests do not make
sense for blobs and trees".
User |
@@ -235,6 +235,15 @@ and `date` to extract the named component. For email fields (`authoremail`, | |||
without angle brackets, and `:localpart` to get the part before the `@` symbol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Sat, Jun 12 2021, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
>
> Add new formatting option `%(raw)`, which will print the raw
> object data without any changes. It will help further to migrate
> all cat-file formatting logic from cat-file to ref-filter.
Nice goal and feature to have.
> The raw data of blob, tree objects may contain '\0', but most of
> the logic in `ref-filter` depends on the output of the atom being
> text (specifically, no embedded NULs in it).
>
> E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
> add the data to the buffer. The raw data of a tree object is
> `100644 one\0...`, only the `100644 one` will be added to the buffer,
> which is incorrect.
>
> Therefore, add a new member in `struct atom_value`: `s_size`, which
> can record raw object size, it can help us add raw object data to
> the buffer or compare two buffers which contain raw object data.
Most of the functions that deal with this already use a strbuf in some
way, before we had a const char *, now there's a size_t to go along with
it, why not simply use a strbuf in the struct for the data? You'll then
get the size and \0 handling for free, and any functions to deal with
conversion can stick to the strbuf API, there seems to be a lot of back
and forth now.
> Beyond, `--format=%(raw)` cannot be used with `--python`, `--shell`,
> `--tcl`, `--perl` because if our binary raw data is passed to a variable
> in the host language, the host language may not support arbitrary binary
> data in the variables of its string type.
Perl at least deals with that just fine, and to the extent that it
doesn't any new problems here would have nothing to do with \0 being in
the data. Perl doesn't have a notion of "binary has \0 in it", it always
supports \0, it has a notion of "is it utf-8 or not?", so any encoding
problems wouldn't be new. I'd think that the same would be true of
Python, but I'm not sure.
> +test_expect_success 'basic atom: refs/tags/testtag *raw' '
> + git cat-file commit refs/tags/testtag^{} >expected &&
> + git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
> + sanitize_pgp <expected >expected.clean &&
> + sanitize_pgp <actual >actual.clean &&
> + echo "" >>expected.clean &&
Just "echo" will do, ditto for the rest. Also odd to go back and forth
between populating expected.clean & actual.clean.
> +test_expect_success 'set up refs pointing to binary blob' '
> + printf "a\0b\0c" >blob1 &&
> + printf "a\0c\0b" >blob2 &&
> + printf "\0a\0b\0c" >blob3 &&
> + printf "abc" >blob4 &&
> + printf "\0 \0 \0 " >blob5 &&
> + printf "\0 \0a\0 " >blob6 &&
> + printf " " >blob7 &&
> + >blob8 &&
> + git hash-object blob1 -w | xargs git update-ref refs/myblobs/blob1 &&
> + git hash-object blob2 -w | xargs git update-ref refs/myblobs/blob2 &&
> + git hash-object blob3 -w | xargs git update-ref refs/myblobs/blob3 &&
> + git hash-object blob4 -w | xargs git update-ref refs/myblobs/blob4 &&
> + git hash-object blob5 -w | xargs git update-ref refs/myblobs/blob5 &&
> + git hash-object blob6 -w | xargs git update-ref refs/myblobs/blob6 &&
> + git hash-object blob7 -w | xargs git update-ref refs/myblobs/blob7 &&
> + git hash-object blob8 -w | xargs git update-ref refs/myblobs/blob8
Hrm, xargs just to avoid:
git update-ref ... $(git hash-object) ?
> +test_expect_success '%(raw) with --python must failed' '
> + test_must_fail git for-each-ref --format="%(raw)" --python
> +'
> +
> +test_expect_success '%(raw) with --tcl must failed' '
> + test_must_fail git for-each-ref --format="%(raw)" --tcl
> +'
> +
> +test_expect_success '%(raw) with --perl must failed' '
> + test_must_fail git for-each-ref --format="%(raw)" --perl
> +'
> +
> +test_expect_success '%(raw) with --shell must failed' '
> + test_must_fail git for-each-ref --format="%(raw)" --shell
> +'
> +
> +test_expect_success '%(raw) with --shell and --sort=raw must failed' '
> + test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
> +'
s/must failed/must fail/, but see question above about encoding in these
languages...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Junio C Hamano wrote (reply to this):
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>> Beyond, `--format=%(raw)` cannot be used with `--python`, `--shell`,
>> `--tcl`, `--perl` because if our binary raw data is passed to a variable
>> in the host language, the host language may not support arbitrary binary
>> data in the variables of its string type.
>
> Perl at least deals with that just fine, and to the extent that it
> doesn't any new problems here would have nothing to do with \0 being in
> the data. Perl doesn't have a notion of "binary has \0 in it", it always
> supports \0, it has a notion of "is it utf-8 or not?", so any encoding
> problems wouldn't be new. I'd think that the same would be true of
> Python, but I'm not sure.
During an earlier iteration long time ago, as we knew Perl is
capable of handling sequence of bytes including NUL, it was decided
to start more strict by rejecting binary for all languages, which
can later be loosened, to limit the scope of the initial
implementation.
>> + git hash-object blob1 -w | xargs git update-ref refs/myblobs/blob1 &&
>> + git hash-object blob2 -w | xargs git update-ref refs/myblobs/blob2 &&
>> + git hash-object blob3 -w | xargs git update-ref refs/myblobs/blob3 &&
>> + git hash-object blob4 -w | xargs git update-ref refs/myblobs/blob4 &&
>> + git hash-object blob5 -w | xargs git update-ref refs/myblobs/blob5 &&
>> + git hash-object blob6 -w | xargs git update-ref refs/myblobs/blob6 &&
>> + git hash-object blob7 -w | xargs git update-ref refs/myblobs/blob7 &&
>> + git hash-object blob8 -w | xargs git update-ref refs/myblobs/blob8
>
> Hrm, xargs just to avoid:
>
> git update-ref ... $(git hash-object) ?
That's horrible. Thanks for noticing.
We'd want to catch segfaults from both hash-object and update-ref.
One way to do so may be
O=$(git hash-object -w blob1) &&
git update-ref refs/myblobs/blob1 "$O"
Thanks for a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2021年6月17日周四 下午3:16写道:
>
> > The raw data of blob, tree objects may contain '\0', but most of
> > the logic in `ref-filter` depends on the output of the atom being
> > text (specifically, no embedded NULs in it).
> >
> > E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
> > add the data to the buffer. The raw data of a tree object is
> > `100644 one\0...`, only the `100644 one` will be added to the buffer,
> > which is incorrect.
> >
> > Therefore, add a new member in `struct atom_value`: `s_size`, which
> > can record raw object size, it can help us add raw object data to
> > the buffer or compare two buffers which contain raw object data.
>
> Most of the functions that deal with this already use a strbuf in some
> way, before we had a const char *, now there's a size_t to go along with
> it, why not simply use a strbuf in the struct for the data? You'll then
> get the size and \0 handling for free, and any functions to deal with
> conversion can stick to the strbuf API, there seems to be a lot of back
> and forth now.
>
Yes, strbuf is a suitable choice when using <str,len> pair.
But if replace v->s with strbuf, the possible changes will be larger.
> > Beyond, `--format=%(raw)` cannot be used with `--python`, `--shell`,
> > `--tcl`, `--perl` because if our binary raw data is passed to a variable
> > in the host language, the host language may not support arbitrary binary
> > data in the variables of its string type.
>
> Perl at least deals with that just fine, and to the extent that it
> doesn't any new problems here would have nothing to do with \0 being in
> the data. Perl doesn't have a notion of "binary has \0 in it", it always
> supports \0, it has a notion of "is it utf-8 or not?", so any encoding
> problems wouldn't be new. I'd think that the same would be true of
> Python, but I'm not sure.
>
Not python safe. See [1].
Regarding the perl language, I support Junio's point of view: it can be
re-supported in the future.
>
> > +test_expect_success 'basic atom: refs/tags/testtag *raw' '
> > + git cat-file commit refs/tags/testtag^{} >expected &&
> > + git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
> > + sanitize_pgp <expected >expected.clean &&
> > + sanitize_pgp <actual >actual.clean &&
> > + echo "" >>expected.clean &&
>
> Just "echo" will do, ditto for the rest. Also odd to go back and forth
> between populating expected.clean & actual.clean.
>
Are you saying that sanitize_pgp is not needed?
>
> > +test_expect_success 'set up refs pointing to binary blob' '
> > + printf "a\0b\0c" >blob1 &&
> > + printf "a\0c\0b" >blob2 &&
> > + printf "\0a\0b\0c" >blob3 &&
> > + printf "abc" >blob4 &&
> > + printf "\0 \0 \0 " >blob5 &&
> > + printf "\0 \0a\0 " >blob6 &&
> > + printf " " >blob7 &&
> > + >blob8 &&
> > + git hash-object blob1 -w | xargs git update-ref refs/myblobs/blob1 &&
> > + git hash-object blob2 -w | xargs git update-ref refs/myblobs/blob2 &&
> > + git hash-object blob3 -w | xargs git update-ref refs/myblobs/blob3 &&
> > + git hash-object blob4 -w | xargs git update-ref refs/myblobs/blob4 &&
> > + git hash-object blob5 -w | xargs git update-ref refs/myblobs/blob5 &&
> > + git hash-object blob6 -w | xargs git update-ref refs/myblobs/blob6 &&
> > + git hash-object blob7 -w | xargs git update-ref refs/myblobs/blob7 &&
> > + git hash-object blob8 -w | xargs git update-ref refs/myblobs/blob8
>
> Hrm, xargs just to avoid:
>
> git update-ref ... $(git hash-object) ?
>
I didn’t think about it, just for convenience.
> > +test_expect_success '%(raw) with --python must failed' '
> > + test_must_fail git for-each-ref --format="%(raw)" --python
> > +'
> > +
> > +test_expect_success '%(raw) with --tcl must failed' '
> > + test_must_fail git for-each-ref --format="%(raw)" --tcl
> > +'
> > +
> > +test_expect_success '%(raw) with --perl must failed' '
> > + test_must_fail git for-each-ref --format="%(raw)" --perl
> > +'
> > +
> > +test_expect_success '%(raw) with --shell must failed' '
> > + test_must_fail git for-each-ref --format="%(raw)" --shell
> > +'
> > +
> > +test_expect_success '%(raw) with --shell and --sort=raw must failed' '
> > + test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
> > +'
>
> s/must failed/must fail/, but see question above about encoding in these
> languages...
[1]: https://lore.kernel.org/git/CAOLTT8QR_GRm4TYk0E_eazQ+unVQODc-3L+b4V5JUN5jtZR8uA@mail.gmail.com/
Thanks for a review.
--
ZheNing Hu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Thu, Jun 17 2021, ZheNing Hu wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2021年6月17日周四 下午3:16写道:
>>
>> > The raw data of blob, tree objects may contain '\0', but most of
>> > the logic in `ref-filter` depends on the output of the atom being
>> > text (specifically, no embedded NULs in it).
>> >
>> > E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
>> > add the data to the buffer. The raw data of a tree object is
>> > `100644 one\0...`, only the `100644 one` will be added to the buffer,
>> > which is incorrect.
>> >
>> > Therefore, add a new member in `struct atom_value`: `s_size`, which
>> > can record raw object size, it can help us add raw object data to
>> > the buffer or compare two buffers which contain raw object data.
>>
>> Most of the functions that deal with this already use a strbuf in some
>> way, before we had a const char *, now there's a size_t to go along with
>> it, why not simply use a strbuf in the struct for the data? You'll then
>> get the size and \0 handling for free, and any functions to deal with
>> conversion can stick to the strbuf API, there seems to be a lot of back
>> and forth now.
>>
>
> Yes, strbuf is a suitable choice when using <str,len> pair.
> But if replace v->s with strbuf, the possible changes will be larger.
I for one would like to see it done that way, those changes are usually
easy to read. Also it seems a large part of 2/8 is extra new code
because we didn't do that, e.g. getting length differently if something
is a strbuf or not, passing char*/size_t pairs to new functions etc.
>> > Beyond, `--format=%(raw)` cannot be used with `--python`, `--shell`,
>> > `--tcl`, `--perl` because if our binary raw data is passed to a variable
>> > in the host language, the host language may not support arbitrary binary
>> > data in the variables of its string type.
>>
>> Perl at least deals with that just fine, and to the extent that it
>> doesn't any new problems here would have nothing to do with \0 being in
>> the data. Perl doesn't have a notion of "binary has \0 in it", it always
>> supports \0, it has a notion of "is it utf-8 or not?", so any encoding
>> problems wouldn't be new. I'd think that the same would be true of
>> Python, but I'm not sure.
>>
>
> Not python safe. See [1].
> Regarding the perl language, I support Junio's point of view: it can be
> re-supported in the future.
Ah, I'd missed that. Anyway, if it's easy it seems you discovered that
Perl deals with it correctly, so we could just have it support this.
>>
>> > +test_expect_success 'basic atom: refs/tags/testtag *raw' '
>> > + git cat-file commit refs/tags/testtag^{} >expected &&
>> > + git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
>> > + sanitize_pgp <expected >expected.clean &&
>> > + sanitize_pgp <actual >actual.clean &&
>> > + echo "" >>expected.clean &&
>>
>> Just "echo" will do, ditto for the rest. Also odd to go back and forth
>> between populating expected.clean & actual.clean.
>>
>
> Are you saying that sanitize_pgp is not needed?
No that instead of:
echo "" >x
You can do:
echo >x
And also that going back and forth between populating different files is
confusing, i.e. this:
echo a >x
echo c >y
echo b >>x
is better as:
echo a >x
echo b >>x
echo c >y
>>
>> > +test_expect_success 'set up refs pointing to binary blob' '
>> > + printf "a\0b\0c" >blob1 &&
>> > + printf "a\0c\0b" >blob2 &&
>> > + printf "\0a\0b\0c" >blob3 &&
>> > + printf "abc" >blob4 &&
>> > + printf "\0 \0 \0 " >blob5 &&
>> > + printf "\0 \0a\0 " >blob6 &&
>> > + printf " " >blob7 &&
>> > + >blob8 &&
>> > + git hash-object blob1 -w | xargs git update-ref refs/myblobs/blob1 &&
>> > + git hash-object blob2 -w | xargs git update-ref refs/myblobs/blob2 &&
>> > + git hash-object blob3 -w | xargs git update-ref refs/myblobs/blob3 &&
>> > + git hash-object blob4 -w | xargs git update-ref refs/myblobs/blob4 &&
>> > + git hash-object blob5 -w | xargs git update-ref refs/myblobs/blob5 &&
>> > + git hash-object blob6 -w | xargs git update-ref refs/myblobs/blob6 &&
>> > + git hash-object blob7 -w | xargs git update-ref refs/myblobs/blob7 &&
>> > + git hash-object blob8 -w | xargs git update-ref refs/myblobs/blob8
>>
>> Hrm, xargs just to avoid:
>>
>> git update-ref ... $(git hash-object) ?
>>
>
> I didn’t think about it, just for convenience.
*nod*, Junio had a good suggestion.
>> > +test_expect_success '%(raw) with --python must failed' '
>> > + test_must_fail git for-each-ref --format="%(raw)" --python
>> > +'
>> > +
>> > +test_expect_success '%(raw) with --tcl must failed' '
>> > + test_must_fail git for-each-ref --format="%(raw)" --tcl
>> > +'
>> > +
>> > +test_expect_success '%(raw) with --perl must failed' '
>> > + test_must_fail git for-each-ref --format="%(raw)" --perl
>> > +'
>> > +
>> > +test_expect_success '%(raw) with --shell must failed' '
>> > + test_must_fail git for-each-ref --format="%(raw)" --shell
>> > +'
>> > +
>> > +test_expect_success '%(raw) with --shell and --sort=raw must failed' '
>> > + test_must_fail git for-each-ref --format="%(raw)" --sort=raw --shell
>> > +'
>>
>> s/must failed/must fail/, but see question above about encoding in these
>> languages...
>
>
> [1]: https://lore.kernel.org/git/CAOLTT8QR_GRm4TYk0E_eazQ+unVQODc-3L+b4V5JUN5jtZR8uA@mail.gmail.com/
>
> Thanks for a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2021年6月17日周四 下午10:45写道:
>
>
> On Thu, Jun 17 2021, ZheNing Hu wrote:
>
> >
> > Yes, strbuf is a suitable choice when using <str,len> pair.
> > But if replace v->s with strbuf, the possible changes will be larger.
>
> I for one would like to see it done that way, those changes are usually
> easy to read. Also it seems a large part of 2/8 is extra new code
> because we didn't do that, e.g. getting length differently if something
> is a strbuf or not, passing char*/size_t pairs to new functions etc.
>
After some refactoring, I found that there are two problems:
1. There are a lot of codes like this in ref-filter to fill v->s:
v->s = show_ref(...)
v->s = copy_email(...)
It is very difficult to modify here: We know that show_ref()
or copy_email() will allocate a block of memory to v->s, but
if v->s is a strbuf, what should we do? In copy_email(), we
can pass the v->s to copy_email() and use strbuf_add()/strbuf_addstr()
instead of xstrdup() and xmemdupz(). But show_ref() will call
external functions like shorten_unambiguous_ref(), we don’t know
whether it will return us NULL or a dynamically allocated memory.
If continue to pass v->s to the inner function, it is not a feasible
method. Or we can use strbuf_attach() + strlen(), I'm not sure
this is a good method.
2. See:
- for (i = 0; i < used_atom_cnt; i++) {
+ for (i = 0; i < used_atom_cnt; i++) {
struct atom_value *v = &ref->value[i];
- if (v->s == NULL && used_atom[i].source == SOURCE_NONE)
+ if (v->s.len == 0 && used_atom[i].source == SOURCE_NONE)
return strbuf_addf_ret(err, -1, _("missing
object %s for %s"),
oid_to_hex(&ref->objectname), ref->refname);
}
In the case of using strbuf, I don’t know how to distinguish between an empty
strbuf and NULL. It can be easily distinguished by using c-style "const char*".
> >
> > Not python safe. See [1].
> > Regarding the perl language, I support Junio's point of view: it can be
> > re-supported in the future.
>
> Ah, I'd missed that. Anyway, if it's easy it seems you discovered that
> Perl deals with it correctly, so we could just have it support this.
>
Well, it's ok, support for perl will be put in a separate commit.
> >>
> >> > +test_expect_success 'basic atom: refs/tags/testtag *raw' '
> >> > + git cat-file commit refs/tags/testtag^{} >expected &&
> >> > + git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
> >> > + sanitize_pgp <expected >expected.clean &&
> >> > + sanitize_pgp <actual >actual.clean &&
> >> > + echo "" >>expected.clean &&
> >>
> >> Just "echo" will do, ditto for the rest. Also odd to go back and forth
> >> between populating expected.clean & actual.clean.
> >>
> >
> > Are you saying that sanitize_pgp is not needed?
>
> No that instead of:
>
> echo "" >x
>
> You can do:
>
> echo >x
>
> And also that going back and forth between populating different files is
> confusing, i.e. this:
>
>
> echo a >x
> echo c >y
> echo b >>x
>
> is better as:
>
> echo a >x
> echo b >>x
> echo c >y
>
>
Thanks, I get what you meant now.
> >>
> >> > +test_expect_success 'set up refs pointing to binary blob' '
> >> > + printf "a\0b\0c" >blob1 &&
> >> > + printf "a\0c\0b" >blob2 &&
> >> > + printf "\0a\0b\0c" >blob3 &&
> >> > + printf "abc" >blob4 &&
> >> > + printf "\0 \0 \0 " >blob5 &&
> >> > + printf "\0 \0a\0 " >blob6 &&
> >> > + printf " " >blob7 &&
> >> > + >blob8 &&
> >> > + git hash-object blob1 -w | xargs git update-ref refs/myblobs/blob1 &&
> >> > + git hash-object blob2 -w | xargs git update-ref refs/myblobs/blob2 &&
> >> > + git hash-object blob3 -w | xargs git update-ref refs/myblobs/blob3 &&
> >> > + git hash-object blob4 -w | xargs git update-ref refs/myblobs/blob4 &&
> >> > + git hash-object blob5 -w | xargs git update-ref refs/myblobs/blob5 &&
> >> > + git hash-object blob6 -w | xargs git update-ref refs/myblobs/blob6 &&
> >> > + git hash-object blob7 -w | xargs git update-ref refs/myblobs/blob7 &&
> >> > + git hash-object blob8 -w | xargs git update-ref refs/myblobs/blob8
> >>
> >> Hrm, xargs just to avoid:
> >>
> >> git update-ref ... $(git hash-object) ?
> >>
> >
> > I didn’t think about it, just for convenience.
>
> *nod*, Junio had a good suggestion.
>
ok.
Thanks.
--
ZheNing Hu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Fri, Jun 18 2021, ZheNing Hu wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2021年6月17日周四 下午10:45写道:
>>
>>
>> On Thu, Jun 17 2021, ZheNing Hu wrote:
>>
>> >
>> > Yes, strbuf is a suitable choice when using <str,len> pair.
>> > But if replace v->s with strbuf, the possible changes will be larger.
>>
>> I for one would like to see it done that way, those changes are usually
>> easy to read. Also it seems a large part of 2/8 is extra new code
>> because we didn't do that, e.g. getting length differently if something
>> is a strbuf or not, passing char*/size_t pairs to new functions etc.
>>
>
> After some refactoring, I found that there are two problems:
> 1. There are a lot of codes like this in ref-filter to fill v->s:
>
> v->s = show_ref(...)
> v->s = copy_email(...)
>
> It is very difficult to modify here: We know that show_ref()
> or copy_email() will allocate a block of memory to v->s, but
> if v->s is a strbuf, what should we do? In copy_email(), we
> can pass the v->s to copy_email() and use strbuf_add()/strbuf_addstr()
> instead of xstrdup() and xmemdupz(). But show_ref() will call
> external functions like shorten_unambiguous_ref(), we don’t know
> whether it will return us NULL or a dynamically allocated memory.
> If continue to pass v->s to the inner function, it is not a feasible
> method. Or we can use strbuf_attach() + strlen(), I'm not sure
> this is a good method.
>
> 2. See:
>
> - for (i = 0; i < used_atom_cnt; i++) {
> + for (i = 0; i < used_atom_cnt; i++) {
> struct atom_value *v = &ref->value[i];
> - if (v->s == NULL && used_atom[i].source == SOURCE_NONE)
> + if (v->s.len == 0 && used_atom[i].source == SOURCE_NONE)
> return strbuf_addf_ret(err, -1, _("missing
> object %s for %s"),
>
> oid_to_hex(&ref->objectname), ref->refname);
> }
>
> In the case of using strbuf, I don’t know how to distinguish between an empty
> strbuf and NULL. It can be easily distinguished by using c-style "const char*".
Yes, sometimes it's just too much of a hassle, looking at
shorten_unambiguous_ref() which returns a xstrdup()'d value that could
indeed be strbuf_attach'd. I haven't tried the conversion myself,
perhaps it's too much hassle.
Just a suggestion from reading your patch in isolation.
>> >
>> > Not python safe. See [1].
>> > Regarding the perl language, I support Junio's point of view: it can be
>> > re-supported in the future.
>>
>> Ah, I'd missed that. Anyway, if it's easy it seems you discovered that
>> Perl deals with it correctly, so we could just have it support this.
>>
>
> Well, it's ok, support for perl will be put in a separate commit.
>
>> >>
>> >> > +test_expect_success 'basic atom: refs/tags/testtag *raw' '
>> >> > + git cat-file commit refs/tags/testtag^{} >expected &&
>> >> > + git for-each-ref --format="%(*raw)" refs/tags/testtag >actual &&
>> >> > + sanitize_pgp <expected >expected.clean &&
>> >> > + sanitize_pgp <actual >actual.clean &&
>> >> > + echo "" >>expected.clean &&
>> >>
>> >> Just "echo" will do, ditto for the rest. Also odd to go back and forth
>> >> between populating expected.clean & actual.clean.
>> >>
>> >
>> > Are you saying that sanitize_pgp is not needed?
>>
>> No that instead of:
>>
>> echo "" >x
>>
>> You can do:
>>
>> echo >x
>>
>> And also that going back and forth between populating different files is
>> confusing, i.e. this:
>>
>>
>> echo a >x
>> echo c >y
>> echo b >>x
>>
>> is better as:
>>
>> echo a >x
>> echo b >>x
>> echo c >y
>>
>>
>
> Thanks, I get what you meant now.
>
>> >>
>> >> > +test_expect_success 'set up refs pointing to binary blob' '
>> >> > + printf "a\0b\0c" >blob1 &&
>> >> > + printf "a\0c\0b" >blob2 &&
>> >> > + printf "\0a\0b\0c" >blob3 &&
>> >> > + printf "abc" >blob4 &&
>> >> > + printf "\0 \0 \0 " >blob5 &&
>> >> > + printf "\0 \0a\0 " >blob6 &&
>> >> > + printf " " >blob7 &&
>> >> > + >blob8 &&
>> >> > + git hash-object blob1 -w | xargs git update-ref refs/myblobs/blob1 &&
>> >> > + git hash-object blob2 -w | xargs git update-ref refs/myblobs/blob2 &&
>> >> > + git hash-object blob3 -w | xargs git update-ref refs/myblobs/blob3 &&
>> >> > + git hash-object blob4 -w | xargs git update-ref refs/myblobs/blob4 &&
>> >> > + git hash-object blob5 -w | xargs git update-ref refs/myblobs/blob5 &&
>> >> > + git hash-object blob6 -w | xargs git update-ref refs/myblobs/blob6 &&
>> >> > + git hash-object blob7 -w | xargs git update-ref refs/myblobs/blob7 &&
>> >> > + git hash-object blob8 -w | xargs git update-ref refs/myblobs/blob8
>> >>
>> >> Hrm, xargs just to avoid:
>> >>
>> >> git update-ref ... $(git hash-object) ?
>> >>
>> >
>> > I didn’t think about it, just for convenience.
>>
>> *nod*, Junio had a good suggestion.
>>
>
> ok.
>
> Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Christian Couder wrote (reply to this):
On Fri, Jun 18, 2021 at 12:51 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> On Fri, Jun 18 2021, ZheNing Hu wrote:
> > After some refactoring, I found that there are two problems:
> > 1. There are a lot of codes like this in ref-filter to fill v->s:
> >
> > v->s = show_ref(...)
> > v->s = copy_email(...)
> >
> > It is very difficult to modify here: We know that show_ref()
> > or copy_email() will allocate a block of memory to v->s, but
> > if v->s is a strbuf, what should we do? In copy_email(), we
> > can pass the v->s to copy_email() and use strbuf_add()/strbuf_addstr()
> > instead of xstrdup() and xmemdupz(). But show_ref() will call
> > external functions like shorten_unambiguous_ref(), we don’t know
> > whether it will return us NULL or a dynamically allocated memory.
> > If continue to pass v->s to the inner function, it is not a feasible
> > method. Or we can use strbuf_attach() + strlen(), I'm not sure
> > this is a good method.
If you resend this patch, it might be a good idea to add a short
version of the above explanations into the commit message.
[...]
> > In the case of using strbuf, I don’t know how to distinguish between an empty
> > strbuf and NULL. It can be easily distinguished by using c-style "const char*".
Maybe this could also be part of the explanation.
> Yes, sometimes it's just too much of a hassle, looking at
> shorten_unambiguous_ref() which returns a xstrdup()'d value that could
> indeed be strbuf_attach'd. I haven't tried the conversion myself,
> perhaps it's too much hassle.
>
> Just a suggestion from reading your patch in isolation.
Yeah, thanks for the review anyway!
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
|
On the Git mailing list, ZheNing Hu wrote (reply to this):
|
User |
cbf7d51
to
86ac3bc
Compare
/submit |
Submitted as pull.980.v3.git.1624086181.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
@@ -226,6 +226,12 @@ newline. The available atoms are: | |||
after that first run of whitespace (i.e., the "rest" of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Christian Couder wrote (reply to this):
On Sat, Jun 19, 2021 at 9:03 AM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> In order to let cat-file use ref-filter logic, the following
> methods are used:
Maybe: s/the following methods are used/let's do the following/
> 1. Add `cat_file_mode` member in struct `ref_format`, this can
> help us reject atoms in verify_ref_format() which cat-file
> cannot use, e.g. `%(refname)`, `%(push)`, `%(upstream)`...
> 2. Change the type of member `format` in struct `batch_options`
> to `ref_format`, We can add format data in it.
Not sure what "We can add format data in it." means.
> 3. Let `batch_objects()` add atoms to format, and use
> `verify_ref_format()` to check atoms.
> 4. Use `has_object_file()` in `batch_one_object()` to check
> whether the input object exists.
> 5. Let get_object() return 1 and print "<oid> missing" instead
> of returning -1 and printing "missing object <oid> for <refname>",
> this can help `format_ref_array_item()` just report that the
> object is missing without letting Git exit.
> 6. Use `format_ref_array_item()` in `batch_object_write()` to
> get the formatted data corresponding to the object. If the
> return value of `format_ref_array_item()` is equals to zero,
> use `batch_write()` to print object data; else if the return
> value less than zero, use `die()` to print the error message
> and exit; else return value greater than zero, only print the
s/else return value greater/else if the return value is greater/
> error message, but not exit.
s/not exit/don't exit/
> 7. Use free_ref_array_item_value() to free ref_array_item's
> value.
That looks like a lot of changes in a single commit. I wonder if this
commit could be split.
> Most of the atoms in `for-each-ref --format` are now supported,
> such as `%(tree)`, `%(parent)`, `%(author)`, `%(tagger)`, `%(if)`,
> `%(then)`, `%(else)`, `%(end)`. But these atoms will be rejected:
> `%(refname)`, `%(symref)`, `%(upstream)`, `%(push)`, `%(worktreepath)`,
> `%(flag)`, `%(HEAD)`, because our objects don't have refname.
s/refname/a refname/
It might be worth talking a bit about possible performance changes.
[...]
> + ret = format_ref_array_item(&item, &opt->format, scratch, &err);
> + if (!ret) {
> + strbuf_addch(scratch, '\n');
> + batch_write(opt, scratch->buf, scratch->len);
> + } else if (ret < 0) {
> + die("%s\n", err.buf);
This if (ret < 0) could be checked first.
> + } else {
> + /* when ret > 0 , don't call die and print the err to stdout*/
I think it would be more helpful to tell what ret > 0 means, rather
than what we do below (which can easily be seen).
> + printf("%s\n", err.buf);
> + fflush(stdout);
> }
For example:
if (ret < 0) {
die("%s\n", err.buf);
if (ret) {
/* ret > 0 means ... */
printf("%s\n", err.buf);
fflush(stdout);
} else {
strbuf_addch(scratch, '\n');
batch_write(opt, scratch->buf, scratch->len);
}
> + free_ref_array_item_value(&item);
> + strbuf_release(&err);
> }
[...]
> +static int batch_objects(struct batch_options *opt, const struct option *options)
It's unfortunate that one argument is called "opt" and the other one
"options". I wonder if the first one could be called "batch" as it
seems to be called this way somewhere else.
> + if (!opt->format.format)
> + strbuf_addstr(&format, "%(objectname) %(objecttype) %(objectsize)");
> + else
> + strbuf_addstr(&format, opt->format.format);
If there is no reason for the condition to be (!X) over just (X), I
think the latter is a bit better.
> if (opt->print_contents)
> - data.info.typep = &data.type;
> + strbuf_addstr(&format, "\n%(raw)");
> + opt->format.format = format.buf;
I wonder if this should be:
opt->format.format = strbuf_detach(&format, NULL);
> + if (verify_ref_format(&opt->format))
> + usage_with_options(cat_file_usage, options);
[...]
> @@ -86,7 +87,7 @@ struct ref_format {
> int need_color_reset_at_eol;
> };
>
> -#define REF_FORMAT_INIT { NULL, NULL, 0, 0, -1 }
> +#define REF_FORMAT_INIT { NULL, NULL, 0, 0, 0, -1 }
Maybe this can already be changed to a designated initializer, like
Ævar suggested recently.
[...]
> +test_expect_success 'cat-file refs/heads/main refs/tags/testtag %(rest)' '
If this test is about checking that %(rest) works with both a branch
and a tag, it might be better to say it more explicitly.
> + cat >expected <<-EOF &&
> + 123 commit 123
> + 456 tag 456
> + EOF
> + git cat-file --batch-check="%(rest) %(objecttype) %(rest)" >actual <<-EOF &&
> + refs/heads/main 123
> + refs/tags/testtag 456
> + EOF
> + test_cmp expected actual
> +'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, ZheNing Hu wrote (reply to this):
Christian Couder <christian.couder@gmail.com> 于2021年6月21日周一 下午1:55写道:
>
> > 1. Add `cat_file_mode` member in struct `ref_format`, this can
> > help us reject atoms in verify_ref_format() which cat-file
> > cannot use, e.g. `%(refname)`, `%(push)`, `%(upstream)`...
> > 2. Change the type of member `format` in struct `batch_options`
> > to `ref_format`, We can add format data in it.
>
> Not sure what "We can add format data in it." means.
Well, there is something wrong with the expression here. What I want
to express is that we can fill its member "format" with the atoms like
"%(objectname) %(refname)", and then pass it to the ref-filter.
> > 7. Use free_ref_array_item_value() to free ref_array_item's
> > value.
>
> That looks like a lot of changes in a single commit. I wonder if this
> commit could be split.
>
Yeah, But I don’t know if I should take it apart step by step, If taken apart,
those intermediate commits are likely to fail the test.
> > Most of the atoms in `for-each-ref --format` are now supported,
> > such as `%(tree)`, `%(parent)`, `%(author)`, `%(tagger)`, `%(if)`,
> > `%(then)`, `%(else)`, `%(end)`. But these atoms will be rejected:
> > `%(refname)`, `%(symref)`, `%(upstream)`, `%(push)`, `%(worktreepath)`,
> > `%(flag)`, `%(HEAD)`, because our objects don't have refname.
>
> s/refname/a refname/
>
> It might be worth talking a bit about possible performance changes.
>
Makes sense. The performance has indeed deteriorated.
> [...]
>
> > + ret = format_ref_array_item(&item, &opt->format, scratch, &err);
> > + if (!ret) {
> > + strbuf_addch(scratch, '\n');
> > + batch_write(opt, scratch->buf, scratch->len);
> > + } else if (ret < 0) {
> > + die("%s\n", err.buf);
>
> This if (ret < 0) could be checked first.
Yes, it is better to put error checking first.
>
> > + } else {
> > + /* when ret > 0 , don't call die and print the err to stdout*/
>
> I think it would be more helpful to tell what ret > 0 means, rather
> than what we do below (which can easily be seen).
>
Ah, There is indeed only one situation for ret > 0 for the time being:
Show "<oid> missing" without exiting Git.
> > + printf("%s\n", err.buf);
> > + fflush(stdout);
> > }
>
> For example:
>
> if (ret < 0) {
> die("%s\n", err.buf);
> if (ret) {
> /* ret > 0 means ... */
> printf("%s\n", err.buf);
> fflush(stdout);
> } else {
> strbuf_addch(scratch, '\n');
> batch_write(opt, scratch->buf, scratch->len);
> }
>
Yes, this might be better.
> > + free_ref_array_item_value(&item);
> > + strbuf_release(&err);
> > }
>
> [...]
>
> > +static int batch_objects(struct batch_options *opt, const struct option *options)
>
> It's unfortunate that one argument is called "opt" and the other one
> "options". I wonder if the first one could be called "batch" as it
> seems to be called this way somewhere else.
>
OK.
> > + if (!opt->format.format)
> > + strbuf_addstr(&format, "%(objectname) %(objecttype) %(objectsize)");
> > + else
> > + strbuf_addstr(&format, opt->format.format);
>
> If there is no reason for the condition to be (!X) over just (X), I
> think the latter is a bit better.
>
Although I think it doesn’t matter which one to use first,
I will still follow your suggestions.
> > if (opt->print_contents)
> > - data.info.typep = &data.type;
> > + strbuf_addstr(&format, "\n%(raw)");
> > + opt->format.format = format.buf;
>
> I wonder if this should be:
>
> opt->format.format = strbuf_detach(&format, NULL);
>
No. here our opt->format.format will not be changed, it would
be better for us to use `strbuf_release(&format)` for resource
recovery. (strbuf_detach() will forced to let us free opt->format.format)
> > + if (verify_ref_format(&opt->format))
> > + usage_with_options(cat_file_usage, options);
>
> [...]
>
> > @@ -86,7 +87,7 @@ struct ref_format {
> > int need_color_reset_at_eol;
> > };
> >
> > -#define REF_FORMAT_INIT { NULL, NULL, 0, 0, -1 }
> > +#define REF_FORMAT_INIT { NULL, NULL, 0, 0, 0, -1 }
>
> Maybe this can already be changed to a designated initializer, like
> Ævar suggested recently.
>
I agree.
> [...]
>
> > +test_expect_success 'cat-file refs/heads/main refs/tags/testtag %(rest)' '
>
> If this test is about checking that %(rest) works with both a branch
> and a tag, it might be better to say it more explicitly.
>
OK.
Thanks.
--
ZheNing Hu
0bb7727
to
d1114a2
Compare
/submit |
Submitted as pull.980.v4.git.1624332054.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
@@ -144,6 +144,7 @@ enum atom_type { | |||
ATOM_BODY, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Bagas Sanjaya wrote (reply to this):
On 22/06/21 10.20, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
>
> Since in the refactor of `git cat-file --batch` later,
> oid_object_info_extended() in get_object() will be used to obtain
> the info of an object with it's oid. When the object cannot be
> obtained in the git repository, `cat-file --batch` expects to output
> "<oid> missing" and continue the next oid query instead of letting
> Git exit. In other error conditions, Git should exit normally. So we
> can achieve this function by passing the return value of get_object().
>
s/Since/Because/
--
An old man doll... just what I always wanted! - Clara
@@ -428,6 +428,13 @@ static void batch_one_object(const char *obj_name, | |||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Bagas Sanjaya wrote (reply to this):
On 22/06/21 10.20, ZheNing Hu via GitGitGadget wrote:
> From: ZheNing Hu <adlternative@gmail.com>
>
> Since cat-file reuses ref-filter logic later will add the
> formal parameter "const struct option *options" to
> batch_objects(), the two synonymous parameters of "opt"
> and "options" may confuse readers, so change batch_options
> parameter of batch_objects() from "opt" to "batch".
>
Better say "Because later cat-file reuses ref-filter logic that will add
parameter ... ".
--
An old man doll... just what I always wanted! - Clara
In the original ref-filter design, it will copy the parsed atom's name and attributes to `used_atom[i].name` in the atom's parsing step, and use it again for string matching in the later specific ref attributes filling step. It use a lot of string matching to determine which atom we need. Introduce the enum "atom_type", each enum value is named as `ATOM_*`, which is the index of each corresponding valid_atom entry. In the first step of the atom parsing, `used_atom.atom_type` will record corresponding enum value from valid_atom entry index, and then in specific reference attribute filling step, only need to compare the value of the `used_atom[i].atom_type` to check the atom type. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Christian Couder <christian.couder@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --batch code to print an object assumes we found out the type of the object from calling oid_object_info_extended(). This is true for the default format, but even in a custom format, we manually modify the object_info struct to ask for the type. This assumption was broken by 845de33 (cat-file: avoid noop calls to sha1_object_info_extended, 2016-05-18). That commit skips the call to oid_object_info_extended() entirely when --batch-all-objects is in use, and the custom format does not include any placeholders that require calling it. Or when the custom format only include placeholders like %(objectname) or %(rest), oid_object_info_extended() will not get the type of the object. This results in an error when we try to confirm that the type didn't change: $ git cat-file --batch=batman --batch-all-objects batman fatal: object 0000239 changed type!? and also has other subtle effects (e.g., we'd fail to stream a blob, since we don't realize it's a blob in the first place). We can fix this by flipping the order of the setup. The check for "do we need to get the object info" must come _after_ we've decided whether we need to look up the type. Helped-by: Jeff King <peff@peff.net> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are two "if (opt->all_objects)" blocks next to each other, merge them into one to provide better readability. Helped-by: Jeff King <peff@peff.net> Signed-off-by: ZheNing Hu <adlternative@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Only tag and commit objects use `grab_sub_body_contents()` to grab object contents in the current codebase. We want to teach the function to also handle blobs and trees to get their raw data, without parsing a blob (whose contents looks like a commit or a tag) incorrectly as a commit or a tag. Skip the block of code that is specific to handling commits and tags early when the given object is of a wrong type to help later addition to handle other types of objects in this function. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Add new formatting option `%(raw)`, which will print the raw object data without any changes. It will help further to migrate all cat-file formatting logic from cat-file to ref-filter. The raw data of blob, tree objects may contain '\0', but most of the logic in `ref-filter` depends on the output of the atom being text (specifically, no embedded NULs in it). E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()` add the data to the buffer. The raw data of a tree object is `100644 one\0...`, only the `100644 one` will be added to the buffer, which is incorrect. Therefore, we need to find a way to record the length of the atom_value's member `s`. Although strbuf can already record the string and its length, if we want to replace the type of atom_value's member `s` with strbuf, many places in ref-filter that are filled with dynamically allocated mermory in `v->s` are not easy to replace. At the same time, we need to check if `v->s == NULL` in populate_value(), and strbuf cannot easily distinguish NULL and empty strings, but c-style "const char *" can do it. So add a new member in `struct atom_value`: `s_size`, which can record raw object size, it can help us add raw object data to the buffer or compare two buffers which contain raw object data. Note that `--format=%(raw)` cannot be used with `--python`, `--shell`, `--tcl`, and `--perl` because if the binary raw data is passed to a variable in such languages, these may not support arbitrary binary data in their string variable type. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Helped-by: Bagas Sanjaya <bagasdotme@gmail.com> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Felipe Contreras <felipe.contreras@gmail.com> Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk> Helped-by: Junio C Hamano <gitster@pobox.com> Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Because the perl language can handle binary data correctly, add the function perl_quote_buf_with_len(), which can specify the length of the data and prevent the data from being truncated at '\0' to help `--format="%(raw)"` re-support `--perl`. Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Use non-const ref_format in *_atom_parser(), which can help us modify the members of ref_format in *_atom_parser(). Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
In order to let "cat-file --batch=%(rest)" use the ref-filter interface, add %(rest) atom for ref-filter. "git for-each-ref", "git branch", "git tag" and "git verify-tag" will reject %(rest) by default. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Because in the refactor of `git cat-file --batch` later, oid_object_info_extended() in get_object() will be used to obtain the info of an object with it's oid. When the object cannot be obtained in the git repository, `cat-file --batch` expects to output "<oid> missing" and continue the next oid query instead of letting Git exit. In other error conditions, Git should exit normally. So we can achieve this function by passing the return value of get_object(). Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
When we use ref_array_item which is not dynamically allocated and want to free the space of its member "value" after the end of use, free_array_item() does not meet our needs, because it tries to free ref_array_item itself and its member "symref". Introduce free_ref_array_item_value() for freeing ref_array_item value. It will be called internally by free_array_item(), and it will help `cat-file --batch` free ref_array_item's value memory later. Helped-by: Junio C Hamano <gitster@pobox.com> Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Add `cat_file_mode` member in struct `ref_format`, when `cat-file --batch` use ref-filter logic later, it can help us reject atoms in verify_ref_format() which cat-file cannot use, e.g. `%(refname)`, `%(push)`, `%(upstream)`... Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Let get_object() return 1 and print "<oid> missing" instead of returning -1 and printing "missing object <oid> for <refname>" if oid_object_info_extended() unable to find the data corresponding to oid. When `cat-file --batch` use ref-filter logic later it can help `format_ref_array_item()` just report that the object is missing without letting Git exit. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Use `has_object_file()` in `batch_one_object()` to check whether the input object exists. This can help us reject the missing oid when we let `cat-file --batch` use ref-filter logic later. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Because later cat-file reuses ref-filter logic that will add parameter "const struct option *options" to batch_objects(), the two synonymous parameters of "opt" and "options" may confuse readers, so change batch_options parameter of batch_objects() from "opt" to "batch". Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
In order to let cat-file use ref-filter logic, let's do the following: 1. Change the type of member `format` in struct `batch_options` to `ref_format`, we will pass it to ref-filter later. 2. Let `batch_objects()` add atoms to format, and use `verify_ref_format()` to check atoms. 3. Use `format_ref_array_item()` in `batch_object_write()` to get the formatted data corresponding to the object. If the return value of `format_ref_array_item()` is equals to zero, use `batch_write()` to print object data; else if the return value is less than zero, use `die()` to print the error message and exit; else if return value is greater than zero, only print the error message, but don't exit. 4. Use free_ref_array_item_value() to free ref_array_item's value. Most of the atoms in `for-each-ref --format` are now supported, such as `%(tree)`, `%(parent)`, `%(author)`, `%(tagger)`, `%(if)`, `%(then)`, `%(else)`, `%(end)`. But these atoms will be rejected: `%(refname)`, `%(symref)`, `%(upstream)`, `%(push)`, `%(worktreepath)`, `%(flag)`, `%(HEAD)`, because these atoms are unique to those objects that pointed to by a ref, "for-each-ref"'s family can naturally use these atoms, but not all objects are pointed to be a ref, so "cat-file" will not be able to use them. The performance for `git cat-file --batch-all-objects --batch-check` on the Git repository itself with performance testing tool `hyperfine` changes from 669.4 ms ± 31.1 ms to 1.134 s ± 0.063 s. The performance for `git cat-file --batch-all-objects --batch >/dev/null` on the Git repository itself with performance testing tool `time` change from "27.37s user 0.29s system 98% cpu 28.089 total" to "33.69s user 1.54s system 87% cpu 40.258 total". Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Reuse the `err` buffer in batch_object_write(), as the buffer `scratch` does. This will reduce the overhead of multiple allocations of memory of the err buffer. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
After cat-file reuses the ref-filter logic, we re-implement the functions of --textconv and --filters options. Add members `use_textconv` and `use_filters` in struct `ref_format`, and use global variables `use_filters` and `use_textconv` in `ref-filter.c`, so that we can filter the content of the object in get_object(). Use `actual_oi` to record the real expand_data: it may point to the original `oi` or the `act_oi` processed by `textconv_object()` or `convert_to_working_tree()`. `grab_values()` will grab the contents of `actual_oi` and `grab_common_values()` to grab the contents of origin `oi`, this ensures that `%(objectsize)` still uses the size of the unfiltered data. In `get_object()`, we made an optimization: Firstly, get the size and type of the object instead of directly getting the object data. If using --textconv, after successfully obtaining the filtered object data, an extra oid_object_info_extended() will be skipped, which can reduce the cost of object data copy; If using --filter, the data of the object first will be getted first, and then convert_to_working_tree() will be used to get the filtered object data. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Because "atom_type == ATOM_OBJECTNAME" implies the condition of `starts_with(name, "objectname")`, "atom_type == ATOM_TREE" implies the condition of `starts_with(name, "tree")`, so the check for `starts_with(name, field)` in grab_oid() is redundant. So Remove the grab_oid() from ref-filter, to reduce repeated check. Mentored-by: Christian Couder <christian.couder@gmail.com> Mentored-by: Hariom Verma <hariom18599@gmail.com> Signed-off-by: ZheNing Hu <adlternative@gmail.com>
891d62f
to
c3b60b9
Compare
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
This branch is now known as |
This patch series was integrated into seen via git@3c525b5. |
This patch series was integrated into seen via git@ce080dd. |
This patch series was integrated into seen via git@c8db213. |
On the Git mailing list, ZheNing Hu wrote (reply to this):
|
This patch series make cat-file reuse ref-filter logic.
Change from last version:
git for-each-ref.txt
.cc: Junio C Hamano gitster@pobox.com
cc: Christian Couder christian.couder@gmail.com
cc: Hariom Verma hariom18599@gmail.com
cc: Bagas Sanjaya bagasdotme@gmail.com
cc: Jeff King peff@peff.net
cc: Ævar Arnfjörð Bjarmason avarab@gmail.com