Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSOC] ref-filter: add %(raw) atom #966

Conversation

adlternative
Copy link

@adlternative adlternative commented Jun 1, 2021

In order to make git cat-file --batch use ref-filter logic,
%(raw) atom is adding to ref-filter.

Change from last version:

  1. Change --<lang> and --format=%(raw)
    checkpoint to verify_ref_format(), which make
    it more scalable.
  2. Change grab_sub_body_contents() use
    struct expand_data *data instread of
    using obj,buf,buf_size to pass object info
    which can reduce the delivery of function
    parameters.

cc: Junio C Hamano gitster@pobox.com
cc: Christian Couder christian.couder@gmail.com
cc: Hariom Verma hariom18599@gmail.com
cc: Karthik Nayak karthik.188@gmail.com
cc: Felipe Contreras felipe.contreras@gmail.com
cc: Bagas Sanjaya bagasdotme@gmail.com
cc: Jeff King peff@peff.net
cc: Phillip Wood phillip.wood123@gmail.com

@adlternative
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 1, 2021

Submitted as pull.966.git.1622558243.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-966/adlternative/ref-filter-raw-atom-v4-v1

To fetch this version to local tag pr-966/adlternative/ref-filter-raw-atom-v4-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-966/adlternative/ref-filter-raw-atom-v4-v1

@@ -1356,7 +1356,8 @@ static void append_lines(struct strbuf *out, const char *buf, unsigned long size
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  /* See grab_values */
> -static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
> +static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf,
> +				   struct object *obj)

Neither this step or the next change needs anything but type member
of the 'obj' (and 'buf' is coming from oi.content of the result of
asking about that same 'obj').

I wonder if we should do one of the following:

 (1) stop passing "void *buf" and instead "struct expand_data
     *data", and use "data->content" to access "buf", which would
     allow you to access "data->type" to perform the added check.

 (2) instead of adding "struct obj *obj" to the parameters, just add
     "enum object_type type", as that is the only thing you need.

Obviously (2) is with lessor impact, but if it can be done safely
without breaking the code [*], (1) would probably be a much more
preferrable direction to go in the longer term.

    Side note [*].  A caller is allowed to choose to feed "buf" that
    is different from "oi.content" (perhaps buf may sometimes want
    to be a utf-8 recoded version of oi.content for certain types of
    objects) with the current system, but if we pass expand_data
    throughout the callchain, such a caller is broken, for example.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, ZheNing Hu wrote (reply to this):

Junio C Hamano <gitster@pobox.com> 于2021年6月3日周四 上午10:10写道:
>
> "ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> >  /* See grab_values */
> > -static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
> > +static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf,
> > +                                struct object *obj)
>
> Neither this step or the next change needs anything but type member
> of the 'obj' (and 'buf' is coming from oi.content of the result of
> asking about that same 'obj').
>
> I wonder if we should do one of the following:
>
>  (1) stop passing "void *buf" and instead "struct expand_data
>      *data", and use "data->content" to access "buf", which would
>      allow you to access "data->type" to perform the added check.
>
>  (2) instead of adding "struct obj *obj" to the parameters, just add
>      "enum object_type type", as that is the only thing you need.
>
> Obviously (2) is with lessor impact, but if it can be done safely
> without breaking the code [*], (1) would probably be a much more
> preferrable direction to go in the longer term.
>

I agree with (1). In future versions of grab_sub_body_contents(), we will
use the content of "data" more frequently instead of using the
crude "obj". The type provided by "obj" can also be provided by
"data". So yes, I would be very willing to let grab_sub_body_contents()
only use "data". (delete "obj")

E.g.

static void grab_sub_body_contents(struct atom_value *val, int deref,
struct expand_data *data)

Using (2), we will need more parameters to pass other object info.

>     Side note [*].  A caller is allowed to choose to feed "buf" that
>     is different from "oi.content" (perhaps buf may sometimes want
>     to be a utf-8 recoded version of oi.content for certain types of
>     objects) with the current system, but if we pass expand_data
>     throughout the callchain, such a caller is broken, for example.
>

Just see the situation in front of us: grab_sub_body_contents()
have only one caller: grab_values(). If someone need a function like
grab_sub_body_contents() to grab another buf, they can use rewrite
a more universal function interface:

static void grab_sub_body_contents(struct atom_value *val, int deref,
struct expand_data *data)
{
   grab_sub_body_contents_internal(val, deref, data->content,
data->size, data->type);
}

static void grab_sub_body_contents_internal(struct atom_value *val,
int deref, void *buf,
                                           unsigned long buf_size,
enum object_type type)
{
...
}

But for the time being, the above one is sufficient.

Thanks.
--
ZheNing Hu

@@ -235,6 +235,15 @@ and `date` to extract the named component. For email fields (`authoremail`,
without angle brackets, and `:localpart` to get the part before the `@` symbol
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"ZheNing Hu via GitGitGadget" <gitgitgadget@gmail.com> writes:

> @@ -644,13 +664,6 @@ static int parse_ref_filter_atom(const struct ref_format *format,
>  		return strbuf_addf_ret(err, -1, _("malformed field name: %.*s"),
>  				       (int)(ep-atom), atom);
>  
> -	/* Do we have the atom already used elsewhere? */
> -	for (i = 0; i < used_atom_cnt; i++) {
> -		int len = strlen(used_atom[i].name);
> -		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
> -			return i;
> -	}
> -
>  	/*
>  	 * If the atom name has a colon, strip it and everything after
>  	 * it off - it specifies the format for this entry, and
> @@ -660,6 +673,17 @@ static int parse_ref_filter_atom(const struct ref_format *format,
>  	arg = memchr(sp, ':', ep - sp);
>  	atom_len = (arg ? arg : ep) - sp;
>  
> +	if (format->quote_style && !strncmp(sp, "raw", 3) && !arg)
> +		return strbuf_addf_ret(err, -1, _("--format=%.*s cannot be used with"
> +				"--python, --shell, --tcl, --perl"), (int)(ep-atom), atom);
> +
> +	/* Do we have the atom already used elsewhere? */
> +	for (i = 0; i < used_atom_cnt; i++) {
> +		int len = strlen(used_atom[i].name);
> +		if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
> +			return i;
> +	}
> +

These two hunks

 - hoists up the code that sets 'arg' to optional string after
   "<atom>:" and counts how long the "<atom>" is in 'atom_len'; the
   change causes the counting done even when the same placeholder is
   already used elsewhere (in which case we do not have to do such
   counting);

 - inserts the early return to reject use of "raw" atom when
   language specific quoting is used.

It probably makes it easier to understand if the former is split
into a separate commit, but at the same time a series with too many
small steps is harder to manage, so let's keep them in a single
change.

But I do not think we want to add the new change at this location,
at least for two reasons:

 * The posted patch checks '!arg' to avoid rejecting "raw:size",
   which would not scale at all.  What if you wanted to later add
   "raw:upcase", which you must reject?

 * We do have enumerated constants for each atom types, but this
   early check and return does string comparison.

Where it belongs is either after "Is the atom a valid one?" loop
where 'atom_len' is used to locate the placeholder's atom in the
table of valid atoms [*], or inside raw_atom_parser().

    Side note: If you read the original code, you would notice that
    there already is a similar "this is a valid atom that appear in
    the valid_atom[] table, but unallowed in this situation" check
    done with .source != SOURCE_NONE conditional.  One downside is
    that until calling raw_atom_parser(), you do not know if
    RAW_BARE or RAW_LENGTH is requested.

If we do inside raw_atom_parser(), it would probably look like this:
 
+static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
+				const char *arg, struct strbuf *err)
+{
+	if (!arg)
+		atom->u.raw_data.option = RAW_BARE;
+	else if (!strcmp(arg, "size"))
+		atom->u.raw_data.option = RAW_LENGTH;
+	else
+		return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
+
+	if (atom->u.raw_data.option == RAW_BARE && format->quote_style)
+		return strbuf_addf_ret(err, -1,
+				       _("--format=%%(raw) cannot be used with ...")...);
+
+	return 0;
+}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, ZheNing Hu wrote (reply to this):

Junio C Hamano <gitster@pobox.com> 于2021年6月3日周四 上午10:38写道:
>
>
> These two hunks
>
>  - hoists up the code that sets 'arg' to optional string after
>    "<atom>:" and counts how long the "<atom>" is in 'atom_len'; the
>    change causes the counting done even when the same placeholder is
>    already used elsewhere (in which case we do not have to do such
>    counting);
>

I admit that I am doing repetitive work here.

>  - inserts the early return to reject use of "raw" atom when
>    language specific quoting is used.
>
> It probably makes it easier to understand if the former is split
> into a separate commit, but at the same time a series with too many
> small steps is harder to manage, so let's keep them in a single
> change.
>
> But I do not think we want to add the new change at this location,
> at least for two reasons:
>
>  * The posted patch checks '!arg' to avoid rejecting "raw:size",
>    which would not scale at all.  What if you wanted to later add
>    "raw:upcase", which you must reject?
>

Yeah, the code here makes "raw" lack of scalability. Especially we
want to add "%(raw:textconv)" and "%(raw:filter)" later.

>  * We do have enumerated constants for each atom types, but this
>    early check and return does string comparison.
>

Note that at this time we must compare strings... parse_ref_filter_atom()
passes string form of the atom. Code block A also requires comparing strings.

-------------------
Code block A:

        for (i = 0; i < used_atom_cnt; i++) {
               int len = strlen(used_atom[i].name);
               if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
                       return i;
       }
-------------------

All the following replies are based on such a fact:
We will reuse used atoms as much as possible.

Think about this situation:

$ git for-each-ref --format="%(raw)" --sort="raw" --python

Since we specified --sort="raw",`parse_sorting_atom()`
will be called in parse_opt_ref_sorting(), but at this time
we haven't parsed --<lang> yet. So format->quote_style == 0,
we cannot refuse  --<lang> at this time, and a "%(raw)" atom
item will be added to used_atom, when we use `verify_ref_format()`
to call `parse_sorting_atom()` for the second time, we already have
raw atom item in used_atom, in Code Block A we directly returned,
We can't refuse --<lang> too after Code Block A. So as a last solution,
we refuse --<lang> with "%(raw)" before Code Block A.

> Where it belongs is either after "Is the atom a valid one?" loop
> where 'atom_len' is used to locate the placeholder's atom in the
> table of valid atoms [*], or inside raw_atom_parser().
>
>     Side note: If you read the original code, you would notice that
>     there already is a similar "this is a valid atom that appear in
>     the valid_atom[] table, but unallowed in this situation" check
>     done with .source != SOURCE_NONE conditional.  One downside is
>     that until calling raw_atom_parser(), you do not know if
>     RAW_BARE or RAW_LENGTH is requested.
>

Yes, but we need to pay attention to it is below Code Block A.

> If we do inside raw_atom_parser(), it would probably look like this:
>
> +static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
> +                               const char *arg, struct strbuf *err)
> +{
> +       if (!arg)
> +               atom->u.raw_data.option = RAW_BARE;
> +       else if (!strcmp(arg, "size"))
> +               atom->u.raw_data.option = RAW_LENGTH;
> +       else
> +               return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
> +
> +       if (atom->u.raw_data.option == RAW_BARE && format->quote_style)
> +               return strbuf_addf_ret(err, -1,
> +                                      _("--format=%%(raw) cannot be used with ...")...);
> +
> +       return 0;
> +}

It's same, "*.parser()" is below Code Block A.

After all, the reason why this must be done here is the ref-filter
original logic
has not considered rejecting a format atom and an option.

Thanks.
--
ZheNing Hu

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, ZheNing Hu wrote (reply to this):

ZheNing Hu <adlternative@gmail.com> 于2021年6月3日周四 下午1:36写道:
>
> Junio C Hamano <gitster@pobox.com> 于2021年6月3日周四 上午10:38写道:
> > But I do not think we want to add the new change at this location,
> > at least for two reasons:
> >
> >  * The posted patch checks '!arg' to avoid rejecting "raw:size",
> >    which would not scale at all.  What if you wanted to later add
> >    "raw:upcase", which you must reject?
> >
>
> Yeah, the code here makes "raw" lack of scalability. Especially we
> want to add "%(raw:textconv)" and "%(raw:filter)" later.
>

Now I am building %(raw:textconv) and %(raw:filter), the code will be
very difficult to write:

        if (format->quote_style && !strncmp(sp, "raw", 3)
                                && ((!arg) || (!strncmp(arg,
":textconv", 9)) || (!strncmp(arg, ":filter", 7))))
                return strbuf_addf_ret(err, -1, _("--format=%.*s
cannot be used with"
                                "--python, --shell, --tcl, --perl"),
(int)(ep-atom), atom);

Is there any good way?

Thanks.
--
ZheNing Hu

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

ZheNing Hu <adlternative@gmail.com> writes:

>>  * We do have enumerated constants for each atom types, but this
>>    early check and return does string comparison.
>>
>
> Note that at this time we must compare strings... parse_ref_filter_atom()
> passes string form of the atom. Code block A also requires comparing strings.
>
> -------------------
> Code block A:
>
>         for (i = 0; i < used_atom_cnt; i++) {
>                int len = strlen(used_atom[i].name);
>                if (len == ep - atom && !memcmp(used_atom[i].name, atom, len))
>                        return i;
>        }
> -------------------

The point is that you can piggyback existing string comparison
(which is called "parsing") and use the parsed result (i.e. if you
can compare with ATOM_RAW instead of adding another strcmp(), that
can be a better solution).

> All the following replies are based on such a fact:
> We will reuse used atoms as much as possible.
>
> Think about this situation:
>
> $ git for-each-ref --format="%(raw)" --sort="raw" --python
>
> Since we specified --sort="raw",`parse_sorting_atom()`
> will be called in parse_opt_ref_sorting(), but at this time
> we haven't parsed --<lang> yet.

That only says using parse_sorting_atom() and relying on the check
in the function is still too early, and does not necessarily support
the posted patch that redundantly runs strcmp().

After parsing all the command line options, we have used_atom[]
fully populated and we know what host language we are quoting the
result for---and that makes a good place to check for comflicting
requests.

>> +static int raw_atom_parser(const struct ref_format *format, struct used_atom *atom,
>> +                               const char *arg, struct strbuf *err)
>> +{
>> +       if (!arg)
>> +               atom->u.raw_data.option = RAW_BARE;
>> +       else if (!strcmp(arg, "size"))
>> +               atom->u.raw_data.option = RAW_LENGTH;
>> +       else
>> +               return strbuf_addf_ret(err, -1, _("unrecognized %%(raw) argument: %s"), arg);
>> +
>> +       if (atom->u.raw_data.option == RAW_BARE && format->quote_style)
>> +               return strbuf_addf_ret(err, -1,
>> +                                      _("--format=%%(raw) cannot be used with ...")...);
>> +
>> +       return 0;
>> +}
>
> It's same, "*.parser()" is below Code Block A.
>
> After all, the reason why this must be done here is the ref-filter
> original logic
> has not considered rejecting a format atom and an option.

That is something you can fix to make the code easier to follow, no?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

ZheNing Hu <adlternative@gmail.com> writes:

> Now I am building %(raw:textconv) and %(raw:filter), the code will be
> very difficult to write:
>
>         if (format->quote_style && !strncmp(sp, "raw", 3)
>                                 && ((!arg) || (!strncmp(arg,
> ":textconv", 9)) || (!strncmp(arg, ":filter", 7))))
>                 return strbuf_addf_ret(err, -1, _("--format=%.*s
> cannot be used with"
>                                 "--python, --shell, --tcl, --perl"),
> (int)(ep-atom), atom);
>
> Is there any good way?

The problem you are having sounds like a natural consequence of
doing the check at the wrong place in the code, at least to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, ZheNing Hu wrote (reply to this):

Junio C Hamano <gitster@pobox.com> 于2021年6月4日周五 上午5:35写道:
>
> The point is that you can piggyback existing string comparison
> (which is called "parsing") and use the parsed result (i.e. if you
> can compare with ATOM_RAW instead of adding another strcmp(), that
> can be a better solution).
>
> That only says using parse_sorting_atom() and relying on the check
> in the function is still too early, and does not necessarily support
> the posted patch that redundantly runs strcmp().
>
> After parsing all the command line options, we have used_atom[]
> fully populated and we know what host language we are quoting the
> result for---and that makes a good place to check for comflicting
> requests.
>

Alright, I got it: we can perform related check in verify_ref_format(),
after parse_ref_filter_atom(), It is a good checkpoint.

> > After all, the reason why this must be done here is the ref-filter
> > original logic
> > has not considered rejecting a format atom and an option.
>
> That is something you can fix to make the code easier to follow, no?

You are right. ;-)

Thanks.
--
ZheNing Hu

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 3, 2021

On the Git mailing list, Bagas Sanjaya wrote (reply to this):

Hi,

On 01/06/21 21.37, ZheNing Hu via GitGitGadget wrote:
> In order to make git cat-file --batch use ref-filter logic, I add %(raw)
> atom to ref-filter.
> 
> Change from last version:
> 
>   1. Change is_empty() logic.
>   2. Simplify memcasecmp().
>   3. rebase on zh/ref-filter-atom-type.
> 

I prefer no first-person pronouns (I and we) in patch cover letter and 
commit message, so better say:

"Add %(raw) atom to ref-filter to make git cat-file --batch use 
ref-filter logic."

-- 
An old man doll... just what I always wanted! - Clara

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 3, 2021

On the Git mailing list, ZheNing Hu wrote (reply to this):

Bagas Sanjaya <bagasdotme@gmail.com> 于2021年6月3日周四 下午1:11写道:
>
> Hi,
>
> On 01/06/21 21.37, ZheNing Hu via GitGitGadget wrote:
> > In order to make git cat-file --batch use ref-filter logic, I add %(raw)
> > atom to ref-filter.
> >
> > Change from last version:
> >
> >   1. Change is_empty() logic.
> >   2. Simplify memcasecmp().
> >   3. rebase on zh/ref-filter-atom-type.
> >
>
> I prefer no first-person pronouns (I and we) in patch cover letter and
> commit message, so better say:
>
> "Add %(raw) atom to ref-filter to make git cat-file --batch use
> ref-filter logic."
>

Ok. I will change my way of narrating.

> --
> An old man doll... just what I always wanted! - Clara

Only tag and commit objects use `grab_sub_body_contents()` to grab
object contents in the current codebase.  We want to teach the
function to also handle blobs and trees to get their raw data,
without parsing a blob (whose contents looks like a commit or a tag)
incorrectly as a commit or a tag.

Skip the block of code that is specific to handling commits and tags
early when the given object is of a wrong type to help later
addition to handle other types of objects in this function.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
Add new formatting option `%(raw)`, which will print the raw
object data without any changes. It will help further to migrate
all cat-file formatting logic from cat-file to ref-filter.

The raw data of blob, tree objects may contain '\0', but most of
the logic in `ref-filter` depands on the output of the atom being
text (specifically, no embedded NULs in it).

E.g. `quote_formatting()` use `strbuf_addstr()` or `*._quote_buf()`
add the data to the buffer. The raw data of a tree object is
`100644 one\0...`, only the `100644 one` will be added to the buffer,
which is incorrect.

Therefore, add a new member in `struct atom_value`: `s_size`, which
can record raw object size, it can help us add raw object data to
the buffer or compare two buffers which contain raw object data.

Beyond, `--format=%(raw)` cannot be used with `--python`, `--shell`,
`--tcl`, `--perl` because if our binary raw data is passed to a variable
in the host language, the host language may not support arbitrary binary
data in the variables of its string type.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Helped-by: Felipe Contreras <felipe.contreras@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Based-on-patch-by: Olga Telezhnaya <olyatelezhnaya@gmail.com>
Signed-off-by: ZheNing Hu <adlternative@gmail.com>
@adlternative
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 4, 2021

Submitted as pull.966.v2.git.1622808751.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-966/adlternative/ref-filter-raw-atom-v4-v2

To fetch this version to local tag pr-966/adlternative/ref-filter-raw-atom-v4-v2:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-966/adlternative/ref-filter-raw-atom-v4-v2

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 4, 2021

On the Git mailing list, Christian Couder wrote (reply to this):

No need to resend as it's a cover letter, but just in case there is
another round and you copy things from this cover letter:

On Fri, Jun 4, 2021 at 2:12 PM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> In order to make git cat-file --batch use ref-filter logic, %(raw) atom is
> adding to ref-filter.

s/adding/added/

> Change from last version:
>
>  1. Change --<lang> and --format=%(raw) checkpoint to verify_ref_format(),
>     which make it more scalable.

s/make/makes/

>  2. Change grab_sub_body_contents() use struct expand_data *data instread of

s/use/to use/
s/instread/instead/

>     using obj,buf,buf_size to pass object info which can reduce the delivery
>     of function parameters.

@@ -235,6 +235,15 @@ and `date` to extract the named component. For email fields (`authoremail`,
without angle brackets, and `:localpart` to get the part before the `@` symbol
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Christian Couder wrote (reply to this):

On Fri, Jun 4, 2021 at 2:12 PM ZheNing Hu via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: ZheNing Hu <adlternative@gmail.com>
>
> Add new formatting option `%(raw)`, which will print the raw
> object data without any changes. It will help further to migrate
> all cat-file formatting logic from cat-file to ref-filter.
>
> The raw data of blob, tree objects may contain '\0', but most of
> the logic in `ref-filter` depands on the output of the atom being

s/depands/depends/

> text (specifically, no embedded NULs in it).

> diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
> index 2ae2478de706..8f8d8cd1e04f 100644
> --- a/Documentation/git-for-each-ref.txt
> +++ b/Documentation/git-for-each-ref.txt
> @@ -235,6 +235,15 @@ and `date` to extract the named component.  For email fields (`authoremail`,
>  without angle brackets, and `:localpart` to get the part before the `@` symbol
>  out of the trimmed email.
>
> +The raw data in a object is `raw`.

s/a object/an object/

> +
> +raw:size::
> +       The raw data size of the object.
> +
> +Note that `--format=%(raw)` can not be used with `--python`, `--shell`, `--tcl`,
> +`--perl` because the host language may not support arbitrary binary data in the
> +variables of its string type.





> @@ -1765,7 +1815,7 @@ static int populate_value(struct ref_array_item *ref, struct strbuf *err)
>                 int deref = 0;
>                 const char *refname;
>                 struct branch *branch = NULL;
> -
> +               v->s_size = ATOM_VALUE_S_SIZE_INIT;

It looks like a blank line was removed as you added the above new line.

>                 v->handler = append_atom;
>                 v->atom = atom;
>

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 5, 2021

On the Git mailing list, ZheNing Hu wrote (reply to this):

Hi, Christian,

Christian Couder <christian.couder@gmail.com> 于2021年6月4日周五 下午8:53写道:
>
> No need to resend as it's a cover letter, but just in case there is
> another round and you copy things from this cover letter:
>

Sorry, what is the bad place in this cover letter I write? This
cover letter is also different from the last time ...

> On Fri, Jun 4, 2021 at 2:12 PM ZheNing Hu via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > In order to make git cat-file --batch use ref-filter logic, %(raw) atom is
> > adding to ref-filter.
>
> s/adding/added/
>
> > Change from last version:
> >
> >  1. Change --<lang> and --format=%(raw) checkpoint to verify_ref_format(),
> >     which make it more scalable.
>
> s/make/makes/
>
> >  2. Change grab_sub_body_contents() use struct expand_data *data instread of
>
> s/use/to use/
> s/instread/instead/
>
> >     using obj,buf,buf_size to pass object info which can reduce the delivery
> >     of function parameters.

Thanks for these grammatical corrections.
--
ZheNing Hu

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 5, 2021

On the Git mailing list, Christian Couder wrote (reply to this):

On Sat, Jun 5, 2021 at 6:34 AM ZheNing Hu <adlternative@gmail.com> wrote:
>
> Hi, Christian,
>
> Christian Couder <christian.couder@gmail.com> 于2021年6月4日周五 下午8:53写道:
> >
> > No need to resend as it's a cover letter, but just in case there is
> > another round and you copy things from this cover letter:
> >
>
> Sorry, what is the bad place in this cover letter I write? This
> cover letter is also different from the last time ...

I was talking about the grammatical issues below in the cover letter.
Sometimes people copy things, for example a text explaining what the
patch series is about, from the cover letter of version N to the cover
letter of version N + 1, so I thought that telling you about
grammatical issues in this cover letter could perhaps help you if you
have to write another cover letter for another version of this patch
series.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 5, 2021

On the Git mailing list, ZheNing Hu wrote (reply to this):

Christian Couder <christian.couder@gmail.com> 于2021年6月5日周六 下午12:49写道:
>
> On Sat, Jun 5, 2021 at 6:34 AM ZheNing Hu <adlternative@gmail.com> wrote:
> >
> > Hi, Christian,
> >
> > Christian Couder <christian.couder@gmail.com> 于2021年6月4日周五 下午8:53写道:
> > >
> > > No need to resend as it's a cover letter, but just in case there is
> > > another round and you copy things from this cover letter:
> > >
> >
> > Sorry, what is the bad place in this cover letter I write? This
> > cover letter is also different from the last time ...
>
> I was talking about the grammatical issues below in the cover letter.
> Sometimes people copy things, for example a text explaining what the
> patch series is about, from the cover letter of version N to the cover
> letter of version N + 1, so I thought that telling you about
> grammatical issues in this cover letter could perhaps help you if you
> have to write another cover letter for another version of this patch
> series.

Ok, I get it.

I want to mention another question:
If I have a new patch series about %(rest) is based on the current %(raw)
patch series, should I submit it immediately?

Thanks.
--
ZheNing Hu

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 5, 2021

On the Git mailing list, Christian Couder wrote (reply to this):

On Sat, Jun 5, 2021 at 7:42 AM ZheNing Hu <adlternative@gmail.com> wrote:

> I want to mention another question:
> If I have a new patch series about %(rest) is based on the current %(raw)
> patch series, should I submit it immediately?

Yeah, I think it's ok to send it as long as you explicitly specify
(using a link for example) the patch series on the mailing list it
depends on.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 5, 2021

On the Git mailing list, ZheNing Hu wrote (reply to this):

Christian Couder <christian.couder@gmail.com> 于2021年6月5日周六 下午2:45写道:
>
> On Sat, Jun 5, 2021 at 7:42 AM ZheNing Hu <adlternative@gmail.com> wrote:
>
> > I want to mention another question:
> > If I have a new patch series about %(rest) is based on the current %(raw)
> > patch series, should I submit it immediately?
>
> Yeah, I think it's ok to send it as long as you explicitly specify
> (using a link for example) the patch series on the mailing list it
> depends on.

Ok. Because %(raw:textconv) %(raw:filter) dependent on %(raw) and %(rest),
%(raw) seems to have experienced a long cycle. These codes about new atoms
seem to stay in my local git repo for a long time. It may be a good thing to
send them earlier. :)

Thanks.

@gitgitgadget
Copy link

gitgitgadget bot commented Jun 8, 2021

This patch series was integrated into seen via git@418aa44.

@gitgitgadget gitgitgadget bot added the seen label Jun 8, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Jun 9, 2021

This patch series was integrated into seen via git@c281d03.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 1, 2021

This patch series was integrated into seen via git@3c525b5.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 2, 2021

This patch series was integrated into seen via git@ce080dd.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 3, 2021

This patch series was integrated into seen via git@c8db213.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 6, 2021

This patch series was integrated into seen via git@2a51e28.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 7, 2021

This patch series was integrated into seen via git@4fd9b2e.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 8, 2021

This patch series was integrated into seen via git@d917617.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 8, 2021

This patch series was integrated into seen via git@ec1bc30.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 9, 2021

This patch series was integrated into seen via git@16f3dbf.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 14, 2021

This patch series was integrated into seen via git@697afb8.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 28, 2021

This patch series was integrated into seen via git@b14cd70.

@gitgitgadget
Copy link

gitgitgadget bot commented Jul 30, 2021

This patch series was integrated into seen via git@e8d57e1.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 2, 2021

This patch series was integrated into seen via git@f673156.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 4, 2021

This patch series was integrated into seen via git@340b8d6.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 4, 2021

This patch series was integrated into seen via git@db57ce3.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 4, 2021

This patch series was integrated into seen via git@e2bcb8c.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 4, 2021

This patch series was integrated into seen via git@01d69c2.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 4, 2021

This patch series was integrated into seen via git@e8a6112.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 4, 2021

This patch series was integrated into next via git@4c4529d.

@gitgitgadget gitgitgadget bot added the next label Aug 4, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Aug 5, 2021

This patch series was integrated into seen via git@72ee1a2.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 6, 2021

This patch series was integrated into seen via git@30b8edb.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 6, 2021

This patch series was integrated into seen via git@0588156.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 9, 2021

This patch series was integrated into seen via git@3cd6276.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 9, 2021

This patch series was integrated into seen via git@076856e.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 11, 2021

This patch series was integrated into seen via git@d342a15.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 11, 2021

This patch series was integrated into seen via git@18426b1.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 15, 2021

This patch series was integrated into seen via git@4c4529d.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 15, 2021

This patch series was integrated into seen via git@aa00842.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 16, 2021

This patch series was integrated into seen via git@884fcb3.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 19, 2021

This patch series was integrated into seen via git@1bd338f.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 20, 2021

This patch series was integrated into seen via git@50faa16.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 23, 2021

This patch series was integrated into seen via git@1b66e8e.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 23, 2021

This patch series was integrated into next via git@5c06fb4.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 24, 2021

This patch series was integrated into seen via git@bda891e.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 24, 2021

This patch series was integrated into next via git@bda891e.

@gitgitgadget
Copy link

gitgitgadget bot commented Aug 24, 2021

This patch series was integrated into master via git@bda891e.

@gitgitgadget gitgitgadget bot added the master label Aug 24, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Aug 24, 2021

Closed via bda891e.

@gitgitgadget gitgitgadget bot closed this Aug 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant