Skip to content

pr-959/adlternative/ref-filter-contents-raw-v1

In (a2f3241: [GSOC] ref-filter: add contents:raw atom) I did not notice the
breakage that occurred during the test, In the later remediation, I found a
very serious problem: The output object data of "git cat-file tree foo" or
"git cat-file blob foo" may contain '\0'. However, most of the logic in
ref-filter depends on the atomic output not containing'\0'.

Therefore, we must carry out a series of repairs to ref-filter so that it
can support output of data containing '\0'.

In first patch, I add *.quote_buf_with_size() functions, this can deal with
data with containing'\0'.

In second patch, I add the member s_size in struct atom_value, and protects
the output of the atom from being truncated at '\0', and successfully
supported the %(contents) of blob and tree.

In third patch, I added the%(contents:raw) atom, It can print the original
content of an object.

What needs to be reconsidered:

For a binary object blob, tree,

git for-each-ref --format="%(contents)" --python refs/mytrees/first

will output a string processed by python_quote_buf_with_size(), which
contains'\0'. But the binary files seem to be useless after quoting. Should
we allow these binary files to be output in the default way with
strbuf_add()? If so, we can remove the first patch.

ZheNing Hu (3):
  [GSOC] quote: add *.quote_buf_with_size functions
  [GSOC] ref-filter: support %(contents) for blob, tree
  [GSOC] ref-filter: add contents:raw atom

 Documentation/git-for-each-ref.txt |  19 ++-
 quote.c                            | 116 +++++++++++++++
 quote.h                            |   4 +
 ref-filter.c                       | 229 +++++++++++++++++++++--------
 t/t6300-for-each-ref.sh            | 214 ++++++++++++++++++++++++++-
 5 files changed, 511 insertions(+), 71 deletions(-)

base-commit: 97eea85a0a1ec66d356567808a1e4ca2367e0ce7

Submitted-As: https://lore.kernel.org/git/pull.959.git.1621763612.gitgitgadget@gmail.com
Assets 2