pr-1632/listx/trailer-api-refactor-part-1-v4
tagged this
06 Feb 05:12
This patch series is the first 10 patches of a larger cleanup/bugfix series (henceforth "larger series") I've been working on. The main goal of this series is to begin the process of "libifying" the trailer API. By "API" I mean the interface exposed in trailer.h. The larger series brings a number of additional cleanups (exposing and fixing some bugs along the way), and builds on top of this series. When the larger series is merged, we will be in a good state to additionally pursue the following goals: 1. "API reuse inside Git": make the API expressive enough to eliminate any need by other parts of Git to use the interpret-trailers builtin as a subprocess (instead they could just use the API directly); 2. "API stability": add unit tests to codify the expected behavior of API functions; and 3. "API documentation": create developer-focused documentation to explain how to use the API effectively, noting any API limitations or anti-patterns. The reason why the larger series itself doesn't tackle these goals directly is because I believe that API code should be thought from the ground up with a libification-focused perspective. Some structs and functions exposed in the API today should probably not be libified (read: kept in trailer.h) as is. For example, the "trailer_iterator" struct has a "private" member and it feels wrong to allow API users to peek inside here (and take at face value our future API users' pinky promise that they won't depend on those private internals not meant for public consumption). One pattern we could use here to cleanly separate "what is the API" (publicly exposed) and "what is the implementation" (private) is the pointer-to-implementation ("pimpl") idiom. There may be other appropriate patterns, but I've chosen this one because it's a simple, low-level concept (put structs in foo.c instead of foo.h), which has far-reaching high-level consequences (API users must rely exclusively on the API to make use of such private structs, via opaque pointers). The pimpl idiom for C comes from the book "C Interfaces and Implementations" (see patch "trailer: make trailer_info struct private"). The idea of turning a public struct into a private one is a fundamental question of libification because it forces us to reconsider all of the data structures we have and how they're actually used by already existing users. For the trailer API, those existing users are the "interpret-trailers" builtin command, and anything else that includes the "trailer.h" header file (e.g., sequencer.c). One advantage of this idiom is that even the compiler understands it --- the compiler will loudly complain if you try to access the innards of a private struct through an opaque pointer. Another advantage of this idiom is that it helps to reduce the probability of breaking changes in the API. Because a private struct's members are out of view from our users (they only know about opaque pointers to the private struct, not its members), we are free to modify the members of the struct at any time, as much as we like, as long as we don't break the semantics of the exposed API functions (which is why unit-testing these API functions will be crucial long-term). If this pimpl idiom turns out to be a mistake, undoing it is easy --- just move the relevant struct definition from foo.c to the header file. So it's a great way to try things out without digging ourselves into a pit of despair that will be difficult to get out of. With the libification-focused goals out of the way, let's turn to this patch series in more detail. Currently, we have "process_trailers()" in trailer.h which does many different things (parse command-line arguments, create temporary files, etc) that are independent of the concept of "trailers". Keeping this function as an API function would make unit-testing it difficult. While there is no technical reason why we couldn't write unit tests for the smaller functions that are called within process_trailers(), doing so would involve testing private ("static" in trailer.c) functions instead of API functions, which defeats the goal of "API stability" mentioned earlier above. As an alternative to how things are done in this patch series, we could keep trailer.h intact and decide to unit-test the existing "trailer_info_get()" function which does most of the trailer parsing work (and is used by sequencer.c). However this function wouldn't be easy to test either, because the resulting "trailer_info" struct merely contains the unparsed "trailers" lines. So the unit test (if it wants to inspect the result of parsing these lines) would have to invoke additional parsing functions itself. And at that point it would no longer be a unit test in the traditional sense, because it would be invoking multiple functions at once. In summary this series breaks up "process_trailers()" into smaller pieces, exposing many of the parts relevant to trailer-related processing in trailer.h. This will force us to eventually introduce unit tests for these API functions, but that is a good thing for API stability. In the future after libification is "complete", users external to Git will be able to use the same trailer processing API used by the interpret-trailers builtin. For example, a web server may want to parse trailers the same way that Git would parse them, without having to call interpret-trailers as a subprocess. This use case was the original motivation behind my work in this area. Thanks to the aggressive refactoring in this series, I've been able to identify and fix several bugs in our existing implementation. Those fixes build on top of this series but were not included here, in order to keep this series small. Below is a "shortlog" of those fixes I have locally: * "trailer: trailer replacement should not change its position" (If we found a trailer we'd like to replace, preserve its position relative to the other trailers found in the trailer block, instead of always moving it to the beginning or end of the entire trailer block.) * "interpret-trailers: preserve trailers coming from the input" (Sometimes, the parsed trailers from the input will be formatted differently depending on whether we provide --only-trailers or not. Make the trailers that were not modified and which are coming directly from the input get formatted the same way, regardless of this flag.) * "interpret-trailers: do not modify the input if NOP" (Refrain from subtracting or adding a newline around the patch divider "---" if we are not adding new trailers.) * "trailer formatter: split up format_trailer() monolith" (Fix a bug in git-log where we still printed a blank newline even if we didn't want to format anything.) * "interpret-trailers: fail if given unrecognized arguments" (E.g., for "--where", only accept recognized WHERE_* enum values. If we get something unrecognized, fail with an error instead of silently doing nothing. Ditto for "--if-exists" and "--if-missing".) Notable changes in v4 ===================== * Patches 3, 4, 5, and 8 have been broken up into smaller steps. There are 28 instead of 10 patches now, but these 28 should be much easier to review than the (previously condensed) 10. * NEW Patch 1: "trailer: free trailer_info after all related usage" fixes awkward use-after-free coding style * NEW Patch 2: "shortlog: add test for de-duplicating folded trailers" increases test coverage related to trailer iterators and "unfold_value()" * NEW Patch 27: "trailer_set_*(): put out parameter at the end" is a small refactor to reorder parameters. * Patches 5-16: These smaller patches make up Patch 3 from v3. * Patches 17-18: These smaller patches make up Patch 4 from v3. * Patches 19-20: These smaller patches make up Patch 5 from v3. * Patches 23-26: These smaller patches make up Patch 8 from v3. * Anonymize unambiguous parameters in <trailer.h>. Notable changes in v3 ===================== * Squashed Patch 4 into Patch 3 ("trailer: unify trailer formatting machinery"), to avoid breaking the build ("-Werror=unused-function" violations) * NEW (Patch 10): Introduce "trailer template" terminology for readability (no behavioral change) * (API function) Rename default_separators() to trailer_default_separators() * (API function) Rename new_trailers_clear() to free_trailer_templates() * trailer.h: for single-parameter functions, anonymize the parameter name to reduce verbosity Notable changes in v2 ===================== * (cover letter) Discuss goals of the larger series in more detail, especially the pimpl idiom * (cover letter) List bug fixes pending in the larger series that depend on this series * Reorder function parameters to have trailer options at the beginning (and out parameters toward the end) * "sequencer: use the trailer iterator": prefer C string instead of strbuf for new "raw" field * Patch 1 (was Patch 2) also renames ensure_configured() to trailer_config_init() (forgot to rename this one previously) Linus Arver (28): trailer: free trailer_info _after_ all related usage shortlog: add test for de-duplicating folded trailers trailer: prepare to expose functions as part of API trailer: move interpret_trailers() to interpret-trailers.c trailer: start preparing for formatting unification trailer_info_get(): reorder parameters format_trailers(): use strbuf instead of FILE format_trailer_info(): move "fast path" to caller format_trailers_from_commit(): indirectly call trailer_info_get() format_trailer_info(): use trailer_item objects format_trailer_info(): drop redundant unfold_value() format_trailer_info(): append newline for non-trailer lines trailer: begin formatting unification format_trailer_info(): teach it about opts->trim_empty format_trailer_info(): avoid double-printing the separator trailer: finish formatting unification trailer: teach iterator about non-trailer lines sequencer: use the trailer iterator trailer: make trailer_info struct private trailer: retire trailer_info_get() from API trailer: spread usage of "trailer_block" language trailer: prepare to delete "parse_trailers_from_command_line_args()" trailer: add new helper functions to API trailer_add_arg_item(): drop new_trailer_item usage trailer: deprecate "new_trailer_item" struct from API trailer: unify "--trailer ..." arg handling trailer_set_*(): put out parameter at the end trailer: introduce "template" term for readability builtin/interpret-trailers.c | 189 ++++++-- pretty.c | 2 +- ref-filter.c | 2 +- sequencer.c | 27 +- t/t4201-shortlog.sh | 32 ++ trailer.c | 811 +++++++++++++++++------------------ trailer.h | 109 ++--- 7 files changed, 642 insertions(+), 530 deletions(-) base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b Submitted-As: https://lore.kernel.org/git/pull.1632.v4.git.1707196348.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.1632.git.1704869487.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.1632.v2.git.1706308737.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.1632.v3.git.1706664144.gitgitgadget@gmail.com
Assets 2
-
2024-02-06T05:12:28Z -
2024-02-06T05:12:28Z -