pr-848/derrickstolee/chunk-format/refactor-v4
tagged this
18 Feb 14:07
This is a restart on the topic previously submitted [1] but dropped because ak/corrected-commit-date was still in progress. This version is based on that branch. [1] https://lore.kernel.org/git/pull.804.git.1607012215.gitgitgadget@gmail.com/ This version also changes the approach to use a more dynamic interaction with a struct chunkfile pointer. This idea is credited to Taylor Blau [2], but I started again from scratch. I also go further to make struct chunkfile anonymous to API consumers. It is defined only in chunk-format.c, which should hopefully deter future users from interacting with that data directly. [2] https://lore.kernel.org/git/X8%2FI%2FRzXZksio+ri@nand.local/ This combined API is beneficial to reduce duplicated logic. Or rather, to ensure that similar file formats have similar protections against bad data. The multi-pack-index code did not have as many guards as the commit-graph code did, but now they both share a common base that checks for things like duplicate chunks or offsets outside the size of the file. Here are some stats for the end-to-end change: * 570 insertions(+), 456 deletions(-). * commit-graph.c: 107 insertions(+), 192 deletions(-) * midx.c: 164 insertions(+), 260 deletions(-) While there is an overall increase to the code size, the consumers do get smaller. Boilerplate things like abstracting method to match chunk_write_fn and chunk_read_fn make up a lot of these insertions. The "interesting" code gets a lot smaller and cleaner. Updates in V4 ============= * Out-of-date macros in commit-graph.c and midx.c are removed in their appropriate patches. * Documentation around the read API is improved. Updates in V3 ============= * API methods use better types and changed their order to match internal data more closely. * Use hashfile_total() instead of internal data values. * The implementation of pair_chunk() uses read_chunk(). * init_chunkfile() has an in-code doc comment warning against using the same struct chunkfile for reads and writes. * More multiplications are correctly cast in midx.c. * The chunk-format technical docs are expanded. Updates in V2 ============= * The method pair_chunk() now automatically sets a pointer while read_chunk() uses the callback. This greatly reduces the code size. * Pointer casts are now implicit instead of explicit. * Extra care is taken to not overflow when verifying chunk sizes on write. Thanks, -Stolee Derrick Stolee (17): commit-graph: anonymize data in chunk_write_fn chunk-format: create chunk format write API commit-graph: use chunk-format write API midx: rename pack_info to write_midx_context midx: use context in write_midx_pack_names() midx: add entries to write_midx_context midx: add pack_perm to write_midx_context midx: add num_large_offsets to write_midx_context midx: return success/failure in chunk write methods midx: drop chunk progress during write midx: use chunk-format API in write_midx_internal() chunk-format: create read chunk API commit-graph: use chunk-format read API midx: use chunk-format read API midx: use 64-bit multiplication for chunk sizes chunk-format: restore duplicate chunk checks chunk-format: add technical docs Documentation/technical/chunk-format.txt | 116 +++++ .../technical/commit-graph-format.txt | 3 + Documentation/technical/pack-format.txt | 3 + Makefile | 1 + chunk-format.c | 179 ++++++++ chunk-format.h | 68 +++ commit-graph.c | 305 +++++------- midx.c | 433 +++++++----------- t/t5318-commit-graph.sh | 2 +- t/t5319-multi-pack-index.sh | 6 +- 10 files changed, 652 insertions(+), 464 deletions(-) create mode 100644 Documentation/technical/chunk-format.txt create mode 100644 chunk-format.c create mode 100644 chunk-format.h base-commit: 5a3b130cad0d5c770f766e3af6d32b41766374c0 Submitted-As: https://lore.kernel.org/git/pull.848.v4.git.1613657259.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.848.git.1611676886.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.848.v2.git.1611759716.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.848.v3.git.1612535452.gitgitgadget@gmail.com
Assets 2
-
2021-02-18T14:07:40Z -
2021-02-18T14:07:40Z - Loading