[pull] master from git:master#19
Merged
pull[bot] merged 31 commits intoLadyK-21:masterfrom Aug 19, 2022
Merged
Conversation
Add the 'recursive' diff flag to the local changes reporting done by 'git checkout' in 'show_local_changes()'. Without the flag enabled, unexpanded sparse directories will not be recursed into to report the diff of each file's contents, resulting in the reported local changes including "modified" sparse directories. The same issue was found and fixed for 'git status' in 2c521b0 (status: fix nested sparse directory diff in sparse index, 2022-03-01) Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update 'do_oneway_diff()' to perform a 'diff_tree_oid()' on removed sparse directories, as it does for added or modified sparse directories (see 9eb00af (diff-lib: handle index diffs with sparse dirs, 2021-07-14)). At the moment, this update is unreachable code because 'unpack_trees()' (currently the only way 'oneway_diff()' can be called, via 'diff_cache()') will always traverse trees down to the individual removed files of a deleted sparse directory. A subsequent patch will change this to better preserve a sparse index in other uses of 'unpack_tree()', e.g. 'git reset --hard'. However, making that change without this patch would result in (among other issues) 'git status' printing only the name of a deleted sparse directory, not its contents. To avoid introducing that bug, 'do_oneway_diff()' is updated before modifying 'unpack_trees()'. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add 'index_name_pos_sparse()', which behaves the same as 'index_name_pos()', except that it does not expand a sparse index to search for an entry inside a sparse directory. 'index_entry_exists()' was originally implemented in 20ec2d0 (reset: make sparse-aware (except --mixed), 2021-11-29) as an alternative to 'index_name_pos()' to allow callers to search for an index entry without expanding a sparse index. However, that particular use case only required knowing whether the requested entry existed, so 'index_entry_exists()' does not return the index positioning information provided by 'index_name_pos()'. This patch implements 'index_name_pos_sparse()' to accommodate callers that need the positioning information of 'index_name_pos()', but do not want to expand the index. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If 'unpack_single_entry()' is unpacking a new directory tree (that is, one not already present in the index) into a sparse index, unpack the tree as a sparse directory rather than traversing its contents and unpacking each file individually. This helps keep the sparse index as collapsed as possible in cases such as 'git reset --hard' restoring a outside-of-cone directory removed with 'git rm -r --sparse'. Without this patch, 'unpack_single_entry()' will only unpack a directory into the index as a sparse directory (rather than traversing into it and unpacking its files one-by-one) if an entry with the same name already exists in the index. This patch allows sparse directory unpacking without a matching index entry when the following conditions are met: 1. the directory's path is outside the sparse cone, and 2. there are no children of the directory in the index If a directory meets these requirements (as determined by 'is_new_sparse_dir()'), 'unpack_single_entry()' unpacks the sparse directory index entry and propagates the decision back up to 'unpack_callback()' to prevent unnecessary tree traversal into the unpacked directory. Reported-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* vd/sparse-reset-checkout-fixes: unpack-trees: unpack new trees as sparse directories cache.h: create 'index_name_pos_sparse()' oneway_diff: handle removed sparse directories checkout: fix nested sparse directory diff in sparse index
Add tests for `git-rm`, make sure it behaves as expected when <pathspec> is both inside or outside of sparse-checkout definition. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Method pathspec_needs_expanded_index() in reset.c from 4d1cfc1 (reset: make --mixed sparse-aware, 2021-11-29) is reusable when we need to verify if the index needs to be expanded when the command is utilizing a pathspec rather than a literal path. Move it to pathspec.h for reusability. Add a few items to the function so it can better serve its purpose as a standalone public function: * Add a check in front so if the index is not sparse, return early since no expansion is needed. * It now takes an arbitrary 'struct index_state' pointer instead of using `the_index` and `active_cache`. * Add documentation to the function. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the `ensure_full_index()` method so `git-rm` does not always expand the index when the expansion is unnecessary, i.e. when <pathspec> does not have any possibilities to match anything outside of sparse-checkout definition. Expand the index when the <pathspec> needs an expanded index, i.e. the <pathspec> contains wildcard that may need a full-index or the <pathspec> is simply outside of sparse-checkout definition. Notice that the test 'rm pathspec expands index when necessary' in t1092 *is* testing this code change behavior, though it will be marked as 'test_expect_success' only in the next patch, where we officially mark `command_requires_full_index = 0`, so the index does not expand unless we tell it to do so. Notice that because we also want `ensure_full_index` to record the stdout and stderr from Git command, a corresponding modification is also included in this patch. The reason we want the "sparse-index-out" and "sparse-index-err", is that we need to make sure there is no error from Git command itself, so we can rely on the `test_region` result and determine if the index is expanded or not. Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Enable the sparse index within the `git-rm` command.
The `p2000` tests demonstrate a ~92% execution time reduction for
'git rm' using a sparse index.
Test HEAD~1 HEAD
--------------------------------------------------------------------------
2000.74: git rm ... (full-v3) 0.41(0.37+0.05) 0.43(0.36+0.07) +4.9%
2000.75: git rm ... (full-v4) 0.38(0.34+0.05) 0.39(0.35+0.05) +2.6%
2000.76: git rm ... (sparse-v3) 0.57(0.56+0.01) 0.05(0.05+0.00) -91.2%
2000.77: git rm ... (sparse-v4) 0.57(0.55+0.02) 0.03(0.03+0.00) -94.7%
----
Also, normalize a behavioral difference of `git-rm` under sparse-index.
See related discussion [1].
`git-rm` a sparse-directory entry within a sparse-index enabled repo
behaves differently from a sparse directory within a sparse-checkout
enabled repo.
For example, in a sparse-index repo, where 'folder1' is a
sparse-directory entry, `git rm -r --sparse folder1` provides this:
rm 'folder1/'
Whereas in a sparse-checkout repo *without* sparse-index, doing so
provides this:
rm 'folder1/0/0/0'
rm 'folder1/0/1'
rm 'folder1/a'
Because `git rm` a sparse-directory entry does not need to expand the
index, therefore we should accept the current behavior, which is faster
than "expand the sparse-directory entry to match the sparse-checkout
situation".
Modify a previous test so such difference is not considered as an error.
[1] ffyuanda#6 (comment)
Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bug report https://lore.kernel.org/git/AM0PR02MB56357CC96B702244F3271014E8DC9@AM0PR02MB5635.eurprd02.prod.outlook.com/ noted that a file containing /r/r/n needed renormalising twice. This is by design. Lone CR characters, not paired with an LF, are left unchanged. Note this limitation of the "clean" filter in the documentation. Renormalize was introduced at 9472935 (add: introduce "--renormalize", Torsten Bögershausen, 2017-11-16) Signed-off-by: Philip Oakley <philipoakley@iee.email> Reviewed-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The name of the variable is wrong, and it can be set to anything, like 1. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Reviewed-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When vimdiff3 was added in 7c147b7 (mergetools: add vimdiff3 mode, 2014-04-20), the description made clear the intention: It's similar to the default, except that the other windows are hidden. This ensures that removed/added colors are still visible on the main merge window, but the other windows not visible. However, in 0041797 (vimdiff: new implementation with layout support, 2022-03-30) this was broken by generating a command that never creates windows, and therefore vim never shows the diff. The layout support implementation broke the whole purpose of vimdiff3, and simply shows MERGED, which is no different from simply opening the file with vim. In order to show the diff, the windows need to be created first, and then when they are hidden the diff remains (if hidenoff isn't set), but by setting the `hidden` option the initial buffers are marked as hidden thus making the feature work. Suggested-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Reviewed-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using the single window mode we are greeted with the following warning: "./content_LOCAL_8975" 6L, 28B "./content_BASE_8975" 6 lines, 29 bytes "./content_REMOTE_8975" 6 lines, 29 bytes "content" 16 lines, 115 bytes Press ENTER or type command to continue every time. Silence that. Suggested-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Reviewed-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When diffopt has hiddenoff set and there's only one window (as is the case in the single window mode) the diff mode is turned off. We don't want that, so turn that option off. Cc: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Reviewed-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we treat tabs especially, the logic becomes much simpler. Cc: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Reviewed-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Layouts with a single window other than "MERGED" do not work (e.g.
"LOCAL" or "MERGED+LOCAL").
This is because as the documentation of bufdo says:
The last buffer (or where an error occurred) becomes the current
buffer.
And we do always do bufdo the end.
Additionally, we do it only once, when it should be per tab.
Fix this by doing it once per tab right after it's created and before
any buffer is switched.
Cc: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we wrap the tabdo command there's no need for a separate command call. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> Reviewed-by: Fernando Ramos <greenfoo@u92.eu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce the idea of bundle URIs to the Git codebase through an aspirational design document. This document includes the full design intended to include the feature in its fully-implemented form. This will take several steps as detailed in the Implementation Plan section. By committing this document now, it can be used to motivate changes necessary to reach these final goals. The design can still be altered as new information is discovered. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous change introduced the bundle URI design document. It creates a flexible set of options that allow bundle providers many ways to organize Git object data and speed up clones and fetches. It is particularly important that we have flexibility so we can apply future advancements as new ideas for efficiently organizing Git data are discovered. However, the design document does not provide even an example of how bundles could be organized, and that makes it difficult to envision how the feature should work at the end of the implementation plan. Add a section that details how a bundle provider could work, including using the Git server advertisement for multiple geo-distributed servers. This organization is based on the GVFS Cache Servers which have successfully used similar ideas to provide fast object access and reduced server load for very large repositories. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
When using init_tree_desc() and tree_entry() to iterate over a tree, we
always canonicalize the modes coming out of the tree. This is a good
thing to prevent bugs or oddities in normal code paths, but it's
counter-productive for tools like fsck that want to see the exact
contents.
We can address this by adding an option to avoid the extra
canonicalization. A few notes on the implementation:
- I've attached the new option to the tree_desc struct itself. The
actual code change is in decode_tree_entry(), which is in turn
called by the public update_tree_entry(), tree_entry(), and
init_tree_desc() functions, plus their "gently" counterparts.
By letting it ride along in the struct, we can avoid changing the
signature of those functions, which are called many times. Plus it's
conceptually simpler: you really want a particular iteration of a
tree to be "raw" or not, rather than individual calls.
- We still have to set the new option somewhere. The struct is
initialized by init_tree_desc(). I added the new flags field only to
the "gently" version. That avoids disturbing the much more numerous
non-gentle callers, and it makes sense that anybody being careful
about looking at raw modes would also be careful about bogus trees
(i.e., the caller will be something like fsck in the first place).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We use the normal tree_desc code to iterate over trees in fsck, meaning we only see the canonicalized modes it returns. And hence we'd never see anything unexpected, since it will coerce literally any garbage into one of our normal and accepted modes. We can use the new RAW_MODES flag to see the real modes, and then use the existing code to actually analyze them. The existing code is written as allow-known-good, so there's not much point in testing a variety of breakages. The one tested here should be S_IFREG but with nonsense permissions. Do note that the error-reporting here isn't great. We don't mention the specific bad mode, but just that the tree has one or more broken modes. But when you go to look at it with "git ls-tree", we'll report the canonicalized mode! This isn't ideal, but given that this should come up rarely, and that any number of other tree corruptions might force you into looking at the binary bytes via "cat-file", it's not the end of the world. And it's something we can improve on top later if we choose. Reported-by: Xavier Morel <xavier.morel@masklinn.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous commit un-broke the "badFileMode" check; before then it was literally testing nothing. And as far as I can tell, it has been so since the very initial version of fsck. The current severity of "badFileMode" is just "warning". But in the --strict mode used by transfer.fsckObjects, that is elevated to an error. This will potentially cause hassle for users, because historical objects with bad modes will suddenly start causing pushes to many server operators to be rejected. At the same time, these bogus modes aren't actually a big risk. Because we canonicalize them everywhere besides fsck, they can't cause too much mischief in the real world. The worst thing you can do is end up with two almost-identical trees that have different hashes but are interpreted the same. That will generally cause things to be inefficient rather than wrong, and is a bug somebody working on a Git implementation would want to fix, but probably not worth inconveniencing users by refusing to push or fetch. So let's downgrade this to "info" by default, which is our setting for "mention this when fscking, but don't ever reject, even under strict mode". If somebody really wants to be paranoid, they can still adjust the level using config. Suggested-by: Xavier Morel <xavier.morel@masklinn.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
The '--disk-usage' option for git-rev-list was introduced in 16950f8 (rev-list: add --disk-usage option for calculating disk usage, 2021-02-09). This is very useful for people inspect their git repo's objects usage infomation, but the resulting number is quit hard for a human to read. Teach git rev-list to output a human readable result when using '--disk-usage'. Signed-off-by: Li Linchao <lilinchao@oschina.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Documentation for "git add --renormalize" has been improved. * po/doc-add-renormalize: doc add: renormalize is not idempotent for CRCRLF
"vimdiff3" regression fix. * fc/vimdiff-layout-vimdiff3-fix: mergetools: vimdiff: simplify tabfirst mergetools: vimdiff: fix single window layouts mergetools: vimdiff: rework tab logic mergetools: vimdiff: fix for diffopt mergetools: vimdiff: silence annoying messages mergetools: vimdiff: make vimdiff3 actually work mergetools: vimdiff: fix comment
"git fsck" reads mode from tree objects but canonicalizes the mode before passing it to the logic to check object sanity, which has hid broken tree objects from the checking logic. This has been corrected, but to help exiting projects with broken tree objects that they cannot fix retroactively, the severity of anomalies this code detects has been demoted to "info" for now. * jk/fsck-tree-mode-bits-fix: fsck: downgrade tree badFilemode to "info" fsck: actually detect bad file modes in trees tree-walk: add a mechanism for getting non-canonicalized modes
The "bundle URI" design gets documented. * ds/bundle-uri-more: bundle-uri: add example bundle organization docs: document bundle URI standard
Fixes to sparse index compatibility work for "reset" and "checkout" commands. * vd/sparse-reset-checkout-fixes: unpack-trees: unpack new trees as sparse directories cache.h: create 'index_name_pos_sparse()' oneway_diff: handle removed sparse directories checkout: fix nested sparse directory diff in sparse index
"git rm" has become more aware of the sparse-index feature. * sy/sparse-rm: rm: integrate with sparse-index rm: expand the index only when necessary pathspec.h: move pathspec_needs_expanded_index() from reset.c to here t1092: add tests for `git-rm`
"git rev-list --disk-usage" learned to take an optional value "human" to show the reported value in human-readable format, like "3.40MiB". * ll/disk-usage-humanise: rev-list: support human-readable output for `--disk-usage`
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pull bot
pushed a commit
that referenced
this pull request
Jan 17, 2023
There is an out-of-bounds read possible when parsing gitattributes that
have an attribute that is 2^31+1 bytes long. This is caused due to an
integer overflow when we assign the result of strlen(3P) to an `int`,
where we use the wrapped-around value in a subsequent call to
memcpy(3P). The following code reproduces the issue:
blob=$(perl -e 'print "a" x 2147483649 . " attr"' | git hash-object -w --stdin)
git update-index --add --cacheinfo 100644,$blob,.gitattributes
git check-attr --all file
AddressSanitizer:DEADLYSIGNAL
=================================================================
==8451==ERROR: AddressSanitizer: SEGV on unknown address 0x7f93efa00800 (pc 0x7f94f1f8f082 bp 0x7ffddb59b3a0 sp 0x7ffddb59ab28 T0)
==8451==The signal is caused by a READ memory access.
#0 0x7f94f1f8f082 (/usr/lib/libc.so.6+0x176082)
#1 0x7f94f2047d9c in __interceptor_strspn /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:752
#2 0x560e190f7f26 in parse_attr_line attr.c:375
#3 0x560e190f9663 in handle_attr_line attr.c:660
#4 0x560e190f9ddd in read_attr_from_index attr.c:769
#5 0x560e190f9f14 in read_attr attr.c:797
#6 0x560e190fa24e in bootstrap_attr_stack attr.c:867
#7 0x560e190fa4a5 in prepare_attr_stack attr.c:902
#8 0x560e190fb5dc in collect_some_attrs attr.c:1097
#9 0x560e190fb93f in git_all_attrs attr.c:1128
#10 0x560e18e6136e in check_attr builtin/check-attr.c:67
#11 0x560e18e61c12 in cmd_check_attr builtin/check-attr.c:183
#12 0x560e18e15993 in run_builtin git.c:466
#13 0x560e18e16397 in handle_builtin git.c:721
#14 0x560e18e16b2b in run_argv git.c:788
#15 0x560e18e17991 in cmd_main git.c:926
#16 0x560e190ae2bd in main common-main.c:57
#17 0x7f94f1e3c28f (/usr/lib/libc.so.6+0x2328f)
#18 0x7f94f1e3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#19 0x560e18e110e4 in _start ../sysdeps/x86_64/start.S:115
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/usr/lib/libc.so.6+0x176082)
==8451==ABORTING
Fix this bug by converting the variable to a `size_t` instead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pull bot
pushed a commit
that referenced
this pull request
Jan 17, 2023
It is possible to trigger an integer overflow when parsing attribute
names when there are more than 2^31 of them for a single pattern. This
can either lead to us dying due to trying to request too many bytes:
blob=$(perl -e 'print "f" . " a=" x 2147483649' | git hash-object -w --stdin)
git update-index --add --cacheinfo 100644,$blob,.gitattributes
git attr-check --all file
=================================================================
==1022==ERROR: AddressSanitizer: requested allocation size 0xfffffff800000032 (0xfffffff800001038 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0)
#0 0x7fd3efabf411 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77
#1 0x5563a0a1e3d3 in xcalloc wrapper.c:150
#2 0x5563a058d005 in parse_attr_line attr.c:384
#3 0x5563a058e661 in handle_attr_line attr.c:660
#4 0x5563a058eddb in read_attr_from_index attr.c:769
#5 0x5563a058ef12 in read_attr attr.c:797
#6 0x5563a058f24c in bootstrap_attr_stack attr.c:867
#7 0x5563a058f4a3 in prepare_attr_stack attr.c:902
#8 0x5563a05905da in collect_some_attrs attr.c:1097
#9 0x5563a059093d in git_all_attrs attr.c:1128
#10 0x5563a02f636e in check_attr builtin/check-attr.c:67
#11 0x5563a02f6c12 in cmd_check_attr builtin/check-attr.c:183
#12 0x5563a02aa993 in run_builtin git.c:466
#13 0x5563a02ab397 in handle_builtin git.c:721
#14 0x5563a02abb2b in run_argv git.c:788
#15 0x5563a02ac991 in cmd_main git.c:926
#16 0x5563a05432bd in main common-main.c:57
#17 0x7fd3ef82228f (/usr/lib/libc.so.6+0x2328f)
==1022==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: allocation-size-too-big /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:77 in __interceptor_calloc
==1022==ABORTING
Or, much worse, it can lead to an out-of-bounds write because we
underallocate and then memcpy(3P) into an array:
perl -e '
print "A " . "\rh="x2000000000;
print "\rh="x2000000000;
print "\rh="x294967294 . "\n"
' >.gitattributes
git add .gitattributes
git commit -am "evil attributes"
$ git clone --quiet /path/to/repo
=================================================================
==15062==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000002550 at pc 0x5555559884d5 bp 0x7fffffffbc60 sp 0x7fffffffbc58
WRITE of size 8 at 0x602000002550 thread T0
#0 0x5555559884d4 in parse_attr_line attr.c:393
#1 0x5555559884d4 in handle_attr_line attr.c:660
#2 0x555555988902 in read_attr_from_index attr.c:784
#3 0x555555988902 in read_attr_from_index attr.c:747
#4 0x555555988a1d in read_attr attr.c:800
#5 0x555555989b0c in bootstrap_attr_stack attr.c:882
#6 0x555555989b0c in prepare_attr_stack attr.c:917
#7 0x555555989b0c in collect_some_attrs attr.c:1112
#8 0x55555598b141 in git_check_attr attr.c:1126
#9 0x555555a13004 in convert_attrs convert.c:1311
#10 0x555555a95e04 in checkout_entry_ca entry.c:553
#11 0x555555d58bf6 in checkout_entry entry.h:42
#12 0x555555d58bf6 in check_updates unpack-trees.c:480
#13 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
#14 0x555555785ab7 in checkout builtin/clone.c:724
#15 0x555555785ab7 in cmd_clone builtin/clone.c:1384
#16 0x55555572443c in run_builtin git.c:466
#17 0x55555572443c in handle_builtin git.c:721
#18 0x555555727872 in run_argv git.c:788
#19 0x555555727872 in cmd_main git.c:926
#20 0x555555721fa0 in main common-main.c:57
#21 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308
#22 0x555555723f39 in _start (git+0x1cff39)
0x602000002552 is located 0 bytes to the right of 2-byte region [0x602000002550,0x602000002552) allocated by thread T0 here:
#0 0x7ffff768c037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x555555d7fff7 in xcalloc wrapper.c:150
#2 0x55555598815f in parse_attr_line attr.c:384
#3 0x55555598815f in handle_attr_line attr.c:660
#4 0x555555988902 in read_attr_from_index attr.c:784
#5 0x555555988902 in read_attr_from_index attr.c:747
#6 0x555555988a1d in read_attr attr.c:800
#7 0x555555989b0c in bootstrap_attr_stack attr.c:882
#8 0x555555989b0c in prepare_attr_stack attr.c:917
#9 0x555555989b0c in collect_some_attrs attr.c:1112
#10 0x55555598b141 in git_check_attr attr.c:1126
#11 0x555555a13004 in convert_attrs convert.c:1311
#12 0x555555a95e04 in checkout_entry_ca entry.c:553
#13 0x555555d58bf6 in checkout_entry entry.h:42
#14 0x555555d58bf6 in check_updates unpack-trees.c:480
#15 0x555555d5eb55 in unpack_trees unpack-trees.c:2040
#16 0x555555785ab7 in checkout builtin/clone.c:724
#17 0x555555785ab7 in cmd_clone builtin/clone.c:1384
#18 0x55555572443c in run_builtin git.c:466
#19 0x55555572443c in handle_builtin git.c:721
#20 0x555555727872 in run_argv git.c:788
#21 0x555555727872 in cmd_main git.c:926
#22 0x555555721fa0 in main common-main.c:57
#23 0x7ffff73f1d09 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: heap-buffer-overflow attr.c:393 in parse_attr_line
Shadow bytes around the buggy address:
0x0c047fff8450: fa fa 00 02 fa fa 00 07 fa fa fd fd fa fa 00 00
0x0c047fff8460: fa fa 02 fa fa fa fd fd fa fa 00 06 fa fa 05 fa
0x0c047fff8470: fa fa fd fd fa fa 00 02 fa fa 06 fa fa fa 05 fa
0x0c047fff8480: fa fa 07 fa fa fa fd fd fa fa 00 01 fa fa 00 02
0x0c047fff8490: fa fa 00 03 fa fa 00 fa fa fa 00 01 fa fa 00 03
=>0x0c047fff84a0: fa fa 00 01 fa fa 00 02 fa fa[02]fa fa fa fa fa
0x0c047fff84b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff84c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff84d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff84e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff84f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==15062==ABORTING
Fix this bug by using `size_t` instead to count the number of attributes
so that this value cannot reasonably overflow without running out of
memory before already.
Reported-by: Markus Vervier <markus.vervier@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pull bot
pushed a commit
that referenced
this pull request
Jan 17, 2023
When using a padding specifier in the pretty format passed to git-log(1)
we need to calculate the string length in several places. These string
lengths are stored in `int`s though, which means that these can easily
overflow when the input lengths exceeds 2GB. This can ultimately lead to
an out-of-bounds write when these are used in a call to memcpy(3P):
==8340==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1ec62f97fe at pc 0x7f2127e5f427 bp 0x7ffd3bd63de0 sp 0x7ffd3bd63588
WRITE of size 1 at 0x7f1ec62f97fe thread T0
#0 0x7f2127e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
#1 0x5628e96aa605 in format_and_pad_commit pretty.c:1762
#2 0x5628e96aa7f4 in format_commit_item pretty.c:1801
#3 0x5628e97cdb24 in strbuf_expand strbuf.c:429
#4 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
#5 0x5628e96acd0f in pretty_print_commit pretty.c:2161
#6 0x5628e95a44c8 in show_log log-tree.c:781
#7 0x5628e95a76ba in log_tree_commit log-tree.c:1117
#8 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
#9 0x5628e922c35b in cmd_log_walk builtin/log.c:549
#10 0x5628e922f1a2 in cmd_log builtin/log.c:883
#11 0x5628e9106993 in run_builtin git.c:466
#12 0x5628e9107397 in handle_builtin git.c:721
#13 0x5628e9107b07 in run_argv git.c:788
#14 0x5628e91088a7 in cmd_main git.c:923
#15 0x5628e939d682 in main common-main.c:57
#16 0x7f2127c3c28f (/usr/lib/libc.so.6+0x2328f)
#17 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#18 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115
0x7f1ec62f97fe is located 2 bytes to the left of 4831838265-byte region [0x7f1ec62f9800,0x7f1fe62f9839)
allocated by thread T0 here:
#0 0x7f2127ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x5628e98774d4 in xrealloc wrapper.c:136
#2 0x5628e97cb01c in strbuf_grow strbuf.c:99
#3 0x5628e97ccd42 in strbuf_addchars strbuf.c:327
#4 0x5628e96aa55c in format_and_pad_commit pretty.c:1761
#5 0x5628e96aa7f4 in format_commit_item pretty.c:1801
#6 0x5628e97cdb24 in strbuf_expand strbuf.c:429
#7 0x5628e96ab060 in repo_format_commit_message pretty.c:1869
#8 0x5628e96acd0f in pretty_print_commit pretty.c:2161
#9 0x5628e95a44c8 in show_log log-tree.c:781
#10 0x5628e95a76ba in log_tree_commit log-tree.c:1117
#11 0x5628e922bed5 in cmd_log_walk_no_free builtin/log.c:508
#12 0x5628e922c35b in cmd_log_walk builtin/log.c:549
#13 0x5628e922f1a2 in cmd_log builtin/log.c:883
#14 0x5628e9106993 in run_builtin git.c:466
#15 0x5628e9107397 in handle_builtin git.c:721
#16 0x5628e9107b07 in run_argv git.c:788
#17 0x5628e91088a7 in cmd_main git.c:923
#18 0x5628e939d682 in main common-main.c:57
#19 0x7f2127c3c28f (/usr/lib/libc.so.6+0x2328f)
#20 0x7f2127c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#21 0x5628e91020e4 in _start ../sysdeps/x86_64/start.S:115
SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
Shadow bytes around the buggy address:
0x0fe458c572a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0fe458c572b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0fe458c572c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0fe458c572d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0fe458c572e0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
=>0x0fe458c572f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa[fa]
0x0fe458c57300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0fe458c57310: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0fe458c57320: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0fe458c57330: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0x0fe458c57340: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==8340==ABORTING
The pretty format can also be used in `git archive` operations via the
`export-subst` attribute. So this is what in our opinion makes this a
critical issue in the context of Git forges which allow to download an
archive of user supplied Git repositories.
Fix this vulnerability by using `size_t` instead of `int` to track the
string lengths. Add tests which detect this vulnerability when Git is
compiled with the address sanitizer.
Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Original-patch-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Modified-by: Taylor Blau <me@ttalorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pull bot
pushed a commit
that referenced
this pull request
Jan 17, 2023
With the `%>>(<N>)` pretty formatter, you can ask git-log(1) et al to
steal spaces. To do so we need to look ahead of the next token to see
whether there are spaces there. This loop takes into account ANSI
sequences that end with an `m`, and if it finds any it will skip them
until it finds the first space. While doing so it does not take into
account the buffer's limits though and easily does an out-of-bounds
read.
Add a test that hits this behaviour. While we don't have an easy way to
verify this, the test causes the following failure when run with
`SANITIZE=address`:
==37941==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000000baf at pc 0x55ba6f88e0d0 bp 0x7ffc84c50d20 sp 0x7ffc84c50d10
READ of size 1 at 0x603000000baf thread T0
#0 0x55ba6f88e0cf in format_and_pad_commit pretty.c:1712
#1 0x55ba6f88e7b4 in format_commit_item pretty.c:1801
#2 0x55ba6f9b1ae4 in strbuf_expand strbuf.c:429
#3 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
#4 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
#5 0x55ba6f7884c8 in show_log log-tree.c:781
#6 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
#7 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
#8 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
#9 0x55ba6f4131a2 in cmd_log builtin/log.c:883
#10 0x55ba6f2ea993 in run_builtin git.c:466
#11 0x55ba6f2eb397 in handle_builtin git.c:721
#12 0x55ba6f2ebb07 in run_argv git.c:788
#13 0x55ba6f2ec8a7 in cmd_main git.c:923
#14 0x55ba6f581682 in main common-main.c:57
#15 0x7f2d08c3c28f (/usr/lib/libc.so.6+0x2328f)
#16 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#17 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115
0x603000000baf is located 1 bytes to the left of 24-byte region [0x603000000bb0,0x603000000bc8)
allocated by thread T0 here:
#0 0x7f2d08ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
#1 0x55ba6fa5b494 in xrealloc wrapper.c:136
#2 0x55ba6f9aefdc in strbuf_grow strbuf.c:99
#3 0x55ba6f9b0a06 in strbuf_add strbuf.c:298
#4 0x55ba6f9b1a25 in strbuf_expand strbuf.c:418
#5 0x55ba6f88f020 in repo_format_commit_message pretty.c:1869
#6 0x55ba6f890ccf in pretty_print_commit pretty.c:2161
#7 0x55ba6f7884c8 in show_log log-tree.c:781
#8 0x55ba6f78b6ba in log_tree_commit log-tree.c:1117
#9 0x55ba6f40fed5 in cmd_log_walk_no_free builtin/log.c:508
#10 0x55ba6f41035b in cmd_log_walk builtin/log.c:549
#11 0x55ba6f4131a2 in cmd_log builtin/log.c:883
#12 0x55ba6f2ea993 in run_builtin git.c:466
#13 0x55ba6f2eb397 in handle_builtin git.c:721
#14 0x55ba6f2ebb07 in run_argv git.c:788
#15 0x55ba6f2ec8a7 in cmd_main git.c:923
#16 0x55ba6f581682 in main common-main.c:57
#17 0x7f2d08c3c28f (/usr/lib/libc.so.6+0x2328f)
#18 0x7f2d08c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#19 0x55ba6f2e60e4 in _start ../sysdeps/x86_64/start.S:115
SUMMARY: AddressSanitizer: heap-buffer-overflow pretty.c:1712 in format_and_pad_commit
Shadow bytes around the buggy address:
0x0c067fff8120: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
0x0c067fff8130: fd fd fa fa fd fd fd fd fa fa fd fd fd fa fa fa
0x0c067fff8140: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
0x0c067fff8150: fa fa fd fd fd fd fa fa 00 00 00 fa fa fa fd fd
0x0c067fff8160: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
=>0x0c067fff8170: fd fd fd fa fa[fa]00 00 00 fa fa fa 00 00 00 fa
0x0c067fff8180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff8190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff81a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff81b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c067fff81c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Luckily enough, this would only cause us to copy the out-of-bounds data
into the formatted commit in case we really had an ANSI sequence
preceding our buffer. So this bug likely has no security consequences.
Fix it regardless by not traversing past the buffer's start.
Reported-by: Patrick Steinhardt <ps@pks.im>
Reported-by: Eric Sesterhenn <eric.sesterhenn@x41-dsec.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
pull bot
pushed a commit
that referenced
this pull request
Jan 15, 2026
It is possible to hit a memory leak when reading data from a submodule
via git-grep(1):
Direct leak of 192 byte(s) in 1 object(s) allocated from:
#0 0x55555562e726 in calloc (git+0xda726)
#1 0x555555964734 in xcalloc ../wrapper.c:154:8
#2 0x555555835136 in load_multi_pack_index_one ../midx.c:135:2
#3 0x555555834fd6 in load_multi_pack_index ../midx.c:382:6
#4 0x5555558365b6 in prepare_multi_pack_index_one ../midx.c:716:17
#5 0x55555586c605 in packfile_store_prepare ../packfile.c:1103:3
#6 0x55555586c90c in packfile_store_reprepare ../packfile.c:1118:2
#7 0x5555558546b3 in odb_reprepare ../odb.c:1106:2
#8 0x5555558539e4 in do_oid_object_info_extended ../odb.c:715:4
#9 0x5555558533d1 in odb_read_object_info_extended ../odb.c:862:8
#10 0x5555558540bd in odb_read_object ../odb.c:920:6
#11 0x55555580a330 in grep_source_load_oid ../grep.c:1934:12
#12 0x55555580a13a in grep_source_load ../grep.c:1986:10
#13 0x555555809103 in grep_source_is_binary ../grep.c:2014:7
#14 0x555555807574 in grep_source_1 ../grep.c:1625:8
#15 0x555555807322 in grep_source ../grep.c:1837:10
#16 0x5555556a5c58 in run ../builtin/grep.c:208:10
#17 0x55555562bb42 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o
#18 0x7ffff7a9a979 in start_thread (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x9a979) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
#19 0x7ffff7b22d2b in __GI___clone3 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x122d2b) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
The root caues of this leak is the way we set up and release the
submodule:
1. We use `repo_submodule_init()` to initialize a new repository. This
repository is stored in `repos_to_free`.
2. We now read data from the submodule repository.
3. We then call `repo_clear()` on the submodule repositories.
4. `repo_clear()` calls `odb_free()`.
5. `odb_free()` calls `odb_free_sources()` followed by `odb_close()`.
The issue here is the 5th step: we call `odb_free_sources()` _before_ we
call `odb_close()`. But `odb_free_sources()` already frees all sources,
so the logic that closes them in `odb_close()` now becomes a no-op. As a
consequence, we never explicitly close sources at all.
Fix the leak by closing the store before we free the sources.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )