-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scalar: avoid segfault in reconfigure --all #1724
scalar: avoid segfault in reconfigure --all #1724
Conversation
/submit |
Submitted as pull.1724.git.1714496333004.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This branch is now known as |
This patch series was integrated into seen via git@de93395. |
This patch series was integrated into seen via git@85f6920. |
There was a status update in the "New Topics" section about the branch Scalar fix. Comments? source: <pull.1724.git.1714496333004.gitgitgadget@gmail.com> |
This comment was marked as off-topic.
This comment was marked as off-topic.
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Tue, Apr 30, 2024 at 04:58:52PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@gmail.com>
>
> During the latest v2.45.0 update, 'scalar reconfigure --all' started to
> segfault on my machine. Breaking it down via the debugger, it was
> faulting on a NULL reference to the_hash_algo, which is a macro pointing
> to the_repository->hash_algo.
>
> This NULL reference appears to be due to the way the loop is abusing the
> the_repository pointer, pointing it to a local repository struct after
> discovering that the current directory is a valid Git repository. This
> repo-swapping bit was in the original implementation from 4582676075
> (scalar: teach 'reconfigure' to optionally handle all registered
> enlistments, 2021-12-03), but only recently started segfaulting while
> trying to parse the HEAD reference. This also only happens on the
> _second_ repository in the list, so does not reproduce if there is only
> one registered repo.
Interesting. This also has some overlap with my patch series that aims
to drop the default-SHA1 fallback that we have in place for
`the_repository` [1].
> Add a test to t9210-scalar.sh to test 'scalar reconfigure --all' with
> multiple registered repos, as a precaution. Unfortunately, I was unable
> to reproduce the segfault using this test, so there is some coverage
> left to be desired. What exactly causes my setup to hit this bug but not
> this test structure? Unclear.
One way to trigger _a_ BUG is to use an "includeIf.onbranch:foobar.path"
config. This will cause Git to try and look up the "HEAD" reference, but
because we have a partially-configured repository, only, that will crash
with:
BUG: refs.c:2123: reference backend is unknown
Whether that bug is the one that you have seen I cannot tell. It at
least does not sound like it.
test_expect_success 'scalar reconfigure --all with includeIf.onbranch' '
repos="two three four" &&
for num in $repos
do
git init $num/src &&
scalar register $num/src &&
git -C $num/src config includeif."onbranch:foo".path something &&
git -C $num/src config core.preloadIndex false || return 1
done &&
scalar reconfigure --all &&
for num in $repos
do
test true = "$(git -C $num/src config core.preloadIndex)" || return 1
done
'
Another issue, which is likely the one you report here, is if you have a
repository with detached "HEAD". In that case we use `get_oid_hex()` to
parse "HEAD", which will implicitly use `the_hash_algo`. But because you
unset it in the second round this will fail with a segfault when you try
to access it:
./test-lib.sh: line 1069: 583995 Segmentation fault (core dumped) scalar reconfigure --all
This is the following testcase:
test_expect_success 'scalar reconfigure --all with detached HEADs' '
repos="two three four" &&
for num in $repos
do
rm -rf $num/src &&
git init $num/src &&
scalar register $num/src &&
git -C $num/src config core.preloadIndex false &&
test_commit -C $num/src initial &&
git -C $num/src switch --detach HEAD || return 1
done &&
scalar reconfigure --all &&
for num in $repos
do
test true = "$(git -C $num/src config core.preloadIndex)" || return 1
done
'
This issue should be fixed by my patch series in [1] because we start to
use `get_oid_hex_any()` to parse detached HEADs.
Anyway, your fix is indeed effective because with `repo_init()` we now
properly configure the repository.
[1]: https://lore.kernel.org/git/cover.1714371422.git.ps@pks.im/
Patrick |
User |
This patch series was integrated into seen via git@3a6be86. |
This patch series was integrated into seen via git@7885ec7. |
There was a status update in the "Cooking" section about the branch Scalar fix. Comments? source: <pull.1724.git.1714496333004.gitgitgadget@gmail.com> |
6f0f5fa
to
5be703b
Compare
On the Git mailing list, Derrick Stolee wrote (reply to this): On 5/2/24 2:46 AM, Patrick Steinhardt wrote:
> On Tue, Apr 30, 2024 at 04:58:52PM +0000, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <stolee@gmail.com>
>>
>> During the latest v2.45.0 update, 'scalar reconfigure --all' started to
>> segfault on my machine. Breaking it down via the debugger, it was
>> faulting on a NULL reference to the_hash_algo, which is a macro pointing
>> to the_repository->hash_algo.
>>
>> This NULL reference appears to be due to the way the loop is abusing the
>> the_repository pointer, pointing it to a local repository struct after
>> discovering that the current directory is a valid Git repository. This
>> repo-swapping bit was in the original implementation from 4582676075
>> (scalar: teach 'reconfigure' to optionally handle all registered
>> enlistments, 2021-12-03), but only recently started segfaulting while
>> trying to parse the HEAD reference. This also only happens on the
>> _second_ repository in the list, so does not reproduce if there is only
>> one registered repo.
> Interesting. This also has some overlap with my patch series that aims
> to drop the default-SHA1 fallback that we have in place for
> `the_repository` [1].
Thanks for this pointer. It indeed will help.
>> Add a test to t9210-scalar.sh to test 'scalar reconfigure --all' with
>> multiple registered repos, as a precaution. Unfortunately, I was unable
>> to reproduce the segfault using this test, so there is some coverage
>> left to be desired. What exactly causes my setup to hit this bug but not
>> this test structure? Unclear.
> One way to trigger _a_ BUG is to use an "includeIf.onbranch:foobar.path"
> config. This will cause Git to try and look up the "HEAD" reference, but
> because we have a partially-configured repository, only, that will crash
> with:
>
> BUG: refs.c:2123: reference backend is unknown
This is a good extra find. After your explanation for the second
test, I'm confident that I was hitting the detached HEAD case on
my machine.
I will shamelessly steal your tests in my v2.
> This issue should be fixed by my patch series in [1] because we start to
> use `get_oid_hex_any()` to parse detached HEADs.
>
> Anyway, your fix is indeed effective because with `repo_init()` we now
> properly configure the repository.
I appreciate that your series will fix this in a ref-focused way. I think
this change could prevent other bad interactions with the_repository in
the future.
Thanks,
-Stolee
|
/submit |
Submitted as pull.1724.v2.git.1714874298859.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Sun, May 05, 2024 at 01:58:18AM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@gmail.com>
>
> During the latest v2.45.0 update, 'scalar reconfigure --all' started to
> segfault on my machine. Breaking it down via the debugger, it was
> faulting on a NULL reference to the_hash_algo, which is a macro pointing
> to the_repository->hash_algo.
>
> This NULL reference appears to be due to the way the loop is abusing the
> the_repository pointer, pointing it to a local repository struct after
> discovering that the current directory is a valid Git repository. This
> repo-swapping bit was in the original implementation from 4582676075
> (scalar: teach 'reconfigure' to optionally handle all registered
> enlistments, 2021-12-03), but only recently started segfaulting while
> trying to parse the HEAD reference. This also only happens on the
> _second_ repository in the list, so does not reproduce if there is only
> one registered repo.
I think this explanation could use an update now that we have figured
out the root cause likely being a detached HEAD, where `get_oid_hex()`
will then access a NULL pointer.
> My first inclination was to try to refactor cmd_reconfigure() to execute
> 'git for-each-repo' instead of this loop. In addition to the difficulty
> of executing 'scalar reconfigure' within 'git for-each-repo', it would
> be difficult to perform the clean-up logic for non-existent repos if we
> relied on that child process.
>
> Instead, I chose to move the temporary repo to be within the loop and
> reinstate the_repository to its old value after we are done performing
> logic on the current array item.
>
> Add tests to t9210-scalar.sh to test 'scalar reconfigure --all' with
> multiple registered repos. There are two different ways that the old
> use of the_repository could trigger bugs. These issues are being solved
> independently to be more careful about the_repository being
> uninitialized, but the change in this patch around the use of
> the_repository is still a good safety precaution.
>
> Co-authored-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Derrick Stolee <stolee@gmail.com>
> ---
> scalar: avoid segfault in reconfigure --all
>
> Update 'scalar reconfigure --all' to be more careful with the_repository
> pointer, avoiding sudden halts in some scenarios.
>
> ------------------------------------------------------------------------
>
> I noticed this while validating the v2.45.0 release (specifically the
> microsoft/git fork, but this applies to the core project).
>
> Thanks, Patrick, for digging in and finding the critical reasons why
> this issue can happen. I've included Patrick's test cases and given him
> co-authorship. I forged his sign-off, so could you please ACK that
> sign-off, Patrick?
>
> -Stolee
Other than the above nit regarding the root cause analysis this patch
looks good to me, and I'm fine with the forged SOB. Thanks!
Patrick |
This patch series was integrated into seen via git@2754234. |
There was a status update in the "Cooking" section about the branch Scalar fix. Expecting a final (and hopefully small) reroll. cf. <Zjhufiq2DmBlVpIx@tanuki> source: <pull.1724.v2.git.1714874298859.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@7eebfa6. |
This patch series was integrated into seen via git@21e3564. |
During the latest v2.45.0 update, 'scalar reconfigure --all' started to segfault on my machine. Breaking it down via the debugger, it was faulting on a NULL reference to the_hash_algo, which is a macro pointing to the_repository->hash_algo. In my case, this is due to one of my repositories having a detached HEAD, which requires get_oid_hex() to parse that the HEAD reference is valid. Another way to cause a failure is to use the "includeIf.onbranch" config key, which will lead to a BUG() statement. My first inclination was to try to refactor cmd_reconfigure() to execute 'git for-each-repo' instead of this loop. In addition to the difficulty of executing 'scalar reconfigure' within 'git for-each-repo', it would be difficult to perform the clean-up logic for non-existent repos if we relied on that child process. Instead, I chose to move the temporary repo to be within the loop and reinstate the_repository to its old value after we are done performing logic on the current array item. Add tests to t9210-scalar.sh to test 'scalar reconfigure --all' with multiple registered repos. There are two different ways that the old use of the_repository could trigger bugs. These issues are being solved independently to be more careful about the_repository being uninitialized, but the change in this patch around the use of the_repository is still a good safety precaution. Co-authored-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Derrick Stolee <stolee@gmail.com>
5be703b
to
00d1019
Compare
/submit |
Submitted as pull.1724.v3.git.1715126749391.gitgitgadget@gmail.com To fetch this version into
To fetch this version to local tag
|
This patch series was integrated into seen via git@8ad0227. |
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Wed, May 08, 2024 at 12:05:49AM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <stolee@gmail.com>
>
> During the latest v2.45.0 update, 'scalar reconfigure --all' started to
> segfault on my machine. Breaking it down via the debugger, it was
> faulting on a NULL reference to the_hash_algo, which is a macro pointing
> to the_repository->hash_algo.
>
> In my case, this is due to one of my repositories having a detached HEAD,
> which requires get_oid_hex() to parse that the HEAD reference is valid.
> Another way to cause a failure is to use the "includeIf.onbranch" config
> key, which will lead to a BUG() statement.
>
> My first inclination was to try to refactor cmd_reconfigure() to execute
> 'git for-each-repo' instead of this loop. In addition to the difficulty
> of executing 'scalar reconfigure' within 'git for-each-repo', it would
> be difficult to perform the clean-up logic for non-existent repos if we
> relied on that child process.
>
> Instead, I chose to move the temporary repo to be within the loop and
> reinstate the_repository to its old value after we are done performing
> logic on the current array item.
>
> Add tests to t9210-scalar.sh to test 'scalar reconfigure --all' with
> multiple registered repos. There are two different ways that the old
> use of the_repository could trigger bugs. These issues are being solved
> independently to be more careful about the_repository being
> uninitialized, but the change in this patch around the use of
> the_repository is still a good safety precaution.
>
> Co-authored-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> Signed-off-by: Derrick Stolee <stolee@gmail.com>
Thanks, this version looks good to me!
Patrick |
This patch series was integrated into seen via git@b368d89. |
This patch series was integrated into next via git@eca398f. |
There was a status update in the "Cooking" section about the branch Scalar fix. Will merge to 'master'. source: <pull.1724.v3.git.1715126749391.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@7b44f63. |
There was a status update in the "Cooking" section about the branch Scalar fix. Will merge to 'master'. source: <pull.1724.v3.git.1715126749391.gitgitgadget@gmail.com> |
This patch series was integrated into seen via git@1838ac7. |
This patch series was integrated into seen via git@260a891. |
This patch series was integrated into seen via git@0cff342. |
There was a status update in the "Cooking" section about the branch Scalar fix. Will merge to 'master'. source: <pull.1724.v3.git.1715126749391.gitgitgadget@gmail.com> |
See gitgitgadget#1724 for the upstream fix. This includes an extra detail around the context of that change that will require some reaction in microsoft/git. Please see the `fixup!` and merge commit messages for details. This merges in the upstream commit that is going into `master` soon.
This patch series was integrated into seen via git@1e00d22. |
This patch series was integrated into master via git@1e00d22. |
This patch series was integrated into next via git@1e00d22. |
Closed via 1e00d22. |
Update 'scalar reconfigure --all' to be more careful with the_repository pointer, avoiding sudden halts in some scenarios.
I noticed this while validating the v2.45.0 release (specifically the microsoft/git fork, but this applies to the core project).
Thanks, Patrick, for digging in and finding the critical reasons why this issue can happen. Patrick ACKed the sign-off in v2.
v3 includes an update to the commit message. Sorry that I missed this paragraph when updating for v2.
-Stolee
cc: Patrick Steinhardt ps@pks.im