Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dir: fix malloc of root untracked_cache_dir #884

Conversation

jeffhostetler
Copy link

@jeffhostetler jeffhostetler commented Feb 24, 2021

cc: Taylor Blau me@ttaylorr.com
cc: Jeff King peff@peff.net
cc: Jeff Hostetler git@jeffhostetler.com

dir.c Outdated
@@ -2731,7 +2731,7 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
}

if (!dir->untracked->root) {
const int len = sizeof(*dir->untracked->root);
const int len = st_add(sizeof(*dir->untracked->root), 1);
dir->untracked->root = xmalloc(len);
memset(dir->untracked->root, 0, len);
Copy link

@peff peff Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the intent here is that the name flex-array would be the empty string (your "account for the flex array" confused me for a moment; I think the issue is that the array is supposed to be a NUL-terminated string, so we are really accounting for that).

So I think your "extend the memset" in the commit message is also "extend the malloc". We need a place to put that NUL byte (and what the patch is doing here is correct).

I'm likewise curious that this would show up on Windows and not elsewhere. It would be a non-problem on any system that defines FLEX_ARRAY to 1. But most compilers (especially modern Unix ones like gcc or clang) use the empty string, so I'd expect them to be vulnerable to the bug.

I do think this whole thing could be written as:

FLEX_ALLOC_STR(dir->untracked->root, name, "");

That IMHO shows the intent more clearly, as well as getting rid of the manual computation (which is the reason I added those macros in the first place; I didn't catch this in my audits because it erroneously wasn't doing any computation in the first place!). It also avoids the use of int for storing a computed length (it's OK here because we know the name field is the empty string, but it would be likely to come up in a grep-based audit).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I was hoping there was a better way to write that. (I must admit that after several hours of wandering aimlessly thru the debugger and finally finding the problem, I didn't think to look for a FLEX_ macro to do it better.)

@dscho
Copy link
Member

dscho commented Feb 24, 2021

Before /submiting, please edit the PR description (it can be empty) because it'll become the cover letter.

Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
for the root directory.  Get rid of unsafe code that might fail to
initialize the `name` field (if FLEX_ARRAY is not 1).  This will
make it clear that we intend to have a structure with an empty
string following it.

A problem was observed on Windows where the length of the memset() was
too short, so the first byte of the name field was not zeroed.  This
resulted in the name field having garbage from a previous use of that
area of memory.

The record for the root directory was then written to the untracked-cache
extension in the index.  This garbage would then be visible to future
commands when they reloaded the untracked-cache extension.

Since the directory record for the root directory had garbage in the
`name` field, the `t/helper/test-tool dump-untracked-cache` tool
printed this garbage as the path prefix (rather than '/') for each
directory in the untracked cache as it recursed.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
@jeffhostetler
Copy link
Author

/submit

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

Submitted as pull.884.git.1614177117508.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git pr-884/jeffhostetler/untracked-cache-corruption-v1

To fetch this version to local tag pr-884/jeffhostetler/untracked-cache-corruption-v1:

git fetch --no-tags https://github.com/gitgitgadget/git tag pr-884/jeffhostetler/untracked-cache-corruption-v1

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

On the Git mailing list, Taylor Blau wrote (reply to this):

On Wed, Feb 24, 2021 at 02:31:57PM +0000, Jeff Hostetler via GitGitGadget wrote:
> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
> for the root directory.  Get rid of unsafe code that might fail to
> initialize the `name` field (if FLEX_ARRAY is not 1).  This will
> make it clear that we intend to have a structure with an empty
> string following it.

I had to read this patch message a few times over to make sure that I
wasn't missing something in the patch itself, but it looks all good to
me.

  Reviewed-by: Taylor Blau <me@ttaylorr.com>

Thanks,
Taylor

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

User Taylor Blau <me@ttaylorr.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Jeff Hostetler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Jeff Hostetler <jeffhost@microsoft.com>
>
> Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
> for the root directory.  Get rid of unsafe code that might fail to
> initialize the `name` field (if FLEX_ARRAY is not 1).  This will
> make it clear that we intend to have a structure with an empty
> string following it.
>
> A problem was observed on Windows where the length of the memset() was
> too short, so the first byte of the name field was not zeroed.  This
> resulted in the name field having garbage from a previous use of that
> area of memory.
>
> The record for the root directory was then written to the untracked-cache
> extension in the index.  This garbage would then be visible to future
> commands when they reloaded the untracked-cache extension.
>
> Since the directory record for the root directory had garbage in the
> `name` field, the `t/helper/test-tool dump-untracked-cache` tool
> printed this garbage as the path prefix (rather than '/') for each
> directory in the untracked cache as it recursed.
>
> Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
> ---
>     dir: fix malloc of root untracked_cache_dir

Nicely spotted.

The problematic code was introduced in 2015, a year before these
FLEX_ALLOC_*() helpers were introduced.  The result is of course
correct and much easier to read, as the necessary "ask for a region
of calloc'ed memory with an additional byte for terminating NUL
beyond strlen()" is hidden in the helper.

Will queue; thanks.

> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-884%2Fjeffhostetler%2Funtracked-cache-corruption-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-884/jeffhostetler/untracked-cache-corruption-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/884
>
>  dir.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/dir.c b/dir.c
> index d153a63bbd14..fd8aa7c40faa 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -2730,11 +2730,8 @@ static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct *d
>  		return NULL;
>  	}
>  
> -	if (!dir->untracked->root) {
> -		const int len = sizeof(*dir->untracked->root);
> -		dir->untracked->root = xmalloc(len);
> -		memset(dir->untracked->root, 0, len);
> -	}
> +	if (!dir->untracked->root)
> +		FLEX_ALLOC_STR(dir->untracked->root, name, "");
>  
>  	/* Validate $GIT_DIR/info/exclude and core.excludesfile */
>  	root = dir->untracked->root;
>
> base-commit: 966e671106b2fd38301e7c344c754fd118d0bb07

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

On the Git mailing list, Jeff King wrote (reply to this):

On Wed, Feb 24, 2021 at 12:08:42PM -0800, Junio C Hamano wrote:

> > Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
> > for the root directory.  Get rid of unsafe code that might fail to
> > initialize the `name` field (if FLEX_ARRAY is not 1).  This will
> > make it clear that we intend to have a structure with an empty
> > string following it.
> [...]
> The problematic code was introduced in 2015, a year before these
> FLEX_ALLOC_*() helpers were introduced.  The result is of course
> correct and much easier to read, as the necessary "ask for a region
> of calloc'ed memory with an additional byte for terminating NUL
> beyond strlen()" is hidden in the helper.

When I added the FLEX_ALLOC_* helpers, I audited for existing callers to
convert. But I did so by looking for places where we were doing manual
size computations. The bug here was that it was not doing any
computation at all (when it need to be doing "+1"). So that's my guess
why it got overlooked (which is not super important, but may give a hint
about how to look for similar bugs).

-Peff

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

User Jeff King <peff@peff.net> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 2/24/21 4:05 PM, Jeff King wrote:
> On Wed, Feb 24, 2021 at 12:08:42PM -0800, Junio C Hamano wrote:
> 
>>> Use FLEX_ALLOC_STR() to allocate the `struct untracked_cache_dir`
>>> for the root directory.  Get rid of unsafe code that might fail to
>>> initialize the `name` field (if FLEX_ARRAY is not 1).  This will
>>> make it clear that we intend to have a structure with an empty
>>> string following it.
>> [...]
>> The problematic code was introduced in 2015, a year before these
>> FLEX_ALLOC_*() helpers were introduced.  The result is of course
>> correct and much easier to read, as the necessary "ask for a region
>> of calloc'ed memory with an additional byte for terminating NUL
>> beyond strlen()" is hidden in the helper.
> 
> When I added the FLEX_ALLOC_* helpers, I audited for existing callers to
> convert. But I did so by looking for places where we were doing manual
> size computations. The bug here was that it was not doing any
> computation at all (when it need to be doing "+1"). So that's my guess
> why it got overlooked (which is not super important, but may give a hint
> about how to look for similar bugs).


There's another call to xmalloc() at [1] that does the st_add3()
thing that I didn't change.  It was correctly computing the size
so I didn't want to change it for no reason.  That and the 2 memcpy()s
made it feel like we should have a different FLEX_ macro for this
case -- more of a FLEX_DUP... or something.  But I didn't want to
distract from my bug fix here.

[1] https://github.com/git/git/blob/master/dir.c#L3290

Jeff

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

User Jeff Hostetler <git@jeffhostetler.com> has been added to the cc: list.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 24, 2021

On the Git mailing list, Jeff King wrote (reply to this):

On Wed, Feb 24, 2021 at 04:15:11PM -0500, Jeff Hostetler wrote:

> > When I added the FLEX_ALLOC_* helpers, I audited for existing callers to
> > convert. But I did so by looking for places where we were doing manual
> > size computations. The bug here was that it was not doing any
> > computation at all (when it need to be doing "+1"). So that's my guess
> > why it got overlooked (which is not super important, but may give a hint
> > about how to look for similar bugs).
> 
> There's another call to xmalloc() at [1] that does the st_add3()
> thing that I didn't change.  It was correctly computing the size
> so I didn't want to change it for no reason.  That and the 2 memcpy()s
> made it feel like we should have a different FLEX_ macro for this
> case -- more of a FLEX_DUP... or something.  But I didn't want to
> distract from my bug fix here.

Thanks for pointing it out. Even if we change it, I agree it's better to
keep it out of your existing bugfix patch.

But I do think that spot gets weird. We are copying all of the elements
of the struct _except_ the name field, which comes from somewhere else.
So it's tempting to do:

diff --git a/dir.c b/dir.c
index d153a63bbd..ab16db3f76 100644
--- a/dir.c
+++ b/dir.c
@@ -3287,9 +3287,9 @@ static int read_one_dir(struct untracked_cache_dir **untracked_,
 	if (!eos || eos == end)
 		return -1;
 
-	*untracked_ = untracked = xmalloc(st_add3(sizeof(*untracked), eos - data, 1));
+	FLEX_ALLOC_MEM(untracked, name, data, eos - data);
 	memcpy(untracked, &ud, sizeof(ud));
-	memcpy(untracked->name, data, eos - data + 1);
+	*untracked_ = untracked;
 	data = eos + 1;
 
 	for (i = 0; i < untracked->untracked_nr; i++) {

But that's wrong! The remaining memcpy using sizeof(ud) might or might
not touch the first byte of the name field, depending on struct padding
and the value of FLEX_ARRAY. And if it does, then we'd be overwriting
the early bytes of that field with whatever was in "ud" (which I guess
is uninitialized cruft, because that is a stack variable).

So it would have to be:

  memcpy(untracked, &ud, offsetof(struct untracked_cache_dir, name));

Quite subtle. Even with a comment, I think I prefer the existing code.
If we had a macro as you suggest for "allocate a flex struct, dup most of
the contents from this other struct, and then copy in the flex bytes
from this other spot", that would make it more readable. But I'm not
sure we'd find more than one spot to use such a macro. :)

-Peff

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 25, 2021

This branch is now known as jh/untracked-cache-fix.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 25, 2021

This patch series was integrated into seen via git@ed3bf13.

@gitgitgadget gitgitgadget bot added the seen label Feb 25, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 26, 2021

This patch series was integrated into seen via git@23cc612.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 26, 2021

This patch series was integrated into seen via git@d975fa7.

@gitgitgadget
Copy link

gitgitgadget bot commented Feb 26, 2021

This patch series was integrated into next via git@79d1e40.

@gitgitgadget gitgitgadget bot added the next label Feb 26, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Feb 27, 2021

This patch series was integrated into seen via git@24f544c.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 1, 2021

This patch series was integrated into seen via git@9889cff.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 1, 2021

This patch series was integrated into next via git@9889cff.

@gitgitgadget
Copy link

gitgitgadget bot commented Mar 1, 2021

This patch series was integrated into master via git@9889cff.

@gitgitgadget gitgitgadget bot added the master label Mar 1, 2021
@gitgitgadget gitgitgadget bot closed this Mar 1, 2021
@gitgitgadget
Copy link

gitgitgadget bot commented Mar 1, 2021

Closed via 9889cff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants