Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mi_rezalloc() zero initialization is unsafe #63

Closed
kickunderscore opened this issue Jun 29, 2019 · 5 comments
Closed

mi_rezalloc() zero initialization is unsafe #63

kickunderscore opened this issue Jun 29, 2019 · 5 comments

Comments

@kickunderscore
Copy link

kickunderscore commented Jun 29, 2019

In _mi_realloc_zero() and _mi_heap_malloc_zero() only the requested size and not the possibly larger allocated size is initialized to zero.

This may lead to uninitialized 'new' memory from the applications point of view in a following call to _mi_realloc_zero(), where the 'fits' test using mi_usable_size(p) and the memcpy(newp, p, ...) relies on a defined state of the previously allocated and not only the previously requested memory range.

Also take into account, that using mi_rezalloc() on a pointer from a call to mi_malloc() is legal but would lead to uninitialized memory in the range from the requested size to the allocated size too. It seems to me that the rezalloc feature requires to store the requested size or at least to zero out the implicit allocated range form requested size to allocated size in all allocations.

I wonder if this is why posix doesn't specify rezalloc().


On my windows machine this code sample can reproduce the problem:

int main()
{
  //  mi_rezalloc() is not safe!

  //  test precondition: first allocation on page!

  //  force allocating the last block in the page next, so it will be 'recycled' immediately...
  for (int idx = 0; idx < 127; ++idx)
    ::mi_free(::mi_malloc(128));

  //  force known state of 'uninitialized' memory (by mi_malloc debug initialization or explicit memset)
  auto uninitialized = static_cast< unsigned char* >(::mi_malloc(128));
  assert(mi_usable_size(uninitialized) == 128);
  ::memset(uninitialized, 0xD0, 128);
  ::mi_free(uninitialized);

  //  this initialized allocation recycles the uninitialized pointer (if not, adapt forcing loop above or use a different size class...)
  auto initialized = static_cast< unsigned char* >(::mi_rezalloc(nullptr, 121));
  assert(mi_usable_size(initialized) == 128);
  assert(uninitialized == initialized);
  assert(initialized[  0] ==    0); // access in requested range; mimalloc has initialized this!
  assert(initialized[120] ==    0); // access in requested range; mimalloc has initialized this!
  assert(initialized[121] == 0xd0); // access in verified usable range; mimalloc has not initialized this!
  initialized[  0] = 'A';
  initialized[120] = 'Z';

  //  grow with initialization! remember what the api documentation says:
  //  > If the newsize is larger than the original allocated size of p, the extra bytes are initialized to zero.
  initialized = static_cast< unsigned char* >(::mi_rezalloc(initialized, 122));
  assert(mi_usable_size(initialized) == 128);
  assert(uninitialized == initialized);
  assert(initialized[  0] ==  'A'); // access in requested old range; mimalloc has not re-initialized this!
  assert(initialized[120] ==  'Z'); // access in requested old range; mimalloc has not re-initialized this!
  assert(initialized[121] == 0x00); // access in requested initialized new memory; mimalloc has not initialized this!
}
@daanx
Copy link
Collaborator

daanx commented Jul 3, 2019

Ah this is a bit tricky: I now designed the zero interface, like mi_calloc , mi_zalloc , to only zero out the requested size, and for mi_rezalloc to only zero out the any additional requested size (if reallocating to a larger size).

A change we could make is to zero out always up to the useable size instead of requested size. I think that would be a good idea.

_It seems to me that the rezalloc feature requires to store the requested size or at least to zero out the implicit allocated range form requested size to allocated size in all allocations. _

It does do that already, zero out any additional space if reallocating to a larger area. Of course, the existing data should not zero out.

@kickunderscore
Copy link
Author

It's tricky indeed! Unfortunately, a change limited to the 'z' and 'c' allocation variants will not solve the problem. An example:

Initial allocation of 6 bytes by mi_malloc() (with debug initialization simulating an undefined state):
*p=xD0D0D0D0D0D0D0D0D0D0

The application copies its C string of length 5 plus a null byte as termination, i.e. it defines the state of the requested memory completely:
*p=x48656C6C6F00D0D0 ("Hello")

Another part of the application wants to append a sixth character to the C string and ensure termination by using mi_rezalloc(p, 7). The current implementation of _mi_realloc_zero() recognizes that the assigned block is large enough and delivers it unchanged. The application appends another character but no termination, since it used the initialization of the additional memory with zero. We end up with an unterminated C string with funny effects up to segmentation fault.
*p=x48656C6C6F21D0D0 ("Hello!ÐÐ...")

Can the implementation of mi_rezalloc() prevent this?
I don't see any way to do this without mi_rezalloc() knowing the size of the original allocation or initializing the extra memory available in all allocations (not just the 'z' and 'c' variants) to zero.

Note that even if the old block is copied to a larger new block, the undefined memory area is copied as well, since only the block size is known. The extension of the zero initialization by one word in the area of the old block is overwritten, so that the extended initialization remains ineffective! (see master commit 7b4f359; alloc.c; lines 336 and 339)


On possible solution without performance losses if the feature is not used would be the requirement to pass the originally requested size resp. count in mi_rezalloc() resp. mi_recalloc(). The application needs to know these values anyway in order to use the memory correctly, so why not make this missing information available? Aren't there even suggestions to call mi_free() itself with the requested size to allow better debug/consistency checks?

@daanx
Copy link
Collaborator

daanx commented Jul 4, 2019

Ah, nice example. The way it should be I think is that (a) calloc, zalloc always zero out all the useable memory. rezalloc and recalloc are only well-defined for a previous calloc,zalloc,rezalloc,recalloc, but not if the previous memory was alloc'd with malloc. That would fix this I think. Still, we cannot enforce this and perhaps we should do away with the rezalloc variantcs...

@kickunderscore
Copy link
Author

> ...and perhaps we should do away with the rezalloc variantcs...
👍

daanx added a commit that referenced this issue Jul 9, 2019
@daanx
Copy link
Collaborator

daanx commented Jul 9, 2019

removed rezalloc and variants in the latest dev branch. kept mi_recalloc as a Unix variant for convenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants