Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to decompress - LZXpress Huffman code sizes are over-subscribed #6

Closed
scudette opened this issue Jul 22, 2020 · 8 comments
Closed
Assignees

Comments

@scudette
Copy link

Just linking this issue Velocidex/go-prefetch#4 which also seems to affect libscca.

I have reattached the problematic file.
DOWNLOADER.EXE-CAE991BA.pf.zip

I only tested with pyscca version from pip (pip install libscca-python) maybe it is fixed in the latest version?

@joachimmetz
Copy link
Member

@scudette thx for the report, could be some LZXpress Huffman edge-case

libfwnt_huffman_tree_build: code sizes are over-subscribed.
libfwnt_lzxpress_huffman_decompress_chunk: unable to build Huffman tree.
libfwnt_lzxpress_huffman_decompress: unable to decompress chunk.
libscca_compressed_block_read: unable to decompress compressed data.
libscca_io_handle_read_compressed_blocks: unable to read compressed block data.
libscca_file_open_read: unable to read compressed blocks.
libscca_file_open_file_io_handle: unable to read from file handle.
libscca_file_open: unable to open file: DOWNLOADER.EXE-CAE991BA.pf.
info_handle_open_input: unable to open input file.

@joachimmetz joachimmetz self-assigned this Jul 22, 2020
@joachimmetz joachimmetz changed the title Failed to open a prefetch file. Unable to decompress - LZXpress Huffman code sizes are over-subscribed Jul 22, 2020
@scudette
Copy link
Author

Yes it is definitely related to decompression. I used Francesco Picaso's script here https://gist.github.com/dfirfpi/113ff71274a97b489dfd to decompress using the API and it worked without problems. A binary diff shows a discrepancy starting at byte 79768 (which continues to completely fail) but I have no idea why.

@joachimmetz
Copy link
Member

I'll have a closer look later if time permits, I suspect the WINAPI decompression routines to maybe be more error tolerant

@joachimmetz
Copy link
Member

For libfwnt judging on the output the of RtlDecompressBufferEx using https://github.com/libyal/assorted/blob/master/src/lzxpressdecompress.c the corruption happens around 0x000137e0 in the decompressed data.

lzxpressdecompress -v -2 -o 8 -d 226264 -t test DOWNLOADER.EXE-CAE991BA.pf

libfwnt_lzxpress_huffman_decompress_chunk: compressed data offset       : 19492 (0x00004c24)
libfwnt_lzxpress_huffman_decompress_chunk: huffman symbol               : 0x0134
libfwnt_lzxpress_huffman_decompress_chunk: number of bits               : 22
libfwnt_lzxpress_huffman_decompress_chunk: compression offset           : 8
libfwnt_lzxpress_huffman_decompress_chunk: compression size             : 7
libfwnt_lzxpress_huffman_decompress_chunk: uncompressed data offset     : 79827 (0x000137d3)

libfwnt_lzxpress_huffman_decompress_chunk: compressed data offset       : 19492 (0x00004c24)
libfwnt_lzxpress_huffman_decompress_chunk: huffman symbol               : 0x0134
libfwnt_lzxpress_huffman_decompress_chunk: number of bits               : 18
libfwnt_lzxpress_huffman_decompress_chunk: compression offset           : 9
libfwnt_lzxpress_huffman_decompress_chunk: compression size             : 7
libfwnt_lzxpress_huffman_decompress_chunk: uncompressed data offset     : 79834 (0x000137da)

libfwnt_lzxpress_huffman_decompress_chunk: compressed data offset       : 19494 (0x00004c26)
libfwnt_lzxpress_huffman_decompress_chunk: huffman symbol               : 0x00c8
libfwnt_lzxpress_huffman_decompress_chunk: number of bits               : 25

Per prebious debug output libfwnt this corresponds to approx offset 0x00004c24 in the input data

00004c00  09 98 50 0a 27 90 02 2b  a8 02 0b b8 d0 0c 2f 10  |..P.'..+....../.|
00004c10  03 33 c8 03 0d d8 50 0e  37 90 03 3f e8 3f b0 00  |.3....P.7..?.?..|
00004c20  40 ff 00 00 cd 0f 01 00  00 00 e6 ee de ee ea e9  |@...............|
00004c30  ee ee e6 ee ee ee e6 ee  ee ee e6 ee ed ee e6 ee  |................|

Looks like around 0x4c2a there is a new chunk of LZXpress-Huffman comrpessed data

lzxpressdecompress -v -2 -o 8 -d 226264 -s $(( 0x4c2a - 8 )) -t test DOWNLOADER.EXE-CAE991BA.pf

Results in a uncompressed output of:

0001fff0  a0 86 00 10 05 ff ff ff  a0 86 00 10 05 ff ff ff  |................|
00020000  a0 86                                             |..|

lzxpressdecompress -v -2 -o $(( 0x4c2a )) -d 226264 -s $(( 0x7422 - 0x4c2a )) -t test DOWNLOADER.EXE-CAE991BA.pf

00000000  e8 87 00 00 05 ff 8f ff  f0 87 00 00 05 ff 8f ff  |................|
00000010  f8 87 00 00 05 ff 8f ff  00 88 00 00 05 ff 8f ff  |................|
...
00012c50  64 00 61 00 34 00 7d 00  5c 00 24 00 45 00 58 00  |d.a.4.}.\.$.E.X.|
00012c60  54 00 45 00 4e 00 44 00  00 00 00 00 00 00 00 00  |T.E.N.D.........|
00012c70  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

A first assumption is that chunks of 0x20000 bytes are used.

@joachimmetz
Copy link
Member

joachimmetz commented Jul 23, 2020

A first assumption is that chunks of 0x20000 bytes are used.

Ignore that, chunks are 0x10000 bytes in size.

libfwnt_lzxpress_huffman_decompress_chunk: compressed data offset       : 19479 (0x00004c17)
libfwnt_lzxpress_huffman_decompress_chunk: huffman symbol               : 0x01ff
libfwnt_lzxpress_huffman_decompress_chunk: number of bits               : 26
libfwnt_lzxpress_huffman_decompress_chunk: compression offset           : 2024
libfwnt_lzxpress_huffman_decompress_chunk: compression size             : 15
libfwnt_lzxpress_huffman_decompress_chunk: extended compression size    : 270 (offset: 0x00004c19)
libfwnt_lzxpress_huffman_decompress_chunk: extended compression size    : 0 (offset: 0x00004c1a)
libfwnt_lzxpress_huffman_decompress_chunk: compression offset           : 34792
libfwnt_lzxpress_huffman_decompress_chunk: compression size             : 3
libfwnt_lzxpress_huffman_decompress_chunk: uncompressed data offset     : 79760 (0x00013790)

The second extended compression size of 0 is the edge-case and appears to have a special behavior. Which corresponds to offset 0x00004c1a + 8 = 0x00004c22.

00004c10  03 33 c8 03 0d d8 50 0e  37 90 03 3f e8 3f b0 00  |.3....P.7..?.?..|
00004c20  40 ff 00 00 cd 0f 01 00  00 00 e6 ee de ee ea e9  |@...............|

Which looks like an extension for a 32-bit extended size

@joachimmetz
Copy link
Member

joachimmetz commented Jul 23, 2020

Will make the changes to libfwnt to add 32-bit extended size support. Wondering when this was introduced and if this leads to issues with solutions using the WINAPI on older versions of Windows as well.

@joachimmetz
Copy link
Member

Closing this in favor of libyal/libfwnt#8

@scudette
Copy link
Author

scudette commented Jul 23, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants