Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Corrupt input data #17

Open
kitsudog opened this issue Sep 30, 2020 · 3 comments
Open

Corrupt input data #17

kitsudog opened this issue Sep 30, 2020 · 3 comments
Labels
bug Something isn't working wontfix This will not be worked on

Comments

@kitsudog
Copy link

kitsudog commented Sep 30, 2020

PARTS_01_SHIRTF.l2db.zip

CompressionHelper.py

  39:  try:
  40:      return dec.decompress(data[5:])
  41:  except:
  42:      return lzma.decompress(data)

it works.

@K0lb3
Copy link
Owner

K0lb3 commented Oct 7, 2020

Thanks for the report.
The fix you proposed wouldn't fix the root of the issue,
which is probably a false flag detection.
I will look into it in the coming days.

@K0lb3
Copy link
Owner

K0lb3 commented Oct 9, 2020

I noticed that the data in question is compressed via lzma.FORMAT_ALONE which is the legacy lzma format.

So I checked how the properties are encoded in that format and found the following code

void DecodeProperties(const Byte *properties)
  {
    unsigned d = properties[0];
    if (d >= (9 * 5 * 5))
      throw "Incorrect LZMA properties";
    lc = d % 9;
    d /= 9;
    pb = d / 5;
    lp = d % 5;
    dictSizeInProperties = 0;
    for (int i = 0; i < 4; i++)
      dictSizeInProperties |= (UInt32)properties[i + 1] << (8 * i);
    dictSize = dictSizeInProperties;
    if (dictSize < LZMA_DIC_MIN)
      dictSize = LZMA_DIC_MIN;
  }

which is pretty much the same as the current python code besides the min dict size.

So it's better to simply use lzma.decompress(data, format = lzma.FORMAT_ALONE) .

I'm going to run some tests now to confirm this.

@K0lb3
Copy link
Owner

K0lb3 commented Oct 9, 2020

Hm,
according to my tests it's not exactly the same.
I found a good way to identify the compression format tho.

When the ALONE format is used, data[5] equals 255,
if the specific compressor is used, it equals 0.

I'm going to run some further tests and so how it goes with that.

@K0lb3 K0lb3 added bug Something isn't working wontfix This will not be worked on labels Nov 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants