[Request] Support games compressed in .xz format #1964

tony971 · 2017-06-07T16:01:24Z

PCSX2 has a shiny new .xz module. Any chance of users being able to shrink down their game library because of it?

gregory38 · 2017-06-08T09:24:18Z

I would love it.

Some info for future implementer

index information is builtin in the xz format
xz-utils code can be used as an example to get the index information

gregory38 · 2017-08-06T15:42:07Z

Future/new version of xz (5.3) add a new API to parse block information: lzma_file_info_decoder. This function decodes all headers data and creates a block list info structure (type lzma_index). Note, xz calls block index

Then you can create an iterator lzma_index_iter_init. (Note next element can be get with lzma_index_iter_next but I think it is useless).

You can directly go to the good block with lzma_index_iter_locate which will return the iterator of the address you want to decode.

Summary:

lzma_file_info_decoder(&stream, &index, ...); : out index
lzma_index_iter_init(&iterator, index); out iterator
lzma_index_iter_locate(&iter, uncompressed_address): out correctly set iterator
Allow to use iter.block.compressed_file_offset, iter.block.uncompressed_file_offset and iter.block.uncompressed_size

API is not clear for me. There are lzma_block/lzma_index/lzma_iter objects. It seems that above iterator should be used to decode header block info.

while (!lzma_index_iter_next(&iter, LZMA_INDEX_ITER_BLOCK)) {
       lzma_block block;
      uint8_t header_size = fread of 1 bytes at iter.compressed_file_offset
      block.header_size = lzma_block_header_size_decode(header_size);
      // XXX need to block.version ! And likely block.check
      lzma_block_header_decode(&block, ..., compressed_buffer);
      all_info_struct.push(block);
   }

With block and file info, you can directly use lzma_block_buffer_decode

lzma_block_buffer_decode(lzma_block *block, const lzma_allocator *allocator,
		const uint8_t *in, size_t *in_pos, size_t in_size,
		uint8_t *out, size_t *out_pos, size_t out_size)

gregory38 · 2017-08-06T17:38:16Z

So I read the xz file format specification. Information are duplicated for redundancy/corruption checking.

So the full story is

block contains header + data. Header contains header size and may (depends on compression flags) contains compressed/uncompressed size of the block.
index is a list of block records. Each record contain the unpadded size and the uncompressed size.

NOTE: I checked a binary on my computer and size aren't present in block header.

Conclusion we need first to decode index to get the various offset of the blocks. Iterator allow us to iterate on block records.

gregory38 · 2017-08-06T19:08:16Z

@turtleli
I would need a newer version of xz to do some tests (unreleased actually). How can I sync https://github.com/PCSX2/xz.git with latest upstream git ?

Edit: actually don't bother I pulled something in a local branch. It should be enough.

avih · 2017-08-06T19:19:35Z

While xz is definitely popular, if possible, I'd suggest to also examine/play with newer compression formats like zstd and brotly or maybe some of the LZ* family. In my experience with the gzip implementation, random access decompression speed is the key for avoiding lot of pitfalls, workarounds and caches.

Also, it's best to avoid creating an index, and instead stick to formats/configurations which provide their own index as part of the standard - and require users to use these configurations only (I don't know how much this is true for the formats i mentioned).

gregory38 · 2017-08-07T22:25:28Z

So I'm rather close of a working prototype (based on latest xz git). I manage to uncompress a couple of blocks. And cdvd format seem to be detected correctly. But it fails later. Maybe an issue with block boundary. I need to double check the logic..

gregory38 · 2017-08-08T20:08:09Z

Good news I manage to boot a game. The issue was on the blocksize/blockcount management. Honestly the logic should be moved into the base class. Anyway XZ stuff is done 👍

gregory38 · 2017-08-10T18:00:29Z

As a side note, xz could also be a neat replacement for save state too. I saved 30% with a repack of the savestate.

tony971 · 2017-09-14T10:23:50Z

Is this waiting on a new XZ release?

gregory38 · 2017-09-14T14:58:58Z

Yeah a new XZ release would help. We would need to release 1.6 too. I don't want to requires an alpha release of XZ for our release. I'm also waiting to have free time to merge the code.

Zero3K · 2017-12-31T17:12:47Z

Any news regarding it?

MrCK1 · 2017-12-31T17:13:39Z

Nope, everything you see in the pull requests section is what's currently being worked on.

Zero3K · 2017-12-31T17:20:24Z

I hope that it gets added soon.

tony971 · 2018-01-05T15:34:32Z

I've just been checking for new XZ releases. This is the biggest gap between releases they've had in a long while, so hopefully it's soon.

tony971 · 2018-04-30T13:39:06Z

Looks like an alpha build of xz utils was published with the lzma_file_info_decoder() API

https://git.tukaani.org/?p=xz.git;a=blob_plain;f=NEWS;hb=114cab97af766b21e0fc8620479202fb1e7a5e41

Quest79 · 2020-09-18T08:28:11Z

Whats happening/ed with this? I've just done a ton of tests to compress my PS2 library and out of gz, zip, 7z, rar, cso and xz, xz had the lowest filesize and did it suspiciously fast 10MB/s with the strongest compression. Gz was doing about 2MB/s (less cores used) Im using the one built into current 7z.

Space savings roughly 8-15% over gz. Thats several hundred GB saved for larger collections.
Theoretically~ if we keep implementing new better compression every few years, in 200 years or so a massive Ps2 collections will be under 10KB

refractionpcsx2 · 2020-09-18T08:32:18Z

Nothing has really happened I'm afraid. xz got added to PCSX2 for making GS dumps, but no support for loading games yet, it's not really been much of a priority.

Theoretically~ if we keep implementing new better compression every few years, in 200 years or so a massive Ps2 collections will be under 10KB

That's big brain right there, but I don't think that's how compression works.

gregory38 · 2020-09-18T08:36:48Z

Btw, I found this codec recently. The promise is a similar lzma compression ratio, but a much faster decompression speed.

https://github.com/richgel999/lzham_codec

However, I don't know if we can chunk the bitstream for random access

tony971 · 2020-09-18T18:30:08Z

This is the most promising I've found.

https://github.com/aaru-dps/Aaru

lightningterror · 2022-05-10T10:29:51Z

Closing as trivial and we support other formats.
A pr was open to implement it in #2424 , if someone wishes they can resume work to get it in a mergeable state.

ssakash added the Feature request label Jun 7, 2017

gregory38 added the Core label Jun 8, 2017

MrCK1 added Enhancement / Feature Request and removed Feature request labels Oct 13, 2017

tony971 mentioned this issue Jan 4, 2019

[Request] .aif file support #2798

Closed

lightningterror closed this as completed May 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Request] Support games compressed in .xz format #1964

[Request] Support games compressed in .xz format #1964

tony971 commented Jun 7, 2017

gregory38 commented Jun 8, 2017

gregory38 commented Aug 6, 2017 •

edited

gregory38 commented Aug 6, 2017 •

edited

gregory38 commented Aug 6, 2017 •

edited

avih commented Aug 6, 2017

gregory38 commented Aug 7, 2017

gregory38 commented Aug 8, 2017

gregory38 commented Aug 10, 2017 •

edited

tony971 commented Sep 14, 2017

gregory38 commented Sep 14, 2017

Zero3K commented Dec 31, 2017

MrCK1 commented Dec 31, 2017

Zero3K commented Dec 31, 2017

tony971 commented Jan 5, 2018

tony971 commented Apr 30, 2018 •

edited

Quest79 commented Sep 18, 2020 •

edited

refractionpcsx2 commented Sep 18, 2020

gregory38 commented Sep 18, 2020 •

edited

tony971 commented Sep 18, 2020

lightningterror commented May 10, 2022

[Request] Support games compressed in .xz format #1964

[Request] Support games compressed in .xz format #1964

Comments

tony971 commented Jun 7, 2017

gregory38 commented Jun 8, 2017

gregory38 commented Aug 6, 2017 • edited

gregory38 commented Aug 6, 2017 • edited

gregory38 commented Aug 6, 2017 • edited

avih commented Aug 6, 2017

gregory38 commented Aug 7, 2017

gregory38 commented Aug 8, 2017

gregory38 commented Aug 10, 2017 • edited

tony971 commented Sep 14, 2017

gregory38 commented Sep 14, 2017

Zero3K commented Dec 31, 2017

MrCK1 commented Dec 31, 2017

Zero3K commented Dec 31, 2017

tony971 commented Jan 5, 2018

tony971 commented Apr 30, 2018 • edited

Quest79 commented Sep 18, 2020 • edited

refractionpcsx2 commented Sep 18, 2020

gregory38 commented Sep 18, 2020 • edited

tony971 commented Sep 18, 2020

lightningterror commented May 10, 2022

gregory38 commented Aug 6, 2017 •

edited

gregory38 commented Aug 6, 2017 •

edited

gregory38 commented Aug 6, 2017 •

edited

gregory38 commented Aug 10, 2017 •

edited

tony971 commented Apr 30, 2018 •

edited

Quest79 commented Sep 18, 2020 •

edited

gregory38 commented Sep 18, 2020 •

edited