Skip to content

SCI: Merge and cleanup LZW decompressors #6550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 10, 2025
Merged

Conversation

sluicebox
Copy link
Member

This PR ports the LZW decompressor from my SCI tools to ScummVM. This replaces two FreeSCI-era decompressors with one small documented one.

That's right, we're compressing the decompressors.

SCI has two LZW compression formats. They only have two minor differences. The first is the bit order, the second is an off-by-one bug. The bug is so common in LZW implementations that it has a name and made the LZW wiki page: "early change".

I don't think it's known to the SCI community that the formats only differ by these two things. I learned it the hard way when developing my decompressors. I think what usually happens is that the FreeSCI code for the second algorithm just gets ported around. It works, but it's based on disassembly and not very comprehensible. (As the FreeSCI comments acknowledged.)

This decompressor helps generate my sci-scripts repository, where it gets run on every script in almost every game/version, so it's well used. I've also ran it against every LZW-compressed resource in my corpus.

This affects all SCI0-SCI1 games, but it shouldn't change any behavior unless I've messed something up =)

@sluicebox sluicebox requested a review from bluegr April 9, 2025 03:11
Comment on lines +170 to +172
// SCI0: LSB-first.
// SCI01/1: MSB-first, code size is increased one code earlier than necessary.
// This is known as an "early change" bug in LZW implementations.
Copy link
Member

@bluegr bluegr Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit confused here. The "early change" bug was introduced in early implementations of the LZW algorithm, but here it's introduced in newer SCI versions. Was this introduced by accident, perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's very confusing that their second LZW implementation introduced this bug! I don't have any insight beyond that; it's a common mistake to make in LZW but it doesn't really matter as long as both the encoder and decoder behave the same.

My guess is that Sierra didn't know they introduced this bug in their second implementation. They already swapped the bit order, so it's not like any of the old stuff and new stuff would have been compatible. But it is funny that if it weren't for this, the only difference would be "flip the bits".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd indeed. Thanks for the explanation :)

@bluegr
Copy link
Member

bluegr commented Apr 10, 2025

Excellent work! Much cleaner and readable implementation

I've briefly tested this out, and it works great.

Thanks for your work!

@bluegr bluegr merged commit 6fff0ec into scummvm:master Apr 10, 2025
8 checks passed
@sluicebox sluicebox deleted the sci-lzw branch April 10, 2025 06:21
@Kawa-oneechan
Copy link
Contributor

That's right, we're compressing the decompressors.

Oh my wow 😹

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants