Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open compressed files that were not made by godot #28999

Open
xelivous opened this issue May 19, 2019 · 14 comments
Open

Open compressed files that were not made by godot #28999

xelivous opened this issue May 19, 2019 · 14 comments

Comments

@xelivous
Copy link
Contributor

Godot version: 33897d9

OS/device including version: Linux 4.14.113-1-MANJARO

Issue description:

I was trying to load a gzipped json file from an external web server, and found out that I can't seem to find a way to do so in godot without making a gdnative plugin. File has the method open_compressed() which supports gzip, but the format it expects for gzip has this bizarre gcpf magic header that encapsulates the format.
Screenshot_2019-05-19_05-54-52

I kind of expected that it would at the very least open up compressed files that weren't made using the godot editor, assuming encryption wasn't used, considering at the end of the day it's just a bunch of strings or whatever other bytes in there. I'm not sure why the magic header exists in the compression format, but ideally if it could fall back to not requiring the magic header at all and just open up the file without it instead that would be cool.

If I ever need to edit a gzipped file that godot has made i can't just do a quick change from any other archive tools, i'll have to open up godot, open it up, edit it, and then re-export it in order for godot to continue being able to read it which is kind of annoying.

Steps to reproduce:

  1. download a gzip'd file from anywhere or make one yourself using the command line
  2. try and load it into godot using open_compressed or any other manner

Minimal reproduction project:

test_compression.zip

  • there's a test.json.gz file at the root of the project file that was made using gzip on the command line, which has the same contents as the text edit that is in the project.
  • If you click the "save gz" button it will save out a file to user://test.json.gz, and load it with "load gz". you can use this to compare the output that godot is expecting compared to the output from command-line gzip
  • You can try to load the command-line gzip file by clicking the button on the bottom of the text edit, and if it fails it will spit out the error below. It will generally always be error 15 until something changes.
  • There's also a checkbox to verify if the input is valid json or not.
@slasktotten

This comment has been minimized.

@Skaruts
Copy link

Skaruts commented Sep 10, 2019

I was trying to load RexPaint images (.xp files) and it seems I'm having the same problem. Error 15, file unrecognized.

I'm an ignorant when comes to compression formats, but according to RexPaint's manual:

Appendix B: .xp Format Specification (and Import Libraries)
-----------------------------------------------------------------
    (...)
    The .xp files are deflated with zlib (specifically they are gzipped files created 
    via gzofstream); once decompressed the format is binary (...)

So I tried File.COMPRESSION_GZIP first, but then I also tried all the others and none worked. If I manually extract the file with 7z then it works fine, but I'd rather not have to be doing that...

@DrMoriarty
Copy link
Contributor

I also have the same problem. I try to pack some text resources and use it in game. I tried zstd and gzip. I can not open this compressed files in gd script because open_compressed returns error=15.

@R-033
Copy link

R-033 commented Jan 26, 2020

To avoid error 15 you need to add header yourself, sadly
As I understand, it goes as follows:

file.store_32(0x46504347)
file.store_32(0x00000001) # compression method
file.store_32(0x00001000)
file.store_32(decompressedSize)
file.store_32(compressedSize)

And in the end of file for safety:

file.store_32(0x00000000)
file.store_32(0x46504347)

I pack compressed data + header into temporary file with open() and then read that file with open_compressed()

@Skaruts
Copy link

Skaruts commented Feb 6, 2020

@R-033 where does the compressedSize and decompressedSize come from?

@R-033
Copy link

R-033 commented Feb 25, 2020

@Skaruts compressedSize is the size of data between header and footer in bytes (original file size), decompressedSize is expected size of data after it's being decompressed by open_compressed() method

@Skaruts
Copy link

Skaruts commented Feb 28, 2020

@R-033 I'm confused, though. At that point I don't know the size of the decompressed data, or what to expect the size to be.

@R-033
Copy link

R-033 commented Feb 28, 2020

@Skaruts usually this value is used for creating output array internally, so maybe if you'll make it big enough there won't be a problem? I'm just guessing though, I didn't check godot sources

@facespkz
Copy link

Ran into this today. Especially annoying since the compression formats already seem to use magic numbers...

@Calinou
Copy link
Member

Calinou commented Apr 28, 2021

I was trying to load a gzipped json file from an external web server

Note that for this particular use case, transparent gzip compression is now supported in HTTPRequest in the master branch: #38944

@jamie-pate
Copy link
Contributor

jamie-pate commented Nov 30, 2021

NOTE: this is also weird: every store_string() call will create a new gzip header! so you should pre-buffer your content before writing to a compressed file or you will bloat it with extra gzip block headers?

It looks like there is a maximum block size for compressed data, and the FileAccessCompressed class will write multiple gzip blocks with magic etc inside your compressed file.

@paulmiller
Copy link

I wrote a little program to compress files outside of Godot, which can then be loaded normally inside Godot:
https://gist.github.com/paulmiller/a5e593eda3a14e3ffa9acd8f0a4fac4e

It also gets way better compression ratios for some files, with the downside that Godot will decompress the entire file at once.

@DanielKilgallon
Copy link

DanielKilgallon commented Dec 18, 2023

I had to do some trial and error, but I was able to use the advice in this thread to successfully get Godot to read a file that was externally compressed.
https://gist.github.com/DanielKilgallon/5936bd6b5020202ce5dc61c0295ee10f

@Zylann
Copy link
Contributor

Zylann commented Jan 8, 2024

I tried opening NBT files (in my case Minecraft schematics) which are documented as being some binary data wrapped in GZIP.
Unfortunately I hit this issue too, Godot can't recognize the file.

7zip can open the file just fine (also indicates there is a 10 bytes header), so there is surely a way to recognize its format.

So far I just see that FileAccess.open_compressed actually expects a custom, non-standard Godot format in any case, wrapping the actual format specified in compression_mode. So it looks like it would be quite annoying to make it support standard GZIP directly, but I'm not familiar with this code so I dont know if there is a trick to do that cleanly.

After fiddling around, in the end it looks like this worked:

	compressed_data = FileAccess.get_file_as_bytes(fpath)
	var decompressed_data := compressed_data.decompress_dynamic(-1, FileAccess.COMPRESSION_GZIP)

Of course if the file comes from malicious players you should specify max_size as even if the file is tiny, the uncompressed size can be 4Gb.


The following is some fiddling around I did, before I noticed GZIP was just working (I was trying hard with DEFLATE instead but got nowhere).

I gave a try reading the GZIP format myself, and handling the compressed data to PackedByteArray.decompress_dynamic.
I decoded 10 bytes of header (like I saw in 7zip) and decoded the uncompressed size in the footer (which matched what 7zip tells). CRC also matches what 7zip tells (though 7zip shows it in hexadecimal, careful :p).
But for some reason decompression keeps failing with the "incorrect header check" warning, I'm not sure why.

Here is the code I have so far:

static func open_gzip(fpath: String) -> PackedByteArray:
	var f := FileAccess.open(fpath, FileAccess.READ)
	if f == null:
		push_error("Could not open file ", fpath, ", error ", FileAccess.get_open_error())
		return PackedByteArray()
	
	# https://docs.fileformat.com/compression/gz/
	# Read 10-byte header
	var header := f.get_16() # 1f 8b
	var compression_method := f.get_8() # 08 for DEFLATE
	var file_flags := f.get_8()
	var timestamp := f.get_32()
	var compression_flags := f.get_8()
	var os_id := f.get_8()
	# Assuming no other extra header stuff, which my file doesn't have, 
	# but might need to be handled eventually
	
	var compressed_data_position := f.get_position()
	print("compressed_data_position ", compressed_data_position)
	var total_file_length := f.get_length()
	var footer_length := 8
	var compressed_data_size := total_file_length - compressed_data_position - footer_length
	var compressed_data := f.get_buffer(compressed_data_size)
	
	# Read footer
	var checksum_crc32 := f.get_32()
	var decompressed_data_size := f.get_32()
	print(f.get_position())
	print("decompressed_data_size ", decompressed_data_size)
	print("CRC32 ", checksum_crc32)
	
	f = null
	
	#compressed_data = FileAccess.get_file_as_bytes(fpath)
	
	# Decompress data ourselves. Usually the format is DEFLATE.
	# Godot allows to specify either GZIP or DEFLATE, but there is no difference in the 
	# implementation.
	var decompressed_data := compressed_data.decompress_dynamic(-1, FileAccess.COMPRESSION_DEFLATE)
	print("decompressed_data.size(): ", decompressed_data.size())
	
	return decompressed_data

And the file I'm testing with, just in case (inside zip)
train_bridge_x2.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests