Skip to content
This repository has been archived by the owner on Jun 2, 2024. It is now read-only.

Binary file seems to be corrupted with method Deflated #378

Closed
mass10 opened this issue Jun 27, 2023 · 14 comments
Closed

Binary file seems to be corrupted with method Deflated #378

mass10 opened this issue Jun 27, 2023 · 14 comments

Comments

@mass10
Copy link

mass10 commented Jun 27, 2023

Summary

 Binary file seems to be corrupted with method zip::CompressionMethod::Deflated.

Code to reproduce

use std::io::{Read, Write};

/// Example of deflate.
fn main() -> Result<(), Box<dyn std::error::Error>> {
	// ARCHIVE
	{
		// START
		let w = std::fs::File::create("notepad.exe.zip")?;
		let mut archiver = zip::ZipWriter::new(w);

		let options = zip::write::FileOptions::default();
		// let options = options.compression_method(zip::CompressionMethod::Stored); // SAFE
		let options = options.compression_method(zip::CompressionMethod::Deflated); // CORRUPT🔥
		archiver.start_file("notepad.exe", options)?;

		// READ
		let mut stream = std::fs::File::open("C:\\Windows\\system32\\notepad.exe")?;
		loop {
			let mut buffer = [0; 1000];
			let bytes_read = stream.read(&mut buffer)?;
			if bytes_read == 0 {
				break;
			}

			// WRITE
			let write_buffer = &buffer[..bytes_read];
			archiver.write(&write_buffer)?;
		}

		// COMPLETE
		archiver.finish().unwrap();
	}

	// UNZIP
	{
		let command = std::process::Command::new("wsl.exe").args(&["unzip", "notepad.exe.zip"]).spawn()?.wait()?;
		println!("Command exited with: {}", command);
	}

	// ORIGINAL FILE SIZE
	{
		let meta = std::fs::metadata("C:\\Windows\\system32\\notepad.exe")?;
		println!("Original NOTEPAD is: {} bytes.", meta.len());
	}

	// EXTRACTED FILE SIZE
	{
		let meta = std::fs::metadata("notepad.exe")?;
		println!("Extracted NOTEPAD is: {} bytes.", meta.len());
	}

	return Ok(());
}

Result is

Original NOTEPAD is: 201216 bytes.
Extracted NOTEPAD is: 200903 bytes.

Thank you!

@mass10
Copy link
Author

mass10 commented Jul 2, 2023

zip::CompressionMethod::Zstd sounds good.

@mass10
Copy link
Author

mass10 commented Jul 3, 2023

But I found neither of the following supports ZStandard:

  • 7-zip
  • explorer.exe
  • unzip

@milesj
Copy link

milesj commented Jul 8, 2023

Just ran into this also. Have to use Stored for now...

@axnsan12
Copy link

axnsan12 commented Jul 23, 2023

You are using archiver.write (an implementation of std::io::Write::write) and ignoring the return value. You should be calling write in a loop checking the count of bytes written, or use archiver.write_all.

https://doc.rust-lang.org/stable/std/io/trait.Write.html#tymethod.write

This function will attempt to write the entire contents of buf, but the entire write might not succeed, or the write may also generate an error. A call to write represents at most one attempt to write to any wrapped object.

@mass10
Copy link
Author

mass10 commented Jul 24, 2023

Oh, It was fixed!

Thank you!

@mass10 mass10 closed this as completed Jul 24, 2023
@milesj
Copy link

milesj commented Jul 26, 2023

I'm using write_all and it still fails, here's my impl: https://github.com/moonrepo/starbase/blob/master/crates/archive/src/zip.rs

@mass10 mass10 reopened this Jul 27, 2023
@mass10
Copy link
Author

mass10 commented Jul 29, 2023

@milesj I thought I want to use fs::read_file_bytes() instead of fs::read_file() in your code.

@milesj
Copy link

milesj commented Jul 29, 2023

When I try that I get this error "stream did not contain valid UTF-8", which is the same as using fs::read_file(file)?.as_bytes().

Both are interesting, since these test cases are very simple.

I also tried this, which failed with the same error.

let mut buffer = Vec::new();
let mut stream = fs::open_file(file)?;
stream.read_to_end(&mut buffer).unwrap();

I feel like something else may be going on.

@mass10
Copy link
Author

mass10 commented Jul 29, 2023

I tried this.

/// Succeeds🌔
fn diagnose_1(path: &str) {
    let data = std::fs::read(path).unwrap();
    println!("(1) {} is {} bytes.", path, data.len());
}

/// Succeeds🌔
fn diagnose_2(path: &str) {
    use std::io::Read;
    let mut file = std::fs::File::open(path).unwrap();
    let mut buffer = Vec::new();
    file.read_to_end(&mut buffer).unwrap();
    println!("(2) {} is {} bytes.", path, buffer.len());
}

/// Fails🌒
fn diagnose_3(path: &str) {
    let text = std::fs::read_to_string(path).unwrap(); // 🔥stream did not contain valid UTF-8
    let data = text.as_bytes();
    println!("(3) {} is {} bytes.", path, data.len());
}

pub fn main() {
    diagnose_1(r"C:\Windows\system32\notepad.exe"); //🌔
    diagnose_2(r"C:\Windows\system32\notepad.exe"); //🌔
    diagnose_3(r"C:\Windows\system32\notepad.exe"); //🌒
}

Result was

(1) C:\Windows\system32\notepad.exe is 201216 bytes.
(2) C:\Windows\system32\notepad.exe is 201216 bytes.
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { kind: InvalidData, message: "stream did not contain valid UTF-8" }', src\test2.rs:18:46

@milesj
Copy link

milesj commented Aug 1, 2023

It's weird, because even if I do this it fails:

self.archive
    .start_file(name, options)
    .map_err(|error| ZipError::AddFailure {
        source: file.to_path_buf(),
        error,
    })?;

self.archive
    .write_all(&std::fs::read(file).unwrap())
    .map_err(|error| FsError::Write {
        path: file.to_path_buf(),
        error,
    })?;

I wonder if this abstraction is causing issues once compiled.

Example PR: moonrepo/starbase#33

@mass10
Copy link
Author

mass10 commented Aug 5, 2023

@milesj What kind of file is broken or fail?

@milesj
Copy link

milesj commented Aug 5, 2023

Not sure which file is the problem, but this is the fixture: https://github.com/moonrepo/starbase/tree/master/crates/archive/tests/__fixtures__/archives

They are very simple files.

@milesj
Copy link

milesj commented Aug 7, 2023

I figured out the problem, was my mistake. The issue was I was using by_index_raw instead of by_index when unzipping.

I'm using Copilot and I think it generated that 😞

@mass10
Copy link
Author

mass10 commented Aug 8, 2023

That's good! Everything has become clear.

@mass10 mass10 closed this as completed Aug 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants