Skip to content

zip: Entry.name(); Entry.extract_to() accepting user-provided writers #23991

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

l1yefeng
Copy link

std.zip users sometimes wish to extract content in memory rather than to the file system. This change adds a new extract_to as a more flexible alternative to the existing extract.

This partly addresses #21922.

I am a first-time contributor to zig. Comments and suggestions are greatly appreciated.

@l1yefeng l1yefeng marked this pull request as draft May 26, 2025 12:00
@l1yefeng l1yefeng marked this pull request as ready for review May 26, 2025 14:03
@andrewrk andrewrk self-requested a review May 26, 2025 18:55
Copy link
Contributor

@GalaxyShard GalaxyShard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disclaimer: I'm only a contributor, not a Zig core team member, so take my suggestions with a grain of salt.

lib/std/zip.zig Outdated
Comment on lines 537 to 538
var dir = try extractor.openDir(filename[0 .. filename.len - 1]);
dir.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this openDir function really isn't to open a directory, but to create directories to form that path, which is evident by the filesystem implementation of openDir below and the original code that was here before these changes.

There's also not much reason to have a close function; any extractor implementation could just call it before returning from it's openDir (or more accurately makePath) function if it was necessary. The filesystem implementation implements it as a no-op anyways.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this.

lib/std/zip.zig Outdated
Comment on lines 548 to 557
var file = try extractor.openFile(filename);
defer file.close();
const crc = try decompress(
self.compression_method,
self.uncompressed_size,
limited_reader.reader(),
out_file.writer(),
file.writer(),
);
if (limited_reader.bytes_left != 0)
return error.ZipDecompressTruncated;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, but I do wonder if this could be factored out to remove the need for this extractor interface altogether, since all that's really needed here is a writer to decompress the data to.

For example, maybe have extract_to return a struct containing the filename/path, crc, and a "file or directory" enum, then this code that actually does the decompression and the openDir code above that creates the directory could be moved into extract, and the extractor interface can be removed.

Copy link
Author

@l1yefeng l1yefeng May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decompression depends on the (generalized concept of) opened file and opening the file depends on the filename. Then decompression may have to be moved from extract_to to its caller, too. That's too much for callers I'm afraid.

lib/std/zip.zig Outdated
Comment on lines 587 to 596
fn openFile(self: @This(), name: []u8) !std.fs.File {
if (std.fs.path.dirname(name)) |dirname| {
var parent_dir = try self.dest.makeOpenPath(dirname, .{});
defer parent_dir.close();

const basename = std.fs.path.basename(name);
return try parent_dir.createFile(basename, .{ .exclusive = true });
}
return try self.dest.createFile(name, .{ .exclusive = true });
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function creates a file and ensures it's parent directory exists, and doesn't open a file in the usual sense, it should probably be renamed as such. Of course if the extractor interface is removed (see comment above), this wouldn't be relevant.

@l1yefeng
Copy link
Author

l1yefeng commented May 28, 2025

After using zip.zig a bit more, I realize that what makes things easier is to have these methods

  • entry.name() that gets the filename. Often callers need the name first. Whether to extract and how to extract depends on the name. On the other hand, callers may also choose to extract_to straight away if they don't care about the name, being rid of the filename buffer.
  • entry.extract_to() that accepts a simple writer instead of extract_context above. Callers can init/deinit as they wish at call sites.
  • entry.extract() still wraps extract_to, roughly goes like
name <- self.name()
if (name ends with '/') {
  create dir
} else {
  file <- create file with name
  defer file.close()
  self.extract_to(..., file.writer())
}

@l1yefeng l1yefeng marked this pull request as draft May 28, 2025 09:57
@l1yefeng l1yefeng marked this pull request as ready for review May 28, 2025 13:26
@l1yefeng l1yefeng changed the title zip.Iterator(_).Entry.extract_to as a more flexible extract zip: Entry.name and Entry.extract_to that accepts user-provided writers Jun 1, 2025
@l1yefeng l1yefeng changed the title zip: Entry.name and Entry.extract_to that accepts user-provided writers zip: Entry.name(); Entry.extract_to() accepting user-provided writers Jun 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants