Skip to content

Two possible improvements in std.fs.File.readToEndAllocOptions #21547

@nunokaeru

Description

@nunokaeru

Zig Version

0.14.0-dev.1573+4d81e8ee9

Steps to Reproduce and Observed Behavior

The issues refers to the following function:

/// Reads all the bytes from the current position to the end of the file.
/// On success, caller owns returned buffer.
/// If the file is larger than `max_bytes`, returns `error.FileTooBig`.
/// If `size_hint` is specified the initial buffer size is calculated using
/// that value, otherwise an arbitrary value is used instead.
/// Allows specifying alignment and a sentinel value.
pub fn readToEndAllocOptions(
    self: File,
    allocator: Allocator,
    max_bytes: usize,
    size_hint: ?usize,
    comptime alignment: u29,
    comptime optional_sentinel: ?u8,
) !(if (optional_sentinel) |s| [:s]align(alignment) u8 else []align(alignment) u8) {
    // If no size hint is provided fall back to the size=0 code path
    const size = size_hint orelse 0;

    // The file size returned by stat is used as hint to set the buffer
    // size. If the reported size is zero, as it happens on Linux for files
    // in /proc, a small buffer is allocated instead.
    const initial_cap = (if (size > 0) size else 1024) + @intFromBool(optional_sentinel != null);
    var array_list = try std.ArrayListAligned(u8, alignment).initCapacity(allocator, initial_cap);
    defer array_list.deinit();

    self.reader().readAllArrayListAligned(alignment, &array_list, max_bytes) catch |err| switch (err) {
        error.StreamTooLong => return error.FileTooBig,
        else => |e| return e,
    };

    if (optional_sentinel) |sentinel| {
        return try array_list.toOwnedSliceSentinel(sentinel);
    } else {
        return try array_list.toOwnedSlice();
    }
}

Issue 1:

  • Sentinel value does not need to be added to initial cap because toOwnedSliceSentinel will create the +1 memory for it. In line: const initial_cap = (if (size > 0) size else 1024) + @intFromBool(optional_sentinel != null);

Issue 2:

  • The following comment might not be correct as we do not use ( try f.stat() ).size as an hint to the initial_cap.
    // The file size returned by stat is used as hint to set the buffer
    // size. If the reported size is zero, as it happens on Linux for files
    // in /proc, a small buffer is allocated instead.
    const initial_cap = (if (size > 0) size else 1024) + @intFromBool(optional_sentinel != null);
    var array_list = try std.ArrayListAligned(u8, alignment).initCapacity(allocator, initial_cap);

Interestingly enough fmtPathFile does exactly what the comment refers to.

fn fmtPathFile(
    fmt: *Fmt,
    file_path: []const u8,
    check_mode: bool,
    dir: fs.Dir,
    sub_path: []const u8,
) FmtError!void {
    const source_file = try dir.openFile(sub_path, .{});
    var file_closed = false;
    errdefer if (!file_closed) source_file.close();

    const stat = try source_file.stat();

    if (stat.kind == .directory)
        return error.IsDir;

    const gpa = fmt.gpa;
    const source_code = try std.zig.readSourceFileToEndAlloc(
        gpa,
        source_file,
        std.math.cast(usize, stat.size) orelse return error.FileTooBig,
    );
    defer gpa.free(source_code);

Expected Behavior

The inital size of the array list to match exactly the size of the file when allowed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    standard libraryThis issue involves writing Zig code for the standard library.

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions