-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Writergate #24329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Writergate #24329
Conversation
5837754
to
1ef243b
Compare
Macos uses the BSD definition of msghdr All linux architectures share a single msghdr definition. Many architectures had manually inserted padding fields that were endian specific and some had fields with different integers. This unifies all architectures to use a single correct msghdr definition.
preparing to rearrange std.io namespace into an interface how to upgrade: std.io.getStdIn() -> std.fs.File.stdin() std.io.getStdOut() -> std.fs.File.stdout() std.io.getStdErr() -> std.fs.File.stderr()
added adapter to AnyWriter and GenericWriter to help bridge the gap between old and new API make std.testing.expectFmt work at compile-time std.fmt no longer has a dependency on std.unicode. Formatted printing was never properly unicode-aware. Now it no longer pretends to be. Breakage/deprecations: * std.fs.File.reader -> std.fs.File.deprecatedReader * std.fs.File.writer -> std.fs.File.deprecatedWriter * std.io.GenericReader -> std.io.Reader * std.io.GenericWriter -> std.io.Writer * std.io.AnyReader -> std.io.Reader * std.io.AnyWriter -> std.io.Writer * std.fmt.format -> std.fmt.deprecatedFormat * std.fmt.fmtSliceEscapeLower -> std.ascii.hexEscape * std.fmt.fmtSliceEscapeUpper -> std.ascii.hexEscape * std.fmt.fmtSliceHexLower -> {x} * std.fmt.fmtSliceHexUpper -> {X} * std.fmt.fmtIntSizeDec -> {B} * std.fmt.fmtIntSizeBin -> {Bi} * std.fmt.fmtDuration -> {D} * std.fmt.fmtDurationSigned -> {D} * {} -> {f} when there is a format method * format method signature - anytype -> *std.io.Writer - inferred error set -> error{WriteFailed} - options -> (deleted) * std.fmt.Formatted - now takes context type explicitly - no fmt string
behavior tests must not depend on std.io
for structs, enums, and unions. auto untagged unions are no longer printed as pointers; instead they are printed as "{ ... }". extern and packed untagged unions have each field printed, similar to what gdb does. also fix bugs in delimiter based reading
it didn't account for data.len can no longer be zero
now it avoids writing to buffer in the case of fixed
So that when returning from drain there is always capacity for at least one more byte.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
splat
is ignored here.
err: ?Error = null, | ||
|
||
fn drain(w: *Writer, data: []const []const u8, splat: usize) Writer.Error!usize { | ||
_ = splat; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ = splat; | |
if (splat == 0 and data.len == 1) | |
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"A write will only be sent here if it could not fit into buffer
, or during a "flush" operation." If data.len is 1 and splat is 0 then it would fit into the buffer. Flush sends an empty string for data[0].
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, thanks for the explanation
err: ?Error = null, | ||
|
||
fn drain(w: *std.io.Writer, data: []const []const u8, splat: usize) std.io.Writer.Error!usize { | ||
_ = splat; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ = splat; | |
if (splat == 0 and data.len == 1) | |
return 0; |
|
/// a success case. | ||
/// | ||
/// Returns total number of bytes written to `w`. | ||
pub fn streamRemaining(r: *Reader, w: *Writer) StreamRemainingError!usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can loop infinitely - here's a reproduction:
// 0. $ touch big_file.txt
// 1. $ truncate -s 1G big_file.txt
// 2. $ zig run repro.zig
const std = @import("std");
pub fn main() !void {
const file = try std.fs.cwd().openFile("big_file.txt", .{});
defer file.close();
var r_buffer: [256]u8 = undefined;
var file_reader = file.reader(&r_buffer);
const reader = &file_reader.interface;
var w_buffer: [256]u8 = undefined;
var discarding: std.io.Writer.Discarding = .init(&w_buffer);
const writer = &discarding.writer;
_ = try reader.streamRemaining(writer);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reader.VTable
says that the return value of stream
"will be at minimum 0
and at most limit
." This return value of course denotes the number of bytes streamed, yet "The number returned, including zero, does not indicate end of stream." - It appears that the stream implementation used here is returning 0 when it should actually be returning Error.EndOfStream
.
I guess a further requirement of stream
may be to guarantee Error.EndOfStream
if it reaches the end of the stream.
What would cause this error on
Log is defined at the top: const std = @import("std");
const log = std.log.scoped(.foo); |
/// Asserts the provided buffer has total capacity enough for `len`.
///
/// Advances the buffer end position by `len`.
pub fn writableArray(w: *Writer, comptime len: usize) Error!*[len]u8 { so there's missing doc comment on or decision needs to be made that it supports unbuffered |
Is there something my user code is doing that could cause this? I think it's just calling the default log method in std? |
I see, good find, yes defaultLog should provide a small buffer |
try this patch please: --- a/lib/std/log.zig
+++ b/lib/std/log.zig
@@ -147,16 +147,10 @@ pub fn defaultLog(
) void {
const level_txt = comptime message_level.asText();
const prefix2 = if (scope == .default) ": " else "(" ++ @tagName(scope) ++ "): ";
- const stderr = std.fs.File.stderr().deprecatedWriter();
- var bw = std.io.bufferedWriter(stderr);
- const writer = bw.writer();
-
- std.debug.lockStdErr();
- defer std.debug.unlockStdErr();
- nosuspend {
- writer.print(level_txt ++ prefix2 ++ format ++ "\n", args) catch return;
- bw.flush() catch return;
- }
+ var buffer: [32]u8 = undefined;
+ const stderr = std.debug.lockStderrWriter(&buffer);
+ defer std.debug.unlockStderrWriter();
+ nosuspend stderr.print(level_txt ++ prefix2 ++ format ++ "\n", args) catch return;
}
/// Returns a scoped logging namespace that logs all messages using the scope no need to rebuild the compiler |
that fixes the
Where const std = @import("std");
const print = std.debug.print; |
--- a/lib/std/debug.zig
+++ b/lib/std/debug.zig
@@ -222,7 +222,8 @@ pub fn unlockStderrWriter() void {
/// Print to stderr, unbuffered, and silently returning on failure. Intended
/// for use in "printf debugging". Use `std.log` functions for proper logging.
pub fn print(comptime fmt: []const u8, args: anytype) void {
- const bw = lockStderrWriter(&.{});
+ var buffer: [32]u8 = undefined;
+ const bw = lockStderrWriter(&buffer);
defer unlockStderrWriter();
nosuspend bw.print(fmt, args) catch return;
} |
I can confirm that fixes the I actually didn't mean to print the |
Thanks, I'll submit those fixes shortly. |
Previous Scandal
Summary
Deprecates all existing std.io readers and writers in favor of the newly provided
std.io.Reader
andstd.io.Writer
which are non-generic and have the buffer above the vtable - in other words the buffer is in the interface, not the implementation. This means that although Reader and Writer are no longer generic, they are still transparent to optimization; all of the interface functions have a concrete hot path operating on the buffer, and only make vtable calls when the buffer is full.I have a lot more changes to upstream but it was taking too long to finish them so I decided to do it more piecemeal. Therefore, I opened this tiny baby PR to get things started.
These changes are extremely breaking. I am sorry for that, but I have carefully examined the situation and acquired confidence that this is the direction that Zig needs to go. I hope you will strap in your seatbelt and come along for the ride; it will be worth it.
The breakage in this first PR mainly has to do with formatted printing.
Performance Data
Building Self-Hosted Compiler with Itself
Building My Music Player Project
source
Compiler Binary Size (ReleaseSmall)
C Backend Building the Zig Compiler
C Backend Building Hello World
ReleaseFast zig
Debug zig
Upgrade Guide
Turn on
-freference-trace
to help you find all the format string breakage."{f}"
Required to Callformat
MethodsExample:
This will now cause a compile error:
Fixed by:
Motivation: eliminate these two footguns:
Introducing a
format
method to a struct caused a bug if there was formatting code somewhere that prints with {} and then starts rendering differently.Removing a
format
method to a struct caused a bug if there was formatting code somewhere that prints with {} and is now changed without notice.Now, introducing a
format
method will cause compile errors at all{}
sites. In the future, it will have no effect.Similarly, eliminating a
format
method will not change any sites that use{}
.Using
{f}
always tries to call aformat
method, causing a compile error if none exists.Format Methods No Longer Have Format Strings or Options
⬇️
The deleted FormatOptions are now for numbers only.
Any state that you got from the format string, there are three suggested alternatives:
This can be called with
"{f}", .{std.fmt.alt(Foo, .formatB)}
.std.fmt.Alt
This can be called with
"{f}", .{foo.bar(1234)}
.{f}
.This can be called with
"{f}", .{foo.bar(1234)}
.Formatted Printing No Longer Deals with Unicode
If you were relying on alignment combined with Unicode codepoints, it is now ASCII/bytes only. The previous implementation was not fully Unicode-aware. If you want to align Unicode strings you need full Unicode support which the standard library does not provide.
Miscellaneous
These are deprecated but not deleted yet:
If you have an old stream and you need a new one, you can use
adaptToNewApi()
like this:New API
Formatted Printing
@tagName()
and@errorName()
formatNumber
method.std.io.Writer
andstd.io.Reader
These have a bunch of handy new APIs that are more convenient, perform better, and are not generic. For instance look at how reading until a delimiter works now.
These streams also feature some unique concepts compared with other languages' stream implementations:
std.fs.File.Reader
Memoizes key information about a file handle such as:
sendfile
)versus plain variants (e.g.
read
).Fulfills the
std.io.Reader
interface.This API turned out to be super handy in practice. Having a concrete type to pass around that memoizes file size is really nice.
std.fs.File.Writer
Same idea but for writing.
What's NOT Included in this Branch
This is part of a series of changes leading up to "I/O as an Interface" and Async/Await Resurrection. However, this branch does not do any of that. It also does not do any of these things:
I have done all the above in a separate branch and plan to upstream them one at a time in follow-up PRs, eliminating dependencies on the old streaming APIs like a game of pick-up-sticks.
Merge Checklist: