Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow returning a value with an error #2647

Open
CurtisFenner opened this issue Jun 10, 2019 · 43 comments
Open

Allow returning a value with an error #2647

CurtisFenner opened this issue Jun 10, 2019 · 43 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@CurtisFenner
Copy link

CurtisFenner commented Jun 10, 2019

Sometimes when a function fails, there is extra information that you have on hand that may help the caller respond to the problem or produce a diagnostic. For example, in the parseU64 example by andrewrk here,

const ParseError = error {
    InvalidChar,
    Overflow,
};

pub fn parseU64(buf: []const u8, radix: u8) ParseError!u64 {

it would be useful for the function could return the position of the invalid character so that the caller could produce a diagnostic message.

Because Zig treats error types specially, when using errors you get a bunch of nice features, such as ! error-set inference, try/catch, and errdefer; you currently lose these features if you want to return extra diagnostic information since that information is no longer an error type.

While something like index-of-bad-character is less useful for parsing an integer, getting "bad character" with no location when parsing a 2KiB JSON blob is very frustrating! -- this is the current state of the standard library's JSON parser.

There are currently two workarounds possible today to let this extra information get out, neither of which are very ergonomic and which work against Zig's error types:

Workaround 1: Return a tagged union

You could explicitly return a tagged union that has the extra information:

const ParseError = error {
    Overflow,
}

const ParseResult = union(enum) {
    Result: u64,
    InvalidChar: usize,
}

pub fn parseU64(buf: []const u8, radix: u8) ParseError!ParseResult {

This is unfortunate in a number of ways. First, because InvalidChar is no longer an error, you cannot propagate/handle the failure with try/catch. Second, because the InvalidChar case is no longer an error, you cannot use errdefer to cleanup partially constructed state in the parser. Finally, calling the function is made messy because it can fail in two separate ways -- either in the error union, or in the explicitly returned union. This means calls that distinguish different errors (as opposed to just propagating with try) need nested switches.

Workaround 2: Write to an out parameter

You could also leave the error set alone, and instead expand the contract of parseU64 to write to an out parameter whenever it returns a InvalidChar error:

pub fn parseU64(buf: []const u8, radix: u8, invalid_char_index: *usize) ParseError!u64{

However, this makes the function's interface much messier: it now includes mutation, and it makes it impossible to indicate that it's being called in such a way that it cannot fail, since the pointer parameter is required (where previously a catch unreachable could handle). Also, it won't be immediately obvious which out parameters are associated with which errors, especially if inferred error sets are being used. In particular, it gives libraries writes the opportunity to sometimes re-use out parameters (in order to prevent function signatures from growing out of hand) and sometimes not (they at least cannot when the types aren't the same).

Proposal: Associate each error with a type

EDIT: Scroll down to a comment for a refreshed proposal. It looks essentially the same as here but with a bit more detail. The primary difference is not associating errors with value types, but an error within a particular error-set with a type. This means no changes to the anyerror type are necessary.

I propose allowing a type to be associated with each error:

const ParseError = error {
    InvalidChar: usize,
    Overflow, // equivalent to `Overflow: void`
};

pub fn parseU64(buf: []const u8, radix: u8) ParseError!u64 {
    ......
        if (digit >= radix) {
            return error.InvalidChar(index);
        }
    ......

The value returned would be available in switchs:

if (parseU64(str, 10)) |number| {
	......
} else |err| switch (err) {
	error.Overflow => {
		......
	},
	error.InvalidChar => |index| {
		......
	}
}

This allows a function which can fail in multiple ways to associate different value types with different kinds of failures, or just return some plain errors that worked how they did before.

With this proposal, the caller can use inferred error sets to automatically propagate extra information, and the callsite isn't made messy with extra out-parameters/an extra non-error failure handling switch. In addition, all of the features special to errors, like errdefer and try/catch, continue to work.

Errors in the global set would now be associated with a type, so that the same error name assigned two different types would be given different error numbers.

I'm not sure what happens when you have an error set with the same name twice with different types. This could possibly be a limited case where "overloading" a single name is OK, since instantiating an error is always zero-cost, but I'll ask what others think.


I'm fairly new to Zig, so some of the details may not be quite right, but hopefully the overall concept and proposal makes sense and isn't unfixably broken.

@hryx
Copy link
Sponsor Contributor

hryx commented Jun 10, 2019

I see potential in that. A world where error sets are just regular unions, but given all the syntax-level amenities of today's errors.

// a regular-ass type
const InvalidChar = struct {
    pos: usize,
};

// an error set containing different types
const ParseError = error {
    InvalidChar: InvalidChar,
    Overflow, // void
};

// merge like ya would today
const Error = ParseError || error{OutOfMemory};

fn f() void {
    parse(something) catch |err| switch (err) {
        .InvalidChar => |e| warn("bad character at {}", e.pos),
        .Overflow => warn("overflow"),
        .OutOfMemory => warn("out of memory"),
    };
}

Taking it further, perhaps all today's good stuff about errors could be applied to any type, not just unions. Maybe the error keyword "taints" a type as an error type. (Although, making errors non-unions would probably have too many effects on the language.)

const SomeError1 = error struct {
    val: usize,
    reason: []const u8,
};

const SomeError2 = error union(enum) {
    OutOfOrder,
    OutOfBounds,
    OutOfIdeas,
};

// today's, which is sugar for above
const SomeError3 = error {
    ResourceExhausted,
    DeadlineExceeded,
};

Because you could now "bloat" an error set with types of larger size, this might affect how strongly use of the global error set is discouraged.

@daurnimator
Copy link
Collaborator

I remember seeing this proposed before but I can't find the issue for it. Maybe it was only on IRC?

@andrewrk andrewrk added this to the 0.6.0 milestone Jun 10, 2019
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jun 10, 2019
@andrewrk
Copy link
Member

Thank you @CurtisFenner for a well written proposal

@shawnl
Copy link
Contributor

shawnl commented Jun 10, 2019

This is just a tagged union.

And as they seem so useful, maybe we can add anonymous structs, so we can just use tagged unions instead of multiple return values.

Don't worry about the optimizations here. The compiler can handle that.

@ghost
Copy link

ghost commented Jun 11, 2019

There's a previous issue here #572 (just for the record)

@emekoi
Copy link
Contributor

emekoi commented Jun 11, 2019

because errors are assigned a unique value, how about allowing for tagged unions to use errors as the tag value? this would avoid adding new syntax to language and making this feature consistent with other constructs in the language. this tangentially relies on #1945.

/// stolen from above

const ParseError = union(error) {
    InvalidChar: usize,
    Overflow, // equivalent to `Overflow: void`
};

pub fn parseU64(buf: []const u8, radix: u8) ParseError!u64 {
    // ......
        if (digit >= radix) {
            return error{ .InvalidChar = index };
        }
    // ......
}

test "parseU64" {
	if (parseU64(str, 10)) |number| {
		// ......
	} else |err| switch (err) {
		error.Overflow => {
			// ......
		},
		error.InvalidChar => |index| {
			// ......
		}
	}
}

@shawnl
Copy link
Contributor

shawnl commented Jun 11, 2019

Agreeing with @emoki I'd like some syntactic sugar for multiple arguments to an error switch, if the type is defined in the same tagged union:

/// stolen from above

const ParseError = union(error) {
    InvalidChar: InvalidCharStruct,
    Overflow, // equivalent to `Overflow: void`

    pub const InvalidCharStruct = {
        i: usize,
        o: bool,
    }
};

pub fn parseU64(buf: []const u8, radix: u8) ParseError!u64 {
    // ......
        if (digit >= radix) {
            return error{ .InvalidChar = .InvalidCharStruct{index, false} };
        }
    // ......
}

test "parseU64" {
	if (parseU64(str, 10)) |number| {
		// ......
	} else |err| switch (err) {
		error.Overflow => {
			// ......
		},
		error.InvalidChar => |index, boolish| {
			// ......
		}
	}
}

@CurtisFenner
Copy link
Author

I think what @emekoi suggested is excellent, as it removes the need for extra syntax and sidesteps the issues of increasing the size of anyerror and dealing with error names assigned different types, while still enabling the core idea here!

@daurnimator
Copy link
Collaborator

daurnimator commented Jun 13, 2019

return error{ .InvalidChar = index };

I assume this should be:

return ParseError{ .InvalidChar = index };

Otherwise I love the idea!

@emekoi
Copy link
Contributor

emekoi commented Jun 15, 2019

that's what i wasn't sure about. would you still have to explicitly name the error even when using an inferred error set? or would you just use error as you normally would with an inferred error set?

@ghost
Copy link

ghost commented Jul 20, 2019

Not a proposal, but something possible currently: here's a variation on OP's "Workaround 2" (the out parameter). A struct member instead of an "out" parameter. It's still not perfect, but this or Workaround 2 is still the most flexible as they make it possible to allocate memory for the error value (e.g. a formatted error message).

const Thing = struct {
    const ErrorInfo = struct {
        message: []u8,
    };

    error_info: ?ErrorInfo,

    // `allocator` could also be a parameter of an init function
    fn doSomething(self: *Thing, allocator: ...) !void {
        if (bad thing 1) {
            self.error_info = ErrorInfo {
                .message = try ...allocate a string...,
            };
            return error.ThingError;
        } else if (bad thing 2) {
            self.error_info = ErrorInfo {
                .message = try ...allocate a different string...,
            };
            return error.ThingError;
        } else {
            // happy
        }
    }
};

fn caller() void {
    var thing = Thing.init();
    defer thing.deinit(); // free allocated stuff in error_info if present

    thing.doSomething(some_allocator) catch |err| {
        switch (err) { 
            error.ThingError => {
                // this `.?` is the smelliest part of this idea
                std.debug.warn("error: {}\n", thing.error_info.?.message);
            },
            else => {
                // e.g. an OOM error from when we tried to alloc for the error message
                std.debug.warn("some other error\n");
            },
        }
        return;
    }

    std.debug.warn("success\n");
}

This might be a solution for std InStream and OutStream which currently have that annoying generic error parameter?


Also, for parsers and line numbers specifically, you don't need to include the line number in the error value itself. Just maintain it in a struct member and the caller can pull it out when catching. If these struct members aren't exclusive to failed states, then there's no smell at all here.

const Parser = struct {
    ...
    line_index: usize,

    parse(self: *Parser) !?Token {
        // continually update line_index, return a regular zig error if something goes wrong
    }
};

@Tetralux
Copy link
Contributor

Tetralux commented Jul 20, 2019

I like @emekoi's suggestion here, but I'll note that I'd like to be able to have parseU64 return !u64 and have the error type inferred, just as we do now, and still be able to do
return error{ .InvalidIndex = index };.

@Tetralux
Copy link
Contributor

I guess it would actually be return error.InvalidChar{ .index = index }; - But that's still fine by me :)

@emekoi
Copy link
Contributor

emekoi commented Jul 20, 2019

in your example doSomething can be cleaned up using errdefer

@marler8997
Copy link
Contributor

I think the issue here can be summarized by noting that zig has 2 concepts that are tied together that probably don't need to be.

  1. Error Control Flow
  2. Error Codes

Zig has some nice constructs that make error control flow easy to work with (try, errdefer, catch, orelse, etc). However, the only way to use them is if you return "Error Codes". If Zig provides a way to enable "Error Control Flow" with more than just "Error Codes" then applications are free to choose the best type to return error information.

Maybe Zig should be able to infer an error set that includes any type, not just error codes?

fn foo() !void{
    if (...)
        return error SomeStruct.init(...);
    if (...)
        return error.SomeErrorCode;
}

@gggin
Copy link

gggin commented Oct 12, 2019

this c++ Proposal is so cool with this zig Proposal, so maybe a consider.l
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r3.pdf

@gggin
Copy link

gggin commented Oct 25, 2019

I find that the boost library outcome also support custom types.
https://ned14.github.io/outcome/tutorial/advanced/payload/copy_file2/

@daurnimator
Copy link
Collaborator

I additionally propose that coercing a union(error) to an error should be possible. That way you can still have a "returns all errors" function fn foo() anyerror!void but it only returns the error code, and not the error value.

@nodefish
Copy link

Is this not typically a job for some kind of interfaces, i.e. allow anything that implements IError to be used for the error control flow syntax? This would play nicely with proposals for wrapped primitives.

@andrewrk
Copy link
Member

andrewrk commented Feb 21, 2020

Here's a pattern that I would consider to be an alternative to this proposal:

zig/lib/std/target.zig

Lines 717 to 826 in 5709737

pub const ParseOptions = struct {
/// This is sometimes called a "triple". It looks roughly like this:
/// riscv64-linux-gnu
/// The fields are, respectively:
/// * CPU Architecture
/// * Operating System
/// * C ABI (optional)
arch_os_abi: []const u8,
/// Looks like "name+a+b-c-d+e", where "name" is a CPU Model name, "a", "b", and "e"
/// are examples of CPU features to add to the set, and "c" and "d" are examples of CPU features
/// to remove from the set.
cpu_features: []const u8 = "baseline",
/// If this is provided, the function will populate some information about parsing failures,
/// so that user-friendly error messages can be delivered.
diagnostics: ?*Diagnostics = null,
pub const Diagnostics = struct {
/// If the architecture was determined, this will be populated.
arch: ?Cpu.Arch = null,
/// If the OS was determined, this will be populated.
os: ?Os = null,
/// If the ABI was determined, this will be populated.
abi: ?Abi = null,
/// If the CPU name was determined, this will be populated.
cpu_name: ?[]const u8 = null,
/// If error.UnknownCpuFeature is returned, this will be populated.
unknown_feature_name: ?[]const u8 = null,
};
};
pub fn parse(args: ParseOptions) !Target {
var dummy_diags: ParseOptions.Diagnostics = undefined;
var diags = args.diagnostics orelse &dummy_diags;
var it = mem.separate(args.arch_os_abi, "-");
const arch_name = it.next() orelse return error.MissingArchitecture;
const arch = try Cpu.Arch.parse(arch_name);
diags.arch = arch;
const os_name = it.next() orelse return error.MissingOperatingSystem;
const os = try Os.parse(os_name);
diags.os = os;
const abi_name = it.next();
const abi = if (abi_name) |n| try Abi.parse(n) else Abi.default(arch, os);
diags.abi = abi;
if (it.next() != null) return error.UnexpectedExtraField;
const all_features = arch.allFeaturesList();
var index: usize = 0;
while (index < args.cpu_features.len and
args.cpu_features[index] != '+' and
args.cpu_features[index] != '-')
{
index += 1;
}
const cpu_name = args.cpu_features[0..index];
diags.cpu_name = cpu_name;
const cpu: Cpu = if (mem.eql(u8, cpu_name, "baseline")) Cpu.baseline(arch) else blk: {
const cpu_model = try arch.parseCpuModel(cpu_name);
var set = cpu_model.features;
while (index < args.cpu_features.len) {
const op = args.cpu_features[index];
index += 1;
const start = index;
while (index < args.cpu_features.len and
args.cpu_features[index] != '+' and
args.cpu_features[index] != '-')
{
index += 1;
}
const feature_name = args.cpu_features[start..index];
for (all_features) |feature, feat_index_usize| {
const feat_index = @intCast(Cpu.Feature.Set.Index, feat_index_usize);
if (mem.eql(u8, feature_name, feature.name)) {
switch (op) {
'+' => set.addFeature(feat_index),
'-' => set.removeFeature(feat_index),
else => unreachable,
}
break;
}
} else {
diags.unknown_feature_name = feature_name;
return error.UnknownCpuFeature;
}
}
set.populateDependencies(all_features);
break :blk .{
.arch = arch,
.model = cpu_model,
.features = set,
};
};
var cross = Cross{
.cpu = cpu,
.os = os,
.abi = abi,
};
return Target{ .Cross = cross };
}

I think it's quite reasonable.

Edit: here's usage example at the callsite:

var diags: std.Target.ParseOptions.Diagnostics = .{};
break :blk Target.parse(.{
.arch_os_abi = zig_triple,
.cpu_features = mcpu,
.diagnostics = &diags,
}) catch |err| switch (err) {
error.UnknownCpu => {
std.debug.warn("Unknown CPU: '{}'\nAvailable CPUs for architecture '{}':\n", .{
diags.cpu_name.?,
@tagName(diags.arch.?),
});
for (diags.arch.?.allCpuModels()) |cpu| {
std.debug.warn(" {}\n", .{cpu.name});
}
process.exit(1);
},
error.UnknownCpuFeature => {
std.debug.warn(
\\Unknown CPU feature: '{}'
\\Available CPU features for architecture '{}':
\\
, .{
diags.unknown_feature_name,
@tagName(diags.arch.?),
});
for (diags.arch.?.allFeaturesList()) |feature| {
std.debug.warn(" {}: {}\n", .{ feature.name, feature.description });
}
process.exit(1);
},
else => |e| return e,
};

@shawnl
Copy link
Contributor

shawnl commented Feb 21, 2020

The problem with returning a value with an error, is that it is the same as returning a polymorphic type, and defining that return type inside the function, instead of in the function signature. While I think we still need inferred return types (#447) for some things, this is an advanced feature, and fancier patterns, like the example, should be required in order to utilize these features.

We also need inferred types as a way of sneaking in multiple return values (through the anonymous structs we already have), which LLVM supports, but in C requires an ugly one-use-but-defined struct (and where the C ABI layout of that struct is ignored by the optimization pass).

@CurtisFenner
Copy link
Author

CurtisFenner commented Feb 27, 2020

I think it's reasonable, but I think it could be better by making minimal changes to the language. Specifically, I think Zig is already expressive enough to more tightly reflect the function's interface in its type signature; we just need to apply the right already-existing features. My two main complaints with what you can currently achieve:

  • IDEs cannot help you associate returned errors to particular fields on the returned struct
  • You cannot use a tagged union, because you want to avoid the "happy path" fields being behind a superfluous .successful case. However, using a struct instead of a tagged union means
    • Understanding how the struct is populated requires reading the documentation (if it can be trusted) or otherwise the code, whereas a tagged-union is an established pattern that doesn't require you to look elsewhere
    • You don't get immediate runtime checks that you are only reading from the correct field (see above); this requires boilerplate for setting defaults or setting up undefined
    • You potentially waste memory in the struct layout for diagnostic fields that aren't populated in other error cases / in success

An error set like error { A, B } is essentially a enum.
An error union like A!B is already essentially a tagged union, union(enum) { success: B, err: A }.

This (modified) proposal is to optionally transform the "error set" concept into an "error union" concept -- noting that a tagged union with all void field types is essentially the same thing as a enum; ie, we have just strengthened an existing concept (error sets) to another existing concept (tagged unions).

I don't think it's necessary to automatically generate the union, as emekoi suggested earlier -- we just use unions, but with tags that are errors instead of enums.

The example would look something like

pub const DiagnosticsErr = union(error) {
    UnknownCpuFeature: ?[]const u8,
    MissingArchitecture: void,
    // same as UnknownCpu: void
    UnknownCpu,
}

 pub fn parse(args: ParseOptions) !Target { // infer DiagnosticsErr!Target
    ......
    // equivalent to `return DiagnosticsErr{.MissingArchitecture={}};`
    return error.MissingArchitecture;
    ......
    return DiagnosticsErr{.UnknownCpuFeature = feature_name};
}


var result = Target.parse(.......) catch |err| switch (err) {
     error.UnknownCpuFeature => |unknown_feature_name| {
        ......
    },
    else => |e| return e,
}

I think this is a relatively small change to the language (since it re-uses existing concepts and is totally backwards compatible) to make some situations much clearer without any additional runtime/compiletime cost.

@emekoi
Copy link
Contributor

emekoi commented Apr 15, 2020

related: #786

motiejus added a commit to motiejus/zig that referenced this issue Jun 22, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *ErrCtx`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `ErrCtx` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on "\x00") and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.split_rev`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 22, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *ErrCtx`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `ErrCtx` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.split_rev`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 22, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.split_rev`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 22, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.split_rev`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 22, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.split_rev`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 22, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitRev`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 22, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitRev`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 22, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitReversed`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 23, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitBackwards`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 26, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'žemas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitBackwards`.
motiejus added a commit to motiejus/zig that referenced this issue Jun 26, 2022
Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'Žvangalas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitBackwards`.
jedisct1 pushed a commit that referenced this issue Jun 29, 2022
* mem: refactor tests of split()

- add a few cases for .rest()
- use expectEqualSlices()

* mem: add splitBackwards

Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'Žvangalas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitBackwards`.
andrewrk pushed a commit that referenced this issue Jul 19, 2022
* mem: refactor tests of split()

- add a few cases for .rest()
- use expectEqualSlices()

* mem: add splitBackwards

Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'Žvangalas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitBackwards`.
wooster0 pushed a commit to wooster0/zig that referenced this issue Jul 24, 2022
* mem: refactor tests of split()

- add a few cases for .rest()
- use expectEqualSlices()

* mem: add splitBackwards

Over the last couple of weeks weeks I needed to iterate over a
collection backwards at least twice. Do we want to have this in stdlib?
If yes, click "Merge" and start using today! Free shipping and returns
(before 1.0).

Why is this useful?
-------------------

I need this for building an error wrapper: errors are added in the
wrapper from "lowest" level to "highest" level, and then printed in
reverse order. Imagine `UpdateUsers` call, which needs to return
`error.InvalidInput` and a wrappable error context. In Go we would add a
context to the error when returning it:

    // if update_user fails, add context on which user we are operating
    if err := update_user(user); err != nil {
        return fmt.Errorf("user id=%d: %w", user.id, err)
    }

Since Zig cannot pass anything else than u16 with an error (ziglang#2647), I
will pass a `err_ctx: *Err`, to the callers, where they can, besides
returning an error, augment it with auxiliary data. `Err` is a
preallocated array that can add zero-byte-separated strings. For a
concrete example, imagine such a call graph:

    update_user(User, *Err) error{InvalidInput}!<...>
      validate_user([]const u8, *Err) error{InvalidInput}!<...>

Where `validate_user` would like, besides only the error, signal the
invalid field. And `update_user`, besides the error, would signal the
offending user id.

We also don't want the low-level functions to know in which context they
are operating to construct a meaningful error message: if validation
fails, they append their "context" to the buffer. To translate/augment
the Go example above:

    pub fn validate_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        const name = user.name;
        if (!ascii.isAlpha(name)) {
            err_ctx.print("name '{s}' must be ascii-letters only", .{name});
            return error.InvalidInput;
        }
        <...>
    }

    // update_user validates each user and does something with it.
    pub fn update_user(err_ctx: *Err, user: User) error{InvalidInput}!void {
        // validate the user before updating it
        validate_user(user) catch {
            err_ctx.print("user id={d}", .{user.id});
            return error.InvalidInput;
        };
        <...>
    }

Then the top-level function (in my case, CLI) will read the buffer
backwards (splitting on `"\x00"`) and print:

    user id=123: name 'Žvangalas' must be ascii-letters only

To read that buffer backwards, dear readers of this commit message, I
need `mem.splitBackwards`.
@Jarred-Sumner
Copy link
Contributor

Some examples where returning a value with an error would be helpful:

If std.base64.Base64Decoder.decode returns an error, how do you tell the user how many bytes were successfully decoded?

zig/lib/std/base64.zig

Lines 189 to 226 in 94aeb6e

pub fn decode(decoder: *const Base64Decoder, dest: []u8, source: []const u8) Error!void {
if (decoder.pad_char != null and source.len % 4 != 0) return error.InvalidPadding;
var acc: u12 = 0;
var acc_len: u4 = 0;
var dest_idx: usize = 0;
var leftover_idx: ?usize = null;
for (source) |c, src_idx| {
const d = decoder.char_to_index[c];
if (d == invalid_char) {
if (decoder.pad_char == null or c != decoder.pad_char.?) return error.InvalidCharacter;
leftover_idx = src_idx;
break;
}
acc = (acc << 6) + d;
acc_len += 6;
if (acc_len >= 8) {
acc_len -= 8;
dest[dest_idx] = @truncate(u8, acc >> acc_len);
dest_idx += 1;
}
}
if (acc_len > 4 or (acc & (@as(u12, 1) << acc_len) - 1) != 0) {
return error.InvalidPadding;
}
if (leftover_idx == null) return;
var leftover = source[leftover_idx.?..];
if (decoder.pad_char) |pad_char| {
const padding_len = acc_len / 2;
var padding_chars: usize = 0;
for (leftover) |c| {
if (c != pad_char) {
return if (c == Base64Decoder.invalid_char) error.InvalidCharacter else error.InvalidPadding;
}
padding_chars += 1;
}
if (padding_chars != padding_len) return error.InvalidPadding;
}
}

If std.fs.File.writeAll returns an error, how do you tell the user how many bytes were written successfully?

If a developer did try std.fs.openFileAbsolute(path, flags), how do you tell the user what file path failed to open successfully? You'd have to wrap every call into another function which logs errors separately.

                var body_file = std.fs.openFileAbsoluteZ(absolute_path_, .{ .mode = .read_only }) catch |err| {
                    Output.printErrorln("<r><red>{s}<r> opening file {s}", .{ @errorName(err), absolute_path });
                    Global.exit(1);
                };

@scheibo
Copy link
Sponsor Contributor

scheibo commented Feb 25, 2023

I appreciate that this is a difficult issue (I think Spec's explanation on the Zig Discord was helpful for me understanding the challenges), but I really feel the current state of things is unsatisfactory. I just spent way longer than I should have to debugging an issue of my project's build not working on Windows given that all I had to work with from the zig compiler was an error: AccessDenied and the build command that failed. When I finally gave up and switched to rewriting and then debugging things through Node the error that it returned was EBUSY and the specific path in question that Windows considered to be busy, which made the problem actually tractable1. While the obvious answer here is "The Zig compiler is a work in progress and eventually we will improve our error messages using the diagnostic pattern proposed above..." (or perhaps that this is some Windows specific issue, etc), I think the fact that even the compiler can't consistently implement this pattern points to it perhaps being too manual/tedious/unergonomic/difficult to expect the Zig ecosystem at large to do the same.

(sorry for not being able to propose a concrete solution here, I just felt the need to bump this issue with a use case after a particularly unpleasant debugging experience)

Footnotes

  1. EDIT: from looking deeper at this I think maybe one problem is Zig mapping USER_MAPPED_FILE -> AccessDenied - I think if I had gotten the more specific error here I probably would have been able to solve this even without the filename, though the filename + the exact error code would obviously have been the best option

@karlseguin
Copy link
Contributor

I think this has already been captured, but I'm dealing with error values that have resources that must be freed. I'm not sure how this would work with try.

Specifically, I've written a driver for duckdb. When failing to execute a query, duckdb provides a detailed error message. That message is valid until the underlying prepared statement is freed. Whether I dupe the error message or keep a reference to the prepared statement, I need to free something.

I'm using a Result / tagged union, but it's a bit cumbersome to consume.

@VisenDev
Copy link

I agree that something to this affect would be great, particularly in the instance of failed parsing, writing to file, etc...

I have also experience similar opaque error messages when parsing json with no hint of where in the json file the error occurred

@ni-vzavalishin
Copy link

I was thinking along the lines of @marler8997's suggestions, but from a perspective that this approach can actually solve the problem of merging conflicting error sets (which is an unsolved problem in the OP's proposal, and I find it a bit difficult to accept it to be unsolved). So, once we allow other types to be used as members of error sets, there is no problem with a conflict, since each type is globally unique in a natural way. The syntax could look e.g. like follows (I do not insist on this specific form, which is based on @marler8997's one, this is just an example)

const MyErrorSet = error {
    SomeErrorId,
    AnotherErrorId,
    error ErrorIdWithPayload,
};
const ErrorIdWithPayload = struct {
    payload_field1: []const u8,
    payload_field2: usize,
};
fn fail() MyErrorSet!void {
    return error ErrorIdWithPayload{ .payload_field1 = "", .payload_field2 = 0 };
}

fn func() void {
    fail() catch |err| switch(err) {
        error.SomeErrorId, error.AnotherErrorId => {},
        error ErrorIdWithPayload => |payload| {
            std.debug.print("{s} {}\n", .{ payload.payload_field1, payload.payload_field2 });
        }
    };
}

Notice that, differently from SomeErrorId and AnotherErrorId, the ErrorIdWithPayload is globally unique and won't clash with another identifier called ErrorIdWithPayload defined elsewhere even upon merging the error sets. It is used in exactly the same way as any other identifier. So, if func was in another file, one would have to qualify it with a namespace:

const fail_example = @import("fail_example.zig");

fn func() void {
    fail_example.fail() catch |err| switch(err) {
        error.SomeErrorId, error.AnotherErrorId => {},
        error fail_example.ErrorIdWithPayload => |payload| {
            std.debug.print("{s} {}\n", .{ payload.payload_field1, payload.payload_field2 });
        }
    };
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests