Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error sets #632

Closed
andrewrk opened this issue Nov 29, 2017 · 32 comments · Fixed by #759
Closed

error sets #632

andrewrk opened this issue Nov 29, 2017 · 32 comments · Fixed by #759
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Nov 29, 2017

See #632 (comment) the current iteration of this proposal.

everybody hates %. what can we do about it?

  • leave % alone as far as types go. %T is still how you make an error union on T.
  • replace %%x with tryfail x
  • replace %return x with tryret x
  • replace a %% b with a tryor b
  • replace %defer with trydefer
@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Nov 29, 2017
@andrewrk andrewrk added this to the 0.3.0 milestone Nov 29, 2017
@PavelVozenilek
Copy link

It would be handy for readability if these new keywords could be also written as try-fail, try-ret, ..., or at least as try_fail, try_ret, ...

@andrewrk
Copy link
Member Author

Proposal iteration:

  • Keep a %% b because it mirrors a ?? b which works the same as C#.
    But change it to a !! b.
  • Remove %%x. Instead use x !! unreachable.

Remove the %T type. Instead, functions can fail:

fn foo(x: i32) fail -> i32 {
    if (x < 0)
        return error.ExpectedPositiveNumber;
    if (x % 2 == 0)
        return error.ExpectedOddNumber;
    return x - 1;
}

Here, we introduce the fail keyword. The compiler is able to determine the set of possible error values that foo can return.

@typeOf(foo) is: fn(i32) fail (ExpectedPositiveNumber, ExpectedOddNumber) -> i32

foo can be implicitly casted to fn(i32) fail -> i32 which allows any set of error codes.
foo cannot be implicitly casted to fn(i32) fail(OutOfMemory) -> i32 which is only allowed to fail with error.OutOfMemory.
foo can be implicitly casted to fn(i32) fail(OutOfMemory, ExpectedPositiveNumber, ExpectedOddNumber) -> i32 which is allowed to fail with a superset of the set of failures that foo can have.

  • Change %defer expr; to fail defer expr;
  • Change %return expr to try expr
  • If we wanted to keep %%x it could be fail x.

Now there is no more % to mean error, and we we only have !! infix operator to match ??.

Now it becomes straightforward to allow multiple return values from a block:

fn div(a: i32, b: i32) fail -> i32, i32 { ... }

Now consider:

var xxx: error = undefined;
fn bar() fail {
    return xxx;
}

This is a compile error; Zig cannot determine the set of possible error codes. To fix:

var xxx: error = undefined;
fn bar() fail(OutOfMemory, SomeOtherError) {
    return xxx;
}

Now there is a runtime safety check that makes sure that bar only returns errors from the allowed set.

You might want to make the set of possible errors depend on some other function:

var xxx: error = undefined;
fn bar() fail(OutOfMemory, SomeOtherError, @errorsOf(foo)) {
    return xxx;
}

Now I'm thinking about how the error type might want to know the set of possible errors it could be.
We want to make it so that if you switch(err) we can give compile errors for missing possible error codes.

@thejoshwolfe
Copy link
Contributor

+1 for separating the error channel from the return values.

We may want to avoid conflating the terms "error" and "fail". This gets especially confusing when writing test harnesses, so it'd be nice to stick to just one term. Other languages use the term "exception" for this kind of thing, or even "signal". I think "error" is a good word for this.

Thanks to error being a keyword, all of your example uses of fail can be replaced with error and we have no ambiguity, except for fail x, but I think we should remove that anyway.

I think the @errorsOf(foo) idea needs some work. Let's not include that in this proposal.

@PavelVozenilek
Copy link

  1. If the compiler always forces one to handle returned errors (either "consuming" them or passing them up) then it would be redundant to annotate a function with fail. User will be always reminded when he makes a mistake.

    This detail plus list of all possible errors can be shown automatically by an IDE.

  2. Semantics of the fail(list of all errors) construct should be fail(list of errors invisible to the compiler which can be also returned). Listing all possible errors can be tedious to almost impossible, if the error generating code is conditionally compiled.

    Example:

    var xxx: error = undefined; // can be ThisError or ThatError
    fn foo(comptime bool b) fail(+ThisError, +ThatError)
    {
      if (...) return xxx;
      if (b) return OutOfMemoryError;
      else return SomeOtherError;
    }
    
    

@PavelVozenilek
Copy link

Other syntax which eliminates fail(list) completely:

var xxx: error = undefined; 
fn foo() 
{
  if (...) return xxx <ThisError | ThatError>;
}

If and only if compiler cannot deduce which error is returned it has to be supplied manually by the programmer. Compiler would then check for obvious mistakes.

This moves the error list into the most relevant place. Local or non-local change do not affect function signature.


fail annotation is fragile to changes in environment.

fn foo(comptime  b : bool) fail???
{
  if (b) return SomeError;
}

Pessimization (when in doubt label it as fail) would result in awkward situations where function is annotated but the (smart enough) compiler refuses to accept any error handling when such function is called, because there's nothing to handle. Whole function chains could be subject to this.


If there's annotation which may be useful and non-fragile it would be no-fail for library APIs, giving the user assurance that whatever happens inside stays inside.

// no need to worry that something unexpected leaks out of this function
pub fn foo() no-fail
{
  ...
}

exported function should be no-fail by default.

@kyle-github
Copy link

Personally I prefer using sum types (was enum, now maybe enum struct?) to handle this:

fn foo(aVal: bool) -> enum { i32, err1, err2, err3 ... }
{
... some code ...
    return err1; <-- needs to automatically set the result type.
...
}

In the calling code you would use switch to take apart the return type. The compiler can check all possible types returned from foo and that a caller handles all possible return types. No extra syntax at all as far as I can tell. I think this works now.

Note that this also covers the case that a function needs to return different types than just errors and one other type. The caller still has to switch to figure out what was returned.

This does not cover the !!/%% case but there is probably something to be done there that would be relatively simple.

@raulgrell
Copy link
Contributor

For someone who actually likes the % sigils and status-quo error handling, I quite like the direction that @andrewrk's proposal is going, and I agree with @thejoshwolfe that we should stick with one keyword.

@thejoshwolfe: Could you elaborate on why you think it's particularly valuable to separate the return values from the error channel?

On the other hand, I share @kyle-github's appreciation of sum types. The error enum concept is nice, and if the main objective is to remove the sigils, could just make them into actual keywords instead of changing the semantics of error handling.

A proposal with keyword based syntax:

a %% b becomes a error b

Remove %%a in favor of a error unreachable

Change %defer expr; to defer(error) expr;

Change %return expr to try expr;

Change %T to error T

fn foo(x: i32) -> error i32 {
    if (x < 0)
        return error.ExpectedPositiveNumber;
    if (x % 2 == 0)
        return error.ExpectedOddNumber;
    return x - 1;
}

Providing a set of valid errors:

fn bar() -> error(OutOfMemory) T {
    return allocate(T);
}

Basically, like in @kyle-github's example:

%T == error T == enum {T, error}
error(OutOfMemory) T == enum {T, error.OutOfMemory}

Now there is no more % to mean error, and everything still works the same way.

@kyle-github
Copy link

I like @raulgrell's points here.

I am also a fan of the existing % syntax. But then I did a lot of Perl coding years ago so sigils in general are not a problem for me. That said, typing sigils can slow down coding if you are a very fast typist (as I am). They are not on the home row :-)

There are a few things that are really nice about the current system:

  • There is a very clear idiom for handling an error returned from a function. Or several idioms. These do not add to the expressiveness of the language but make common patterns really clean. I am thinking of the %return and %defer keywords.
  • There is a very clear way to introduce error values as a sort of pseudo enum. This is nice because it makes it possible to do this safely anywhere in the code and you do not need to worry about reusing values accidentally. Ever try adding something to errno's valid set? Yeah...

So, how does the addition of these keywords and removal of % keep these points?

Here is another approach to handling errors that is similar to what happens now, but also does not require extra sigils.

First, we make error a somewhat special enum that is automatically program-wide. You declare all the errors you need to add in the compilation unit. However, rather than just an enum, you can think of error as a special ordinal type and the type is error. IDEs can eventually help with suggestions etc. to help prevent accidental error additions due to typos.

...
error ENoMem;
error ENullPointer;
error ENegative;
...

fn foo(bar:i32) -> error | i32 {
    if(bar < 0) {
        return ENegative;
    }
...
    return bar+42;
}

In this code, you simply use return as normal and the compiler figures out what you want to do. You do need to declare that you return an error or an i32. I use the | syntax from some other languages to indicate type alternates. I.e. the return value type is either an error or an i32. This could be written as enum { error, i32 } today I think. I like using | for alternation and keeping enum for named constants. I can see a strong argument for using union too.

When using this you get:

...
var x = foo(y);
if (@typeOf(x) == error) { defer cleanupAfterFoo(x); }

I am conflating a few idea I have with respect to defer here but they can be ignored for now. (I like the idea of providing the arguments to the deferred function so that you can easily implement things like error counters etc. OOB. This can also be extended to deferring a block instead of a function call though local functions would solve most of that too.)

Or you could use switch:

var x = foo(y);
switch(@typeOf(x)) {
     error => ... do something with the error like defer or return. ...
     i32 => ... do something with the value...
     ...
}

This shows a couple of things I was thinking about. While I really like the short syntax of the error defers and returns, either you need special syntax to support it or you should drop down to using existing keywords and syntax. One thing I am not sure about is how to avoid magic in handling the types. In my examples there is an implicit cast of x to the tested type. Perhaps that can be a special function:

if(@castTo(error, x)) | err | ...

But I am not sure I like that because it is somewhat hacky version of allowing multiple return values.

Rust is approaching this problem by starting off with very few shortcuts and then adding them as the idioms become clear and accepted. It might make sense for Zig to do the same at first. Premature optimization is the root of all evil etc.

@PavelVozenilek
Copy link

PavelVozenilek commented Dec 2, 2017

Ideal error handling for me

There is conceptually one global error variable in the program (well, one per thread).

This sets it:

return SomeError;

Its the simplest possible way.

Why not make error handling similarly easy? There's no need for ceremony: the compiler always knows what errors can a function return (with some help in edge cases), and will always be able to check that all these errors are (somehow) covered.

// this will ignore all errors
var x = foo(bar(), baz());
if (err) { // catches all possible errors from foo, bar and baz, 'err' is keyword
  // no-op, the x is known to be undefined at this point
} else {
  // x is valid here
}


//--------------------------------------------
// this will pass all errors up, they will be added to caller's interface
var x = foo(bar(), baz());
if (err) {
  return err;
}


//--------------------------------------------
// this will handle all errors, compiler will make sure all cases are covered
var x = foo(bar(), baz());
switch (err) {
ThisError => , // no-op = handled by ignoring
ThatError => return err; // passed up, ThatError becomes part of caller's interface
}


//--------------------------------------------
// some errors are handled, the rest is passed up
var x = foo(bar(), baz());
switch (err) {
ThisError => ,
else => return err;
}


//--------------------------------------------
// this will not compile because errors are not handled
fn f() 
{
  var x = foo(bar(), baz());
}


//--------------------------------------------
// handling all errors in one place is also possible
fn f() 
{
  var x = foo();
  // in normal flow x is valid now
  var y = bar(x);
  // in normal flow x and y are valid now
  var z = baz(y);

  if (err) { // it will jump here if there's error anywhere above
    // z is certainly not valid here, compiler should tag x, y as undefined too
  } else {
   // x, y, z are certainly valid
  }
}



//--------------------------------------------
// stepwise handling of errors
fn f() 
{
  var x = foo();
  if (err == ThisError) { // other errors go further down
    x = some substitute value
  }

  var y = bar(x);
  // catches anything but ThatError, may be risky, but user explicitly choose to do so
  if (err != ThatError) { 
   y = ...
  }

  var z = baz(y);

  if (err) { // remaining uncaught errors (especially ThatError) are handled here
    ...
  }
}

  1. There's no ceremony. No need to annotate functions, no need for elaborate return types, no need for strange tricks to extract the errors. Only one new keyword (err) is needed.
  2. Every error must be handled somehow, and it is done in intuitive way, w/o need for special forms. Compiler ensures all valid errors are handled but no more.
  3. There will be no way to use undefined variables (because error occured) in the program, unless the programmer explicitly writes it that way (by ignoring the error explicitly and continuing as if nothing bad happened). Compiler may still warn in such cases.
  4. If a project has strict requirements it may force (via project settings) explicit individual handling of every possible error (not ignoring them en-masse or just passing them up).

@Dubhead
Copy link
Contributor

Dubhead commented Dec 2, 2017

I don't hate the % sigil, but if we are to ditch it, I prefer keywords that are easy to remember.

Replace %T with T or error
Replace %%x with error(unreachable) x
Replace %return x with error(return) x
Replace a %% b with a error(or) b
Replace %defer with error(defer)

The mnemonic is: The thing inside parens is executed on error.

Also, Java's checked exception is a controversial feature, and I don't want Zig to go that way.

@andrewrk andrewrk changed the title get rid of some % sigils with new keywords failable functions instead of error union type Dec 7, 2017
@andrewrk
Copy link
Member Author

andrewrk commented Dec 8, 2017

Proposal iteration:

  • Remove error top level declarations.
  • Remove the error type.
  • Add the ability to create an "error set" type:
const Errors = error {
   OutOfMemory,
   InvalidInput,
};
  • In a function, declare and optionally document errors to return them:
fn foo(x: u32) error -> u32 {
    // This could be declared outside the function, or in, doesn't matter.
    const Errors = error {
        /// The input was the value 0, which is unsupported.
        Zero,
        /// Did you think 1 was ok? It's right out.
        One,
    };
    if (x == 0)
        return Errors.Zero;
    if (x == 1)
        return Errors.One;
    return x - 2;
}
  • error is the global error set. Every error set that is created, its entries get added to the global error set. The global error set does not allow field access, e.g. error.Foo is not allowed.
const Something = struct {
    condition1: bool,
    condition2: bool,
    err: error,  // Bad choice for the type of this field because now `bar`'s error set is the global error set.
};
fn bar(ptr: &Something) error -> u32 {
    if (ptr.condition2) {
        return ptr.err;
    }
    return 1234;
}
  • Each failable function creates an error set by the union of all the error sets of the possible return types.
const Something = struct {
    condition1: bool,
    condition2: bool,
    err: PossibleErrors, // better choice for the type of this field
};

const PossibleErrors = error {
    OutOfMemory,
    InvalidUserInput,
};
fn bar(ptr: &Something) error -> u32 {
    const Errors = error {Condition1WasTriggered};
    if (ptr.condition1) {
        return Errors.Condition1WasTriggered;
    }
    if (ptr.condition2) {
        return ptr.err;
    }
    return 1234;
}
  • @errors(bar) is error{OutOfMemory, InvalidUserInput, ConditionWasTriggered}
  • Error set declarations can inherit from other sets:
// MorePossibleErrors is all the errors from PossibleErrors, all the errors foo can return, and Derp.
// @errors(x) returns the error set of x
const MorePossibleErrors = error(PossibleErrors, @errors(foo)) {
    Derp,
};
const OtherErrorSet = error {
    OutOfMemory,
    Unique,
};

comptime {
    assert(MorePossibleErrors.Derp == error.Derp);
    assert(MorePossibleErrors.OutOfMemory == PossibleErrors.OutOfMemory);
    assert(MorePossibleErrors.OutOfMemory == OtherErrorSet.OutOfMemory);
    assert(MorePossibleErrors.Unique == OtherErrorSet.Unique); // compile error: error set MorePossibleErrors has no field Unique
}
  • A function can override the documentation comment for an error:
fn foo(x: u32) error -> u32 {
    const Fail = error {
        /// The user specified numbers that are not allowed because they are too big.
        InvalidInput,
    };
    if (x > @maxValue(u32) - 10) {
        return Fail.InvalidInput;
    }
    return (try bar(x)) + 10;
}
fn bar(x: u32) error -> u32 {
    const Fail = error {
        /// The user specified the number 0 which is not allowed.
        InvalidInput,
        OutOfMemory,
    };
    if (x == 0) {
        return Fail.InvalidInput;
    }
    return 1234;
}

The documentation for the InvalidInput error that can be returned for foo will be distinct from the documentation for the InvalidInput error that can be return for bar.

Note also that bar reserves the right to return OutOfMemory even though it is not yet possible for that to happen. This is important for comptime branches.
@errors(foo) is error {InvalidInput, OutOfMemory} because the errors from bar are inherited from the try.

@andrewrk andrewrk added the accepted This proposal is planned. label Dec 8, 2017
@andrewrk
Copy link
Member Author

andrewrk commented Dec 8, 2017

One more optional thing:

Maybe make an error set with only 1 field implicitly cast to its value instantiation. So then you could do:

fn foo() error {
    return error{ItBroke};
}

Which both creates the ItBroke error and returns it. This reduces overhead of using errors, making people more likely to use them.

@andrewrk
Copy link
Member Author

andrewrk commented Dec 8, 2017

Another adjustment: I ran into an issue with syntax.

We want function prototypes to be able to specify the error set, for example if you accept a function pointer:

const ErrorSet = error { A, B };
fn foo(func: fn() ErrorSet -> i32) {
    // the function passed can only return errors A and B
};

This means fn proto has an optional expression and then an optional -> expression and then a body which starts with {. This is ambiguous:

fn foo() {

}

Is the { ... } part of the error set or the function body?

Here's how I'm going to fix it, at least for now:

// function that can return an error or an i32
// the error set is determined automatically by the compiler
fn foo() -> i32 !! {

}
// function that can return an error or void
// the error set is determined automatically by the compiler
fn foo() !! {

}

// function that returns void
fn foo() {

}

// function that must return an error in ErrorSet or void
const ErrorSet = error { A, B };
fn foo() !! ErrorSet {

}

// function that must return an error in ErrorSet or i32
const ErrorSet = error { A, B };
fn foo() -> i32 !! ErrorSet {

}

// function that can return i32 or any error in the entire global error set
fn foo() -> i32 !! error {

}

This mirrors the way unwrapping a function call with an error works. const result = foo() !! default_value;

@Hejsil
Copy link
Contributor

Hejsil commented Dec 8, 2017

I was actually thinking of a feature related to error sets. The idea was to be able to have error categories, so you could check if an error was part of a category like SDLError or something. With error sets this could be done by introducing a builtin function called @errorSetContains.

As far as i've read on other issues, we like to have non pseudo code examples if possible, so here is an example based on C code i've had to write before using SDL.

// Let's pretend that we are able to import sdl headers
const sdl = @cImport(@cInclude("SDL2/SDL.h"));
const mixer = @cImport(@cInclude("SDL2/SDL_mixer.h"));

const SDLError = error {
    UnableToInitSDL,
    UnableCreateWindow
};

const MixerError = error {
    OpenAudioFailed,
    LoadMUSFailed
};

fn openAndLoadMusic() -> &mixer.Mix_Music !! MixerError {
    var status = mixer.Mix_OpenAudio(mixer.MIX_DEFAULT_FREQUENCY, mixer.MIX_DEFAULT_FORMAT, 2, 1024);
    if (status != 0) {
        return MixerError.OpenAudioFailed;
    }

    trydefer mixer.Mix_CloseAudio();
    if (Mix_LoadMUS("path/to/music.wav")) |music| {
        return music;
    } else {
        return MixerError.LoadMUSFailed;
    }
}

fn initSdlAndCreateWindow() -> &sdl.SDL_Window !! SDLError {
    var status = sdl.SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO);
    if (status != 0) {
        return SDLError.UnableToInitSDL;
    }

    trydefer sdl.SDL_Quit();
    if (sdl.SDL_CreateWindow("Test", 
        sdl.SDL_WINDOWPOS_UNDEFINED, sdl.SDL_WINDOWPOS_UNDEFINED,
        640, 480, sdl.SDL_WINDOW_SHOWN)) |window| {
        return window;
    } else {
        return SDLError.UnableCreateWindow;
    }
}

const Game = struct {
    const Self = this;

    music: &mixer.Mix_Music,
    window: &sdl.SDL_Window,

    fn init() -> Self !! {
        Self {
            .window = tryreturn initSdlAndCreateWindow(),
            .music = tryreturn openAndLoadMusic(),
        }
    }

    fn update(game: &Self) {  
        // Updating!
    }

    fn draw(game: &Self) {
        // Drawing!
    }
}

pub fn main() !! {
    var game = if (Game.init()) |result| {
        result
    } else |err| {
        if (@errorSetContains(SDLError, err)) {
            var errMessage = sdl.SDL_GetError();
            // Print error
        } else if (@errorSetContains(MixerError, err)) {
            var errMessage = mixer.Mix_GetError();
            // Print error
        }

        return err;
    }

    while (true) {
        game.update();
        game.draw();
    }
}

@PavelVozenilek
Copy link

@Hejsil: would it be enough to have structured error names, like MyErrorClass.SpecificError and then be able to use prefix part of the name, like if (err == MyErrorClass) ?

@Hejsil
Copy link
Contributor

Hejsil commented Dec 8, 2017

@PavelVozenilek What ever syntax works best, though idk if I want == to also mean "is part of" or "contained in" in certain context. Seems confusing for no real gain. Also, figuring out if an error is part of an error set would probably require some auto generated function that is then called, which we could argue is a hidden function call, though not one the user wrote.

It could also be implemented in userspace if we are able to get a slice of all values in an enum/error set at runtime:

fn contains(comptime T: type, values: []T, value: T) -> bool {
    for (values) |v| {
        if (v == value) return true;
    }
    return false
}

const Error1 = error {
    A,
    B
}

const Error2 = error {
    C,
    D
}

test "contains" {
    // Whatever syntax you would like goes here
    //                     VVVVVVVVVVVVVVVVVVV
    assert(contains(error, @getMembers(Error1), Error1.A));
    assert(contains(error, @getMembers(Error2), Error1.A)); // Fails
}

@andrewrk
Copy link
Member Author

andrewrk commented Dec 8, 2017

@Hejsil this @errorSetContains feature as you have described it makes sense, and the use case is compelling. Thanks!

I think we might add a more generic reflection function for that. Maybe @memberIndexByTag which returns ?usize. So you could do @memberIndexByTag(MixerError, err) ! = null.

Either way, we'll make sure that this use case is covered.

Oh, and your userland solution would work, with the use of @memberCount and @memberName. We might need to add @memberTag for getting the tag value / error value of a member though.

@thejoshwolfe
Copy link
Contributor

Could you elaborate on why you think it's particularly valuable to separate the return values from the error channel?

  1. Assigning to _ shouldn't be able to suppress errors. For example, if we change this to a %return and make the function failable, then we should get a compile error here until we %return there too.
  2. Multiple return values (just a proposal at this point) clashes with error union but harmonizes with error channel.
  3. There are weird cases where error unions get ugly:
fn HashMap(comptime K: type, comptime V: type) -> type {
    return struct {
        const Self = this;
        // ...
        fn get(self: &const Self, key: K) -> %V {
            if (something) return error.KeyNotFound; // supposed to be failure
            var actual_value: V = something;
            return actual_value; // supposed to be success
        }
    };
}

fn main() {
    var posix_error_codes = HashMap(i32, error).init();
    posix_error_codes.put(2, error.NotFound);
    posix_error_codes.put(5, error.IoError);
    // ...
    var posix_error: error = posix_error_codes.get(errno) %% |err| error.Unknown;
}

If your V is error, then -> %V means -> %error. When you get into that situation, it's literally impossible to return a "failure" error instead of a "success" error. The "supposed to be a failure" line above would actually behave like a success.

I propose we use a different keyword for returning something along the error channel. Instead of return error.Foo;, we should use something like raise error.Foo; or throw error.Foo;. I'm not happy with either of these keyword suggestions, but I think it needs to be a different statement than a normal-looking return.

@kyle-github
Copy link

I am getting a bit confused (a normal state) by why this is being done...

If you want to move toward multiple return values, then separating errors out a la Go makes more sense to me and fits with "one obvious way to do it." Then you can use _ as the only real special magic:

fn foo(i:i32) -> i32, error {
   ...
   if(i == 0) {
       return _, error.DivByZero;
   }
   ...
   return 42/i, _;
}

If you want to move toward something like a union, why not just use a union?

const retType = union {
   err: error;
   val:i32;
}

fn foo(i:i32) -> retType {
   ...
   if(i == 0) {
      return retType{.err = error.DivByZero};
   }
   ...
   return retType{.val = (42/i)};
}

This feels a little like the "optimized" syntax is being discussed first without solving the underlying question of how errors need to be handled. Are errors integral values that are passed along a channel? Are they a sort of enum that is an alternate value (result is union-like)? Are they individual types in a union (also union-like but with the possibility of carrying more data)?

@thejoshwolfe, thanks for the example with returning %error... Hmm...

@andrewrk andrewrk modified the milestones: 0.3.0, 0.2.0 Dec 15, 2017
@andrewrk andrewrk added the breaking Implementing this issue could cause existing code to no longer compile or have different behavior. label Dec 15, 2017
andrewrk added a commit that referenced this issue Dec 18, 2017
this will be fixed in a better way later by #632
@jido
Copy link

jido commented Jan 3, 2018

I do not like the !! proposal. What makes it symmetrical, the object is different (value versus error type). Also you introduce it as a way to remove ambiguity between expression and function body, then add !! without error type which reintroduces the ambiguity.

Kyle's _ proposal is nice.

andrewrk added a commit that referenced this issue Jan 7, 2018
See #632

better fits the convention of using keywords for control flow
andrewrk added a commit that referenced this issue Jan 7, 2018
See #632

better fits the convention of using keywords for control flow
@andrewrk andrewrk changed the title failable functions instead of error union type error sets Jan 7, 2018
andrewrk added a commit that referenced this issue Jan 7, 2018
@jido
Copy link

jido commented Jan 8, 2018

Is there a new Issue about failable functions?

@andrewrk
Copy link
Member Author

andrewrk commented Jan 8, 2018

Sorry, it's a bit disorganized. Here's what is actually accepted in this issue:

  • removing %% prefix operator
  • replacing %defer with errdefer
  • error sets
  • the status quo error union type integrates with error sets
  • a function whose return type has an implicitly determined error set (syntax to be determined)

Failable functions vs error union type I think will be a separate issue. I think this issue is complicated enough that I need to implement the stuff that I'm more confident about, and then see how it feels, and then iterate from there. Sorry for the instability. That's why we're only at 0.1.1. Some things, such as error sets, and concurrency, will be experimental for a little while longer while we narrow in on how to make zig the best language it can be.

andrewrk added a commit that referenced this issue Jan 9, 2018
See #632
closes #545
closes #510

this makes #651 higher priority
andrewrk added a commit that referenced this issue Jan 24, 2018
See #632

now we have 1 less sigil
@andrewrk
Copy link
Member Author

andrewrk commented Jan 24, 2018

here is error sets as I plan to implement them:

// void functions must be declared explicitly
fn foo() -> void {
  
}

// remove the -> since we always have return type
fn foo() void {
  
}

// error union looks like this
// this could be any error in the entire program
const x: error!i32 = 1234;

// declare an error set
const MyErrSet = error {OutOfMemory, FileNotFound};

// error union with an error set
const y: MyErrSet!i32 = 5678;
const z1: MyErrSet!i32 = MyErrSet.OutOfMemory;

// error set of size 1 implicitly casts to instance of
// any error set which contains it
const z2: MyErrSet!i32 = error{OutOfMemory};

// leave off the error set in a function return type to
// have it infer the error set
fn foo() !i32 {
    // this declares the ItFailed error
    return error{ItFailed};
}

// the error set of foo is foo.errors
const bar1: foo.errors!i32 = 42;
const bar2 = foo.errors.ItFailed;

// merge error sets
const ErrSetA = error{
    /// ErrSetA doc comment
    BadValue,
    Accident,
};
const ErrSetB = error{
    //// ErrSetB doc comment
    BadValue,
    Broken,
};
// doc comment of MergedErrSet.BadValue is "ErrSetA doc comment"
// MergedErrSet contains {BadValue, Accident, Broken}
const MergedErrSet = ErrSetA || ErrSetB;

andrewrk added a commit that referenced this issue Jan 25, 2018
The purpose of this is:

 * Only one way to do things
 * Changing a function with void return type to return a possible
   error becomes a 1 character change, subtly encouraging
   people to use errors.

See #632

Here are some imperfect sed commands for performing this update:

remove arrow:

```
sed -i 's/\(\bfn\b.*\)-> /\1/g' $(find . -name "*.zig")
```

add void:

```
sed -i 's/\(\bfn\b.*\))\s*{/\1) void {/g' $(find ../ -name "*.zig")
```

Some cleanup may be necessary, but this should do the bulk of the work.
@tjpalmer
Copy link

I think losing -> makes code harder to read. Maybe for being less like other languages in the syntax space, so maybe it's just a conditioning thing for me.

Also, error types before main return type seems visually wrong, too.

And ErrSetA || ErrSetB also goes against sum/union syntax in other languages, too. The || is normally for bools and short-circuits. The | is better for sets. If each error is both a value and a type, then using TypeA | TypeB is a meaningful general direction, which @kyle-github also mentioned earlier. Could be a short-hand in some cases for your current auto-tagged unions except I guess expanding sub-unions to top level. (New to zig, but I presume the current unions don't promote sub-unions to top level.)

I guess multiple of my comments just emphasize that looking/working like other languages can be good when there's no strong reason to break from them.

I very much agree that a syntax for inferring the error set would be good, if error sets really are needed at all, since you expect the compiler to report missed handling anyway. Generated docs would show them, too, I presume.

@tjpalmer
Copy link

Maybe Type | error would be good syntax for inferring error types, since error is already a keyword.

@tjpalmer
Copy link

Your suggestion of !Type as shorthand for Type | error seems reasonable, too, though. Going back to sets, since ! is normally a bool thing (vs the set thing ~), it wouldn't suggest inverse set, and that's good here.

And on -> being a "more than one way to do it," I'm not convinced, since fn blah()! { might be allowed but not fn blah() -> ! { if that's what you were getting at. I would require the type for fn blah() -> !void syntax. And I think fn blah()! { is a fine learnable shorthand for fn blah() -> !void just like !void is learnable for void | error.

You also have a handful of cases where you return error explicitly already. If error infers, these would change to inferring the types of errors contained. Maybe would need a different type to say "any error" if error infers.

Anyway, I've spammed enough and I'm just some new guy here, so I better stop for now.

@andrewrk
Copy link
Member Author

@tjpalmer I think you raised some good points about syntax. Sounds like you're on board with the semantics. I'm in the middle of implementing this, and so I'll finish up and we can try out the semantics, and then make an adjustment to the syntax.

A couple of things to consider:

  • Using | has a problem that it's also used for bitwise OR. We could make it work based on the type system, but in general, less ambiguity is helpful. "communicate intent precisely"
  • Putting the type before or after: Currently we have ?T as well as &T, and this proposal replaces %T with error!T. Having the thing that decorates the type always on the left fixes the confusion caused by, e.g. C's spiral rule.

@tjpalmer
Copy link

Thanks for the reply.

On semantics, are you okay with general union/sum types? I think going beyond just error types would be good here. General principles that happen to apply to errors would be nice. And you might be meaning this, but I'm not 100% sure.

On syntax, understood for the spiral rule, which is why I figured a shortcut modifier would still be a prefix, as you suggest. For TypeA | TypeB | TypeC, I would parse this according to standard expression parsing, not as a special case modifier. So order is irrelevant, and I suspect most people would prefer to put the primary type first in the case of Type | error.

And still I recommend familiarity where possible. I don't ever use or and and in C/C++, so I got surprised today on a learning Zig project I started, when && didn't work. So, I guess I'm still learning some of your preferences, despite having mostly gone through the documentation.

On set operators, bitwise operators are set operators. So, 0x3 | 0x9 -> 0xB is the same thing as (bits) {1, 2} | {1, 8} -> {1, 2, 8} (if I'm doing the bits right in my head). Python uses | and & for set operations (as well as bit operations), and TypeScript, Haskell, Ceylon, and more use | for sum/union types (noting that TypeScript doesn't have operator overloading for ordinary expressions). And even Java (again, no operator overloading) uses & for intersection types (in generics) and | for union types (in exception handling), too. (And sorry for being overly explicit on things like mentioning Java not having operator overloading. Just trying to make sure some key points are out.)

I'm not sure if intersection types have any place in Zig ever, but using the common notation for union types could be handy. And I don't expect sets in core Zig, either. I just made the point about sets (including in Python) to point out that this notion of bitwise operators for sets is fairly pervasive.

Anyway, too much chatting on my part, again.

@tjpalmer
Copy link

In my comments, I was forgetting that types appear as first-class values in standard expressions at compile time in Zig, but I think the intent would still be clear and well-aligned with other languages.

(As an aside, I'm here exploring Zig land because I think you've made a lot of great decisions. After years of not trying to make a language of my own, I'd started again recently, but it's hard to make a progress on spare time. I'm very impressed with what you've gotten done already, and I'm more interested in something in this space succeeding than in being the exact language I'd design myself. The only things I'm inclined to push here are those that I think can work with your existing vision.)

@andrewrk
Copy link
Member Author

On semantics, are you okay with general union/sum types?

Have you looked at how unions work? http://ziglang.org/documentation/master/#union
This is intended for general use, whereas error unions have special semantics.

And still I recommend familiarity where possible. I don't ever use or and and in C/C++, so I got surprised today on a learning Zig project I started, when && didn't work.

A precedent for and and or can be found in Python, one of the most popular programming languages. The suggestion to change from && and || was made by @ant6n, who pointed out that control flow is accomplished almost exclusively through keywords, and this would make that pattern more consistent. Further, and and or do work in C. See #272 for more details.

@tjpalmer
Copy link

I did see the union thing previously. I was imagining this as a slight variation on that but with the same underlying and compatible mechanism (and for more than errors). But I also understand that you try to have one way rather than multiple ways to say the same thing (whereas having both | and union(enum) to express unions is multiple ways). Do as you see fit, of course.

And understood on and and or as well. It caught me by surprise is all. Maybe a well-targeted compiler error message could help there.

Anyway, thanks for reply again.

@thejoshwolfe
Copy link
Contributor

A note about using || instead of | for for unioning error sets:

You're absolutely right about bitwise | effectively being a set union operator. There is subtle pattern so far of operators Zig borrows from Python, but due to the "no hidden allocations" design of Zig these operators can only work at comptime. These operators are Python's + for array concatenation, * for sequence repetition, and now | for set unioning. We wanted to make these "dynamically size" operators in Zig look different than their normal integer meanings, so we went with ++, **, and ||. The pattern is "double operator means it operates on a different type than single operator, and the double operation only works at comptime.".

As a bonus, ++ is the concatenation operator in Haskell. But as a negative bonus, ** is the exponentiation operator in many languages (JavaScript, Python), and || is short-circuit or in almost every language that uses that operator (C, Java, even Bash).

As nice as this "double operator" pattern is, I'm not sure how far it will go beyond ++, **, and ||, and it's already broken with operators like == and ?? (although ?? violates the "keywords for control flow" rule mentioned above as well, so perhaps it will be changed someday anyway). So it's not a completely arbitrary decision, but I'm not sure how solid it is either. And the confusion with the || operator from the C family is still a bit concerning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. breaking Implementing this issue could cause existing code to no longer compile or have different behavior. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants