New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are the %%'s in front of functions necessary? #545

Closed
taoeffect opened this Issue Oct 18, 2017 · 33 comments

Comments

Projects
None yet
@taoeffect

taoeffect commented Oct 18, 2017

I couldn't really figure out from the guide why they were needed in front of many function calls.

They look like they might be unnecessary, and they also add to the visual confusion / verbosity / complexity / lack of clarity in the language (IMO). Is there a way the compiler can automate whatever it is they are there for?

@andrewrk

This comment has been minimized.

Member

andrewrk commented Oct 18, 2017

Have a look at http://ziglang.org/documentation/master/#errors

%%foo() is the same as foo() %% unreachable, that is, it asserts that there is no error.

@thejoshwolfe is working on a proposal to get rid of the %% prefix operator, but it makes the function calls that currently have %% in front of them even more verbose.

@thejoshwolfe

This comment has been minimized.

Member

thejoshwolfe commented Oct 18, 2017

Related: #510

There's currently a discussion going on to possibly remove the %% operator from the language entirely.

The operator means "the function i'm calling here is declared to possibly return an error (a % in the return type), but i promise that when i'm calling it here, it will never actually result in an error." It is a language-level assertion that there's no error.

There's an abundance of code in existence right now that overuses the %% prefix operator where it is inappropriate to do so, which is evidence that it should be removed from the language.

Is there a way the compiler can automate whatever it is they are there for?

No, and that's a feature of the language. They are there to acknowledge the possibility of an error, and there's no default for what to do with errors when they happen. The programmer must do something with an error wherever it can happen.

This design decision is to encourage code authors to think about error cases and handle them appropriately in version 0.0.0 of their code. It's always possible to explicitly ignore errors with foo() %% |_| {} or assert that errors will never happen with foo() %% |_| unreachable, but at least the programmer had to type that and take responsibility, and at least those cases are visible in the source code.

@taoeffect

This comment has been minimized.

taoeffect commented Oct 18, 2017

Might'n't there be an even simpler solution by changing entirely how errors are handled in the language?

My intent here is to be a muse, not a rabble rouser. I'm very interested in safe, secure, performant languages, and am happy to see Zig existing and competing with the likes of Rust. I'm also interested in language design and in simplicity as a vehicle to security.

From my experience I am very cognizant of the importance of removing as much syntax from languages as possible. Less syntax = less room for errors, less to learn, and less cognitive overload, so if there's a way to rework the language error handling to just not need this frequent usage of %, it could lead to something very cool.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Oct 18, 2017

Might'n't there be an even simpler solution by changing entirely how errors are handled in the language?

Do you have a proposal?

@taoeffect

This comment has been minimized.

taoeffect commented Oct 18, 2017

Do you have a proposal?

Well, before I could make a proposal I would have to study the language much more to figure out what purpose the operator is currently serving, and how error handling is handled in general.

Just thinking out loud here...

The docs say:

The %% prefix operator is equivalent to expression %% |err| return err. It unwraps an error union type, and panics in debug mode if the value was an error.

Um, so, could you design the language to just not do that? Or do it in another way if for some reason it's critical?

"panics in debug mode" sounds like some sort of exception handling mechanism.

"unwraps an error union type" sounds like a weird type system thing.

Earlier the docs say:

Maybe you know with complete certainty that an expression will never be an error. In this case you can do this:

const number = parseU64("1234", 10) %% unreachable;

Here we know for sure that "1234" will parse successfully. So we put the unreachable value on the right hand side. unreachable generates a panic in Debug and ReleaseSafe modes and undefined behavior in ReleaseFast mode. So, while we're debugging the application, if there was a surprise error here, the application would crash appropriately.

Like, is this syntax really necessary to cause the app to panic?

Why not just panic on unhandled errors like other languages do?

I can't really think why I would want to say that I am "absolutely sure this function will not generate an error". If I want a function to not generate an error, I will write it in such a way that it won't generate an error, and if the compiler needs to know that no error will be generated (for some reason) it can figure that out based on the implementation of the function.

@kyle-github

This comment has been minimized.

kyle-github commented Oct 18, 2017

It may be that '%%' is a case of slightly premature optimization. I happen to really like the idea that wrapping a type with % allows you a way to return errors without resorting to Go's multiple return value idea. The problem with Go's way is that 99% of the time, you are returning a value or an error but not both. So, you either use one of the return values or the other. The way that Zig works makes this extremely common case very, very clean.

Now with %% that might be too much. I can see keeping:

var x = %return foo(y);

Since that is a very common idiom and really easy to scan for when hardening code.

@taoeffect , I think you might be confusing Zig errors with exceptions. They are definitely not that. Think of it as a side channel on which you can get error values. It is almost identical in use to Go's multiple return values, but with a much more pleasing (IMO) flow to the code. Rust does something similar and it also works very nicely.

@taoeffect

This comment has been minimized.

taoeffect commented Oct 18, 2017

I admit to not having dived into the details of Rust's error handling either, and Go code I've only skimmed, so perhaps that's part of the issue on my end.

Still, this sentence sounds weird to me:

Maybe you know with complete certainty that an expression will never be an error. In this case you can do this:

Why is it my business to "know with complete certainty that an expression will never be an error"? Seems to me that's clearly the compiler's area of concern, not mine.

So if the compiler knows with complete certainty there won't be an error, it shouldn't need me to tell it.

If functions return "error types" that need to be handled appropriately, well, the match control-flow operator from OCaml/Haskell (? - this too I only have cursory knowledge-of, not practice using) or something similar can be used.

Errors can be considered simply causing a different "stream" of code to get triggered, and besides defining what those "stream paths" (aka branches) are, I don't see why the programmer needs to specify anything else.

In C, I handle these "stream paths" with a goto label plus macros. Results in very readable C code that handles errors flawlessly.

@kyle-github

This comment has been minimized.

kyle-github commented Oct 18, 2017

@taoeffect, look at more Zig code and it will become clear. A type that can also be a value can be split by if or switch. I think switch is the closest to match in Zig. You can think of a % type as a sum type of components Error and some specific value type. E.g.

fn foo() -> %i32

returns a 32-bit integer or an error.

Based on your example, you seem to want exceptions, even if they are local. Taking your example and rewriting in Zig (on the fly while I have no compiler in front of me, so please excuse the syntax errors!).

fn initKeychainAccess() -> %bool
{
    %return SecKeychainSetUserInteractionAllowed(TRUE);
    %return SecKeychainUnlock(gKeychain, 0, NULL, FALSE);
    %return SecKeychainAddCallback(MyKeychainCallback, kSecEveryEventMask, NULL);
    doThatFancyThingYouDo();
    return true;
}

Now, this is not what I think of as idiomatic Zig, just a line-by-line translation. I think there is probably a better way to return errors than this but hopefully you can get the idea.

@kyle-github

This comment has been minimized.

kyle-github commented Oct 18, 2017

There is some discussion about the use of defer that can also do some interesting things. I cannot remember the issue # though :-(

@thejoshwolfe

This comment has been minimized.

Member

thejoshwolfe commented Oct 19, 2017

One thing that the goto+macros solution has that the %return solution does not have is the log_err calls. Making %return more debugging-friendly is going to be an important issue soon in Zig development. There have been informal discussions of attaching stack traces to error "objects" (which are really just error codes at the moment) in debug mode, but these ideas are still pretty young.

Why is it my business to "know with complete certainty that an expression will never be an error"? Seems to me that's clearly the compiler's area of concern, not mine. So if the compiler knows with complete certainty there won't be an error, it shouldn't need me to tell it.

That's putting a little bit more faith in the compiler than Zig is comfortable doing right now.

The example of parseU64("1234", 10) might be a bit too contrived, since you could just type 1234 as a number literal. Take as a different example if the string was constructed with an algorithm that is guaranteed to never generate a malformed string, then you can assert that the parsing function will never fail. Or perhaps the well-formedness of the string was already checked by a separate validator, then you would know that the function would never fail at that point.

Generally, the case of asserting that a function will never return an error is pretty rare, which is why I would like to get rid of the %% prefix operator. Then the question of when you know with 100% certainty that a function will never return an error fits into the general situation of when you know something with 100% certainty. The general pattern is to use unreachable or assert() (which just uses unreachable) to communicate to the compiler that you know something with 100% certainty. Then in safety mode, the compiler will double check that assertion, and in release-fast mode, the optimizer will write as performant code as possible while trusting the assertion.

@taoeffect

This comment has been minimized.

taoeffect commented Oct 19, 2017

That's putting a little bit more faith in the compiler than Zig is comfortable doing right now.

Is the compiler less intelligent than the Java compiler? In Java it can tell whether exceptions might be thrown, and if you choose not to catch them you must write throws SomethingException in the definition of the function.

Can't Zig work like this?

Then instead of saying "I know this won't cause an error", you match (or switch) on the return type of a function appropriately, and include an unhandled path as a catch-all for any error types you chose not to write explicit branches for?

@taoeffect

This comment has been minimized.

taoeffect commented Oct 19, 2017

OK, I think my previous comment might have missed the point of the return% operator... which is to make it possible to call functions one after another, similarly to my DO_FAILABLE macro.

If so, that is neat, and thanks @kyle-github for rewriting my code! That really helps me understand a lot better what's going on.

OK, now previous comment from @kyle-github is starting to make sense to me.

Now with %% that might be too much. I can see keeping:

So %% and %return are the same thing?

If so, OK, this makes a lot more sense to me, and yes, I would be in favor of removing %% but keeping %return, but perhaps changing it to something else, as unexpected symbols like % (which means "mod" in many other languages) really confuses newcomers like me.

@taoeffect

This comment has been minimized.

taoeffect commented Oct 19, 2017

I will note, that in C, I wrote the DO_FAILABLE macro because all I had was C at my disposal, but logically what I was doing was creating an "and" chain.

i.e. in JavaScript, something like fn1() && fn2() && fn3().

So what are the desirable properties here?

For me it would be:

  • Make it easy to write a sequence of functions that can fail
  • Make it syntactically clear what's going on to language-newcomers
  • Make debugging simple by printing the name of the function that failed, and the filename.zig:(line #) / stacktrace
  • Make all the code typesafe so that the compiler can verify you explicitly handled (or didn't handle) all error cases properly

%return to me, just semantically, looks/reads weird.

In languages where the last expression is the value that's returned, the presence of the return keyword is strictly limited to mean "early return from function". It might help with onboarding new users if that semantic were left unchanged.

In LISP, the solution here would be to wrap all of the function calls in an (and [..]) form.

For me, it simply does not get more elegant syntactically than LISP, but I understand you're not going to rewrite this language into s-exprs.

Zig already has goto and defer, so why not add another, similar keyword, for handling the "desirable properties" mentioned above?

Maybe and { .. } as the C-equivalent of the LISP and form?

fn initKeychainAccess() -> %bool
{
  and {
    SecKeychainSetUserInteractionAllowed(TRUE);
    SecKeychainUnlock(gKeychain, 0, NULL, FALSE);
    SecKeychainAddCallback(MyKeychainCallback, kSecEveryEventMask, NULL);
    doThatFancyThingYouDo();
  }
}

NOTE: edited the above code

Or something like that?

Sidenote: For me, it the meaning of %return would be clearer if it were an operator like return?, because the question-mark conveys uncertainty, whereas % doesn't really convey any existing meaning from English, and has existing meaning in C-like languages (mod).

@taoeffect

This comment has been minimized.

taoeffect commented Oct 19, 2017

Sorry for filling up this thread with comments. Just one last thought and a comment:

Comment

I edited the above code to just use and. Now it's:

fn initKeychainAccess() -> %bool
{
  and {
    SecKeychainSetUserInteractionAllowed(TRUE);
    SecKeychainUnlock(gKeychain, 0, NULL, FALSE);
    SecKeychainAddCallback(MyKeychainCallback, kSecEveryEventMask, NULL);
    doThatFancyThingYouDo();
  }
}

Last thought

The syntactic issue with C-based languages is that they create multiple namespaces unnecessarily.

For example, there's the namespace of "operators", which you'd better study and know by heart and not write them in the wrong place, and even though "operators" do basically the same thing as functions, they're separate in syntax and behavior from functions and the function namespace.

So programmers have to learn "two languages" (or more) when learning a "single" language.

Sexpr-based languages like LISP, Scheme and Clojure, simply do not have this problem.

You don't need to learn a new language to write an if statement, you just call the if function, and you learn to live without return by the correct application of the if, and, or, and begin functions/forms.

So the sample code above would simply be:

(define (initKeychainAccess)
  (and 
    (SecKeychainSetUserInteractionAllowed TRUE)
    (SecKeychainUnlock gKeychain 0 NULL FALSE)
    (SecKeychainAddCallback MyKeychainCallback kSecEveryEventMask NULL)
    (doThatFancyThingYouDo)))

We could add type hints/info to this too (see Typed Clojure (note) or Typed Racket for inspiration).

@kyle-github

This comment has been minimized.

kyle-github commented Oct 19, 2017

@taoeffect, I don't think you'll convert @andrewrk or @thejoshwolfe to Lisp-like syntax :-)

Or me for that matter. There is a balance between simplicity and being concise enough to express powerful thoughts/code in a clean and efficient way. There are always people that like the extremes (APL on one end and Forth/Lisp on the other)!

Personally I like the idea of having some higher level constructs, but more oriented toward parallel execution with extremely easy fork/join semantics.

With your macros, as @thejoshwolfe mentioned, the debugging output on error is quite nice. This is one area where Perl's approach can be kind of interesting:

my $fh = open($filename,"<") or die "Oops, cannot open the file $filename";

That throws an exception which will be printed if not caught. I think in Zig this would be:

var fh = fopen(filename, mode) %% | err |  debug_print("Oops, cannot open the file {}! Error: {}",filename, err);

I assume here that debug_print will be a compile-time function that will print a stack trace in debug mode and panic or otherwise do something a little more dramatic in release mode.

@kyle-github

This comment has been minimized.

kyle-github commented Oct 19, 2017

@taoeffect, oops, forgot to mention that there already is a ? sigil used for nullable types.

@taoeffect

This comment has been minimized.

taoeffect commented Oct 19, 2017

@taoeffect, I don't think you'll convert @andrewrk or @thejoshwolfe to Lisp-like syntax :-)

Heh, yeah, that's fine, just wanted to mention it. :-)

The and operator above works fine in C-syntax though. ^_^

var fh = fopen(filename, mode) %% | err | debug_print("Oops, cannot open the file {}! Error: {}",filename, err);

The idea behind DO_FAILABLE was that I got tired of writing logging output for every single function I was calling. I could instead wrap function calls in DO_FAILABLE and it would do everything for me (print the file, line number, function name). The only thing it wouldn't do was print the arguments (because they might contain sensitive data).

Having "or"s after every function that might fail will just get tiresome, hence and { }, which, since you are developing a new language, can do fancy debugging for you if you'd like.

@pluto439

This comment has been minimized.

pluto439 commented Oct 19, 2017

Here is the best example of error handling I've seen. Had to save it, it's that pretty.

http://ibb.co/hQyz5m

Just add a bit of goto in there, to skip to the correct "destructor", and it's done. No code repeat.

It's kinda like defer, but I don't like defer very much, I think goto is more clear. I also want to have a programming language that I can use in repl terminal, and there's just no function end anywhere, defer will never get executed.

I also want a language that supports Structured Exception Handling, even thought I will probably only use 1 global handler instead of exception chain.

And about %%, I'm still trying to figure out what it is. I'm more of a fan of golang with it's multiple return values. I'm hoping for that, but a bit less annoying (I never know if I should use err := or err =, always have to change code as I comment/uncomment code above it).

Just read this https://board.flatassembler.net/topic.php?t=20106

@thejoshwolfe

This comment has been minimized.

Member

thejoshwolfe commented Oct 19, 2017

The docs say:

The %% prefix operator is equivalent to expression %% |err| return err. It unwraps an error union type, and panics in debug mode if the value was an error.

Oops. That's a typo in the docs. %%expression is equivalent to expression %% unreachable. %return expression is equivalent to expression %% |err| return err.

@hasenj

This comment has been minimized.

hasenj commented Oct 20, 2017

If it's true that this expression:

%%printf("something ...", ....)

Is equivalent to this:

printf(.....) %% unreachable

This means if printf returns an error code, the program would crash!

I have to question why there's a need to crash the program if printf returned an error?

Why can't I just ignore it?

I don't even use the return value from printf for anything.

I like the go approach. Function returns multiple values, you can check the error value or explicitly ignore it.

Perhaps we can get another operator like %! to explicitly ignore errors instead of crashing on them.

@tiehuis

This comment has been minimized.

Member

tiehuis commented Oct 20, 2017

I think are looking for the semantics of binding to an empty value to discard errors.

An example:

const printf = @import("std").io.stdout.printf;

error GenericError;
fn alwaysError() -> %void {
    error.GenericError
}

pub fn main() -> %void {
    _ = alwaysError();
    _ = printf("Hello\n");
}
@hasenj

This comment has been minimized.

hasenj commented Oct 20, 2017

In anyway, my feeling is that code that can crash should be a bit more explicit. %%printf does not really stand out as an "I can crash!". I would prefer printf(...) %% unreachable to indicate that failure of this call is not acceptable. It's more explicit.

I do like the idea that the error wraps a value, so that you need to unwrap it before accessing the value, similar to a null check on nullable types.

But you shouldn't be required to handle every possible error that every function can return.

I actually think it's ok to just ignore errors silently too. After all, if you're not using the return value of a function, does it really matter that it returned an error?

@andrewrk andrewrk added this to the 0.3.0 milestone Oct 20, 2017

@basmith

This comment has been minimized.

basmith commented Oct 21, 2017

Using enums as a result type can avoid %'s a bit, though it's got it's own bookkeeping. There might be a more ergonomic way to attempt this, not sure.

const io = @import("std").io;

error SpecificBadThing;

fn Result(comptime T: type) -> type {
  return enum {
    Ok: T,
    Err: error,
  }
}

const DoResult = Result(i32);

fn doSomethingOk() -> DoResult {
  DoResult.Ok { 42 }
}

fn somethingBad() -> Result(i32) {
  Result(i32).Err { error.SpecificBadThing }
}

pub fn main() -> %void {
  switch (doSomethingOk()) {
    // with interstitial type we don't have to mention type param here
    DoResult.Ok => |i|  %%io.stdout.printf("ok was ok! ({})\n", i),
    DoResult.Err => |e| %%io.stdout.printf("ok was bad... ({})\n", e),
  }

  switch (somethingBad()) {
    // without interstitial type we have to repeat type param
    // is this generating distinct types, or is there some unification/interning for reuse?
    Result(i32).Ok => |i|  %%io.stdout.printf("bad was ok... ({})\n", i),
    Result(i32).Err => |e| %%io.stdout.printf("bad was bad! ({})\n", e),
  }
}
@taoeffect

This comment has been minimized.

taoeffect commented Oct 22, 2017

Thinking on it more, my and { ... } syntax is effectively the same thing as a try { ... } exception handling mechanism, so yes, ultimately @kyle-github is correct, I believe I am advocating for a traditional exception handling mechanism.

I am also working on my own language (sexpr-based syntax) and traditional try/catch is what I'll be going with for that. That is what all other error handling techniques ultimately seem to boil down to, and it seems to be the simplest / purest form of error handling that I can think of.

@hasenj

This comment has been minimized.

hasenj commented Oct 23, 2017

I think the key issue I have is why consider "errors" as a special class of return values that must be handled otherwise the compiler complains? It should just be another value that the programmer is free to handle however he sees fit.

For example, if you call a function that performs some operation and returns a value, does the compiler force you to capture the return value and handle it?

If not, then error values should not be so special.

The only thing IMO the compiler should enforce is, when a function returns a union of types A | B, and the programmer wants to deal with the return as if it's type B, they must check so explicitly.

Otherwise it should be ok to ignore the return value.

@andrewrk

This comment has been minimized.

Member

andrewrk commented Oct 23, 2017

For example, if you call a function that performs some operation and returns a value, does the compiler force you to capture the return value and handle it?

Yes.

test "foo" {
    bar();
}

fn bar() -> bool {
    return true;
}
/home/andy/dev/zig/build/test.zig:2:8: error: expression value is ignored
    bar();
       ^
@hasenj

This comment has been minimized.

hasenj commented Oct 23, 2017

I see, so like tiehuis mentioned earlier, my request is fulfilled by using:

_ = printf(....)

I just tried in my project to replace %%io.stdout.printf with _= io.stdout.printf and it seems to work.

@1l0

This comment has been minimized.

1l0 commented Oct 23, 2017

You can do this:
io.stdout.printf("ignore error\n") %% |err| {}

@1l0

This comment has been minimized.

1l0 commented Oct 23, 2017

Related question. Followings can be built with no error. Is it a bug?

const io = @import("std").io;

pub fn main() -> %void {
    io.stdout.printf("no trailing semicolon\n")
}
@tiehuis

This comment has been minimized.

Member

tiehuis commented Oct 23, 2017

@1l0

That works because the return value of printf is %void and omitting the semicolon on the final statement in a block/function will treat that block/fucntion as having that value.

This just returns the value returned by printf from the main function.

@1l0

This comment has been minimized.

1l0 commented Oct 23, 2017

@tiehuis Thanks, if it's not a bug it's ok.

@skyfex

This comment has been minimized.

skyfex commented Nov 30, 2017

Perhaps it would be a good idea to take some inspiration from Nim: I really like the "discard" statement to ignore a return value. For a touch typer, typing a word can be easier than a symbol, and for someone reading the code, it's a lot clearer.

In addition to copying the "discard" keyword, Zig could add something like "noerror" (or "ignore", "safe", ...)

noerror foobar()

var x = noerror foobar()

And to both discard and ignore the error:

discard noerror foobar()

@phase

This comment has been minimized.

phase commented Nov 30, 2017

@hasenj

I think the key issue I have is why consider "errors" as a special class of return values that must be handled otherwise the compiler complains? It should just be another value that the programmer is free to handle however he sees fit.

Programmers are forgetful, and if they forget to check an error then critical code can fail unexpectedly. This mechanism of "handle all errors" makes it so the programmer is required to make a conscious decision about how to handle any error that may occur in the program. Ignoring error values because you don't think they contain anything can lead to numerous errors.

But you shouldn't be required to handle every possible error that every function can return.

You're not, you're only required to handle the possibility of one error from failable functions. If you write code that doesn't cause errors, you don't need to handle them. Standard library calls like I/O always have a possibility of erroring, so we need to handle those. Having the program fail silently is bad for user experience.

andrewrk added a commit that referenced this issue Jan 7, 2018

@andrewrk andrewrk closed this in 3c09411 Jan 9, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment