Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make an RFC for multiple values #78

Open
sorawee opened this issue Jul 22, 2019 · 17 comments
Open

Make an RFC for multiple values #78

sorawee opened this issue Jul 22, 2019 · 17 comments

Comments

@sorawee
Copy link
Contributor

sorawee commented Jul 22, 2019

Do we need multiple values? If so, what's the best way to make it integrate well with the language? Right now, it's not.

As an example, if I have a function f that returns two values, and I want only the first returned value to get passed to function g, the best I can do is:

(let-values ([(x _) (f foo bar)])
  (g x))

Is there anything better that we can do?

@AlexKnauth
Copy link
Member

Can we have tuples instead?

@sorawee
Copy link
Contributor Author

sorawee commented Jul 23, 2019

Yeah, that's the idea. Multiple values, as I understand, is only needed for performance (https://www.cs.indiana.edu/~dyb/pubs/mrvs.pdf). But if we are going to have some kind of static analysis (see #57), it might be possible to make returning a tuple as efficient as multiple values.

@sorawee
Copy link
Contributor Author

sorawee commented Jul 23, 2019

How does Haskell deal with returning tuples?

@jeapostrophe
Copy link
Collaborator

My personal taste is that multiple values are theoretically beautiful and elegantly expose the ability of an advanced compiler to use a custom calling convention, but are extremely annoying to actually use in almost all circumstances in Scheme & Racket, for the reasons you mention.

@rocketnia
Copy link

I think Racket 2 would be simpler to use if it didn't require anyone to know about multiple-value return. It could still exist if it's needed for performance, but it can be relegated to a specialty thing.

Thinking about it in more of a positive brainstorming way, as in "Is there anything better that we can do?" I have several ideas.

To start with, procedures could have the same output convention as their input convention:

  • We could call values with keyword arguments.

  • We could call first-class continuations with keyword arguments.

  • We could use call-with-values to pass the keyword arguments from the result of an expression to the arguments of a procedure.

  • Things like define-values could use the same syntax as function parameter lists -- in particular, having support for keyword arguments.

To make this easier to work with, "multiple values (with possible keyword arguments)" (which we might dub a "values-and-keywords collection") could be the primary first-class value type of the language. Variables would abstract over that concept the same way expressions do:

  • We could refer to multiple values (with possible keyword arguments) using a single variable name.
; Bind `div-results` to `(values 5 4)`.
(define-values div-results (quotient/remainder 124 24))

; Bind `q` to `5` and `r` to `4`.
(define-values (q r) div-results)
  • Even within a values-and-keywords collection, each of the values and keyword bindings would themselves be a values-and-keywords collection. We'd need to watch out for infinite regress in the case of single-value collections, so the representation of at least those would need to be different at a low level.

  • The thing stored in a struct field would be a values-and-keywords collection.

  • The thing stored in a mutable storage location would be a values-and-keywords collection.

  • There could be read/write representations for values-and-keywords collections, and this notation could be used inside a quote as well.

With this infrastructure in place, some quality-of-life conveniences would suggest themselves:

  • Functions could use rest arguments that were bound to values-and-keywords collections rather than to lists.

  • There could be a variation of apply that expected to be given exactly two arguments: The function to call and the argument to pass to it. This argument could be multiple values (with possible keyword arguments), and all of them would be passed to the function.

  • The read/write/quote notation for values-and-keywords collections could replace lists as the primary s-expression data structure. A macro call would be a values-and-keywords collection where the first value was consulted to find the macro binding. A function call could evaluate each of the values-and-keywords entries, treat the first entry as a function, and pass in the rest of the entries as its arguments.

I have no idea what effect any of this would have on performance, nor how severe everyone's headaches would be as they tried to maintain each other's clever nested values-and-keywords data structures. :-p So I don't necessarily support it, but it's a fun thought experiment.

@jeapostrophe
Copy link
Collaborator

Check out this code from over nine years (don't actually know when I wrote it because I had this code outside of git back then): https://github.com/jeapostrophe/exp/blob/master/values.ss (plus the racket update: https://github.com/jeapostrophe/exp/blob/master/values.rkt ) --- It implements a bunch of these ideas, although not as consistently as you suggest.

I don't really support this idea either, but it is useful to write down and think about.

@rocketnia
Copy link

Whoa, nice! :)

@jackfirth
Copy link
Sponsor Collaborator

I think there's another cost to supporting multiple values: it encourages people to group values together in a semantics-free way, instead of creating a struct type with named fields that describe what the values are actually for. Semantics-free representations are great for generic code! They're terrible for everything else, since non-generic code tends to have, y'know, actual semantics associated with it.

@dedbox
Copy link

dedbox commented Jul 23, 2019

Do we need multiple values? If so, what's the best way to make it integrate well with the language? Right now, it's not.

Whoa, let's slow down a little.

Do we need multiple return values? Not to the extent that Racket wouldn't be Turing-complete if we removed them, but they are so convenient I consider them indispensable.

I'll outline just one way multiple return values make my life easier, but it's a big one.

Racket conceptualizes function invocation as the application of a procedure to an argument list. Since lists can have any number of elements, this perspective leads naturally to multivariate functions—such as thunks or functions with rest-args—and multivariate functions compose neatly with multiple return values.

There are useful symmetries between values and list; and between compose, curry, and apply. In combination, they are a potent DSL for multivariate function composition. It's no coincidence Algebraic Racket's prelude shortens most of these names to one or two characters.

Take the totem from my RacketCon talk slides, which looks something like this in Racket:

(compose (curry apply values) … list)

This pattern crops up everywhere when I'm using a compositional style. Algebraic Racket is designed around these functions, to support function composition as a deliberate programming style. My code is so much cleaner now, I'd need a good reason to go back.

Here's a real example, in the event monad:

(define (async-values-evt . evts)
  (if (null? evts)
      (return)
      (do let identified (φ e (fmap (>> :: e) e))
          ((e . x)) <- ($ choice-evt (map identified evts))
          xs <- ($ async-values-evt (remq e evts))
          ($ return (:: x xs)))))

A direct translation to Racket would look like this:

(define (async-values-evt . evts)
  (if (null? evts)
      (handle-evt always-evt (λ _ (values)))
      (let ([identified (λ (e) (handle-evt e (curry cons e)))])
        (replace-evt
         (apply choice-evt (map identified evts))
         (match-lambda
           [(cons e x)
            (replace-evt
             (apply async-values-evt (remq e evts))
             (λ xs (apply (λ ys (handle-evt always-evt (λ _ (apply values ys))))
                          (cons x xs))))])))))

Every line examines, generates, or brings together multiple arguments and multiple return values. Several lines do more than one of these things. Every (λ xs ...), for example, captures an indeterminate number of return values and then applys a function to them as its argument list. Each instance can be eliminated by composeing a curryed version of its body with list, and any list can be converted into multiple values by applying the values function to it.

More precisely, (curry apply values) and (curryr call-with-values list) are duals with respect to multivariate function composition, provided the multi-valued components of both can be wrapped in thunks.

As an example, if I have a function f that returns two values, and I want only the first returned value to get passed to function g, the best I can do is:

(let-values ([(x _) (f foo bar)])
  (g x))

Is there anything better that we can do?

For an expression this simple, let-values might be the best choice. For more complex expressions, here are a few other ways I might do it in Racket without going bananas:

(call-with-values (λ () (f foo bar)) (compose g car list))

(g ((compose car list (λ () (f foo bar)))))

Depending on the context, either of these might be "better" for some definition of "good."

Algebraic Racket makes this sort of multi-valued wizardry mundane, so I'd usually try a few and pick the simplest one. Directly translating the last two examples, I get:

(values-> (.. g car list) (f foo bar))

(g ((.. car list (>> f foo bar))))

And if we include the values monad, I have a few more options:

(fmap (.. g car list) (λ () (f foo bar)))

(>>= (λ () (f foo bar)) (.. g car list))

(lazy-do (x . _) <- (f foo bar)
         (thunk<- (g x)))

@rocketnia
Copy link

I'm glad you chimed in, @dedbox! It's good to hear from someone seriously using multiple-value return for point-free composition.

Based on your refactorings of this example, it looks like it could even be written ((compose g car list f) foo bar).

That said, in a language without multiple-value return, the design of f would be different, this example might simplify to (g (car (f foo bar))) (or even to use something with a more meaningful name than car), and I imagine you would still find it possible to use an arbitrary Racket procedure f in a point-free style by wrapping it up as (>> $ f)/(curry apply f). Are you thinking this wouldn't be as convenient as the techniques you're using now?

@sorawee
Copy link
Contributor Author

sorawee commented Jul 24, 2019

@dedbox

I concur with @rocketnia. In particular, every time you use list, you effectively switch from multiple values world into single value world, precisely because it's difficult to manipulate multiple values, and at that point, there's really no benefit in using values for performance.

Here's one way to translate code with values feature to one without it. Redefine values with list. Split call-with-values into two versions:

  1. (call-with-values-1 gen recv): operate on a function that returns "bare" value. Equivalent to (recv (gen)).

  2. (call-with-values* gen recv): operate on a function that returns multiple values (that is, list). Equivalent to (apply recv (gen)).

(also, if we wish, gen doesn't need to be thunk-ed anymore)

To lift from "bare" value to multiple values, simply apply values. To lift a function that returns a bare value to a function that returns multiple values, apply lift: (define (lift f) (lambda xs (values (apply f xs)))).

Assuming that compose works on functions that return multiple values. Here's one possible implementation: (define (compose f g) (lambda xs (apply f (apply g xs)))).

compose1 stays the same: (define (compose1 f g) (lambda (x) (f (g x)))).

The only thing that we will lose is the "implicit one value" ability. E.g., we won't be able to write (define (f x) (if x 1 (values 2 3))) anymore. Instead, we need to write (define (f x) (if x (values 1) (values 2 3))).

@dedbox
Copy link

dedbox commented Jul 24, 2019

@rocketnia,

Based on your refactorings of this example, it looks like it could even be written ((compose g car list f) foo bar).

Nice. It works!

That said, in a language without multiple-value return, the design of f would be different, this example might simplify to (g (car (f foo bar))) (or even to use something with a more meaningful name than car)

I don't follow. How does Racket know (f foo bar) is equivalent to (apply f (list foo bar)), and (g (car (f foo bar))) is equivalent to (apply g (list (car (f foo bar)))), but (car (f foo bar)) is equivalent to (apply car (f foo bar)) and not (apply car (list (f foo bar)))? What makes car special?

and I imagine you would still find it possible to use an arbitrary Racket procedure f in a point-free style by wrapping it up as (>> $ f)/(curry apply f).

Not a die-hard fan of point-free style, but I'm happy to use it when it gives good results.

Are you thinking this wouldn't be as convenient as the techniques you're using now?

I am pathologically pragmatic, so my techniques will adapt to whatever reality we find ourselves in. There are always trade-offs, so I'll delay speculating until I have a more concrete understanding of what you're proposing.

@sorawee,

every time you use list, you effectively switch from multiple values world into single value world

No, not really. Not in any meaningful sense.

precisely because it's difficult to manipulate multiple values,

Definitely not. It's usually because I want to map, filter, fold, or apply something across an argument list, as opposed to just one of its elements. It is an artifact of multiple arguments, not multiple return values. Would you feel different if we replaced list with + or void?

And besides, capturing multiple return values is as easy as (λ args ...), wherein I have the full force of Racket's list processing capabilities at my disposal. Generalizing args to a full-blown pattern makes the experience even better.

Here's one way to translate code with values feature to one without it. Redefine values with list. Split call-with-values into two versions:

1. `(call-with-values-1 gen recv)`: operate on a function that returns "bare" value. Equivalent to `(recv (gen))`.

2. `(call-with-values* gen recv)`: operate on a function that returns multiple values (that is, list). Equivalent to `(apply recv (gen))`.

I don't understand where you're going with this. The most prominent feature of multiple return values is that you can largely ignore them when you don't need them. If you're proposing every function have exactly one argument, you can already do that without any hardcore macro-fu. Go make a #lang univariate/racket/base for this perspective and tell me how it goes.

(also, if we wish, gen doesn't need to be thunk-ed anymore)

Not without some implicit notion of laziness, or am I missing something? If gen isn't a thunk, then what is it?

@rocketnia
Copy link

@dedbox,

That said, in a language without multiple-value return, the design of f would be different, this example might simplify to (g (car (f foo bar))) (or even to use something with a more meaningful name than car)

I don't follow. [...] What makes car special?

What I'm saying is that in a language without multiple-value return, whoever authored f couldn't have it return multiple values in the first place, so one likely possibility is that they'd have it return a list instead. That way we could just pass the list to car.

If it returned something more meaningful than a list (such as a struct, as @jackfirth is talking about), we could pass it to a more meaningful function than car.

@dedbox
Copy link

dedbox commented Jul 24, 2019

What I'm saying is that in a language without multiple-value return, whoever authored f couldn't have it return multiple values in the first place, so one likely possibility is that they
'd have it return a list instead. That way we could just pass the list to car.

But this doesn't eliminate the burden of discerning multiple arguments from a lone argument that happens to be a list. It merely shifts the burden away from the platform and toward the programmer in the form of ad hoc calling conventions and extra ``glue'' syntax.

One way to build a reference interpreter is to start with a single-argument lambda calculus with pairs, then use it to bootstrap into multiple arguments and more structured data. The tutorial series included with Algebraic Racket documents this technique. It even covers a syntax extension that exploits the ambiguity between list arguments and argument lists to provide first class pattern-based macros with the bare minimum parentheses.

All that is to say I have designed and implemented, on top of Racket, many languages with the sort of syntax you're proposing, and it isn't all roses. The devil is always in the details.

If it returned something more meaningful than a list (such as a struct, as @jackfirth is talking about), we could pass it to a more meaningful function than car.

While I'm generally sympathetic to Jack's desire to name all the things, I'm not buying the argument that structs with named fields are inherently more meaningful than lists. I mean, you can already do this today—with calling conventions and extra ``glue'' syntax. There's nothing stopping you from implementing a whole parallel universe where every function accepts and returns, say, a single hash. I studied a spectrum of variations on this theme throughout my CS Masters program. In practice, I think you'll find not everything needs a name, and too many names can be just as disorienting as too few.

@sorawee sorawee mentioned this issue Aug 3, 2019
@sorawee
Copy link
Contributor Author

sorawee commented Aug 4, 2019

Here's an optimization in Haskell:

If a function returns a tuple of values, and the caller immediately takes the tuple apart again, GHC will attempt to eliminate the tuple completely at the machine code level. Actually, this works for all types having exactly 1 constructor.

-- https://wiki.haskell.org/GHC_optimisations

@tgbugs
Copy link

tgbugs commented Aug 5, 2019

General principals question.

If the underlying runtime supports multiple values, or any language feature for that matter, should a language built on top of it have a construct that allows one to make use of that functionality?

If the answer to this question is no, then don't we still have to require that all compliant runtimes support that feature anyway due to of the requirement for interoperatilbity with existing Racket?

@jackfirth
Copy link
Sponsor Collaborator

If the underlying runtime supports multiple values, or any language feature for that matter, should a language built on top of it have a construct that allows one to make use of that functionality?

I don't think so, no. A language may do so, but I don't see any reason a language should or must.

If the answer to this question is no, then don't we still have to require that all compliant runtimes support that feature anyway due to of the requirement for interoperatilbity with existing Racket?

Yes. But for the most part, Racket2 RFCs aren't about the runtime. They're about the surface APIs. At least, that's been my assumption so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants