Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow defining multiple variables inside let ... in #1695

Open
giorgiga opened this issue Oct 21, 2023 · 10 comments
Open

Allow defining multiple variables inside let ... in #1695

giorgiga opened this issue Oct 21, 2023 · 10 comments

Comments

@giorgiga
Copy link
Contributor

giorgiga commented Oct 21, 2023

Is your feature request related to a problem? Please describe.

I think most people will agree that this:

let
  foo = 1
in let
  bar = 2
in let
  baz = 3
in
  foo + bar + baz

is far less readable than:

let
  foo = 1,
  bar = 2,
  baz = 3
in
  foo + bar + baz

(plus it gets worse as expressions become more complicated)

Describe the solution you'd like

Allow declaring more than one variable inside let ... in, for example separating them with commas.

Describe alternatives you've considered

If commas pose some issue with the grammar (or are disliked), another option could be let { foo = 1, bar = 2 } in ... (but in this case one might want to add metadata to the variables, which IIUC let doesn't allow)

Additional context

I can work on a PR if you think the idea is viable.

(well... I think I can: I'm not really a rustacean, but this seems relatively simple to tackle as a first issue)

@yannham
Copy link
Member

yannham commented Oct 22, 2023

Related: #504 (edit: #494). As you can see in the original issue, I'm rather in favor of this change 😛 but let's discuss that in the next weekly meeting, and in this issue, and see what comes out of it.

Another interesting aspect is that blocks let-binding can have performance impact. In a sequence, the environment is duplicated, then modified (insertion of the new binding), then duplicated, etc. between each and every binding, because the interpreter has to assume that in let x = 1 in let y = _exp_ in body, _exp_ might depend on x (it could do additional analysis to prevent that but it's not free).

If _exp_ doesn't in fact depends on x, this introduces a spurious dependency, which makes the environment less efficient (introducing sharing points by cloning after inserting turns the environment into something like a linked list instead of a hashmap, at least locally).

This might also inhibit reuse optimizations (currently, if you map on an array, and this array is not referenced by anyone else, you can map in place - but with spurious dependencies you might capture references through those let-bindings and unduly prevent reuse).

I honestly don't know if the impact is noticeable, especially now that we've got rid of the share normal form transformation (#1647) which was a huge source of sequence of independent bindings, but it's still an interesting semantic distinction: a let block express a form of independence and parallelism, a guarantee that might be used by the interpreter, while normal let-binding must be assumed to be sequential and dependent.

@giorgiga
Copy link
Contributor Author

That's great @yannham

Another option just came to mind: one could simply drop the "intermediate" ins or replace them with commas (let a = 1 let b = 2 in a + b or let a = 1, let b = 2 in a + b)

For reference, let me name the ideas so far (and format them in a more concise way) - I'll follow up shortly with some additional notes.

1. strict let..in (the current syntax):

   let foo = 1
in let bar = 2
in let baz = 3
in     foo + bar + baz

2. let block:

let  foo = 1,
     bar = 2,
     baz = 3
 in foo + bar + baz

3. let list with commas:

let foo = 1,
let bar = 2,
let baz = 3
 in foo + bar + baz

4. let list with no commas:

let foo = 1
let bar = 2
let baz = 3
 in foo + bar + baz

@giorgiga
Copy link
Contributor Author

The TLDR is "I think the let block option is the winner"; below is the long version, but it's probably not as interesting as to justify its length (since I've written it, I'll post it anyway in case anyone is curious).

Long version

For the sake of exploring how things may end up looking like, below I'll pretend nickel has added haskell-like where and nix-like with. Please don't take this as me assuming the language will evolve in any specific way (or "at all"): it's just for brainstorming.

The examples below will not include expressions where there is no let (eg. nix's classic with pkgs; [ pkg1 pkg2 ])... this is not in-topic, but let me just note here that, in case one wants some sort of separator between with and the value expression (because with pkgs [ pkg1 pkg2 ] is difficult to parse, visually or programmatically), an option would be to exchange with with some imperative verbal form (say, use) and recycle let's in as a separator (ie: use pkgs in [ pkg1 pkg2 ]). Other words may also be exchanged for where, of course (for example with itself or -ing verbal form like using or letting).

Formatting will probably look unusual (it's in the style I use for SQL queries)... since we are talking about language features that don't exist, it's not like there's an established "idiomatic" style, so everyone will have to re-format according to its own style to get an idea for how things could look.

Also, I will not go into what is easy to write a parser for, in part because I don't know the capabilities of that lalrpop parser generator you are using, and in part because I feel technical compromises are best left for later.

1. strict let ... in (the current syntax):

  with std.number
  with std.string as str
   let foo = 1
in let bar = 2
in     (min foo bar) + (max baz qux)
 where baz = 3
 where qux = 4

I find this is verbose and distracting... most of all, the "main" point (the actual expression (min foo bar) + (max baz qux)) does not stand out from the boilerplate.

One-liners (eg. let a = 1 in let b = 2 in a + b) are not particularly easy to visually parse: when you see the first let your eye looks for the matching in and only when you realize it's an in let you iterate until you get to the "main" expression (it may be that that's just how I read things: IDK if people normally make sense of expressions top-down like I do or bottom-up... at least someone else does it like me, otherwise haskell would not have its where).

2. let block:

 with std.number,
      std.string as str
  let foo = 1,
      bar = 2
   in (min foo bar) + (max baz qux)
where baz = 3,
      qux = 4

This neatly divides the expression into "blocks" (similar to how SQL statements are structured).

The "main" expression still doesn't stand out very much (IMHO the bar = 2 stands up more, and you'll have to put each let definition on its own line with non-trivial expressions)... however, it's easy enough to find (you just have to look for the in).

Commas should be allowed after all lines (eg. after std.string as str , bar = 2 and possibly also in (mion foo bar) + (max baz quz)).

3. let list:

 with std.number
 with std.string as str
  let foo = 1
  let bar = 2
  let baz = 3
   in (min foo bar) + (max baz qux)
where baz = 3
where qux = 4

The one above is not the best example for this, but this form (like the next one) works well if there are other "statements" that can be intermixed with let variable definitions (none of them comes to mind, so I didn't change the example).

This style is reminiscent of imperative languages and so it may prove more familiar/intuitive to those programmers (the majority?) who have only been exposed to imperative languages.

4. let list with commas:

 with std.number,
 with std.string as str,
  let foo = 1,
  let bar = 2,
  let baz = 3
   in (min foo bar) + (max baz qux)
where baz = 3
where qux = 4

This is a variant of the previous one so it shares most of its pros/cons.

Personally, I find that the comma helps in visual parsing, but it also looks a bit strange here, as it kind of wants to be seen as an imperative-style statement terminator in an expression language.

@jneem
Copy link
Member

jneem commented Oct 25, 2023

👍 for "let blocks". I like to keep the fact that there is one "let" for every "in"

@giorgiga
Copy link
Contributor Author

@yannham about my offer to submit a PR... it's gonna take a while (and you'll probably be better off having someone who is already familiar with the language/tooling/project does this) :(

From what I've seen in the code, implementing this is gonna require quite extensive changes and doing that while also learning rust+lalrpop, maintaining the current (half?) implementation of rec, and the (undocumented?) let patterns is bound to require much more effort that I anticipated... which also means several other tasks are bound to end up before this one in my TODO list.

@thufschmitt
Copy link
Member

👍 also for “let blocks”. It also brings a nice syntactic symmetry between let blocks and records (Nix has that, and I find it quite practical when refactoring code as changing a record to a set of let bindings (or the other way around) is somewhat frequent.

For symmetry’s sake, I'd also weakly suggest making these blocks recursive by default (like records are)

@yannham
Copy link
Member

yannham commented Oct 28, 2023

For symmetry’s sake, I'd also weakly suggest making these blocks recursive by default (like records are)

Ah, there has been some discussion about this in the original issue around that, see #494 (comment) for example. Although it's a bit less consistent syntax wise, I think recursion by default is the right choice (i.e. the most common / natural thing to do) for records, but isn't for lets in general (and single lets aren't recursive by default, as mentioned in the comment, which would be a surprising difference).

IMHO the less evil/surprising inconsistency is to keep lets, including let-block, not recursive by default, and record be recursive by default. They have a different semantics and express a different intention after all, even if the syntax is similar. I rarely write recursive lets (including in other functional languages), but I reckon we often use recursion inside Nickel records. As always I'm happy to change my mind if there are strong arguments in the other direction!

@toastal
Copy link

toastal commented Dec 6, 2023

What’s wrong with

let foo = 1 in
let bar = 2 in
let baz = 3 in
foo + bar + baz

?

Even in a more verbose case nothing wrong IMO with

let foo = 1 in
let bar = {
	qux = 1,
} 
in # could cuddle this up a line?
let baz = [
	0,
	1,
	2,
] 
in
foo + bar + baz

When you put in in the front and try to align everything all weird-like, of course it looks strange & verbose. There seems to be a lot of focus on alignment & symmetry that aren’t really useful if not aiming for that specific style as a goal. let … in for each new variable seem to be just fine for the Dhall & OCaml projects…

@yannham
Copy link
Member

yannham commented Dec 6, 2023

What’s wrong with

Well, first, nothing is wrong with that 🙂 note that it only works for one-line definitions, though. If your definition is too long to fit on one line, then nickel format will rather lay it out as in the original example of this issue.

Beside, the discussion revolves around the following points:

  • aesthetics: some people find it more pleasant/more readable to declare variables as:

    let
      foo = 1,
      bar = 2,
      baz = 3,
    in
    

    than

    let foo = 1 in
    let bar = 2 in
    let baz = 3 in
    
  • semantics: in a let-block, bindings would be guaranteed to be independent (which is, most of the time, what you want), while they are not in let-block chains. That is, the behavior of

    let x = 1 in
    let y = x + 1 in
    ...
    

    is not the same as

    let
      x = 1,
      y = x + 1,
    in
    ...
    

    Although it's hard to say if it's noticeable, this might in fact have performance implications. Take let foo = <foo_def> in let bar = <bar_def> in <continuation> , where bar doesn't actually depend on foo. Currently, no analysis is performed to tell if <bar_def> uses foo or not, and because it might very well do so, <foo_def> and <bar_def> don't share the same environment (<bar_def> has the definition of foo added in it, and the <continuation> has both). Environment insertions have an impact on the structure of the environment, and thus on its performance (currently a long sequence of let-binding introduces an environment that looks like more or less a linked list, locally).

    On the other hand, in a let-block version let foo = <foo_def>, bar = <bar_def>, in <continuation>, 1. all definitions share the same original environment and 2. the <continuation> sees all the definitions at once in an environment layer, which is locally like a hashmap rather than a linked list.

    This is to be taken with a grain of salt, because some simple free variable analysis phase could, in the future, detects when there's no syntactic dependency between two following lets, but it's still additional machinery. The data structure implementing the environment might change as well. But, in general, insertions/modifications on shared data will always have a potential cost for a persistent hashmap.

  • there's currently no direct way to declare a bunch of mutually recursive functions outside of a record. You can still do let {f, g} = {f = ..., g = ...}, which is an OK workaround (and, honestly, I haven't had the need for local mutually recursive definitions like this until now in Nickel). I still mention this for the sake of exhaustiveness.

@yannham
Copy link
Member

yannham commented Dec 6, 2023

let … in for each new variable seem to be just fine for the Dhall & OCaml projects…

By the way, Dhall does have actually a simple form of let block (when they repeat the let but not the in: dhall-lang/dhall-lang#266 ). As do Nix and Elm. And Haskell. In fact OCaml has a let .. and .. construct that has the exact same semantics as let-blocks described in this issue, but for some reason, it's not used much outside of mutually recursive definitions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants