Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RecordDotSyntax language extension proposal #282

Merged
merged 53 commits into from May 3, 2020
Merged

RecordDotSyntax language extension proposal #282

merged 53 commits into from May 3, 2020

Conversation

@ghost
Copy link

@ghost ghost commented Oct 11, 2019

The proposal has been accepted; the following discussion is mostly of historic interest.


We propose a new language extension RecordDotSyntax that provides syntactic sugar to make the features introduced in the HasField proposal more accessible, improving the user experience.

Rendered

@ghost ghost closed this Oct 11, 2019
@simonpj
Copy link

@simonpj simonpj commented Oct 11, 2019

The link is broken in the rendered proposal, where it says "This proposal is discussed at this pull request"

@simonpj
Copy link

@simonpj simonpj commented Oct 11, 2019

I strongly support the direction of travel of this proposal. I've wanted to use dot-notation for record selection since forever; and this looks like a very plausible way to do so. The fact that it's been used extensively in a production context helps reassure me that there aren't unexpected consequences.

I'd like more clarity about white space.

  • f.x means getField @"lbl" f
  • f .x (note the space after f) means.... what? Perhaps f (\r -> r.x)?
  • f (.x) presumalby really does mean f (\r -> r.x)
  • f(.x) presumably means the same thing.

Perhaps the right way to think about it is that .x is a postfix operator. You cannot put any white space after the dot, but you can always put as much as much whitespace before the dot.

GHC already supports postfix operators: here is the manual section. It would be good to check that the proposal is compatible with treating .x as a postfix operator in the sense of that section.

@ghost ghost reopened this Oct 11, 2019
@ghost
Copy link
Author

@ghost ghost commented Oct 11, 2019

The link is broken in the rendered proposal, where it says "This proposal is discussed at this pull request"

So fast Simon! I was just fixing it 😉

@ghost
Copy link
Author

@ghost ghost commented Oct 11, 2019

You cannot put any white space after the dot, but you can always put as much as much whitespace before the dot.

Yes, that's how it behaves.

@ghost
Copy link
Author

@ghost ghost commented Oct 11, 2019

I'd like more clarity about white space.

  • f.x means getField @"lbl" f

Yes.

  • f .x (note the space after f) means.... what? Perhaps f (\r -> r.x)?

f .x is the same as f.x (that is, getField @"lbl" f).

  • f (.x) presumalby really does mean f (\r -> r.x)

Exactly.

  • f(.x) presumably means the same thing.

Yes.

@simonpj
Copy link

@simonpj simonpj commented Oct 11, 2019

Thanks. Perhaps in due course update the proposal to make these points clear.


Below are some possible variations on this plan, but we advocate the choices made above:

* Should `RecordDotSyntax` imply `NoFieldSelectors`? They are often likely to be used in conjunction, but they aren't inseparable.
Copy link
Contributor

@adamgundry adamgundry Oct 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue that it should not. Nothing about this proposal requires that selectors not exist, it is merely helpful to avoid clashes (for which DuplicateRecordFields would work pretty much equivalently).

If a consensus emerges that RecordDotSyntax + NoFieldSelectors + ... is the right way forward, we could easily add a new extension Haskell2030Records that implies the conjunction.

@adamgundry
Copy link
Contributor

@adamgundry adamgundry commented Oct 11, 2019

In addition to the lack of type-changing updates, the HasField approach is limited compared to existing selector functions in that it cannot support higher-rank fields. (This is arguably more related to the HasField proposal, but perhaps worth flagging up in this context as well.) For example, given

data T = MkT { foo :: forall a . a -> a }

then foo (MkT id) is well-typed but getField @"foo" (MkT id) is not.

Perhaps this use is rare enough that users can re-enable selector function generation (or define their own selectors) in this case.

@phadej
Copy link
Contributor

@phadej phadej commented Oct 11, 2019

{-# LANGUAGE RecordDotSyntax #-}

import qualified Foo

data Foo = Foo

instance HasField "name" Foo () where hasField = ...

something = ... Foo.name

is that Foo.name

  • a name from Foo module
  • or getField @"name" Foo

EDIT:

Foo.name  -- name from module Name
Foo .name -- getField ...

Proposal should point this out.

@nomeata
Copy link
Contributor

@nomeata nomeata commented Oct 11, 2019

Perhaps this use is rare enough that users can re-enable selector function generation (or define their own selectors) in this case.

It may be rare, but it turned out to be a blocker for various ways of fixing #216.

@parsonsmatt
Copy link

@parsonsmatt parsonsmatt commented Oct 11, 2019

The loss of polymorphic update is a huge problem, imo.

@ghost
Copy link
Author

@ghost ghost commented Oct 11, 2019

Foo.name

This parses as a qualified variable, not field selection.

@int-index
Copy link
Contributor

@int-index int-index commented Oct 11, 2019

The e{lbl * val} syntax is a bit perplexing to me. Firstly, it has no = sign, making it hard to recognize at first that there's an update going on here. Secondly, it makes commutative operators non-commutative. That is, the following two lines are not equivalent:

  1. c{taken.year + n}
  2. c{n + taken.year}

I would propose that we introduce a e{lbl * = val} syntax instead. The examples from the proposal would look like:

addYears :: Class -> Int -> Class
addYears c n = c{taken.year + = n} -- update via op

squareUnits :: Class -> Class
squareUnits c = c{units & = \x -> x * x} -- update via function

It would also nicely parallel the C++ syntax +=, -=, *=, etc, differing only in whitespace.

@phadej
Copy link
Contributor

@phadej phadej commented Oct 11, 2019

The getField part looks not like PostfixOperators but likeTypeApplications

foo @Int .field1 @Bar .field2

I can imagine having an Overloaded:RecordDots option in overloaded plugin, as val .name could be desugared into whatever @"name" val, not only getLabel

So for me, the getField part of this proposal is simply introducing some new expression syntax. More things to play with.

@phadej
Copy link
Contributor

@phadej phadej commented Oct 11, 2019

In #282 (comment) the

f .x (note the space after f) means.... what? Perhaps f (\r -> r.x)?

f .x is the same as f.x (that is, getField @"lbl" f).

The proposal should somehow specify how juxtaposition works now, it looks like some will be more binding than another:

someFun record .field .field2 value

should probably be parsed the same as

someFun record.field.field2 value

i.e.

(someFun ((record.field).field2)) value

Should or shouldn't be there warnings for omitted front space? It feels like "don't use tabs" thing.

@ghost
Copy link
Author

@ghost ghost commented Oct 11, 2019

Foo.name  -- name from module Name
Foo .name -- getField ...

Proposal should point this out.

I shall take care of it.

@ghost
Copy link
Author

@ghost ghost commented Oct 11, 2019

The proposal should somehow specify how juxtaposition works now, it looks like some will be more binding than another:

someFun record .field .field2 value

should probably be parsed the same as

someFun record.field.field2 value

i.e.

(someFun ((record.field).field2)) value

As it's implemented in the prototype, function application is taking precedence over field projection so f a.foo.bar.baz.quux 12 parses as ((f a).foo.bar.baz.quux) 12. To treat the first argument to f as a projection of a, write f (a.foo.bar.baz.quux) 12 and f (a .foo .bar .baz .quux) 12 is equivalent. Update : #282 (comment)

@phadej
Copy link
Contributor

@phadej phadej commented Oct 11, 2019

As it's implemented in the prototype, function application is taking precedence over field projection

That's definitely should be pointed out in the proposal. It's an opposite of what I thought it is.

@ghost
Copy link
Author

@ghost ghost commented Oct 11, 2019

That's definitely should be pointed out in the proposal.

Maybe best to go to the unresolved questions section. Hard to call the "right" parse here.

@gbaz
Copy link

@gbaz gbaz commented Oct 11, 2019

In my opinion, this proposal leaves at least one huge thing to be desired. It proves an "alternate route" to compositional projection (besides generating projection functions directly) by having the . denote a projection function. However it does not provide any route to compositional update. In our experience (at awake), this feature is key. In particular, having sugar for both makes the "stealing" of the syntax from idiomatic lens usage much less painful.

Here's one way to do it. Just as (.lbl) expands to (\x -> x.lbl), have:

(.=lbl) ==> (\x v -> x{lbl = v})

[of course, pick your exact syntax poison of choice -- &= is another good candidate].

One nice thing here is you can even parse the "assignment" as an infix operator if desired, so x .=lbl v reads as x with lbl assigned to value v.

This could also desugar to using SetField if you prefer. However, I confess by the way I don't understand why SetField is used at all in this proposal?

I.e., it has e{lbl = val} ==> setField @"lbl" e val. But why not just leave it as is? The subsequent (nested) desugaring e{lbl1.lbl2 = val} ==> e{lbl1 = (e.lbl1){lbl2 = val} would still work correctly, no? And furthermore, if SetField isn't used, then doesn't polymorphic record update still work?

@simonpj
Copy link

@simonpj simonpj commented Oct 11, 2019

I shall take care of it.

It'd be great to update the proposal to cover all the syntactic questions here, so that it stands by itself without reading the discussion thread. For example

As it's implemented in the prototype, function application is taking precedence over field projection so f a.foo.bar.baz.quux 12 parses as ((f a).foo.bar.baz.quux) 12. To treat the first argument to f as a projection of a, write f (a.foo.bar.baz.quux) 12 and f (a .foo .bar .baz .quux) 12 is equivalent.

Make sure the proposal says all this!

@ghost
Copy link
Author

@ghost ghost commented Oct 11, 2019

I shall take care of it.

It'd be great to update the proposal to cover all the syntactic questions here, so that it stands by itself without reading the discussion thread. For example

As it's implemented in the prototype, function application is taking precedence over field projection so f a.foo.bar.baz.quux 12 parses as ((f a).foo.bar.baz.quux) 12. To treat the first argument to f as a projection of a, write f (a.foo.bar.baz.quux) 12 and f (a .foo .bar .baz .quux) 12 is equivalent.

Make sure the proposal says all this!

Understood Simon. On it... Done.

@cocreature
Copy link

@cocreature cocreature commented Oct 11, 2019

This could also desugar to using SetField if you prefer. However, I confess by the way I don't understand why SetField is used at all in this proposal?

This proposal tries to solve two things at once (which could probably be pointed out more clearly in the proposal):

  1. The use of dot-syntax and together with that an easy way to deal with nested fields. This is independent of SetField.
  2. A better solution to colliding field names than DuplicateRecordFields. This depends on SetField.

@phadej
Copy link
Contributor

@phadej phadej commented Oct 11, 2019

@gbaz, one value of desugaring to a something using class, is that one can write manual instances to the class. I.e. you can do "Classy Lenses" stuff. Compare with writing "foo" with and without {-# language OverloadedStrings #-}. Yet, {-# language OverloadedRecordDots #-] would be too much, i.e.

  • plain {-# language RecordDotSyntax #-} would desugar to selector functors and record updates and
  • {-# language RecordDotSyntax, OverloadedRecordDotSyntax #-} would desugar to a type-class members

It's silly to be so granular here, but OTOH there are various things happening: new syntax, and overloading that syntax with existing HasField functionality.


@cocreature was first, and I agree

This proposal tries to solve two things at once (which could probably be pointed out more clearly in the proposal)

@goldfirere
Copy link
Contributor

@goldfirere goldfirere commented Oct 11, 2019

When RecordDotSyntax is in effect, the use of '.' to denote record field access is disambiguated from function composition by the absence of whitespace trailing the '.'.

In the language of https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0229-whitespace-bang-patterns.rst, . will be a prefix operator.


In the bit about update syntax of pbind, why does qvar appear in one production but just var in the others?


@parsonsmatt

The loss of polymorphic update is a huge problem, imo.

Why? Have you used polymorphic update? I agree that it's nice and compositional to have polymorphic update, but is it actually useful in practice?


For update sections (along the lines of #282 (comment)), I humbly submit (.lbl =) as the syntax. (Whitespace before the = not significant, but I like it.) The proposals from @gbaz above use .= and &=, which are both valid operators.

@fsoikin
Copy link

@fsoikin fsoikin commented Oct 11, 2019

I'm a bit worried about polymorphic fields. I agree it's not a frequent use case, but at least I'd very much like to have an escape hatch.

To that end, I'd like to clarify: in the worst case scenario, a manual accessor written via pattern matching would still work, right? E.g.

data T = T { x :: forall a. a -> a }

x :: T -> (forall a. a -> a)
x T { x = r } = r

@simonpj
Copy link

@simonpj simonpj commented Apr 3, 2020

Friends

As the shepherd for this proposal, I'm happy to say that the GHC Steering Committee has, finally, come to a conclusion: we accept the proposal, subject to final revisions (see “What happens next”), with some additional specifics about syntax (see “Our conclusion”).

The process

  • The proposal is entirely about syntax; and specifically about introducing the form r.x for record field selection. No changes
    to the underlying type system, or any other aspect of the language, are proposed. The original proposal was more elaborate but was simplified to focus on the essentials.

  • Records constitute a particularly rich and complex design space, and elicit an unusually broad range of opinions. This proposal attracted over 500 comments, a record for a GHC proposal.

  • This diversity of opinion was reflected in the committee, as you can see if you read the committee's email discussion.

One possible reaction to a diversity of opinion is to do nothing and wait for clarity to emerge. That is often the right choice, but not (I believe strongly) in this case. We have waited a long time already -- I have been engaged in debate about this topic for over two decades -- and I think it's time to decide something. The details matter, though. So we did this:

  • We put together a list of choices, which you can find here, including "reject" (which some members argued for). Members were encouraged to add any choices they felt were missing, and clarify any choices on offer so that everyone understood them. (A note about the Elm “naked selector” possibility, where naked selectors are ordinary functions: no member asked to add this choice.)
  • Once that discussion had concluded, I called a vote. We voted using the Condorcet algorithm to take account not only of people's top choice, but of the ordering of their choices. This process allowed us to express our preferences, while respecting the fact that others might have a different view.

Our conclusion

Happily, there was a clear winner: choice (C2a) beats every other options by 7:4 or more. It is as follows:

  • The form .x or .x.y.z is a new lexeme, called a “naked selector”.

  • A naked selector is illegal, except when enclosed in parens, (.x.y.z) means (\r -> r.x.y.z); that is, it is a record selector. So you can write map (.x) rs to map a selector over a list of records.

  • “r.” with no immediately-following “x” is two lexemes “r” and “.”, i.e function composition. So (r. x) means (r . x).

  • The form r.x (with no spaces on either side of the dot) is not treated as a naked record selector; instead it is treated as an atomic expression, very like a qualified name M.x. So f r.x means f (r.x).

  • The same holds if r is replaced by a parenthesised expression. So f (g 3).x means f ((g 3).x). Or, in general, any atomic expressions, such as [a,b].

Examples

f r.x        means       f (r.x)
f M.n.x      means       f (M.n.x)
f M.N.x      means       f (M.N.x)
f r .x       is illegal
f (g r).x    means       f ((g r).x)
f (g r) .x   is illegal

As you can see, it's a pretty conservative choice. We spent a lot of time discussing more expressive options. Especially, we considered naked selectors as postfix operators, so that

  f .x .y .z   means   f.x.y.z
  f .x 3 .y 4  means   ((f.x) 3).y 4

allowing whitespace between the selectors. You can see a number of variants of this idea among the choices the committee discussed. But we chose a more conservative path for now. By making naked record selectors illegal (except in parens) we leave those possibilities open for the future, as we get more experience. (This is one reason for not adopting Elm's use of naked record selectors without parens.)

Another principle was that it should be possible to replace a variable with the expression it is bound to, so that wherever you can write r.x you can also write (f 3).x.

What happens next

I'd like to invite the authors to update their proposal to incorporate (C2a), and in particular

  • To write out the changes to the grammar, especially how to deal with specifying aexp2.x (see Note 5)
  • To address the question of what happens if a field name is an operator (see Note 6)
    data T = MkT { (&&&) :: Int -> Int, ... }
    

Then the committee can do a final review and sign it off.

Many people have contributed to this conversation -- thank you! The fact that we don't all agree 100% is OK -- people's judgements differ. Many, perhaps most, people (myself included) would have made a different choice left to themselves. But we have a way to resolve such differences of judgement, which we have executed, and I respect the conclusion.

@Tarmean
Copy link

@Tarmean Tarmean commented Apr 3, 2020

This solution seems to cover all important use cases while being the least surprising to new users. Seems like a great approach.

Question about the implementation: Many imperative languages support type-driven field name completion when the user types name.. I am assuming ghci could add support for this without another proposal? Technically it's probably possible by parsing the :info` output but something less fragile would be nice.

@ndmitchell
Copy link
Contributor

@ndmitchell ndmitchell commented Apr 3, 2020

As one of the authors, we'd like to thank @simonpj for all the work he's put into this shepherding process! We confirm we the authors will follow the next steps for the authors, although note that due to a combination health, childcare and work commitments it may take a few weeks. We're also happy with the choice of a conservative C2a.

@Tarmean using :instances I think you might already get the info, as this is just another instance. However, rather than putting this through a command line interpreter tool with a textual syntax, which is always going to be somewhat fragile, it would be much better put in an API. I hope something like Ghcide will support this feature as soon as it makes its way into GHC proper.

@ocharles
Copy link

@ocharles ocharles commented Apr 3, 2020

Forgive me if this is noise, but I just wanted to express my support of the committees decision. As a professional Haskell user, C2a will still greatly improve my QOL in my day to day job, and I agree it leaves the door open for future improvements. And I also want to say a huge thank you to everyone who was involved in this - the original authors, the committee, and everyone who has participated here. You're all amazing!

@shayne-fletcher
Copy link
Contributor

@shayne-fletcher shayne-fletcher commented Apr 11, 2020

@simonpj

I'd like to invite the authors to update their proposal to incorporate (C2a)

We are happy to report that the proposal has been updated as per the committee's requests.

  • To write out the changes to the grammar, especially how to deal with specifying aexp2 (see Note 5)

See section 2.3.

  • To address the question of what happens if a field name is an operator (see Note 6)

See section 2.1.3.

Then the committee can do a final review and sign it off.

We should now like to formally request the committee's final review and sign off. We thank once again yourself and the committee for your time and consideration!

Co-Authored-By: Arnaud Spiwack <arnaud@spiwack.net>
@nomeata nomeata merged commit 6c52ff7 into ghc-proposals:master May 3, 2020
@TheMatten
Copy link

@TheMatten TheMatten commented May 8, 2020

What are current thoughts on polymorphic updates? Having classes like:

class GetField (s :: Symbol) a where
  type Field s a
  getField :: a -> Field s a

class SetField (s :: Symbol) x a where
  type Updated s x a
  type Updated s x a = a
  setField :: a -> x -> Updated s x a

class    (GetField s a, SetField s x a) => HasField s x a
instance (GetField s a, SetField s x a) => HasField s x a

It would easy to implement/derive them, use-cases with monomorphic fields would only need to mention concrete field type in instance head and we could have read-only "virtual" fields constructed from values of other fields without having to think about setting in any way.

@mageshb
Copy link

@mageshb mageshb commented May 9, 2020

Is it possible to use newline to chain multiple selector application,

let r = val
           .lenghtyFieldName1
           .lengthyFieldName2

Would the above be equivalent to the following under this extension

let r = val.lenghtyFieldName1.lengthyFieldName2

@TheMatten
Copy link

@TheMatten TheMatten commented May 9, 2020

@mageshb Not yet - for now you'll have to do something like

let r = ((val
      ).lengthyFieldName1
      ).lengthyFieldName2

using parens to keep dot always next to expression on left side.

@adamgundry
Copy link
Contributor

@adamgundry adamgundry commented May 9, 2020

@TheMatten on the topic of enhancements to HasField, recent discussion has been on #286 regarding splitting it into two classes and (mostly but not exclusively) on #158 regarding polymorphic updates. The problem here is that there are a lot of variant designs, but a lack of consensus around which to choose. Perhaps a new proposal for adding polymorphic update might help things along.

As I've been implementing the HasField design from #158 I've been wondering if it would make sense to offer both mono-HasField and a generalised HasField' class supporting type-changing update. We could use the same underlying function for generating the dictionary in both cases, with the constraint solver specialising it as needed. Record dot syntax could then default to HasField but allow use of HasField' via a rebindable-syntax-like mechanism, and optics libraries would similarly have a choice of which to use.

I haven't yet been convinced that splitting HasField into two classes is worth doing, because we don't currently have a way to indicate that the automatically-generated instances should be get-only or set-only.

@effectfully
Copy link

@effectfully effectfully commented May 10, 2020

@TheMatten

What are current thoughts on polymorphic updates?

I wrote a post on this topic some time ago. It describes all the known approaches (if something was left out, please let me know) and compares them. The one you've outlined is basically the worst (after having no polymorphic update at all).

@adamgundry

The problem here is that there are a lot of variant designs

There is the bad type family approach, the good functional dependencies approach and the novel SameModulo approach. I don't think anybody is going to commit to the novel approach, given that it doesn't offer much more than the functional dependencies approach, which is simpler and has been around for ages. So in my view there's only one serious contender: the FunDep approach.

offer both mono-HasField and a generalised HasField' class supporting type-changing update.

Please swap the names then, many people are used to Lens' being monomorphic and Lens being polymorphic.

Record dot syntax could then default to HasField but allow use of HasField' via a rebindable-syntax-like mechanism

Why?

I haven't yet been convinced that splitting HasField into two classes is worth doing, because we don't currently have a way to indicate that the automatically-generated instances should be get-only or set-only.

Having

data Person = Person
  { name :: String
  , age  :: Int
  } deriving Show
data Company = Company { owner :: Maybe Person }
  deriving Show

with monolithic HasLens:

incOwnerAge company = company{owner = fmap (\y -> y{age = succ y.age}) company.owner}

with split HasLens:

incOwnerAge company = company.just.owner { age = succ age }

(note sure if x.y.z { a = ... } syntax is supported though, haven't been paying attention)

because we don't currently have a way to indicate that the automatically-generated instances should be get-only or set-only.

Even just treating sum-types as set-only is already useful and doesn't require any indications. And having a way to indicate something is only a matter of settling on syntax: just about 500 comments on a proposal and you're done.

@adamgundry
Copy link
Contributor

@adamgundry adamgundry commented May 10, 2020

@effectfully

What are current thoughts on polymorphic updates?

I wrote a post on this topic some time ago. It describes all the known approaches (if something was left out, please let me know) and compares them.

Nice, thanks for this! I somehow lost track of it at the time, but this looks very comprehensive and is really helpful as a comparison of the approaches.

The problem here is that there are a lot of variant designs

There is the bad type family approach, the good functional dependencies approach and the novel SameModulo approach. I don't think anybody is going to commit to the novel approach, given that it doesn't offer much more than the functional dependencies approach, which is simpler and has been around for ages. So in my view there's only one serious contender: the FunDep approach.

I haven't thought about it as much recently, but I generally agree. Someone should write a proposal to use the FunDep approach. Any volunteers? Or maybe I'll get to it eventually...

Record dot syntax could then default to HasField but allow use of HasField' via a rebindable-syntax-like mechanism

Why?

Well, it's a complexity trade-off; dropping type-changing update gets you simpler inferred types. It's not completely obvious that having two HasField classes is better than just the more general option, but it should be relatively cheap to implement.

with split HasLens:

incOwnerAge company = company.just.owner { age = succ age }

(note sure if x.y.z { a = ... } syntax is supported though, haven't been paying attention)

I think you'd need general modification syntax for that (something like company { owner.just.age & succ }, which isn't part of the accepted RecordDotSyntax proposal. With the proposal as it stands you could have company { owner.just.age = succ _ } but then can't fill the hole as company.owner.just.age isn't gettable.

Or you could use optics:

incOwnerAge company = company & #owner % _Just % #age %~ succ

Once you get beyond simple (nested) field selection and update, I'd suggest using lenses/optics directly, rather than trying to extend record operators with pseudo-fields like just.

FWIW I consider HasField primarily an implementation detail of the various "syntaxes for record manipulation" provided by optics libraries and RecordDotSyntax, rather than something that end users would be expected to access directly or define APIs with. That's why I favour the s -> (b -> t, a) representation (simple to construct and to convert into a lens, even if it doesn't compose as well). But it also motivates keeping HasField relatively limited in scope: just the automatic constraint solving based on existing record definitions, rather than some more general notion of stringly-named field-like things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Linked issues

Successfully merging this pull request may close these issues.

None yet