From 03ef0a1026a5a736adfffc77bbf42cc5af4361e5 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Fri, 11 Oct 2019 02:18:36 -0400 Subject: [PATCH 01/49] Record dot syntax language extension proposal --- proposals/0000-record-dot-syntax.md | 292 ++++++++++++++++++++++++++++ 1 file changed, 292 insertions(+) create mode 100644 proposals/0000-record-dot-syntax.md diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md new file mode 100644 index 0000000000..b1bdc23a70 --- /dev/null +++ b/proposals/0000-record-dot-syntax.md @@ -0,0 +1,292 @@ +--- +author: Neil Mitchell and Shayne Fletcher +date-accepted: "" +proposal-number: "" +ticket-url: "" +implemented: "" +--- + +This proposal is [discussed at this pull request](https://github.com/ghc-proposals/ghc-proposals/pull/282). + +# Record Dot Syntax + +Records in Haskell are [widely recognised](https://www.yesodweb.com/blog/2011/09/limitations-of-haskell) as being under-powered, with duplicate field names being particularly troublesome. We propose a new language extension `RecordDotSyntax` that provides syntactic sugar to make the features introduced in [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst) more accessible, improving the user experience. + +## Motivation + +In almost every programming language we write `a.b` to mean the `b` field of the `a` record expression. In Haskell that becomes `b a`, and even then, only works if there is only one `b` in scope. Haskell programmers have struggled with this weakness, variously putting each record in a separate module and using qualified imports, or prefixing record fields with the type name. We propose bringing `a.b` to Haskell, which works regardless of how many `b` fields are in scope. Here's a simple example of what is on offer: + +```haskell +{-# LANGUAGE RecordDotSyntax #-} + +data Company = Company {name :: String, owner :: Person} +data Person = Person {name :: String, age :: Int} + +display :: Company -> String +display c = c.name ++ " is run by " ++ c.owner.name + +nameAfterOwner :: Company -> Company +nameAfterOwner c = c{name = c.owner.name ++ "'s Company"} +``` + +We declare two records both having `name` as a field label. The user may then write `c.name` and `c.owner.name` to access those fields. We can also write `c{name = x}` as a record update, which works even though `name` is no longer unique. Under the hood, we make use of `getField` and `setField` from [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). + +An implementation of this proposal has been battle tested and hardened over 18 months in the enterprise environment as part of [Digital Asset](https://digitalasset.com/)'s [DAML](https://daml.com/) smart contract language (a Haskell derivative utilizing GHC in its implementation), and also in a [Haskell preprocessor and a GHC plugin](https://github.com/ndmitchell/record-dot-preprocessor/). When initially considering Haskell as a basis for DAML, the inadequacy of records was considered the most severe problem, and without devising the scheme presented here, we wouldn't be using Haskell. The feature enjoys universal popularity with users. + +## Proposed Change Specification + +For the specification we focus on the changes to the parsing rules, and the desugaring, with the belief the type checking and renamer changes required are an unambiguous consequences of those. To confirm these changes integrate as expected we have written [a prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) that parses and desugars the forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. + +### `RecordDotSyntax` language extension + +This change adds a new language extension (enabled at source via `{-# LANGUAGE RecordDotSyntax #-}` or on the command line via the flag `-XRecordDotSyntax`). + +When `RecordDotSyntax` is in effect, the use of '.' to denote record field access is disambiguated from function composition by the absence of whitespace trailing the '.'. + +Suppose the following datatype declarations. + +```haskell +data Foo = Foo {foo :: Bar} +data Bar = Bar {bar :: Baz} +data Baz = Baz {baz :: Quux} +data Quux = Quux {quux :: Int} +``` + +The existence of the builtin `HasField` typeclass means that it is possible to write code for getting and setting record fields like this: + +```haskell +getQuux :: Foo -> Int +getQuux a = getField @"quux" (getField @"baz" (getField @"bar" (getField @"foo" a))) + +setQuux :: Foo -> Int -> Foo +setQuux a i = setField@"foo" a (setField@"bar" (getField @"foo" a) (setField@"baz" (getField @"bar" (getField @"foo" a)) (setField@"quux" (getField @"baz" (getField @"bar" (getField @"foo" a))) i))) +``` + +`RecordDotSyntax` enables new concrete syntax so that the following program is equivalent. + +```haskell +getQuux a = a.foo.bar.baz.quux +setQuux a i = a{foo.bar.baz.quux = i} +``` + +In the event the language extension is enabled: + +| Expression | Equivalent | +| -- | -- | +| `e.lbl` | `getField @"lbl" e` the `.` cannot have whitespace after | +| `e{lbl = val}` | `setField @"lbl" e val` | +| `(.lbl)` | `(\x -> x.lbl)` the `.` cannot have whitespace after | +| `e{lbl1.lbl2 = val}` | `e{lbl1 = (e.lbl1){lbl2 = val}}` performing a nested update | +| `e{lbl * val}` | `e{lbl = e.lbl * val}` where `*` can be any operator | +| `e{lbl1.lbl2}` | `e{lbl1.lbl2 = lbl2}` when punning is enabled | + +The above forms combine to provide these identities: + +| Expression | Identity +| -- | -- | +| `e.lbl1.lbl2` | `(e.lbl1).lbl2` | +| `(.lbl1.lbl2)` | `(\x -> x.lbl1.lbl2)` | +| `e.lbl1{lbl2 = val}` | `(e.lbl1){lbl2 = val}` | +| `e{lbl1 = val}.lbl2` | `(e{lbl1 = val}).lbl2` | +| `e{lbl1.lbl2 * val}` | `e{lbl1.lbl2 = e.lbl1.lbl2 * val}` | +| `e{lbl1 = val1, lbl2 = val2}` | `(e{lbl1 = val1}){lbl2 = val2}` | +| `e{lbl1.lbl2, ..}` | `e{lbl2=lbl1.lbl2, ..}` when record wild cards are enabled | + +### Lexer + +A new lexeme *fieldid* is introduced. +
+
*lexeme* → *qvarid* | *qconid* | *qvarsym* | *qconsym* +| *literal* | *special* | *reservedop* | *reservedid* | *fieldid* +
*fieldid* → *.varid* + +This specification results in the following. + +```haskell +-- Regular expressions +@fieldid = (\. @varid) +... +<0,option_prags> { + ... + @fieldid / {ifExtension RecordDotSyntaxBit} { idtoken fieldid } +} +... + +-- Token type +data Token + = ITas + | ... + | ITfieldid FastString + ... + +-- Lexer actions +fieldid :: StringBuffer -> Int -> Token +fieldid buf len = let (_dot, buf') = nextChar buf in ITfieldid $! lexemeToFastString buf' (len - 1) +``` + +Tokens of case `ITfieldid` may not be issued if `RecordDotSyntax` is not enabled. + +### Parser + +#### Field selections + +To support '.' field selection the *fexp* production is extended. +
+
*fexp* → [ *fexp* ] *aexp* | *fexp* *fieldid* + +The specification expresses like this. + +```haskell +%token + ... + FIELDID { L _ (ITfieldid _) } +%% + +... + +fexp :: { ECP } + : fexp aexp { ...} + | fexp FIELDID { ...} -- <- here + | ... +``` + +#### Field updates + +To support the new forms of '.' field update, the *aexp* production is extended. +
+
*aexp* → *aexp⟨qcon⟩* { *pbind* , … , *pbind* } +
*pbind* -> *qvar*=*exp* | *var* *fieldids*=*exp* | *var* *fieldids* *qop* *exp* | *var* [*fieldids*] +
*fieldids* -> *fieldids* *fieldid* + +In this table, the newly added cases are shown next to an example expression they enable: + +| Production | Example | Commentary | +| -- | -- | -- | +|*var* *fieldids*=*exp* | `a{foo.bar=2}` | the *var* is `foo`, `.bar` is a fieldid | +|*var* *fieldids* *qop* *exp* | `a{foo.bar * 12}` | update `a`'s `foo.bar` field to 12 times its initial value | +|*var* [*fieldids*] | `a{foo.bar}` | means `a{foo.bar = bar}` when punning is enabled | + +For example, support for expressions like `a{foo.bar.baz.quux=i}` can be had with one additional case: + +```haskell +aexp1 :: { ECP } + : aexp1 '{' fbinds '}' { ... } + | aexp1 '{' VARID fieldids '=' texp '}' {...} -- <- here + +fieldids :: {[FastString]} +fieldids + : fieldids FIELDID { getFIELDID $2 : $1 } + | FIELDID { [getFIELDID $1] } + +{ +getFIELDID (dL->L _ (ITfieldid x)) = x +} +``` + +An implementation of `RecordDotSyntax` will have to do more than this to incorporate all alternatives. + +#### Sections + +To support '.' sections (e.g. `(.foo.bar.baz)`), we generalize *aexp*. +
+
*aexp* → ( *infixexp* *qop* ) (left section) + | ( *qop* *infixexp* ) (right section) + | ( *fieldids* ) (projection (right) section) + +This specification implies the following additional case to `aexp2`. + +```haskell +aexp2 :: { ECP } + ... + | '(' texp ')' {...} + | '(' fieldids ')' {...} -- <- here +``` + +## Examples + +This is a record type with functions describing a study `Class` (*Oh! Pascal, 2nd ed. Cooper & Clancy, 1985*). + +```haskell +data Grade = A | B | C | D | E | F +data Quarter = Fall | Winter | Spring +data Status = Passed | Failed | Incomplete | Withdrawn + +data Taken = + Taken { year : Int + , term : Quarter + } + +data Class = + Class { hours : Int + , units : Int + , grade : Grade + , result : Status + , taken : Taken + } + +getResult :: Class -> Status +getResult c = c.result -- get + +setResult :: Class -> Status -> Class +setResult c r = c{result = r} -- update + +setYearTaken :: Class -> Int -> Class +setYearTaken c y = c{taken.year = y} -- nested update + +addYears :: Class -> Int -> Class +addYears c n = c{taken.year + n} -- update via op + +squareUnits :: Class -> Class +squareUnits c = c{units & (\x -> x * x)} -- update via function + +getResults :: [Class] -> [Status] +getResults = map (.result) -- section + +getTerms :: [Class] -> [Quarter] +getTerms = map (.taken.term) -- nested section +``` + +A full, rigorous set of examples (as tests) are available in the examples directory of [this repository](https://github.com/ndmitchell/record-dot-preprocessor). Those tests include infix applications, polymorphic data types, interoperation with other extensions and more. They follow the [specifications given earlier](#proposed-change-specification). + +## Effect and Interactions + +**Polymorphic updates:** When enabled, this extension takes the `a{b=c}` syntax and uses it to mean `setField`. The biggest difference a user is likely to experience is that the resulting type of `a{b=c}` is the same as the type `a` - you _cannot_ change the type of the record by updating its fields. The removal of polymorphism is considered essential to preserve decent type inference, and is the only option supported by [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). + +**Stealing a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. + +**Rebindable syntax:** When `RebindableSyntax` is enabled the `getField`, `setField` and `modifyField` functions are those in scope, rather than those in `GHC.Records`. + +**Enabled extensions:** When `RecordDotSyntax` is enabled it should imply the `NoFieldSelectors` extension and allow duplicate record field labels. It would be possible for `RecordDotSyntax` to imply `DuplicateRecordFields`, but we suspect that if people become comfortable with `RecordDotSyntax` then there will be a desire to remove the `DuplicateRecordFields` extension, so we don't want to build on top of it. + +## Costs and Drawbacks + +The implementation of this proposal adds code to the compiler, but not a huge amount. Our [prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) shows the essence of the parsing changes, which is the most complex part. + +If this proposal becomes widely used then it is likely that all Haskell users would have to learn that `a.b` is a record field selection. Fortunately, given how popular this syntax is elsewhere, that is unlikely to surprise new users. + +This proposal advocates a different style of writing Haskell records, which is distinct from the existing style. As such, it may lead to the bifurcation of Haskell styles, with some people preferring the lens approach, and some people preferring the syntax presented here. That is no doubt unfortunate, but hard to avoid - `a.b` really is ubiquitous in programming languages. We consider that any solution to the records problem _must_ cause some level of divergence, but note that this mechanism (as distinct from some proposals) localises that divergence in the implementation of a module - users of the module will not know whether its internals used this extension or not. + +## Alternatives + +The primary alternatives to the problem of records are: + +* Using the [`lens` library](https://hackage.haskell.org/package/lens). The concept of lenses is very powerful, but that power can be [complex to use](https://twitter.com/fylwind/status/549342595940237312?lang=en). In many ways lenses let you abstract over record fields, but Haskell has neglected the "unabstracted" case of concrete fields. +* The [`DuplicateRecordFields` extension](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#duplicate-record-fields) is designed to solve similar problems. We evaluated this extension as the basis for DAML, but found it lacking. The rules about what types must be inferred by what point are cumbersome and tricky to work with, requiring a clear understanding of at what stage a type is inferred by the compiler. +* Some style guidelines mandate that each record should be in a separate module. That works, but then requires qualified modules to access fields - e.g. `Person.name (Company.owner c)`. Forcing the structure of the module system to follow the records also makes circular dependencies vastly more likely, leading to complications such as boot files that are ideally avoided. +* Some style guidelines suggest prefixing each record field with the type name, e.g. `personName (companyOwner c)`. While it works, it isn't pleasant, and many libraries then abbreviate the types to lead to code such as `prsnName (coOwner c)`, which can increase confusion. +* There is a [GHC plugin and preprocessor](https://github.com/ndmitchell/record-dot-preprocessor) that both implement much of this proposal. While both have seen light use, their ergonomics are not ideal. The preprocessor struggles to give good location information given the necessary expansion of substrings. The plugin cannot support the full proposal and leads to error messages mentioning `getField`. Suggesting either a preprocessor or plugin to beginners is not an adequate answer. One of the huge benefits to the `a.b` style in other languages is support for completion in IDE's, which is quite hard to give for something not actually in the language. +* Continue to [vent](https://www.reddit.com/r/haskell/comments/vdg55/haskells_record_system_is_a_cruel_joke/) [about](https://bitcheese.net/haskell-sucks) [records](https://medium.com/@snoyjerk/least-favorite-thing-about-haskal-ef8f80f30733) [on](https://www.quora.com/What-are-the-worst-parts-about-using-Haskell) [social](http://www.stephendiehl.com/posts/production.html) [media](https://www.drmaciver.com/2008/02/tell-us-why-your-language-sucks/). + +All these approaches are currently used, and represent the "status quo", where Haskell records are considered not fit for purpose. + +## Unresolved Questions + +Below are some possible variations on this plan, but we advocate the choices made above: + +* Should `RecordDotSyntax` imply `NoFieldSelectors`? They are often likely to be used in conjunction, but they aren't inseparable. +* It seems appealing that `a{field += 1}` would be the syntax for incrementing a field. However, `+=` is a valid operator (would that be `a{field +== 1}`?) and for infix operators like `div` would that be \`div\`=? +* We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. + +## Implementation Plan + +If accepted, the proposal authors would be delighted to provide an implementation. Implementation depends on the implementation of [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst) and [the `NoFieldSelectors` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0160-no-toplevel-field-selectors.rst). From 8a80758f20b1e50c87fef9e3191e88ef388d2ad5 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Fri, 11 Oct 2019 12:59:12 -0400 Subject: [PATCH 02/49] Syntax clarifications and unresolved questions --- proposals/0000-record-dot-syntax.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index b1bdc23a70..ea5f275d4b 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -92,6 +92,25 @@ The above forms combine to provide these identities: | `e{lbl1 = val1, lbl2 = val2}` | `(e{lbl1 = val1}){lbl2 = val2}` | | `e{lbl1.lbl2, ..}` | `e{lbl2=lbl1.lbl2, ..}` when record wild cards are enabled | +### Syntax + +#### Whitespace + +- `f.x` means `getField @"lbl" f`; +- `f .x` (note the space) means the same thing; +- `f (.x)` means `f (\r -> r.x)`; +- `f(.x)` does too. + +#### Qualified names vs. field projections + +- `Foo.name` is a qualified variable; +- `Foo .name` is an attempt to project a name field from a constructor (will manifest as an error); +- `Foo(.name)` is the constructor `Foo` applied to the section `(.name)`. + +#### Precedence + +In the prototype, function application takes precedence over field projection so `f a.foo.bar.baz.quux 12` parses as `((f a).foo.bar.baz.quux) 12`. To treat the first argument to `f` as a projection of `a`, write `f (a.foo.bar.baz.quux) 12` and `f (a .foo .bar .baz .quux) 12` is equivalent. + ### Lexer A new lexeme *fieldid* is introduced. @@ -286,6 +305,7 @@ Below are some possible variations on this plan, but we advocate the choices mad * Should `RecordDotSyntax` imply `NoFieldSelectors`? They are often likely to be used in conjunction, but they aren't inseparable. * It seems appealing that `a{field += 1}` would be the syntax for incrementing a field. However, `+=` is a valid operator (would that be `a{field +== 1}`?) and for infix operators like `div` would that be \`div\`=? * We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. +* Should `f a.foo.bar.baz.quux 12` parse as `((f a).foo.bar.baz.quux) 12` or `f (a.foo.bar.baz.quux) 12`? ## Implementation Plan From 5ae8019504ab802feb528c1c71f94e6d95917377 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Fri, 11 Oct 2019 21:12:57 +0100 Subject: [PATCH 03/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index ea5f275d4b..41f1e07721 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -41,7 +41,7 @@ For the specification we focus on the changes to the parsing rules, and the desu This change adds a new language extension (enabled at source via `{-# LANGUAGE RecordDotSyntax #-}` or on the command line via the flag `-XRecordDotSyntax`). -When `RecordDotSyntax` is in effect, the use of '.' to denote record field access is disambiguated from function composition by the absence of whitespace trailing the '.'. +When `RecordDotSyntax` is in effect, the use of '.' to denote record field access is disambiguated from function composition by the absence of whitespace trailing the '.'. Concretely, `a.b` and `a .b` are record field accesses, while `a . b` is not, where `a` can be any atomic expression (e.g. `foo`, `foo.bar` or `(f x y)`), but not a module name (e.g. not `Foo.bar`). Suppose the following datatype declarations. @@ -269,7 +269,9 @@ A full, rigorous set of examples (as tests) are available in the examples direct ## Effect and Interactions -**Polymorphic updates:** When enabled, this extension takes the `a{b=c}` syntax and uses it to mean `setField`. The biggest difference a user is likely to experience is that the resulting type of `a{b=c}` is the same as the type `a` - you _cannot_ change the type of the record by updating its fields. The removal of polymorphism is considered essential to preserve decent type inference, and is the only option supported by [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). +**Polymorphic updates:** When enabled, this extension takes the `a{b=c}` syntax and uses it to mean `setField`. The biggest difference a user is likely to experience is that the resulting type of `a{b=c}` is the same as the type `a` - you _cannot_ change the type of the record by updating its fields. The removal of polymorphism is considered essential to preserve decent type inference, and is the only option supported by [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). Anyone wishing to use polymorphic updates can write `let Foo{..} = Foo{polyField=[], ..}` instead. + +**Higher-rank fields:** It is impossible to express `HasField` instances for data types such as `data T = MkT { foo :: forall a . a -> a}`, which means they can't have this syntax available. Users can still write their own selector functions using record puns if required. **Stealing a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. @@ -302,7 +304,7 @@ All these approaches are currently used, and represent the "status quo", where H Below are some possible variations on this plan, but we advocate the choices made above: -* Should `RecordDotSyntax` imply `NoFieldSelectors`? They are often likely to be used in conjunction, but they aren't inseparable. +* Should `RecordDotSyntax` imply `NoFieldSelectors`? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. * It seems appealing that `a{field += 1}` would be the syntax for incrementing a field. However, `+=` is a valid operator (would that be `a{field +== 1}`?) and for infix operators like `div` would that be \`div\`=? * We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. * Should `f a.foo.bar.baz.quux 12` parse as `((f a).foo.bar.baz.quux) 12` or `f (a.foo.bar.baz.quux) 12`? From ce3f0fb113a0084c7feef163e7aecc9b6065b560 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 12 Oct 2019 09:09:47 -0400 Subject: [PATCH 04/49] Link precedence note to unresolved questions section --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 41f1e07721..bf4e297693 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -109,7 +109,7 @@ The above forms combine to provide these identities: #### Precedence -In the prototype, function application takes precedence over field projection so `f a.foo.bar.baz.quux 12` parses as `((f a).foo.bar.baz.quux) 12`. To treat the first argument to `f` as a projection of `a`, write `f (a.foo.bar.baz.quux) 12` and `f (a .foo .bar .baz .quux) 12` is equivalent. +In the prototype, function application takes precedence over field projection so `f a.foo.bar.baz.quux 12` parses as `((f a).foo.bar.baz.quux) 12`. To treat the first argument to `f` as a projection of `a`, write `f (a.foo.bar.baz.quux) 12` and `f (a .foo .bar .baz .quux) 12` is equivalent (see the ["Unresolved Questions"](#unresolved-questions) section). ### Lexer From 037c140ebe0924e830ee400c30da888bc9a51caa Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Mon, 14 Oct 2019 17:23:10 +0100 Subject: [PATCH 05/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 73 ++++++++++------------------- 1 file changed, 24 insertions(+), 49 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index bf4e297693..31969ce09c 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -35,49 +35,19 @@ An implementation of this proposal has been battle tested and hardened over 18 m ## Proposed Change Specification -For the specification we focus on the changes to the parsing rules, and the desugaring, with the belief the type checking and renamer changes required are an unambiguous consequences of those. To confirm these changes integrate as expected we have written [a prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) that parses and desugars the forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. +For the specification we focus on the changes to the parsing rules, and the desugaring, with the belief the type checking and renamer changes required are an unambiguous consequences of those. ### `RecordDotSyntax` language extension -This change adds a new language extension (enabled at source via `{-# LANGUAGE RecordDotSyntax #-}` or on the command line via the flag `-XRecordDotSyntax`). - -When `RecordDotSyntax` is in effect, the use of '.' to denote record field access is disambiguated from function composition by the absence of whitespace trailing the '.'. Concretely, `a.b` and `a .b` are record field accesses, while `a . b` is not, where `a` can be any atomic expression (e.g. `foo`, `foo.bar` or `(f x y)`), but not a module name (e.g. not `Foo.bar`). - -Suppose the following datatype declarations. - -```haskell -data Foo = Foo {foo :: Bar} -data Bar = Bar {bar :: Baz} -data Baz = Baz {baz :: Quux} -data Quux = Quux {quux :: Int} -``` - -The existence of the builtin `HasField` typeclass means that it is possible to write code for getting and setting record fields like this: - -```haskell -getQuux :: Foo -> Int -getQuux a = getField @"quux" (getField @"baz" (getField @"bar" (getField @"foo" a))) - -setQuux :: Foo -> Int -> Foo -setQuux a i = setField@"foo" a (setField@"bar" (getField @"foo" a) (setField@"baz" (getField @"bar" (getField @"foo" a)) (setField@"quux" (getField @"baz" (getField @"bar" (getField @"foo" a))) i))) -``` - -`RecordDotSyntax` enables new concrete syntax so that the following program is equivalent. - -```haskell -getQuux a = a.foo.bar.baz.quux -setQuux a i = a{foo.bar.baz.quux = i} -``` - -In the event the language extension is enabled: +This change adds a new language extension `RecordDotSyntax`. In the event the language extension is enabled: | Expression | Equivalent | | -- | -- | -| `e.lbl` | `getField @"lbl" e` the `.` cannot have whitespace after | +| `e.lbl` | `getField @"lbl" e` the `.` cannot have whitespace either before or after | | `e{lbl = val}` | `setField @"lbl" e val` | | `(.lbl)` | `(\x -> x.lbl)` the `.` cannot have whitespace after | | `e{lbl1.lbl2 = val}` | `e{lbl1 = (e.lbl1){lbl2 = val}}` performing a nested update | -| `e{lbl * val}` | `e{lbl = e.lbl * val}` where `*` can be any operator | +| `e{lbl * val}` | `e{lbl = e.lbl * val}` where `*` can be any operator (can be optimised to `modifyField`) | | `e{lbl1.lbl2}` | `e{lbl1.lbl2 = lbl2}` when punning is enabled | The above forms combine to provide these identities: @@ -94,22 +64,19 @@ The above forms combine to provide these identities: ### Syntax -#### Whitespace +#### Record selection -- `f.x` means `getField @"lbl" f`; -- `f .x` (note the space) means the same thing; -- `f (.x)` means `f (\r -> r.x)`; -- `f(.x)` does too. +The expression: -#### Qualified names vs. field projections +> e.lbl -- `Foo.name` is a qualified variable; -- `Foo .name` is an attempt to project a name field from a constructor (will manifest as an error); -- `Foo(.name)` is the constructor `Foo` applied to the section `(.name)`. +means `getField @"lbl" f`, provided: -#### Precedence +- There is no whitespace either side of `.` +- That `lbl` is a valid variable name +- That `e` is an expression, but not a *conid* -In the prototype, function application takes precedence over field projection so `f a.foo.bar.baz.quux 12` parses as `((f a).foo.bar.baz.quux) 12`. To treat the first argument to `f` as a projection of `a`, write `f (a.foo.bar.baz.quux) 12` and `f (a .foo .bar .baz .quux) 12` is equivalent (see the ["Unresolved Questions"](#unresolved-questions) section). +Similarly, `e{lbl=val}` only applies if `e` is an expression, but not a *conid*. ### Lexer @@ -221,6 +188,12 @@ aexp2 :: { ECP } | '(' fieldids ')' {...} -- <- here ``` +### Prototype + +To confirm these changes integrate as expected we have written [a prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) that parses and desugars the forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. + +In the prototype, function application incorrectly takes precedence over field projection so `f a.foo.bar.baz.quux 12` parses as `((f a).foo.bar.baz.quux) 12` - we consider that a bug. + ## Examples This is a record type with functions describing a study `Class` (*Oh! Pascal, 2nd ed. Cooper & Clancy, 1985*). @@ -271,7 +244,7 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Polymorphic updates:** When enabled, this extension takes the `a{b=c}` syntax and uses it to mean `setField`. The biggest difference a user is likely to experience is that the resulting type of `a{b=c}` is the same as the type `a` - you _cannot_ change the type of the record by updating its fields. The removal of polymorphism is considered essential to preserve decent type inference, and is the only option supported by [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). Anyone wishing to use polymorphic updates can write `let Foo{..} = Foo{polyField=[], ..}` instead. -**Higher-rank fields:** It is impossible to express `HasField` instances for data types such as `data T = MkT { foo :: forall a . a -> a}`, which means they can't have this syntax available. Users can still write their own selector functions using record puns if required. +**Higher-rank fields:** It is impossible to express `HasField` instances for data types such as `data T = MkT { foo :: forall a . a -> a}`, which means they can't have this syntax available. Users can still write their own selector functions using record puns if required. There is a possibility that with future types of impredicativity such `getField` expressions could be solved specially by the compiler. **Stealing a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. @@ -304,10 +277,12 @@ All these approaches are currently used, and represent the "status quo", where H Below are some possible variations on this plan, but we advocate the choices made above: -* Should `RecordDotSyntax` imply `NoFieldSelectors`? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. -* It seems appealing that `a{field += 1}` would be the syntax for incrementing a field. However, `+=` is a valid operator (would that be `a{field +== 1}`?) and for infix operators like `div` would that be \`div\`=? +* Should `RecordDotSyntax` imply `NoFieldSelectors`? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. Of those two, `DuplicateRecordFields` complicates GHC, while `NoFieldSelectors` conceptually simplifies it, so we prefer to bias the eventual outcome. +* It seems appealing that `a{field += 1}` would be the syntax for incrementing a field. However, `+=` is a valid operator (would that be `a{field +== 1}`?) and for infix operators like `div` would that be \`div\`=? One possibility is to use the syntax `a{field + = 1}`. +* There are no update sections. Should `({a=})`, `({a=b})` or `(.lbl=)` be an update section? * We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. -* Should `f a.foo.bar.baz.quux 12` parse as `((f a).foo.bar.baz.quux) 12` or `f (a.foo.bar.baz.quux) 12`? +* Will whitespace sensitivity become worse? We're not aware of qualified modules giving any problems, but it's adding whitespace sensitivity in one more place. +* One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. ## Implementation Plan From 5565fa11d4655e0e0282d9cc256f59fc65aab404 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Mon, 14 Oct 2019 16:29:01 -0400 Subject: [PATCH 06/49] Note projection takes precedence over application --- proposals/0000-record-dot-syntax.md | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 31969ce09c..a96688f2d6 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -72,9 +72,10 @@ The expression: means `getField @"lbl" f`, provided: -- There is no whitespace either side of `.` -- That `lbl` is a valid variable name -- That `e` is an expression, but not a *conid* +- There is no whitespace either side of `.`; +- That `lbl` is a valid variable name; +- That `e` is an expression, but not a *conid*; +- Precedence : `f a.foo.bar.baz.quux 12` parses as `f (a.foo.bar.baz.quux) 12`. Similarly, `e{lbl=val}` only applies if `e` is an expression, but not a *conid*. @@ -190,9 +191,7 @@ aexp2 :: { ECP } ### Prototype -To confirm these changes integrate as expected we have written [a prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) that parses and desugars the forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. - -In the prototype, function application incorrectly takes precedence over field projection so `f a.foo.bar.baz.quux 12` parses as `((f a).foo.bar.baz.quux) 12` - we consider that a bug. +To confirm these changes integrate as expected we have written [a prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) that parses and desugars the forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. Note that in the prototype, projection ([as proposed here](#syntax)), takes precedence over application so `f a.foo.bar.baz.quux 12` parses as `f (a.foo.bar.baz.quux) 12`. ## Examples From a78a0c2294697c79efa798167defe957437bacf1 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Thu, 17 Oct 2019 06:53:28 -0400 Subject: [PATCH 07/49] Typo --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index a96688f2d6..ff2d28feb0 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -70,7 +70,7 @@ The expression: > e.lbl -means `getField @"lbl" f`, provided: +means `getField @"lbl" e`, provided: - There is no whitespace either side of `.`; - That `lbl` is a valid variable name; From d0b0952971d5379d064ee2e9350f711ae42d65d0 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Thu, 17 Oct 2019 06:55:05 -0400 Subject: [PATCH 08/49] Fix type-of --- proposals/0000-record-dot-syntax.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index ff2d28feb0..085d925b3f 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -203,16 +203,16 @@ data Quarter = Fall | Winter | Spring data Status = Passed | Failed | Incomplete | Withdrawn data Taken = - Taken { year : Int - , term : Quarter + Taken { year :: Int + , term :: Quarter } data Class = - Class { hours : Int - , units : Int - , grade : Grade - , result : Status - , taken : Taken + Class { hours :: Int + , units :: Int + , grade :: Grade + , result :: Status + , taken :: Taken } getResult :: Class -> Status From 3efa3578f46d596dab7c18390569455576e23239 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Fri, 18 Oct 2019 19:06:11 +0100 Subject: [PATCH 09/49] Remove punning for record fields --- proposals/0000-record-dot-syntax.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 085d925b3f..cde7d2e2eb 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -48,7 +48,6 @@ This change adds a new language extension `RecordDotSyntax`. In the event the la | `(.lbl)` | `(\x -> x.lbl)` the `.` cannot have whitespace after | | `e{lbl1.lbl2 = val}` | `e{lbl1 = (e.lbl1){lbl2 = val}}` performing a nested update | | `e{lbl * val}` | `e{lbl = e.lbl * val}` where `*` can be any operator (can be optimised to `modifyField`) | -| `e{lbl1.lbl2}` | `e{lbl1.lbl2 = lbl2}` when punning is enabled | The above forms combine to provide these identities: @@ -60,7 +59,6 @@ The above forms combine to provide these identities: | `e{lbl1 = val}.lbl2` | `(e{lbl1 = val}).lbl2` | | `e{lbl1.lbl2 * val}` | `e{lbl1.lbl2 = e.lbl1.lbl2 * val}` | | `e{lbl1 = val1, lbl2 = val2}` | `(e{lbl1 = val1}){lbl2 = val2}` | -| `e{lbl1.lbl2, ..}` | `e{lbl2=lbl1.lbl2, ..}` when record wild cards are enabled | ### Syntax @@ -269,6 +267,7 @@ The primary alternatives to the problem of records are: * Some style guidelines suggest prefixing each record field with the type name, e.g. `personName (companyOwner c)`. While it works, it isn't pleasant, and many libraries then abbreviate the types to lead to code such as `prsnName (coOwner c)`, which can increase confusion. * There is a [GHC plugin and preprocessor](https://github.com/ndmitchell/record-dot-preprocessor) that both implement much of this proposal. While both have seen light use, their ergonomics are not ideal. The preprocessor struggles to give good location information given the necessary expansion of substrings. The plugin cannot support the full proposal and leads to error messages mentioning `getField`. Suggesting either a preprocessor or plugin to beginners is not an adequate answer. One of the huge benefits to the `a.b` style in other languages is support for completion in IDE's, which is quite hard to give for something not actually in the language. * Continue to [vent](https://www.reddit.com/r/haskell/comments/vdg55/haskells_record_system_is_a_cruel_joke/) [about](https://bitcheese.net/haskell-sucks) [records](https://medium.com/@snoyjerk/least-favorite-thing-about-haskal-ef8f80f30733) [on](https://www.quora.com/What-are-the-worst-parts-about-using-Haskell) [social](http://www.stephendiehl.com/posts/production.html) [media](https://www.drmaciver.com/2008/02/tell-us-why-your-language-sucks/). +* Previous versions of this proposal proposed a punning syntax for records, that was widely viewed as being confusing. It has been removed. All these approaches are currently used, and represent the "status quo", where Haskell records are considered not fit for purpose. From 59eca76fcdbfb37dbfd947e26ac68883a367b196 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Sat, 19 Oct 2019 19:26:47 +0100 Subject: [PATCH 10/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index cde7d2e2eb..b1b9bf235e 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -257,6 +257,18 @@ If this proposal becomes widely used then it is likely that all Haskell users wo This proposal advocates a different style of writing Haskell records, which is distinct from the existing style. As such, it may lead to the bifurcation of Haskell styles, with some people preferring the lens approach, and some people preferring the syntax presented here. That is no doubt unfortunate, but hard to avoid - `a.b` really is ubiquitous in programming languages. We consider that any solution to the records problem _must_ cause some level of divergence, but note that this mechanism (as distinct from some proposals) localises that divergence in the implementation of a module - users of the module will not know whether its internals used this extension or not. +The use of `a.b` with no spaces on either side can make it harder to write expressions that span multiple lines. To split over two lines it is possible to do either of: + +``` +(myexpression.field1.field2.field3 + ).field4.field5 + +let temp = myexpression.field1.field2.field3 +in temp.field4.field5 +``` + +We prefer the former, but both are permissible. + ## Alternatives The primary alternatives to the problem of records are: From c2c4edee9e5fd6dc966de42c9e42d33235b2ae42 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 19 Oct 2019 18:03:21 -0400 Subject: [PATCH 11/49] More remove punning for record fields --- proposals/0000-record-dot-syntax.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index b1b9bf235e..3533b22892 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -140,7 +140,7 @@ fexp :: { ECP } To support the new forms of '.' field update, the *aexp* production is extended.

*aexp* → *aexp⟨qcon⟩* { *pbind* , … , *pbind* } -
*pbind* -> *qvar*=*exp* | *var* *fieldids*=*exp* | *var* *fieldids* *qop* *exp* | *var* [*fieldids*] +
*pbind* -> *qvar*=*exp* | *var* *fieldids*=*exp* | *var* *fieldids* *qop* *exp*
*fieldids* -> *fieldids* *fieldid* In this table, the newly added cases are shown next to an example expression they enable: @@ -149,7 +149,6 @@ In this table, the newly added cases are shown next to an example expression the | -- | -- | -- | |*var* *fieldids*=*exp* | `a{foo.bar=2}` | the *var* is `foo`, `.bar` is a fieldid | |*var* *fieldids* *qop* *exp* | `a{foo.bar * 12}` | update `a`'s `foo.bar` field to 12 times its initial value | -|*var* [*fieldids*] | `a{foo.bar}` | means `a{foo.bar = bar}` when punning is enabled | For example, support for expressions like `a{foo.bar.baz.quux=i}` can be had with one additional case: From 8974c3144e6adb52d47781c7c415b2e01cb1a49e Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 19 Oct 2019 18:06:29 -0400 Subject: [PATCH 12/49] Whitespace fix --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 3533b22892..a4791628bf 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -147,7 +147,7 @@ In this table, the newly added cases are shown next to an example expression the | Production | Example | Commentary | | -- | -- | -- | -|*var* *fieldids*=*exp* | `a{foo.bar=2}` | the *var* is `foo`, `.bar` is a fieldid | +|*var* *fieldids*=*exp* | `a{foo.bar = 2}` | the *var* is `foo`, `.bar` is a fieldid | |*var* *fieldids* *qop* *exp* | `a{foo.bar * 12}` | update `a`'s `foo.bar` field to 12 times its initial value | For example, support for expressions like `a{foo.bar.baz.quux=i}` can be had with one additional case: From 9e3aa06e9527705da4c41e4f88bd453f49ce2150 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Thu, 24 Oct 2019 15:08:25 +0100 Subject: [PATCH 13/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index a4791628bf..fffb88d431 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -47,7 +47,6 @@ This change adds a new language extension `RecordDotSyntax`. In the event the la | `e{lbl = val}` | `setField @"lbl" e val` | | `(.lbl)` | `(\x -> x.lbl)` the `.` cannot have whitespace after | | `e{lbl1.lbl2 = val}` | `e{lbl1 = (e.lbl1){lbl2 = val}}` performing a nested update | -| `e{lbl * val}` | `e{lbl = e.lbl * val}` where `*` can be any operator (can be optimised to `modifyField`) | The above forms combine to provide these identities: @@ -57,7 +56,6 @@ The above forms combine to provide these identities: | `(.lbl1.lbl2)` | `(\x -> x.lbl1.lbl2)` | | `e.lbl1{lbl2 = val}` | `(e.lbl1){lbl2 = val}` | | `e{lbl1 = val}.lbl2` | `(e{lbl1 = val}).lbl2` | -| `e{lbl1.lbl2 * val}` | `e{lbl1.lbl2 = e.lbl1.lbl2 * val}` | | `e{lbl1 = val1, lbl2 = val2}` | `(e{lbl1 = val1}){lbl2 = val2}` | ### Syntax @@ -140,7 +138,7 @@ fexp :: { ECP } To support the new forms of '.' field update, the *aexp* production is extended.

*aexp* → *aexp⟨qcon⟩* { *pbind* , … , *pbind* } -
*pbind* -> *qvar*=*exp* | *var* *fieldids*=*exp* | *var* *fieldids* *qop* *exp* +
*pbind* -> *qvar*=*exp* | *var* *fieldids*=*exp*
*fieldids* -> *fieldids* *fieldid* In this table, the newly added cases are shown next to an example expression they enable: @@ -148,7 +146,6 @@ In this table, the newly added cases are shown next to an example expression the | Production | Example | Commentary | | -- | -- | -- | |*var* *fieldids*=*exp* | `a{foo.bar = 2}` | the *var* is `foo`, `.bar` is a fieldid | -|*var* *fieldids* *qop* *exp* | `a{foo.bar * 12}` | update `a`'s `foo.bar` field to 12 times its initial value | For example, support for expressions like `a{foo.bar.baz.quux=i}` can be had with one additional case: @@ -222,10 +219,10 @@ setYearTaken :: Class -> Int -> Class setYearTaken c y = c{taken.year = y} -- nested update addYears :: Class -> Int -> Class -addYears c n = c{taken.year + n} -- update via op +addYears c n = c{taken.year = c.taken.year + n} -- update via op squareUnits :: Class -> Class -squareUnits c = c{units & (\x -> x * x)} -- update via function +squareUnits c = c{units = (\x -> x * x) c.units} -- update via function getResults :: [Class] -> [Status] getResults = map (.result) -- section @@ -244,7 +241,7 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Stealing a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. -**Rebindable syntax:** When `RebindableSyntax` is enabled the `getField`, `setField` and `modifyField` functions are those in scope, rather than those in `GHC.Records`. +**Rebindable syntax:** When `RebindableSyntax` is enabled the `getField` and `setField` functions are those in scope, rather than those in `GHC.Records`. **Enabled extensions:** When `RecordDotSyntax` is enabled it should imply the `NoFieldSelectors` extension and allow duplicate record field labels. It would be possible for `RecordDotSyntax` to imply `DuplicateRecordFields`, but we suspect that if people become comfortable with `RecordDotSyntax` then there will be a desire to remove the `DuplicateRecordFields` extension, so we don't want to build on top of it. @@ -268,7 +265,7 @@ in temp.field4.field5 We prefer the former, but both are permissible. -## Alternatives +## Alternatives to this proposal The primary alternatives to the problem of records are: @@ -282,16 +279,17 @@ The primary alternatives to the problem of records are: All these approaches are currently used, and represent the "status quo", where Haskell records are considered not fit for purpose. -## Unresolved Questions +## Alternatives within this proposal Below are some possible variations on this plan, but we advocate the choices made above: * Should `RecordDotSyntax` imply `NoFieldSelectors`? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. Of those two, `DuplicateRecordFields` complicates GHC, while `NoFieldSelectors` conceptually simplifies it, so we prefer to bias the eventual outcome. -* It seems appealing that `a{field += 1}` would be the syntax for incrementing a field. However, `+=` is a valid operator (would that be `a{field +== 1}`?) and for infix operators like `div` would that be \`div\`=? One possibility is to use the syntax `a{field + = 1}`. -* There are no update sections. Should `({a=})`, `({a=b})` or `(.lbl=)` be an update section? +* Earlier versions of this proposal contained a modify field sytnax of the form `a{field * 2}`. While appealing, there is a lot of syntactic debate, with variously `a{field <- (*2)}`, `a{field * = 2}` and others being proposed. None of these syntax variations are immediately clear to someone not familiar with this proposal. To be conservative, we leave this feature out. +* There are no update sections. Should `({a=})`, `({a=b})` or `(.lbl=)` be an update section? While nice, we leave this feature out. * We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. * Will whitespace sensitivity become worse? We're not aware of qualified modules giving any problems, but it's adding whitespace sensitivity in one more place. * One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. +* For selector functions we have opted for `(.foo)`, but `.foo` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. For `.foo` we err on the side of caution, noting that it is possible to permit `.foo` at a later date, but harder to require backets in future. ## Implementation Plan From 3800f2a3360ae7cf3a8baeff02fa3bf2f8d3fef1 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Thu, 24 Oct 2019 15:39:37 +0100 Subject: [PATCH 14/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index fffb88d431..2a561b017a 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -290,6 +290,7 @@ Below are some possible variations on this plan, but we advocate the choices mad * Will whitespace sensitivity become worse? We're not aware of qualified modules giving any problems, but it's adding whitespace sensitivity in one more place. * One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. * For selector functions we have opted for `(.foo)`, but `.foo` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. For `.foo` we err on the side of caution, noting that it is possible to permit `.foo` at a later date, but harder to require backets in future. +* Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but that seemed to confuse everyone, so has been removed. ## Implementation Plan From c8f6eee1540cba7c68331db73128b83bd510ccb6 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Fri, 25 Oct 2019 10:53:42 +0100 Subject: [PATCH 15/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 2a561b017a..598ff5f461 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -267,7 +267,7 @@ We prefer the former, but both are permissible. ## Alternatives to this proposal -The primary alternatives to the problem of records are: +Instead of this proposal, we could do any of the following: * Using the [`lens` library](https://hackage.haskell.org/package/lens). The concept of lenses is very powerful, but that power can be [complex to use](https://twitter.com/fylwind/status/549342595940237312?lang=en). In many ways lenses let you abstract over record fields, but Haskell has neglected the "unabstracted" case of concrete fields. * The [`DuplicateRecordFields` extension](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#duplicate-record-fields) is designed to solve similar problems. We evaluated this extension as the basis for DAML, but found it lacking. The rules about what types must be inferred by what point are cumbersome and tricky to work with, requiring a clear understanding of at what stage a type is inferred by the compiler. @@ -275,7 +275,6 @@ The primary alternatives to the problem of records are: * Some style guidelines suggest prefixing each record field with the type name, e.g. `personName (companyOwner c)`. While it works, it isn't pleasant, and many libraries then abbreviate the types to lead to code such as `prsnName (coOwner c)`, which can increase confusion. * There is a [GHC plugin and preprocessor](https://github.com/ndmitchell/record-dot-preprocessor) that both implement much of this proposal. While both have seen light use, their ergonomics are not ideal. The preprocessor struggles to give good location information given the necessary expansion of substrings. The plugin cannot support the full proposal and leads to error messages mentioning `getField`. Suggesting either a preprocessor or plugin to beginners is not an adequate answer. One of the huge benefits to the `a.b` style in other languages is support for completion in IDE's, which is quite hard to give for something not actually in the language. * Continue to [vent](https://www.reddit.com/r/haskell/comments/vdg55/haskells_record_system_is_a_cruel_joke/) [about](https://bitcheese.net/haskell-sucks) [records](https://medium.com/@snoyjerk/least-favorite-thing-about-haskal-ef8f80f30733) [on](https://www.quora.com/What-are-the-worst-parts-about-using-Haskell) [social](http://www.stephendiehl.com/posts/production.html) [media](https://www.drmaciver.com/2008/02/tell-us-why-your-language-sucks/). -* Previous versions of this proposal proposed a punning syntax for records, that was widely viewed as being confusing. It has been removed. All these approaches are currently used, and represent the "status quo", where Haskell records are considered not fit for purpose. From 20c36a6765b719f0fe80581c1a30711250e7a865 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Fri, 25 Oct 2019 14:49:13 +0100 Subject: [PATCH 16/49] Switch to .a for field selectors --- proposals/0000-record-dot-syntax.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 598ff5f461..88b0d77e00 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -45,7 +45,7 @@ This change adds a new language extension `RecordDotSyntax`. In the event the la | -- | -- | | `e.lbl` | `getField @"lbl" e` the `.` cannot have whitespace either before or after | | `e{lbl = val}` | `setField @"lbl" e val` | -| `(.lbl)` | `(\x -> x.lbl)` the `.` cannot have whitespace after | +| `.lbl` | `(\x -> x.lbl)` the `.` cannot have whitespace after | | `e{lbl1.lbl2 = val}` | `e{lbl1 = (e.lbl1){lbl2 = val}}` performing a nested update | The above forms combine to provide these identities: @@ -53,7 +53,7 @@ The above forms combine to provide these identities: | Expression | Identity | -- | -- | | `e.lbl1.lbl2` | `(e.lbl1).lbl2` | -| `(.lbl1.lbl2)` | `(\x -> x.lbl1.lbl2)` | +| `.lbl1.lbl2` | `(\x -> x.lbl1.lbl2)` | | `e.lbl1{lbl2 = val}` | `(e.lbl1){lbl2 = val}` | | `e{lbl1 = val}.lbl2` | `(e{lbl1 = val}).lbl2` | | `e{lbl1 = val1, lbl2 = val2}` | `(e{lbl1 = val1}){lbl2 = val2}` | @@ -225,10 +225,10 @@ squareUnits :: Class -> Class squareUnits c = c{units = (\x -> x * x) c.units} -- update via function getResults :: [Class] -> [Status] -getResults = map (.result) -- section +getResults = map .result -- selector getTerms :: [Class] -> [Quarter] -getTerms = map (.taken.term) -- nested section +getTerms = map .taken.term -- nested selector ``` A full, rigorous set of examples (as tests) are available in the examples directory of [this repository](https://github.com/ndmitchell/record-dot-preprocessor). Those tests include infix applications, polymorphic data types, interoperation with other extensions and more. They follow the [specifications given earlier](#proposed-change-specification). @@ -288,7 +288,7 @@ Below are some possible variations on this plan, but we advocate the choices mad * We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. * Will whitespace sensitivity become worse? We're not aware of qualified modules giving any problems, but it's adding whitespace sensitivity in one more place. * One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. -* For selector functions we have opted for `(.foo)`, but `.foo` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. For `.foo` we err on the side of caution, noting that it is possible to permit `.foo` at a later date, but harder to require backets in future. +* For selector functions we have opted for `.foo`, but `(.foo)` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. The `.foo` seems to be universally preferred. * Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but that seemed to confuse everyone, so has been removed. ## Implementation Plan From 7f6cc558485d63947c89343a2e32a58dfdb0b4ab Mon Sep 17 00:00:00 2001 From: Paul Kapustin Date: Sun, 27 Oct 2019 16:13:56 +0100 Subject: [PATCH 17/49] Added more information related to lenses --- proposals/0000-record-dot-syntax.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 88b0d77e00..84c46f5738 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -239,7 +239,7 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Higher-rank fields:** It is impossible to express `HasField` instances for data types such as `data T = MkT { foo :: forall a . a -> a}`, which means they can't have this syntax available. Users can still write their own selector functions using record puns if required. There is a possibility that with future types of impredicativity such `getField` expressions could be solved specially by the compiler. -**Stealing a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. +**Lenses and a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. Moreover, it has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)), that this proposal is actually very complimentary to lens and can be used very well together with lens (one of the most appealing features enabled by NoFieldSelectors is the ability to use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). **Rebindable syntax:** When `RebindableSyntax` is enabled the `getField` and `setField` functions are those in scope, rather than those in `GHC.Records`. @@ -269,7 +269,8 @@ We prefer the former, but both are permissible. Instead of this proposal, we could do any of the following: -* Using the [`lens` library](https://hackage.haskell.org/package/lens). The concept of lenses is very powerful, but that power can be [complex to use](https://twitter.com/fylwind/status/549342595940237312?lang=en). In many ways lenses let you abstract over record fields, but Haskell has neglected the "unabstracted" case of concrete fields. +* Using the [`lens` library](https://hackage.haskell.org/package/lens). While lenses help both with accessors and overloaded names (e.g. `makeFields`), one still needs to use one of the techniques mentioned below (or similar) to work around the problem of duplicate name selectors. In addition, lens-based syntax is more verbose, e.g. `f $ record ^. field` instead of possible `f record.field`. +More importantly, while the concept of lenses is very powerful, that power can be [complex to use](https://twitter.com/fylwind/status/549342595940237312?lang=en), and for many projects that complexity is undesirable. In many ways lenses let you abstract over record fields, but Haskell has neglected the "unabstracted" case of concrete fields. When this is said, we should mention again what has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)): this proposal is orthogonal but very complimentary to lens and can be used very well together with lens (one of the most appealing features enabled by NoFieldSelectors is the ability to use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). * The [`DuplicateRecordFields` extension](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#duplicate-record-fields) is designed to solve similar problems. We evaluated this extension as the basis for DAML, but found it lacking. The rules about what types must be inferred by what point are cumbersome and tricky to work with, requiring a clear understanding of at what stage a type is inferred by the compiler. * Some style guidelines mandate that each record should be in a separate module. That works, but then requires qualified modules to access fields - e.g. `Person.name (Company.owner c)`. Forcing the structure of the module system to follow the records also makes circular dependencies vastly more likely, leading to complications such as boot files that are ideally avoided. * Some style guidelines suggest prefixing each record field with the type name, e.g. `personName (companyOwner c)`. While it works, it isn't pleasant, and many libraries then abbreviate the types to lead to code such as `prsnName (coOwner c)`, which can increase confusion. From 07de5441fa7bc4f8f8e7102d6892bda89939c2a6 Mon Sep 17 00:00:00 2001 From: Paul Kapustin Date: Fri, 1 Nov 2019 23:36:22 +0100 Subject: [PATCH 18/49] Cleaned up the text a bit --- proposals/0000-record-dot-syntax.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 84c46f5738..dad569b40d 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -239,7 +239,8 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Higher-rank fields:** It is impossible to express `HasField` instances for data types such as `data T = MkT { foo :: forall a . a -> a}`, which means they can't have this syntax available. Users can still write their own selector functions using record puns if required. There is a possibility that with future types of impredicativity such `getField` expressions could be solved specially by the compiler. -**Lenses and a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. Moreover, it has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)), that this proposal is actually very complimentary to lens and can be used very well together with lens (one of the most appealing features enabled by NoFieldSelectors is the ability to use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). +**Lenses and a.b syntax:** + The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. In addition, it has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)), that this proposal is complimentary to lens and can actually benefit lens users (as with NoFieldSelectors one can use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). **Rebindable syntax:** When `RebindableSyntax` is enabled the `getField` and `setField` functions are those in scope, rather than those in `GHC.Records`. @@ -270,7 +271,7 @@ We prefer the former, but both are permissible. Instead of this proposal, we could do any of the following: * Using the [`lens` library](https://hackage.haskell.org/package/lens). While lenses help both with accessors and overloaded names (e.g. `makeFields`), one still needs to use one of the techniques mentioned below (or similar) to work around the problem of duplicate name selectors. In addition, lens-based syntax is more verbose, e.g. `f $ record ^. field` instead of possible `f record.field`. -More importantly, while the concept of lenses is very powerful, that power can be [complex to use](https://twitter.com/fylwind/status/549342595940237312?lang=en), and for many projects that complexity is undesirable. In many ways lenses let you abstract over record fields, but Haskell has neglected the "unabstracted" case of concrete fields. When this is said, we should mention again what has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)): this proposal is orthogonal but very complimentary to lens and can be used very well together with lens (one of the most appealing features enabled by NoFieldSelectors is the ability to use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). +More importantly, while the concept of lenses is very powerful, that power can be [complex to use](https://twitter.com/fylwind/status/549342595940237312?lang=en), and for many projects that complexity is undesirable. In many ways lenses let you abstract over record fields, but Haskell has neglected the "unabstracted" case of concrete fields. Moreover, as it has been [previously mentioned](#Effect-and-Interactions), this proposal is orthogonal to lens and can actually benefit lens users. * The [`DuplicateRecordFields` extension](https://downloads.haskell.org/~ghc/latest/docs/html/users_guide/glasgow_exts.html#duplicate-record-fields) is designed to solve similar problems. We evaluated this extension as the basis for DAML, but found it lacking. The rules about what types must be inferred by what point are cumbersome and tricky to work with, requiring a clear understanding of at what stage a type is inferred by the compiler. * Some style guidelines mandate that each record should be in a separate module. That works, but then requires qualified modules to access fields - e.g. `Person.name (Company.owner c)`. Forcing the structure of the module system to follow the records also makes circular dependencies vastly more likely, leading to complications such as boot files that are ideally avoided. * Some style guidelines suggest prefixing each record field with the type name, e.g. `personName (companyOwner c)`. While it works, it isn't pleasant, and many libraries then abbreviate the types to lead to code such as `prsnName (coOwner c)`, which can increase confusion. From a061127972abaf9475231c48725dcab06d3122f1 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Wed, 6 Nov 2019 22:56:38 +0000 Subject: [PATCH 19/49] Do not imply NoFieldSelectors --- proposals/0000-record-dot-syntax.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index dad569b40d..835135d771 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -240,11 +240,11 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Higher-rank fields:** It is impossible to express `HasField` instances for data types such as `data T = MkT { foo :: forall a . a -> a}`, which means they can't have this syntax available. Users can still write their own selector functions using record puns if required. There is a possibility that with future types of impredicativity such `getField` expressions could be solved specially by the compiler. **Lenses and a.b syntax:** - The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. In addition, it has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)), that this proposal is complimentary to lens and can actually benefit lens users (as with NoFieldSelectors one can use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). + The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. In addition, it has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)), that this proposal is complimentary to lens and can actually benefit lens users (as with `NoFieldSelectors` one can use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). **Rebindable syntax:** When `RebindableSyntax` is enabled the `getField` and `setField` functions are those in scope, rather than those in `GHC.Records`. -**Enabled extensions:** When `RecordDotSyntax` is enabled it should imply the `NoFieldSelectors` extension and allow duplicate record field labels. It would be possible for `RecordDotSyntax` to imply `DuplicateRecordFields`, but we suspect that if people become comfortable with `RecordDotSyntax` then there will be a desire to remove the `DuplicateRecordFields` extension, so we don't want to build on top of it. +**Enabled extensions:** When `RecordDotSyntax` is a distinct extension, implying no other extensions off or on. It is often likely to be used in conjunction with either the `NoFieldSelectors` extension or`DuplicateRecordFields`. ## Costs and Drawbacks @@ -284,7 +284,7 @@ All these approaches are currently used, and represent the "status quo", where H Below are some possible variations on this plan, but we advocate the choices made above: -* Should `RecordDotSyntax` imply `NoFieldSelectors`? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. Of those two, `DuplicateRecordFields` complicates GHC, while `NoFieldSelectors` conceptually simplifies it, so we prefer to bias the eventual outcome. +* Should `RecordDotSyntax` imply `NoFieldSelectors` or another extension? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. Of those two, `DuplicateRecordFields` complicates GHC, while `NoFieldSelectors` conceptually simplifies it, so we prefer to bias the eventual outcome. However, there are lots of balls in the air, and enabling `RecordDotSyntax` should ideally not break normal code, so we leave everything distinct (after [being convinced](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-547641588)). * Earlier versions of this proposal contained a modify field sytnax of the form `a{field * 2}`. While appealing, there is a lot of syntactic debate, with variously `a{field <- (*2)}`, `a{field * = 2}` and others being proposed. None of these syntax variations are immediately clear to someone not familiar with this proposal. To be conservative, we leave this feature out. * There are no update sections. Should `({a=})`, `({a=b})` or `(.lbl=)` be an update section? While nice, we leave this feature out. * We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. From 9b143fb23b410c88e3edc4adca53c01fa9316d60 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Wed, 6 Nov 2019 23:04:46 +0000 Subject: [PATCH 20/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 835135d771..2d16156b49 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -290,7 +290,7 @@ Below are some possible variations on this plan, but we advocate the choices mad * We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. * Will whitespace sensitivity become worse? We're not aware of qualified modules giving any problems, but it's adding whitespace sensitivity in one more place. * One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. -* For selector functions we have opted for `.foo`, but `(.foo)` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. The `.foo` seems to be universally preferred. +* For selector functions we have opted for `.foo`, but `(.foo)` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. This aspect is the most debated of the entire proposal (following [Wadler's law](https://wiki.haskell.org/Wadler's_Law)). The reasoning behind selecting `.foo` without brackets are that `.` is special syntax, so isn't realy a section. There is nothing else that `.foo` could reasonably mean. People can wrap it in brackets if they want. * Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but that seemed to confuse everyone, so has been removed. ## Implementation Plan From 78823348e39506939d383ca1e87cb5b5521d5d5b Mon Sep 17 00:00:00 2001 From: Chris Smith Date: Wed, 13 Nov 2019 12:40:54 -0500 Subject: [PATCH 21/49] Reformat alternatives to make room for more detail --- proposals/0000-record-dot-syntax.md | 39 +++++++++++++++++++++++------ 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 2d16156b49..ab0fd49e81 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -284,14 +284,37 @@ All these approaches are currently used, and represent the "status quo", where H Below are some possible variations on this plan, but we advocate the choices made above: -* Should `RecordDotSyntax` imply `NoFieldSelectors` or another extension? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. Of those two, `DuplicateRecordFields` complicates GHC, while `NoFieldSelectors` conceptually simplifies it, so we prefer to bias the eventual outcome. However, there are lots of balls in the air, and enabling `RecordDotSyntax` should ideally not break normal code, so we leave everything distinct (after [being convinced](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-547641588)). -* Earlier versions of this proposal contained a modify field sytnax of the form `a{field * 2}`. While appealing, there is a lot of syntactic debate, with variously `a{field <- (*2)}`, `a{field * = 2}` and others being proposed. None of these syntax variations are immediately clear to someone not familiar with this proposal. To be conservative, we leave this feature out. -* There are no update sections. Should `({a=})`, `({a=b})` or `(.lbl=)` be an update section? While nice, we leave this feature out. -* We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. -* Will whitespace sensitivity become worse? We're not aware of qualified modules giving any problems, but it's adding whitespace sensitivity in one more place. -* One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. -* For selector functions we have opted for `.foo`, but `(.foo)` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. This aspect is the most debated of the entire proposal (following [Wadler's law](https://wiki.haskell.org/Wadler's_Law)). The reasoning behind selecting `.foo` without brackets are that `.` is special syntax, so isn't realy a section. There is nothing else that `.foo` could reasonably mean. People can wrap it in brackets if they want. -* Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but that seemed to confuse everyone, so has been removed. +### Should `RecordDotSyntax` imply `NoFieldSelectors` or another extension? + +Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. Of those two, `DuplicateRecordFields` complicates GHC, while `NoFieldSelectors` conceptually simplifies it, so we prefer to bias the eventual outcome. However, there are lots of balls in the air, and enabling `RecordDotSyntax` should ideally not break normal code, so we leave everything distinct (after [being convinced](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-547641588)). + +### Should a syntax be provided for modification? + +Earlier versions of this proposal contained a modify field sytnax of the form `a{field * 2}`. While appealing, there is a lot of syntactic debate, with variously `a{field <- (*2)}`, `a{field * = 2}` and others being proposed. None of these syntax variations are immediately clear to someone not familiar with this proposal. To be conservative, we leave this feature out. + +### Should there be update sections? + +There are no update sections. Should `({a=})`, `({a=b})` or `(.lbl=)` be an update section? While nice, we leave this feature out. + +### Should pattern matching be extended? + +We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. + +### Will whitespace sensitivity become worse? + +We're not aware of qualified modules giving any problems, but it's adding whitespace sensitivity in one more place. + +### Should a new update syntax be added? + +One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. + +### Which syntax should be chosen for selector functions? + +For selector functions we have opted for `.foo`, but `(.foo)` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. This aspect is the most debated of the entire proposal (following [Wadler's law](https://wiki.haskell.org/Wadler's_Law)). The reasoning behind selecting `.foo` without brackets are that `.` is special syntax, so isn't realy a section. There is nothing else that `.foo` could reasonably mean. People can wrap it in brackets if they want. + +### Should punning be extended to updates? + +Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but that seemed to confuse everyone, so has been removed. ## Implementation Plan From 1fd5df9db4f2113ba2777ffbbf24763c3cb157f6 Mon Sep 17 00:00:00 2001 From: Chris Smith Date: Wed, 13 Nov 2019 13:15:37 -0500 Subject: [PATCH 22/49] Attempt to fairly represent both perspectives on selector syntax --- proposals/0000-record-dot-syntax.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index ab0fd49e81..ba628790ce 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -310,7 +310,19 @@ One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be ### Which syntax should be chosen for selector functions? -For selector functions we have opted for `.foo`, but `(.foo)` and `_.foo` have both been proposed. We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. This aspect is the most debated of the entire proposal (following [Wadler's law](https://wiki.haskell.org/Wadler's_Law)). The reasoning behind selecting `.foo` without brackets are that `.` is special syntax, so isn't realy a section. There is nothing else that `.foo` could reasonably mean. People can wrap it in brackets if they want. +Three syntax options have bee proposed for selector functions: `.foo`, `(.foo)`, and `_.foo`. This aspect is the most debated of the entire proposal (following [Wadler's law](https://wiki.haskell.org/Wadler's_Law)). We have opted for `.foo`. + +We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. Therefore, we reject it. + +The decision between `.foo` and `(.foo)` comes down to both pragmatics, and a significant difference of perspective: + +* On the one hand, `x.foo` can be seen as the core syntax, and `(.foo)` as a section of that syntax. Note that this is NOT a section of `.` as a binary operator, but rather a section in the more general sense that it elides the one and only expression, and desugars to `\x -> x.foo`. This is very similar to `SignatureSections`, in that there is no operator or right-hand expression, so there are no separate left and right sections. `TupleSections` for 3-tuples and above are another example. But some would like to resist spreading this generalization further. +* On the other hand, some believe that `.foo` is better understood as the fundamental construct here, reminiscent of `#foo` from `OverloadedLabels`, and should desugar directly to a use of `getField`. Then `x.foo` is just a syntactic sugar for `.foo x`. Some have advocated that `.foo` should not even need to be a field accessor function at all, and might be any type in the same vein as `OverloadedLabels`, but that is beyond the scope of this proposal and has its own challenges around how to handle nested fields like `.foo.bar`, how to desugar `x {foo.bar = 42}` consistently, and so on. + +Independent of this difference, there are pragmatic concerns on both sides: + +* Some consider treating this as a section like `(.foo)` to be too verbose, and the extra level of parentheses a problem for readability. +* Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. ### Should punning be extended to updates? From c8ba999082866a08c3c84148bca7ea793ad73f06 Mon Sep 17 00:00:00 2001 From: Chris Smith Date: Wed, 13 Nov 2019 20:43:10 -0500 Subject: [PATCH 23/49] Reword and include a few more small points. Most of the changes are based on comments from others. - Make the two overall perspectives clear before diving into arguments over "is it a section". - Fill out the argument against sections a bit per @tysonzero. - Discuss sugar for nested field `.foo.bar` in the labels case and the interaction with RebindableSyntax. Drop the speculative IsLabel stuff. --- proposals/0000-record-dot-syntax.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index ba628790ce..a84dea3fd3 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -314,15 +314,17 @@ Three syntax options have bee proposed for selector functions: `.foo`, `(.foo)`, We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. Therefore, we reject it. -The decision between `.foo` and `(.foo)` comes down to both pragmatics, and a significant difference of perspective: +The decision between `.foo` and `(.foo)` partially comes from a significant difference of perspective: -* On the one hand, `x.foo` can be seen as the core syntax, and `(.foo)` as a section of that syntax. Note that this is NOT a section of `.` as a binary operator, but rather a section in the more general sense that it elides the one and only expression, and desugars to `\x -> x.foo`. This is very similar to `SignatureSections`, in that there is no operator or right-hand expression, so there are no separate left and right sections. `TupleSections` for 3-tuples and above are another example. But some would like to resist spreading this generalization further. -* On the other hand, some believe that `.foo` is better understood as the fundamental construct here, reminiscent of `#foo` from `OverloadedLabels`, and should desugar directly to a use of `getField`. Then `x.foo` is just a syntactic sugar for `.foo x`. Some have advocated that `.foo` should not even need to be a field accessor function at all, and might be any type in the same vein as `OverloadedLabels`, but that is beyond the scope of this proposal and has its own challenges around how to handle nested fields like `.foo.bar`, how to desugar `x {foo.bar = 42}` consistently, and so on. +* On the one hand, `x.foo` can be seen, as described above, as a new syntax that desugars to `getField @"foo" x`, and `(.foo)` and `(.foo.bar)` as sections of that syntax, which themselves desugar to `\x -> x.foo` and `\x -> x.foo.bar`. Note that this is NOT a section of `.` as a binary operator, but rather a section in the more general sense that it elides the one and only subexpression and adds an implicit lambda. +* On the other hand, `.foo` can be seen as the more fundamental construct here, reminiscent of `#foo` from `OverloadedLabels`, and it can desugar directly to `getField @"foo"`. Then one can recover the rest of the syntax above by defining more syntactic sugar: `x.foo` desugars to `.foo x`, and `.foo.bar` desugars to `.bar . .foo` (or `\x -> .bar (.foo x)`, a distinction that only matters if `RebindableSyntax` is in play), and `x.foo.bar` to one of `.bar (.foo x)` or `.foo.bar x` (which are equivalent unless `.foo.bar` is changed by `RebindableSyntax`). + +Thus, it has been discussed at length whether field selection can be seen as just another section, or should be seen as something else that is not section-like. The `SignatureSections` and `TupleSections` extensions (especially for 3-tuples and larger) have already established that that section can be formed by various kinds of elided expressions, not just the operands of a binary operator. However, some would resist spreading this generalization further, and argue that `SignatureSections` and `TupleSections` are justified by looking "operator-like". That is, even though a single `,` in a 3-tuple and the `::` in a type annotation are not real operators, some feel that they at least look a little more like it because there is non-trivial grammar on both sides. Independent of this difference, there are pragmatic concerns on both sides: -* Some consider treating this as a section like `(.foo)` to be too verbose, and the extra level of parentheses a problem for readability. -* Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. +* Some consider the parentheses to be too verbose, and the extra level of parentheses a problem for readability. Even if one agrees that this is conceptually a section, this is the first type of section where parentheses are not actually needed for parsing, so omitting parentheses is still possible even if it loses a bit of consistency in favor of brevity. +* Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. ### Should punning be extended to updates? From 0e45e6793faa5fc73974de197a53d2db078759c5 Mon Sep 17 00:00:00 2001 From: Chris Smith Date: Wed, 13 Nov 2019 22:28:01 -0500 Subject: [PATCH 24/49] Move RebindableSyntax concerns to the end. --- proposals/0000-record-dot-syntax.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index a84dea3fd3..92a506780f 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -317,7 +317,7 @@ We consider `_.foo` to not be very Haskelly, as it is similar to very different The decision between `.foo` and `(.foo)` partially comes from a significant difference of perspective: * On the one hand, `x.foo` can be seen, as described above, as a new syntax that desugars to `getField @"foo" x`, and `(.foo)` and `(.foo.bar)` as sections of that syntax, which themselves desugar to `\x -> x.foo` and `\x -> x.foo.bar`. Note that this is NOT a section of `.` as a binary operator, but rather a section in the more general sense that it elides the one and only subexpression and adds an implicit lambda. -* On the other hand, `.foo` can be seen as the more fundamental construct here, reminiscent of `#foo` from `OverloadedLabels`, and it can desugar directly to `getField @"foo"`. Then one can recover the rest of the syntax above by defining more syntactic sugar: `x.foo` desugars to `.foo x`, and `.foo.bar` desugars to `.bar . .foo` (or `\x -> .bar (.foo x)`, a distinction that only matters if `RebindableSyntax` is in play), and `x.foo.bar` to one of `.bar (.foo x)` or `.foo.bar x` (which are equivalent unless `.foo.bar` is changed by `RebindableSyntax`). +* On the other hand, `.foo` can be seen as the more fundamental construct here, reminiscent of `#foo` from `OverloadedLabels`, and it can desugar directly to `getField @"foo"`. Then one can recover the rest of the syntax above by desugaring: `x.foo` to `.foo x`, and `.foo.bar` to `.bar . .foo`. Thus, it has been discussed at length whether field selection can be seen as just another section, or should be seen as something else that is not section-like. The `SignatureSections` and `TupleSections` extensions (especially for 3-tuples and larger) have already established that that section can be formed by various kinds of elided expressions, not just the operands of a binary operator. However, some would resist spreading this generalization further, and argue that `SignatureSections` and `TupleSections` are justified by looking "operator-like". That is, even though a single `,` in a 3-tuple and the `::` in a type annotation are not real operators, some feel that they at least look a little more like it because there is non-trivial grammar on both sides. @@ -325,6 +325,7 @@ Independent of this difference, there are pragmatic concerns on both sides: * Some consider the parentheses to be too verbose, and the extra level of parentheses a problem for readability. Even if one agrees that this is conceptually a section, this is the first type of section where parentheses are not actually needed for parsing, so omitting parentheses is still possible even if it loses a bit of consistency in favor of brevity. * Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. +* A minor point is that composition comes up explicitly in the nested selector desugaring for the `.foo` case. This raises the question of what happens with `RebindableSyntax`. Does `RebindableSyntax` chooses the `(.)` from local scope? If so, `(x.foo).bar` and `.foo.bar x` could have different values, and `x.foo.bar` might parse as either one. ### Should punning be extended to updates? From 6534817755f5b6acf932ef0270df7f2323a4617a Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Mon, 25 Nov 2019 14:11:43 +0000 Subject: [PATCH 25/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 2d16156b49..17da78daed 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -239,8 +239,7 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Higher-rank fields:** It is impossible to express `HasField` instances for data types such as `data T = MkT { foo :: forall a . a -> a}`, which means they can't have this syntax available. Users can still write their own selector functions using record puns if required. There is a possibility that with future types of impredicativity such `getField` expressions could be solved specially by the compiler. -**Lenses and a.b syntax:** - The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. In addition, it has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)), that this proposal is complimentary to lens and can actually benefit lens users (as with `NoFieldSelectors` one can use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). +**Lenses and a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. In addition, it has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)), that this proposal is complimentary to lens and can actually benefit lens users (as with `NoFieldSelectors` one can use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). **Rebindable syntax:** When `RebindableSyntax` is enabled the `getField` and `setField` functions are those in scope, rather than those in `GHC.Records`. @@ -264,8 +263,6 @@ let temp = myexpression.field1.field2.field3 in temp.field4.field5 ``` -We prefer the former, but both are permissible. - ## Alternatives to this proposal Instead of this proposal, we could do any of the following: From ceb9f11694ece8c7c3478232b95bece6405e07a8 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Mon, 25 Nov 2019 14:18:08 +0000 Subject: [PATCH 26/49] Clarify `.` and rebindable syntax --- proposals/0000-record-dot-syntax.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 583457df05..6bb5185377 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -241,7 +241,7 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Lenses and a.b syntax:** The `a.b` syntax is commonly used in conjunction with the `lens` library, e.g. `expr^.field1.field2`. Treating `a.b` without spaces as a record projection would break such code. The alternatives would be to use a library with a different lens composition operator (e.g. `optics`), introduce an alias in `lens` for `.` (perhaps `%`), write such expressions with spaces, or not enable this extension when also using lenses. While unfortunate, we consider that people who are heavy users of lens don't feel the problems of inadequate records as strongly, so the problems are lessened. In addition, it has been discussed (e.g. [here](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-546159561)), that this proposal is complimentary to lens and can actually benefit lens users (as with `NoFieldSelectors` one can use the same field names for everything: dot notation, lens-y getting, lens-y modification, record updates, `Show/Generic`). -**Rebindable syntax:** When `RebindableSyntax` is enabled the `getField` and `setField` functions are those in scope, rather than those in `GHC.Records`. +**Rebindable syntax:** When `RebindableSyntax` is enabled the `getField` and `setField` functions are those in scope, rather than those in `GHC.Records`. The `.` function (as used in the `a.b.c` desugaring) remains the `Prelude` version (we see the `.` as a syntactic shortcut for an explicit lambda, and believe that whether the implementation uses literal `.` or a lambda is an internal detail). **Enabled extensions:** When `RecordDotSyntax` is a distinct extension, implying no other extensions off or on. It is often likely to be used in conjunction with either the `NoFieldSelectors` extension or`DuplicateRecordFields`. @@ -322,7 +322,6 @@ Independent of this difference, there are pragmatic concerns on both sides: * Some consider the parentheses to be too verbose, and the extra level of parentheses a problem for readability. Even if one agrees that this is conceptually a section, this is the first type of section where parentheses are not actually needed for parsing, so omitting parentheses is still possible even if it loses a bit of consistency in favor of brevity. * Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. -* A minor point is that composition comes up explicitly in the nested selector desugaring for the `.foo` case. This raises the question of what happens with `RebindableSyntax`. Does `RebindableSyntax` chooses the `(.)` from local scope? If so, `(x.foo).bar` and `.foo.bar x` could have different values, and `x.foo.bar` might parse as either one. ### Should punning be extended to updates? From 938cbe777696e5958498aa34be29c38ca587d300 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Mon, 25 Nov 2019 14:28:35 +0000 Subject: [PATCH 27/49] More notes about selector functions --- proposals/0000-record-dot-syntax.md | 1 + 1 file changed, 1 insertion(+) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 6bb5185377..03da983d38 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -322,6 +322,7 @@ Independent of this difference, there are pragmatic concerns on both sides: * Some consider the parentheses to be too verbose, and the extra level of parentheses a problem for readability. Even if one agrees that this is conceptually a section, this is the first type of section where parentheses are not actually needed for parsing, so omitting parentheses is still possible even if it loses a bit of consistency in favor of brevity. * Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. +* Looking at that existing implementation of GHC, supporting `(.b)` is less changes that supporting `.b` alone. While the implementation complexity is not a reason for picking one over the other, the existing grammar of the compiler can give hints about what logically follows. ### Should punning be extended to updates? From 182e4c458f1fc774e9a7a0805a62d1c596f4f0a0 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Mon, 25 Nov 2019 14:37:13 +0000 Subject: [PATCH 28/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 03da983d38..a8678ddbc1 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -328,6 +328,10 @@ Independent of this difference, there are pragmatic concerns on both sides: Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but that seemed to confuse everyone, so has been removed. +## Unresolved issues + +In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed in this proposal) to require brackets, namely `(.field)`. While resolved, we consider it worth the committees deliberation as to which is preferable. Neither author is opposed to either outcome. + ## Implementation Plan If accepted, the proposal authors would be delighted to provide an implementation. Implementation depends on the implementation of [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst) and [the `NoFieldSelectors` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0160-no-toplevel-field-selectors.rst). From 4a2f4b31bfb1653ce21b4dae05fe16ccdccc0543 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Mon, 25 Nov 2019 16:30:20 -0500 Subject: [PATCH 29/49] Rework the formals to allow naked selectors --- proposals/0000-record-dot-syntax.md | 95 +++++++++++------------------ 1 file changed, 37 insertions(+), 58 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index a8678ddbc1..8a3a7d9724 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -81,13 +81,13 @@ A new lexeme *fieldid* is introduced.

*lexeme* → *qvarid* | *qconid* | *qvarsym* | *qconsym* | *literal* | *special* | *reservedop* | *reservedid* | *fieldid* -
*fieldid* → *.varid* +
*fieldid* → *.varid{.varid}* This specification results in the following. ```haskell -- Regular expressions -@fieldid = (\. @varid) +@fieldid = (\. @varid)+ ... <0,option_prags> { ... @@ -99,93 +99,72 @@ This specification results in the following. data Token = ITas | ... - | ITfieldid FastString + | ITfieldid [FastString] ... -- Lexer actions fieldid :: StringBuffer -> Int -> Token -fieldid buf len = let (_dot, buf') = nextChar buf in ITfieldid $! lexemeToFastString buf' (len - 1) +fieldid buf len = ITfieldid $! splitFields buf len + +-- Split a buffer with contents like '.foo.bar.baz' into components. +splitFields :: StringBuffer -> Int -> [FastString] +splitFields buf len = ... ``` Tokens of case `ITfieldid` may not be issued if `RecordDotSyntax` is not enabled. ### Parser -#### Field selections -To support '.' field selection the *fexp* production is extended. -
-
*fexp* → [ *fexp* ] *aexp* | *fexp* *fieldid* +#### Sections -The specification expresses like this. +To support sections (e.g. `.foo.bar.baz`), we generalize *aexp*. +
+
*aexp* → *fieldid* +This specification results in the following schematic. ```haskell %token ... - FIELDID { L _ (ITfieldid _) } + FIELDID { L _ (ITfieldid _) } + ... + %% -... +aexp :: { ECP } + ... -fexp :: { ECP } - : fexp aexp { ...} - | fexp FIELDID { ...} -- <- here - | ... -``` + | '\\' apat apats '->' exp {...} -#### Field updates + | FIELDID { + ... -- This case is only possible when 'RecordDotSyntax' is enabled. + } -- <- here -To support the new forms of '.' field update, the *aexp* production is extended. -
-
*aexp* → *aexp⟨qcon⟩* { *pbind* , … , *pbind* } -
*pbind* -> *qvar*=*exp* | *var* *fieldids*=*exp* -
*fieldids* -> *fieldids* *fieldid* + ... +``` -In this table, the newly added cases are shown next to an example expression they enable: +### Field selections -| Production | Example | Commentary | -| -- | -- | -- | -|*var* *fieldids*=*exp* | `a{foo.bar = 2}` | the *var* is `foo`, `.bar` is a fieldid | +To support field selections, the existing production *fexp* → *[fexp]* *aexp* is sufficient. -For example, support for expressions like `a{foo.bar.baz.quux=i}` can be had with one additional case: +### Field updates + +To support field updates, the *aexp* production is extended. +
+
*aexp* → *aexp⟨qcon⟩* { *pbind* , … , *pbind* } +
*pbind* -> *qvar*=*exp* | *var* *aexp*=*exp* ```haskell + aexp1 :: { ECP } : aexp1 '{' fbinds '}' { ... } - | aexp1 '{' VARID fieldids '=' texp '}' {...} -- <- here - -fieldids :: {[FastString]} -fieldids - : fieldids FIELDID { getFIELDID $2 : $1 } - | FIELDID { [getFIELDID $1] } - -{ -getFIELDID (dL->L _ (ITfieldid x)) = x -} -``` - -An implementation of `RecordDotSyntax` will have to do more than this to incorporate all alternatives. + | aexp1 '{' VARID aexp '=' texp '}' {...} -- <- here -#### Sections - -To support '.' sections (e.g. `(.foo.bar.baz)`), we generalize *aexp*. -
-
*aexp* → ( *infixexp* *qop* ) (left section) - | ( *qop* *infixexp* ) (right section) - | ( *fieldids* ) (projection (right) section) - -This specification implies the following additional case to `aexp2`. - -```haskell -aexp2 :: { ECP } - ... - | '(' texp ')' {...} - | '(' fieldids ')' {...} -- <- here ``` ### Prototype -To confirm these changes integrate as expected we have written [a prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) that parses and desugars the forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. Note that in the prototype, projection ([as proposed here](#syntax)), takes precedence over application so `f a.foo.bar.baz.quux 12` parses as `f (a.foo.bar.baz.quux) 12`. +To gain confidence these changes integrate as expected [a prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) was produced that parses and desugars forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. Note that in the prototype, projection ([as proposed here](#syntax)), takes precedence over application so `f a.foo.bar.baz.quux 12` parses as `f (a.foo.bar.baz.quux) 12`. Note also that the prototype does not follow the specification given above in that : `x .lbl` means `x.lbl` means `(.lbl) x`. ## Examples @@ -231,7 +210,7 @@ getTerms :: [Class] -> [Quarter] getTerms = map .taken.term -- nested selector ``` -A full, rigorous set of examples (as tests) are available in the examples directory of [this repository](https://github.com/ndmitchell/record-dot-preprocessor). Those tests include infix applications, polymorphic data types, interoperation with other extensions and more. They follow the [specifications given earlier](#proposed-change-specification). +A full, rigorous set of examples (as tests) are available in the examples directory of [this repository](https://github.com/ndmitchell/record-dot-preprocessor). Those tests include infix applications, polymorphic data types, interoperation with other extensions and more. ## Effect and Interactions @@ -330,7 +309,7 @@ Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but t ## Unresolved issues -In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed in this proposal) to require brackets, namely `(.field)`. While resolved, we consider it worth the committees deliberation as to which is preferable. Neither author is opposed to either outcome. +In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed [in this proposal](#which-syntax-should-be-chosen-for-selector-functions)) to require brackets, namely `(.field)`. While resolved, we consider it worth the committee's deliberation as to which is preferable. Neither author is opposed to either outcome. ## Implementation Plan From 096904d7e2bf09046664c54b1b3e8bffc441d25f Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Wed, 11 Dec 2019 11:49:21 -0500 Subject: [PATCH 30/49] Start on updates --- proposals/0000-record-dot-syntax.md | 61 ++++++++++++++++++++++++----- 1 file changed, 51 insertions(+), 10 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index a8678ddbc1..947e4b0194 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -38,27 +38,68 @@ An implementation of this proposal has been battle tested and hardened over 18 m For the specification we focus on the changes to the parsing rules, and the desugaring, with the belief the type checking and renamer changes required are an unambiguous consequences of those. ### `RecordDotSyntax` language extension +This change adds a new language extension `RecordDotSyntax`. -This change adds a new language extension `RecordDotSyntax`. In the event the language extension is enabled: +#### Syntax +In the event the language extension is enabled: | Expression | Equivalent | | -- | -- | +| `.lbl` | `(\x -> x.lbl)` the `.` cannot have whitespace after | +| `.lbl1.lbl2` | `(\x -> x.lbl1.lbl2)` | | `e.lbl` | `getField @"lbl" e` the `.` cannot have whitespace either before or after | +| `e.lbl1.lbl2` | `(e.lbl1).lbl2` | | `e{lbl = val}` | `setField @"lbl" e val` | -| `.lbl` | `(\x -> x.lbl)` the `.` cannot have whitespace after | | `e{lbl1.lbl2 = val}` | `e{lbl1 = (e.lbl1){lbl2 = val}}` performing a nested update | - -The above forms combine to provide these identities: - -| Expression | Identity -| -- | -- | -| `e.lbl1.lbl2` | `(e.lbl1).lbl2` | -| `.lbl1.lbl2` | `(\x -> x.lbl1.lbl2)` | | `e.lbl1{lbl2 = val}` | `(e.lbl1){lbl2 = val}` | | `e{lbl1 = val}.lbl2` | `(e{lbl1 = val}).lbl2` | | `e{lbl1 = val1, lbl2 = val2}` | `(e{lbl1 = val1}){lbl2 = val2}` | -### Syntax +*[Note: `e{lbl=val}` is the syntax of a standard H98 record update. It's the nested form introduced by this proposal that is new : `e{lbl1.lbl2 = val}`. However, in the event `RecordDotSyntax` is in effect, we propose that `e{lbl = val}` desugar to `setField @"lbl" e a`]*. + +#### Precedence + +Regarding precedence, we propose that '`.`' should "bind more tightly" than function application thus, `f r.a.b` should parse as `f (r.a.b)`. + +### Definitions + +For clarity of terminology in what follows, we make the following informal definitions: +* A **field selector** is an expression like `.a` or `.a.b` preceded by white-space; +* A **field selection** is an expression like `r.a` or `(f x).a.b`, where the first dot is preceded by a close paren or a varid; +* A **field update** is an expression like `r{a = 12}` or `r{a.b = "foo"}`; +* A **punned field update** is an expression like `r{a}` or `r{a.b}` (where it is understood that `b` is a variable bound in the environment of the expression and only valid syntax if the `NamedFieldPuns` language extension is in effect). + +### Lexing and Parsing + +The intent of this section is **not** to recommend a particular parsing implementation. Rather, we aim to prove that lexing and parsing is feasible. We are open to better strategies for lexing and parsing but in their absence, the scheme presented here appears to have the required properties. + +#### Lexer + +A new lexeme `fieldid` is introduced. +```haskell +-- Regular expressions +@fieldid = (\. @varid)+ +... +<0,option_prags> { + ... + @fieldid / {ifExtension RecordDotSyntaxBit} { idtoken fieldid } +} +... + +-- Token type +data Token + = ITas + | ... + | ITfieldid [FastString] + ... +-- Lexer actions + +``` + + + + + #### Record selection From ecc84f8e3b10fa0b161578e3a2c06feebc8001a2 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Wed, 11 Dec 2019 13:30:16 -0500 Subject: [PATCH 31/49] Update specification and other improvements --- proposals/0000-record-dot-syntax.md | 203 ++++++++++++++-------------- 1 file changed, 104 insertions(+), 99 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 30570b32e7..fbbc2f5a7a 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -12,7 +12,7 @@ This proposal is [discussed at this pull request](https://github.com/ghc-proposa Records in Haskell are [widely recognised](https://www.yesodweb.com/blog/2011/09/limitations-of-haskell) as being under-powered, with duplicate field names being particularly troublesome. We propose a new language extension `RecordDotSyntax` that provides syntactic sugar to make the features introduced in [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst) more accessible, improving the user experience. -## Motivation +## 1. Motivation In almost every programming language we write `a.b` to mean the `b` field of the `a` record expression. In Haskell that becomes `b a`, and even then, only works if there is only one `b` in scope. Haskell programmers have struggled with this weakness, variously putting each record in a separate module and using qualified imports, or prefixing record fields with the type name. We propose bringing `a.b` to Haskell, which works regardless of how many `b` fields are in scope. Here's a simple example of what is on offer: @@ -33,14 +33,14 @@ We declare two records both having `name` as a field label. The user may then wr An implementation of this proposal has been battle tested and hardened over 18 months in the enterprise environment as part of [Digital Asset](https://digitalasset.com/)'s [DAML](https://daml.com/) smart contract language (a Haskell derivative utilizing GHC in its implementation), and also in a [Haskell preprocessor and a GHC plugin](https://github.com/ndmitchell/record-dot-preprocessor/). When initially considering Haskell as a basis for DAML, the inadequacy of records was considered the most severe problem, and without devising the scheme presented here, we wouldn't be using Haskell. The feature enjoys universal popularity with users. -## Proposed Change Specification +## 2. Proposed Change Specification For the specification we focus on the changes to the parsing rules, and the desugaring, with the belief the type checking and renamer changes required are an unambiguous consequences of those. -### `RecordDotSyntax` language extension +### 2.1 `RecordDotSyntax` language extension This change adds a new language extension `RecordDotSyntax`. -#### Syntax +#### 2.1.1 Syntax In the event the language extension is enabled: | Expression | Equivalent | @@ -57,11 +57,11 @@ In the event the language extension is enabled: *[Note: `e{lbl=val}` is the syntax of a standard H98 record update. It's the nested form introduced by this proposal that is new : `e{lbl1.lbl2 = val}`. However, in the event `RecordDotSyntax` is in effect, we propose that `e{lbl = val}` desugar to `setField @"lbl" e a`]*. -#### Precedence +#### 2.1.2 Precedence Regarding precedence, we propose that '`.`' should "bind more tightly" than function application thus, `f r.a.b` should parse as `f (r.a.b)`. -### Definitions +### 2.2 Definitions For clarity of terminology in what follows, we make the following informal definitions: * A **field selector** is an expression like `.a` or `.a.b` preceded by white-space; @@ -69,54 +69,11 @@ For clarity of terminology in what follows, we make the following informal defin * A **field update** is an expression like `r{a = 12}` or `r{a.b = "foo"}`; * A **punned field update** is an expression like `r{a}` or `r{a.b}` (where it is understood that `b` is a variable bound in the environment of the expression and only valid syntax if the `NamedFieldPuns` language extension is in effect). -### Lexing and Parsing +### 2.3 Lexing and Parsing -The intent of this section is **not** to recommend a particular parsing implementation. Rather, we aim to prove that lexing and parsing is feasible. We are open to better strategies for lexing and parsing but in their absence, the scheme presented here appears to have the required properties. +The intent of this section is **not** to recommend a particular parsing implementation. Rather, we aim only to show that lexing and parsing is feasible. We are open to learning of better strategies for the implementation of the lexing and parsing but in their absence, the scheme presented here appears to have the required properties. -#### Lexer - -A new lexeme `fieldid` is introduced. -```haskell --- Regular expressions -@fieldid = (\. @varid)+ -... -<0,option_prags> { - ... - @fieldid / {ifExtension RecordDotSyntaxBit} { idtoken fieldid } -} -... - --- Token type -data Token - = ITas - | ... - | ITfieldid [FastString] - ... --- Lexer actions - -``` - - - - - - -#### Record selection - -The expression: - -> e.lbl - -means `getField @"lbl" e`, provided: - -- There is no whitespace either side of `.`; -- That `lbl` is a valid variable name; -- That `e` is an expression, but not a *conid*; -- Precedence : `f a.foo.bar.baz.quux 12` parses as `f (a.foo.bar.baz.quux) 12`. - -Similarly, `e{lbl=val}` only applies if `e` is an expression, but not a *conid*. - -### Lexer +#### 2.3.1 Lexer A new lexeme *fieldid* is introduced.
@@ -124,7 +81,7 @@ A new lexeme *fieldid* is introduced. | *literal* | *special* | *reservedop* | *reservedid* | *fieldid*
*fieldid* → *.varid{.varid}* -This specification results in the following. +In terms of changes to GHC's `Lexer.x` we write the following. ```haskell -- Regular expressions @@ -152,62 +109,102 @@ splitFields :: StringBuffer -> Int -> [FastString] splitFields buf len = ... ``` -Tokens of case `ITfieldid` may not be issued if `RecordDotSyntax` is not enabled. +Note that tokens of case `ITfieldid` will never be issued if `RecordDotSyntax` is not enabled. -### Parser +In terms of changes to GHC's `Parser.y`, the new token is incorporated into the parser like so. +```haskell +%token + ... + FIELDID { L _ (ITfieldid _) } + ... +{ +... +getFIELDID (dL->L _ (ITfieldid x)) = x +... +} +``` + +#### 2.3.2 Parsing -#### Sections +##### 2.3.2.1 Parsing of field selectors -To support sections (e.g. `.foo.bar.baz`), we generalize *aexp*. +Supporting field selectors is achieved by extending the set of `aexp` productions.

*aexp* → *fieldid* -This specification results in the following schematic. ```haskell -%token - ... - FIELDID { L _ (ITfieldid _) } - ... - -%% - aexp :: { ECP } - ... + ... + | FIELDID { + ... + } -- <- here - | '\\' apat apats '->' exp {...} + ... +``` - | FIELDID { - ... -- This case is only possible when 'RecordDotSyntax' is enabled. - } -- <- here +##### 2.3.2.2 Parsing of field selections - ... +Supporting field selections does not require any new productions. The production *fexp -> fexp aexp* is sufficient, only, its semantic action needs to be updated to take `RecordDotSyntax` into account. +```haskell +fexp :: { ECP } + : fexp aexp + {%do + { + ; recordDotSyntax <- getBit RecordDotSyntaxBit + ; if not recordDotSyntax + then + ... do as we do today + else do { + ; lhs <- runECP_P $1 :: P (Located (HsExpr GhcPs)) + ; rhs <- runECP_P $2 :: P (Located (HsExpr GhcPs)) + ; if not (isFieldSelector rhs) + then + .... do as we do today + else + if (adjacent lhs rhs) + then + ... handle field selection (e.g. 'a.foo.bar') + else + ... handle an application on a field selector (e.g. 'f .foo.bar') + } + ... ``` +*[No doubt a real implementation can express this logic more elegantly - we present it in this way here to elucidate.]* -### Field selections +The key point to note is the disambiguation of a field selection from the application of a term to a field selector. That is, looking at white-space to distinguish between `f.x` and `f .x`. This is handled by the function `adjacent` which can be defined simply as: +```haskell +adjacent :: Located a -> Located b -> Bool +adjacent (L a _) (L b _) = isGoodSrcSpan a && srcSpanEnd a == srcSpanStart b +``` -To support field selections, the existing production *fexp* → *[fexp]* *aexp* is sufficient. +*[One thing to look out for in the implementation at this point is to carefully respect the precedence rule i.e. `f a.b` parse as `f (a.b)`.]* -### Field updates +##### 2.3.2.3 Parsing of field updates -To support field updates, the *aexp* production is extended. +Field updates and punned field updates are achieved by generalizing the `aexp` productions.

*aexp* → *aexp⟨qcon⟩* { *pbind* , … , *pbind* }
*pbind* -> *qvar*=*exp* | *var* *aexp*=*exp* +The existing rule is ```haskell - aexp1 :: { ECP } - : aexp1 '{' fbinds '}' { ... } - | aexp1 '{' VARID aexp '=' texp '}' {...} -- <- here - + : aexp1 '{' fbinds '}' { ...} ``` +It's easy enough to extend `aexp1` to handle simple cases of nested field updates and punned field updates like so: +```haskell +aexp1 :: { ECP } + : aexp1 '{' fbinds '}' { ... as we do today... } + | aexp1 '{' VARID FIELDID '=' texp '}' { ... } <- nested field update here + | aexp1 '{' VARID FIELDID '}' { ... } <- punned field update here here + ... +``` +*[As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b=...}` does. Further, `r{a.b = ..., c =}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* -### Prototype - -To gain confidence these changes integrate as expected [a prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) was produced that parses and desugars forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. Note that in the prototype, projection ([as proposed here](#syntax)), takes precedence over application so `f a.foo.bar.baz.quux 12` parses as `f (a.foo.bar.baz.quux) 12`. Note also that the prototype does not follow the specification given above in that : `x .lbl` means `x.lbl` means `(.lbl) x`. +The prototype implements the parsing scheme present here. More information about the prototype is available in [this section](#91-prototype). -## Examples +## 3. Examples This is a record type with functions describing a study `Class` (*Oh! Pascal, 2nd ed. Cooper & Clancy, 1985*). @@ -253,7 +250,7 @@ getTerms = map .taken.term -- nested selector A full, rigorous set of examples (as tests) are available in the examples directory of [this repository](https://github.com/ndmitchell/record-dot-preprocessor). Those tests include infix applications, polymorphic data types, interoperation with other extensions and more. -## Effect and Interactions +## 4. Effect and Interactions **Polymorphic updates:** When enabled, this extension takes the `a{b=c}` syntax and uses it to mean `setField`. The biggest difference a user is likely to experience is that the resulting type of `a{b=c}` is the same as the type `a` - you _cannot_ change the type of the record by updating its fields. The removal of polymorphism is considered essential to preserve decent type inference, and is the only option supported by [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). Anyone wishing to use polymorphic updates can write `let Foo{..} = Foo{polyField=[], ..}` instead. @@ -265,9 +262,9 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Enabled extensions:** When `RecordDotSyntax` is a distinct extension, implying no other extensions off or on. It is often likely to be used in conjunction with either the `NoFieldSelectors` extension or`DuplicateRecordFields`. -## Costs and Drawbacks +## 5. Costs and Drawbacks -The implementation of this proposal adds code to the compiler, but not a huge amount. Our [prototype implementation](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax) shows the essence of the parsing changes, which is the most complex part. +The implementation of this proposal adds code to the compiler, but not a huge amount. Our [prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/tree/record-dot-syntax-alt) shows the essence of the parsing changes, which is the most complex part. If this proposal becomes widely used then it is likely that all Haskell users would have to learn that `a.b` is a record field selection. Fortunately, given how popular this syntax is elsewhere, that is unlikely to surprise new users. @@ -283,7 +280,7 @@ let temp = myexpression.field1.field2.field3 in temp.field4.field5 ``` -## Alternatives to this proposal +## 6. Alternatives to this proposal Instead of this proposal, we could do any of the following: @@ -297,35 +294,35 @@ More importantly, while the concept of lenses is very powerful, that power can b All these approaches are currently used, and represent the "status quo", where Haskell records are considered not fit for purpose. -## Alternatives within this proposal +## 7. Alternatives within this proposal Below are some possible variations on this plan, but we advocate the choices made above: -### Should `RecordDotSyntax` imply `NoFieldSelectors` or another extension? +### 7.1 Should `RecordDotSyntax` imply `NoFieldSelectors` or another extension? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. Of those two, `DuplicateRecordFields` complicates GHC, while `NoFieldSelectors` conceptually simplifies it, so we prefer to bias the eventual outcome. However, there are lots of balls in the air, and enabling `RecordDotSyntax` should ideally not break normal code, so we leave everything distinct (after [being convinced](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-547641588)). -### Should a syntax be provided for modification? +### 7.2 Should a syntax be provided for modification? Earlier versions of this proposal contained a modify field sytnax of the form `a{field * 2}`. While appealing, there is a lot of syntactic debate, with variously `a{field <- (*2)}`, `a{field * = 2}` and others being proposed. None of these syntax variations are immediately clear to someone not familiar with this proposal. To be conservative, we leave this feature out. -### Should there be update sections? +### 7.3 Should there be update sections? There are no update sections. Should `({a=})`, `({a=b})` or `(.lbl=)` be an update section? While nice, we leave this feature out. -### Should pattern matching be extended? +### 7.4 Should pattern matching be extended? We do not extend pattern matching, although it would be possible for `P{foo.bar=Just x}` to be defined. -### Will whitespace sensitivity become worse? +### 7.5 Will whitespace sensitivity become worse? We're not aware of qualified modules giving any problems, but it's adding whitespace sensitivity in one more place. -### Should a new update syntax be added? +### 7.6 Should a new update syntax be added? One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. -### Which syntax should be chosen for selector functions? +### 7.7 Which syntax should be chosen for selector functions? Three syntax options have bee proposed for selector functions: `.foo`, `(.foo)`, and `_.foo`. This aspect is the most debated of the entire proposal (following [Wadler's law](https://wiki.haskell.org/Wadler's_Law)). We have opted for `.foo`. @@ -344,14 +341,22 @@ Independent of this difference, there are pragmatic concerns on both sides: * Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. * Looking at that existing implementation of GHC, supporting `(.b)` is less changes that supporting `.b` alone. While the implementation complexity is not a reason for picking one over the other, the existing grammar of the compiler can give hints about what logically follows. -### Should punning be extended to updates? +### 7.8 Should punning be extended to updates? Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but that seemed to confuse everyone, so has been removed. -## Unresolved issues +## 8. Unresolved issues + +In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed [in this proposal](#77-which-syntax-should-be-chosen-for-selector-functions)) to require brackets, namely `(.field)`. While resolved, we consider it worth the committee's deliberation as to which is preferable. Neither author is opposed to either outcome. + +## 9. Implementation Plan + +### 9.1 Prototype + +To gain confidence these changes integrate as expected [a prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/tree/record-dot-syntax-alt) was produced that parses and desugars forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. The prototype is sufficiently rich enough to "do the right thing" with [this test file](https://gitlab.haskell.org/shayne-fletcher-da/ghc/raw/record-dot-syntax-alt/record-dot-syntax-tests/Test.hs). -In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed [in this proposal](#which-syntax-should-be-chosen-for-selector-functions)) to require brackets, namely `(.field)`. While resolved, we consider it worth the committee's deliberation as to which is preferable. Neither author is opposed to either outcome. +*[An earlier version of this proposal came with a different [prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax). That prototype differs from the current state of this proposal in that "naked field selectors" are deemed illegal and field selections with white-space are legal e.g. `f .x .y` is `f.x.y`. These differences lead to a somewhat different parsing scheme than the one presented here]* -## Implementation Plan +### 9.2 Who will provide an implementation? If accepted, the proposal authors would be delighted to provide an implementation. Implementation depends on the implementation of [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst) and [the `NoFieldSelectors` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0160-no-toplevel-field-selectors.rst). From d5f3b40ada28ef41ba04ddb1ad25bc5f8d29e19e Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Wed, 11 Dec 2019 14:01:09 -0500 Subject: [PATCH 32/49] Fix typo --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index fbbc2f5a7a..57d9c3cf6b 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -200,7 +200,7 @@ aexp1 :: { ECP } | aexp1 '{' VARID FIELDID '}' { ... } <- punned field update here here ... ``` -*[As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b=...}` does. Further, `r{a.b = ..., c =}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* +*[As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b = ...}` does. Further, `r{a.b = ..., c = ...}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* The prototype implements the parsing scheme present here. More information about the prototype is available in [this section](#91-prototype). From 3d425767be62f67500564e95eb8774438997a4aa Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Wed, 11 Dec 2019 15:30:43 -0500 Subject: [PATCH 33/49] Delete clause about removing punned updates (maybe temporarily) --- proposals/0000-record-dot-syntax.md | 14 +++++--------- 1 file changed, 5 insertions(+), 9 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 57d9c3cf6b..b7035ef7ec 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -202,7 +202,7 @@ aexp1 :: { ECP } ``` *[As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b = ...}` does. Further, `r{a.b = ..., c = ...}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* -The prototype implements the parsing scheme present here. More information about the prototype is available in [this section](#91-prototype). +The prototype implements the parsing scheme present here. More information about the prototype is available in [this section](#81-prototype). ## 3. Examples @@ -341,22 +341,18 @@ Independent of this difference, there are pragmatic concerns on both sides: * Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. * Looking at that existing implementation of GHC, supporting `(.b)` is less changes that supporting `.b` alone. While the implementation complexity is not a reason for picking one over the other, the existing grammar of the compiler can give hints about what logically follows. -### 7.8 Should punning be extended to updates? - -Originally this proposal included `a{foo.bar}` to mean `a{foo.bar = bar}`, but that seemed to confuse everyone, so has been removed. - -## 8. Unresolved issues +## 7. Unresolved issues In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed [in this proposal](#77-which-syntax-should-be-chosen-for-selector-functions)) to require brackets, namely `(.field)`. While resolved, we consider it worth the committee's deliberation as to which is preferable. Neither author is opposed to either outcome. -## 9. Implementation Plan +## 8. Implementation Plan -### 9.1 Prototype +### 8.1 Prototype To gain confidence these changes integrate as expected [a prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/tree/record-dot-syntax-alt) was produced that parses and desugars forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. The prototype is sufficiently rich enough to "do the right thing" with [this test file](https://gitlab.haskell.org/shayne-fletcher-da/ghc/raw/record-dot-syntax-alt/record-dot-syntax-tests/Test.hs). *[An earlier version of this proposal came with a different [prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax). That prototype differs from the current state of this proposal in that "naked field selectors" are deemed illegal and field selections with white-space are legal e.g. `f .x .y` is `f.x.y`. These differences lead to a somewhat different parsing scheme than the one presented here]* -### 9.2 Who will provide an implementation? +### 8.2 Who will provide an implementation? If accepted, the proposal authors would be delighted to provide an implementation. Implementation depends on the implementation of [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst) and [the `NoFieldSelectors` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0160-no-toplevel-field-selectors.rst). From 9d89fee4143dceacc5a00a68150fe86c292503d9 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Wed, 11 Dec 2019 15:32:44 -0500 Subject: [PATCH 34/49] Remove properly --- proposals/0000-record-dot-syntax.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index b7035ef7ec..4c72bab0fe 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -202,7 +202,7 @@ aexp1 :: { ECP } ``` *[As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b = ...}` does. Further, `r{a.b = ..., c = ...}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* -The prototype implements the parsing scheme present here. More information about the prototype is available in [this section](#81-prototype). +The prototype implements the parsing scheme present here. More information about the prototype is available in [this section](#91-prototype). ## 3. Examples @@ -341,18 +341,18 @@ Independent of this difference, there are pragmatic concerns on both sides: * Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. * Looking at that existing implementation of GHC, supporting `(.b)` is less changes that supporting `.b` alone. While the implementation complexity is not a reason for picking one over the other, the existing grammar of the compiler can give hints about what logically follows. -## 7. Unresolved issues +## 8. Unresolved issues In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed [in this proposal](#77-which-syntax-should-be-chosen-for-selector-functions)) to require brackets, namely `(.field)`. While resolved, we consider it worth the committee's deliberation as to which is preferable. Neither author is opposed to either outcome. -## 8. Implementation Plan +## 9. Implementation Plan -### 8.1 Prototype +### 9.1 Prototype To gain confidence these changes integrate as expected [a prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/tree/record-dot-syntax-alt) was produced that parses and desugars forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. The prototype is sufficiently rich enough to "do the right thing" with [this test file](https://gitlab.haskell.org/shayne-fletcher-da/ghc/raw/record-dot-syntax-alt/record-dot-syntax-tests/Test.hs). *[An earlier version of this proposal came with a different [prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax). That prototype differs from the current state of this proposal in that "naked field selectors" are deemed illegal and field selections with white-space are legal e.g. `f .x .y` is `f.x.y`. These differences lead to a somewhat different parsing scheme than the one presented here]* -### 8.2 Who will provide an implementation? +### 9.2 Who will provide an implementation? If accepted, the proposal authors would be delighted to provide an implementation. Implementation depends on the implementation of [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst) and [the `NoFieldSelectors` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0160-no-toplevel-field-selectors.rst). From f1a74da45d05805e591f2ca909362f8d7b152a32 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Wed, 11 Dec 2019 16:39:28 -0500 Subject: [PATCH 35/49] Typo --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 4c72bab0fe..d6b6c9c2d5 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -202,7 +202,7 @@ aexp1 :: { ECP } ``` *[As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b = ...}` does. Further, `r{a.b = ..., c = ...}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* -The prototype implements the parsing scheme present here. More information about the prototype is available in [this section](#91-prototype). +The prototype implements the parsing scheme presented here. More information about the prototype is available in [this section](#91-prototype). ## 3. Examples From b80ff3a4cedddcfaf00e434dcefa98dad9a2aa06 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Tue, 17 Dec 2019 10:35:52 -0500 Subject: [PATCH 36/49] Section 2.1.3 Fields whose names are operator symbols --- proposals/0000-record-dot-syntax.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index d6b6c9c2d5..7257af0bdd 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -61,6 +61,12 @@ In the event the language extension is enabled: Regarding precedence, we propose that '`.`' should "bind more tightly" than function application thus, `f r.a.b` should parse as `f (r.a.b)`. +#### 2.1.3 Fields whose names are operator symbols + +We propose that dot notation be not available for fields whose names are operator symbols (for example, `+`, `.&.` and so on). + +*[Note : This does not preclude the use of explicit `getField` expressions in such cases (e.g. `getField@".+." r`)]*. + ### 2.2 Definitions For clarity of terminology in what follows, we make the following informal definitions: From 9b7a80015224ee3183e909db77df0856aca81fc4 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Tue, 17 Dec 2019 17:08:01 -0500 Subject: [PATCH 37/49] Add note to 2.1.2 "Precedence" --- proposals/0000-record-dot-syntax.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 7257af0bdd..a02389f74b 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -61,6 +61,8 @@ In the event the language extension is enabled: Regarding precedence, we propose that '`.`' should "bind more tightly" than function application thus, `f r.a.b` should parse as `f (r.a.b)`. +*[Note : As normal, explicit parenthesization can be used to "force" the desired evaluation e.g. `(f r).a.b`]*. + #### 2.1.3 Fields whose names are operator symbols We propose that dot notation be not available for fields whose names are operator symbols (for example, `+`, `.&.` and so on). From 40b93b26df602037134a773ff5da4550ddeebed8 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Tue, 17 Dec 2019 17:43:44 -0500 Subject: [PATCH 38/49] Consistent styling of notes --- proposals/0000-record-dot-syntax.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index a02389f74b..b7fb3ffeda 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -67,7 +67,7 @@ Regarding precedence, we propose that '`.`' should "bind more tightly" than func We propose that dot notation be not available for fields whose names are operator symbols (for example, `+`, `.&.` and so on). -*[Note : This does not preclude the use of explicit `getField` expressions in such cases (e.g. `getField@".+." r`)]*. +*[Note : This does not preclude the use of explicit `getField` expressions in such cases (e.g. `getField@".&." r`)]*. ### 2.2 Definitions @@ -178,7 +178,7 @@ fexp :: { ECP } } ... ``` -*[No doubt a real implementation can express this logic more elegantly - we present it in this way here to elucidate.]* +*[Note : No doubt a real implementation can express this logic more elegantly - we present it in this way here to elucidate.]* The key point to note is the disambiguation of a field selection from the application of a term to a field selector. That is, looking at white-space to distinguish between `f.x` and `f .x`. This is handled by the function `adjacent` which can be defined simply as: ```haskell @@ -186,7 +186,7 @@ adjacent :: Located a -> Located b -> Bool adjacent (L a _) (L b _) = isGoodSrcSpan a && srcSpanEnd a == srcSpanStart b ``` -*[One thing to look out for in the implementation at this point is to carefully respect the precedence rule i.e. `f a.b` parse as `f (a.b)`.]* +*[Note : One thing to look out for in the implementation at this point is to carefully respect the precedence rule i.e. `f a.b` parse as `f (a.b)`.]* ##### 2.3.2.3 Parsing of field updates @@ -208,7 +208,7 @@ aexp1 :: { ECP } | aexp1 '{' VARID FIELDID '}' { ... } <- punned field update here here ... ``` -*[As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b = ...}` does. Further, `r{a.b = ..., c = ...}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* +*[Note : As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b = ...}` does. Further, `r{a.b = ..., c = ...}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* The prototype implements the parsing scheme presented here. More information about the prototype is available in [this section](#91-prototype). @@ -359,7 +359,7 @@ In this proposal we pick `.field` to be the syntax for selector functions, howev To gain confidence these changes integrate as expected [a prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/tree/record-dot-syntax-alt) was produced that parses and desugars forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. The prototype is sufficiently rich enough to "do the right thing" with [this test file](https://gitlab.haskell.org/shayne-fletcher-da/ghc/raw/record-dot-syntax-alt/record-dot-syntax-tests/Test.hs). -*[An earlier version of this proposal came with a different [prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax). That prototype differs from the current state of this proposal in that "naked field selectors" are deemed illegal and field selections with white-space are legal e.g. `f .x .y` is `f.x.y`. These differences lead to a somewhat different parsing scheme than the one presented here]* +*[Note : An earlier version of this proposal came with a different [prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax). That prototype differs from the current state of this proposal in that "naked field selectors" are deemed illegal and field selections with white-space are legal e.g. `f .x .y` is `f.x.y`. These differences lead to a somewhat different parsing scheme than the one presented here]*. ### 9.2 Who will provide an implementation? From 09b6bbb759f3765f7beffefc66a6eff889fb52f8 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Mon, 27 Jan 2020 15:02:50 +0000 Subject: [PATCH 39/49] Update 0000-record-dot-syntax.md --- proposals/0000-record-dot-syntax.md | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index b7fb3ffeda..af5d3b7c43 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -260,7 +260,7 @@ A full, rigorous set of examples (as tests) are available in the examples direct ## 4. Effect and Interactions -**Polymorphic updates:** When enabled, this extension takes the `a{b=c}` syntax and uses it to mean `setField`. The biggest difference a user is likely to experience is that the resulting type of `a{b=c}` is the same as the type `a` - you _cannot_ change the type of the record by updating its fields. The removal of polymorphism is considered essential to preserve decent type inference, and is the only option supported by [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). Anyone wishing to use polymorphic updates can write `let Foo{..} = Foo{polyField=[], ..}` instead. +**Polymorphic updates:** When enabled, this extension takes the `a{b=c}` syntax and uses it to mean `setField`. The biggest difference a user is likely to experience is that the resulting type of `a{b=c}` is the same as the type `a` - you _cannot_ change the type of the record by updating its fields. The removal of polymorphism is considered essential to preserve decent type inference, and is the only option supported by [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). Anyone wishing to use polymorphic updates can write `let Foo{..} = a in Foo{polyField=[], ..}` instead. **Higher-rank fields:** It is impossible to express `HasField` instances for data types such as `data T = MkT { foo :: forall a . a -> a}`, which means they can't have this syntax available. Users can still write their own selector functions using record puns if required. There is a possibility that with future types of impredicativity such `getField` expressions could be solved specially by the compiler. @@ -268,7 +268,7 @@ A full, rigorous set of examples (as tests) are available in the examples direct **Rebindable syntax:** When `RebindableSyntax` is enabled the `getField` and `setField` functions are those in scope, rather than those in `GHC.Records`. The `.` function (as used in the `a.b.c` desugaring) remains the `Prelude` version (we see the `.` as a syntactic shortcut for an explicit lambda, and believe that whether the implementation uses literal `.` or a lambda is an internal detail). -**Enabled extensions:** When `RecordDotSyntax` is a distinct extension, implying no other extensions off or on. It is often likely to be used in conjunction with either the `NoFieldSelectors` extension or`DuplicateRecordFields`. +**Enabled extensions:** The `RecordDotSyntax` extension does not imply enabling/disabling any other extensions. It is often likely to be used in conjunction with either the `NoFieldSelectors` extension or`DuplicateRecordFields`. ## 5. Costs and Drawbacks @@ -328,7 +328,7 @@ We're not aware of qualified modules giving any problems, but it's adding whites ### 7.6 Should a new update syntax be added? -One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern to extract fields if necessary. +One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern (with `-XNamedFieldPuns`) to extract fields if necessary. ### 7.7 Which syntax should be chosen for selector functions? @@ -349,9 +349,20 @@ Independent of this difference, there are pragmatic concerns on both sides: * Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. * Looking at that existing implementation of GHC, supporting `(.b)` is less changes that supporting `.b` alone. While the implementation complexity is not a reason for picking one over the other, the existing grammar of the compiler can give hints about what logically follows. +### 7.8 What should naked selectors mean? + +While the meaning of `a.b` is obvious (record field projection), and of `a . b` (function composition), the meaning of `a .b` remains under debate. There are three choices: + +1. A naked selector is illegal. That choice is conservative; we can decide later. However `(.x)` would have to be allowed as a new syntactic production, meaning `\r -> r.x`. +2. A naked selector is not a selector at all, and thus remains a slight-unusual function composition, as it does now. +3. `.x` means `\r -> r.x`, thus resolving the selector function debate above. +4. A naked selector is a postfix operator, binding less tightly than function application. So `(.x)` naturally means the section `\r -> r.x`, and `r .x` means `(.x) r`. This approach would allow a chain of applications to be expressed cleanly. + +Of these, 1, 2 and 3 are all local operations. In contrast, option 4 gives the authors difficulty producing parse trees in their head, and thus increases the total cost of learning the extension more than is considered desirable. + ## 8. Unresolved issues -In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed [in this proposal](#77-which-syntax-should-be-chosen-for-selector-functions)) to require brackets, namely `(.field)`. While resolved, we consider it worth the committee's deliberation as to which is preferable. Neither author is opposed to either outcome. +In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed [in this proposal](#77-which-syntax-should-be-chosen-for-selector-functions)) to require brackets, namely `(.field)`. While resolved, we consider it worth the committee's deliberation as to which is preferable. Neither author is opposed to either outcome. Assuming `.field` is _not_ chosen as the selector function, then the meaning of that construct needs to be nailed down from the 4 options in Section 7.8. ## 9. Implementation Plan From 6ef807f0886730786bcc151a736cf79afc4f73b3 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 09:08:28 -0400 Subject: [PATCH 40/49] Updates in light of committee feedback --- proposals/0000-record-dot-syntax.md | 268 ++++++++-------------------- 1 file changed, 77 insertions(+), 191 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index af5d3b7c43..299a525374 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -31,10 +31,9 @@ nameAfterOwner c = c{name = c.owner.name ++ "'s Company"} We declare two records both having `name` as a field label. The user may then write `c.name` and `c.owner.name` to access those fields. We can also write `c{name = x}` as a record update, which works even though `name` is no longer unique. Under the hood, we make use of `getField` and `setField` from [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). -An implementation of this proposal has been battle tested and hardened over 18 months in the enterprise environment as part of [Digital Asset](https://digitalasset.com/)'s [DAML](https://daml.com/) smart contract language (a Haskell derivative utilizing GHC in its implementation), and also in a [Haskell preprocessor and a GHC plugin](https://github.com/ndmitchell/record-dot-preprocessor/). When initially considering Haskell as a basis for DAML, the inadequacy of records was considered the most severe problem, and without devising the scheme presented here, we wouldn't be using Haskell. The feature enjoys universal popularity with users. +An implementation of this proposal has been battle tested and hardened over two years in the enterprise environment as part of [Digital Asset](https://digitalasset.com/)'s [DAML](https://daml.com/) smart contract language (a Haskell derivative utilizing GHC in its implementation), and also in a [Haskell preprocessor and a GHC plugin](https://github.com/ndmitchell/record-dot-preprocessor/). When initially considering Haskell as a basis for DAML, the inadequacy of records was considered the most severe problem, and without devising the scheme presented here, wouldn't be using Haskell. The feature enjoys universal popularity with users. ## 2. Proposed Change Specification - For the specification we focus on the changes to the parsing rules, and the desugaring, with the belief the type checking and renamer changes required are an unambiguous consequences of those. ### 2.1 `RecordDotSyntax` language extension @@ -43,174 +42,102 @@ This change adds a new language extension `RecordDotSyntax`. #### 2.1.1 Syntax In the event the language extension is enabled: -| Expression | Equivalent | -| -- | -- | -| `.lbl` | `(\x -> x.lbl)` the `.` cannot have whitespace after | -| `.lbl1.lbl2` | `(\x -> x.lbl1.lbl2)` | -| `e.lbl` | `getField @"lbl" e` the `.` cannot have whitespace either before or after | -| `e.lbl1.lbl2` | `(e.lbl1).lbl2` | -| `e{lbl = val}` | `setField @"lbl" e val` | -| `e{lbl1.lbl2 = val}` | `e{lbl1 = (e.lbl1){lbl2 = val}}` performing a nested update | -| `e.lbl1{lbl2 = val}` | `(e.lbl1){lbl2 = val}` | -| `e{lbl1 = val}.lbl2` | `(e{lbl1 = val}).lbl2` | -| `e{lbl1 = val1, lbl2 = val2}` | `(e{lbl1 = val1}){lbl2 = val2}` | +| Expression | Equivalent | +| -- | -- | +| `(.lbl)` | `(\e -> e.lbl)` | +| `(.lbl₁.lbl₂)` | `(\e -> e.lbl₁.lbl₂)` | +| `e.lbl` | `getField @"lbl" e` | +| `e.lbl₁.lbl₂` | `(e.lbl₁).lbl₂` | +| `e{lbl = val}` | `setField @"lbl" e val` | +| `e{lbl₁.lbl₂ = val}` | `e{lbl₁ = (e.lbl₁){lbl₂ = val}}` | +| `e.lbl₁{lbl₂ = val}` | `(e.lbl₁){lbl₂ = val}` | +| `e{lbl₁ = val₁}.val₂` | `(e{lbl₁ = val₁}).val₂` | -*[Note: `e{lbl=val}` is the syntax of a standard H98 record update. It's the nested form introduced by this proposal that is new : `e{lbl1.lbl2 = val}`. However, in the event `RecordDotSyntax` is in effect, we propose that `e{lbl = val}` desugar to `setField @"lbl" e a`]*. +*[Note: `e{lbl=val}` is the syntax of a standard H98 record update. It's the nested form introduced by this proposal that is new : `e{lbl1.lbl2 = val}`. However, in the event `RecordDotSyntax` is in effect, note that we propose that `e{lbl = val}` desugar to `setField @"lbl" e val`]*. #### 2.1.2 Precedence +We propose that '`.`' "bind more tightly" than function application thus, `f r.a.b` parses as `f (r.a.b)`. -Regarding precedence, we propose that '`.`' should "bind more tightly" than function application thus, `f r.a.b` should parse as `f (r.a.b)`. - -*[Note : As normal, explicit parenthesization can be used to "force" the desired evaluation e.g. `(f r).a.b`]*. +| Expression | Interpretation | +| -- | -- | +| `f r.x` | means `f (r.x)` | +| `f M.n.x` | means `f (M.n.x)` | +| `f M.N.x` | means `f (M.N.x)` | +| `f r .x` | is illegal | +| `f (g r).x` | `f ((g r).x)` | +| `f (g r) .x` | is illegal | #### 2.1.3 Fields whose names are operator symbols +We propose that dot notation isn't available for fields whose names are operator symbols (for example, `+`, `.+.` and so on). -We propose that dot notation be not available for fields whose names are operator symbols (for example, `+`, `.&.` and so on). - -*[Note : This does not preclude the use of explicit `getField` expressions in such cases (e.g. `getField@".&." r`)]*. +*[Note : For fields whose names are operator symbols, one can still write `getField` expressions (e.g. `getField @".+." r`)]*. ### 2.2 Definitions - -For clarity of terminology in what follows, we make the following informal definitions: -* A **field selector** is an expression like `.a` or `.a.b` preceded by white-space; -* A **field selection** is an expression like `r.a` or `(f x).a.b`, where the first dot is preceded by a close paren or a varid; +For what follows, we use these informal definitions: +* A **field selector** is an expression like `.a` or `.a.b`; +* A **field selection** is an expression like `r.a` or `(f x).a.b`; * A **field update** is an expression like `r{a = 12}` or `r{a.b = "foo"}`; -* A **punned field update** is an expression like `r{a}` or `r{a.b}` (where it is understood that `b` is a variable bound in the environment of the expression and only valid syntax if the `NamedFieldPuns` language extension is in effect). +* A **punned field update** is an expression like `r{a}` or `r{a.b}` (here it is understood that `b` is a variable bound in the environment of the expression and only valid syntax if the `NamedFieldPuns` language extension is in effect). ### 2.3 Lexing and Parsing -The intent of this section is **not** to recommend a particular parsing implementation. Rather, we aim only to show that lexing and parsing is feasible. We are open to learning of better strategies for the implementation of the lexing and parsing but in their absence, the scheme presented here appears to have the required properties. +The prototype implements the parsing scheme presented here. More information about the prototype is available in [this section](#91-prototype). #### 2.3.1 Lexer +A new token case `ITproj Bool` is introduced. When the extension is enabled occurences of operator '`.`' are classified using the whitespace sensitive operator mechanism from [this (accepted) GHC proposal](https://github.com/ghc-proposals/ghc-proposals/pull/229). The rules are: -A new lexeme *fieldid* is introduced. -
-
*lexeme* → *qvarid* | *qconid* | *qvarsym* | *qconsym* -| *literal* | *special* | *reservedop* | *reservedid* | *fieldid* -
*fieldid* → *.varid{.varid}* +| Occurence | Token | Means | Example | +| -- | -- | -- | -- | +| prefix | `ITproj True` | field selector | `.x` | +| tight infix | `ITproj False` | field selection | `r.x` | +| suffix | `ITdot` | function composition | `f. g` | +| loose infix | `ITdot` | function composition | `f . g` | -In terms of changes to GHC's `Lexer.x` we write the following. - -```haskell --- Regular expressions -@fieldid = (\. @varid)+ -... -<0,option_prags> { - ... - @fieldid / {ifExtension RecordDotSyntaxBit} { idtoken fieldid } -} -... - --- Token type -data Token - = ITas - | ... - | ITfieldid [FastString] - ... - --- Lexer actions -fieldid :: StringBuffer -> Int -> Token -fieldid buf len = ITfieldid $! splitFields buf len - --- Split a buffer with contents like '.foo.bar.baz' into components. -splitFields :: StringBuffer -> Int -> [FastString] -splitFields buf len = ... -``` - -Note that tokens of case `ITfieldid` will never be issued if `RecordDotSyntax` is not enabled. - -In terms of changes to GHC's `Parser.y`, the new token is incorporated into the parser like so. -```haskell -%token - ... - FIELDID { L _ (ITfieldid _) } - ... - -{ -... -getFIELDID (dL->L _ (ITfieldid x)) = x -... -} -``` +No `ITproj` tokens will ever be issued if `RecordDotSyntax` is not enabled. #### 2.3.2 Parsing +The Haskell grammar is extended with the following productions. We use these notations: -##### 2.3.2.1 Parsing of field selectors +| Symbol | Occurence | +|--------|-------------| +| *.ᴾ* | prefix | +| *.ᵀ* | tight-infix | -Supporting field selectors is achieved by extending the set of `aexp` productions. +###### 2.3.2.1 +[Field] +
+     *field* -> *varid* | *qvarid*
-
*aexp* → *fieldid* - -```haskell -aexp :: { ECP } - ... - | FIELDID { - ... - } -- <- here - - ... -``` - -##### 2.3.2.2 Parsing of field selections - -Supporting field selections does not require any new productions. The production *fexp -> fexp aexp* is sufficient, only, its semantic action needs to be updated to take `RecordDotSyntax` into account. -```haskell -fexp :: { ECP } - : fexp aexp - {%do - { - ; recordDotSyntax <- getBit RecordDotSyntaxBit - ; if not recordDotSyntax - then - ... do as we do today - else do { - ; lhs <- runECP_P $1 :: P (Located (HsExpr GhcPs)) - ; rhs <- runECP_P $2 :: P (Located (HsExpr GhcPs)) - ; if not (isFieldSelector rhs) - then - .... do as we do today - else - if (adjacent lhs rhs) - then - ... handle field selection (e.g. 'a.foo.bar') - else - ... handle an application on a field selector (e.g. 'f .foo.bar') - } - ... -``` -*[Note : No doubt a real implementation can express this logic more elegantly - we present it in this way here to elucidate.]* - -The key point to note is the disambiguation of a field selection from the application of a term to a field selector. That is, looking at white-space to distinguish between `f.x` and `f .x`. This is handled by the function `adjacent` which can be defined simply as: -```haskell -adjacent :: Located a -> Located b -> Bool -adjacent (L a _) (L b _) = isGoodSrcSpan a && srcSpanEnd a == srcSpanStart b -``` - -*[Note : One thing to look out for in the implementation at this point is to carefully respect the precedence rule i.e. `f a.b` parse as `f (a.b)`.]* -##### 2.3.2.3 Parsing of field updates +###### 2.3.2.2 +[Field to update] +
+     *fieldToUpdate* -> *fieldToUpdate* *.ᵀ* *field* | *field* +
-Field updates and punned field updates are achieved by generalizing the `aexp` productions. +###### 2.3.2.3 +[Field selectors] +
+     *aexp* → *( projection )* +
+     *projection* → *.ᴾ* *field* | *projection* *.ᵀ* *field*
-
*aexp* → *aexp⟨qcon⟩* { *pbind* , … , *pbind* } -
*pbind* -> *qvar*=*exp* | *var* *aexp*=*exp* -The existing rule is -```haskell -aexp1 :: { ECP } - : aexp1 '{' fbinds '}' { ...} -``` -It's easy enough to extend `aexp1` to handle simple cases of nested field updates and punned field updates like so: -```haskell -aexp1 :: { ECP } - : aexp1 '{' fbinds '}' { ... as we do today... } - | aexp1 '{' VARID FIELDID '=' texp '}' { ... } <- nested field update here - | aexp1 '{' VARID FIELDID '}' { ... } <- punned field update here here - ... -``` -*[Note : As written, this of course means that `r{a = ...}` doesn't result in a `setField` expression whereas `r{a.b = ...}` does. Further, `r{a.b = ..., c = ...}` (multiple updates) aren't handled. We are not endorsing either of those things, rather we are just demonstrating that implementation of this proposal will be achieved by careful generalization of `fbinds`.]* +###### 2.3.2.4 +[Field selection] +
+     *fexp* → *fexp* *.ᵀ* *field* +
-The prototype implements the parsing scheme presented here. More information about the prototype is available in [this section](#91-prototype). +###### 2.3.2.5 +[Field update] +
+     *aexp* → *{* *pbind₁* *,* ... *}* +
+     *pbind* → *field* *.ᵀ* *fieldToUpdate* *=* *exp* +
+     *pbind* → *field* *.ᵀ* *fieldToUpdate* +
## 3. Examples @@ -243,20 +170,14 @@ setResult c r = c{result = r} -- update setYearTaken :: Class -> Int -> Class setYearTaken c y = c{taken.year = y} -- nested update -addYears :: Class -> Int -> Class -addYears c n = c{taken.year = c.taken.year + n} -- update via op - -squareUnits :: Class -> Class -squareUnits c = c{units = (\x -> x * x) c.units} -- update via function - getResults :: [Class] -> [Status] -getResults = map .result -- selector +getResults = map (.result) -- selector getTerms :: [Class] -> [Quarter] -getTerms = map .taken.term -- nested selector +getTerms = map (.taken.term) -- nested selector ``` -A full, rigorous set of examples (as tests) are available in the examples directory of [this repository](https://github.com/ndmitchell/record-dot-preprocessor). Those tests include infix applications, polymorphic data types, interoperation with other extensions and more. +Further examples [accompany the prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/-/blob/f74bb04d850c53e4b35eeba53052dd4b407fd60b/record-dot-syntax-tests/Test.hs) and yet more (as tests) are available in the examples directory of [this repository](https://github.com/ndmitchell/record-dot-preprocessor). Those tests include infix applications, polymorphic data types, interoperation with other extensions and more. ## 4. Effect and Interactions @@ -272,14 +193,13 @@ A full, rigorous set of examples (as tests) are available in the examples direct ## 5. Costs and Drawbacks -The implementation of this proposal adds code to the compiler, but not a huge amount. Our [prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/tree/record-dot-syntax-alt) shows the essence of the parsing changes, which is the most complex part. +The implementation of this proposal adds code to the compiler, but not a huge amount. Our [prototype](#91-prototype) shows the essence of the parsing changes, which is the most complex part. If this proposal becomes widely used then it is likely that all Haskell users would have to learn that `a.b` is a record field selection. Fortunately, given how popular this syntax is elsewhere, that is unlikely to surprise new users. This proposal advocates a different style of writing Haskell records, which is distinct from the existing style. As such, it may lead to the bifurcation of Haskell styles, with some people preferring the lens approach, and some people preferring the syntax presented here. That is no doubt unfortunate, but hard to avoid - `a.b` really is ubiquitous in programming languages. We consider that any solution to the records problem _must_ cause some level of divergence, but note that this mechanism (as distinct from some proposals) localises that divergence in the implementation of a module - users of the module will not know whether its internals used this extension or not. -The use of `a.b` with no spaces on either side can make it harder to write expressions that span multiple lines. To split over two lines it is possible to do either of: - +The use of `a.b` with no spaces on either side can make it harder to write expressions that span multiple lines. To split over two lines it is possible to use the `&` function from `Base` or do either of: ``` (myexpression.field1.field2.field3 ).field4.field5 @@ -304,8 +224,6 @@ All these approaches are currently used, and represent the "status quo", where H ## 7. Alternatives within this proposal -Below are some possible variations on this plan, but we advocate the choices made above: - ### 7.1 Should `RecordDotSyntax` imply `NoFieldSelectors` or another extension? Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, but `DuplicateRecordFields` would work too. Of those two, `DuplicateRecordFields` complicates GHC, while `NoFieldSelectors` conceptually simplifies it, so we prefer to bias the eventual outcome. However, there are lots of balls in the air, and enabling `RecordDotSyntax` should ideally not break normal code, so we leave everything distinct (after [being convinced](https://github.com/ghc-proposals/ghc-proposals/pull/282#issuecomment-547641588)). @@ -330,47 +248,15 @@ We're not aware of qualified modules giving any problems, but it's adding whites One suggestion is that record updates remain as normal, but `a { .foo = 1 }` be used to indicate the new forms of updates. While possible, we believe that option leads to a confusing result, with two forms of update both of which fail in different corner cases. Instead, we recommend use of `C{foo}` as a pattern (with `-XNamedFieldPuns`) to extract fields if necessary. -### 7.7 Which syntax should be chosen for selector functions? - -Three syntax options have bee proposed for selector functions: `.foo`, `(.foo)`, and `_.foo`. This aspect is the most debated of the entire proposal (following [Wadler's law](https://wiki.haskell.org/Wadler's_Law)). We have opted for `.foo`. - -We consider `_.foo` to not be very Haskelly, as it is similar to very different uses of underscore. Therefore, we reject it. - -The decision between `.foo` and `(.foo)` partially comes from a significant difference of perspective: - -* On the one hand, `x.foo` can be seen, as described above, as a new syntax that desugars to `getField @"foo" x`, and `(.foo)` and `(.foo.bar)` as sections of that syntax, which themselves desugar to `\x -> x.foo` and `\x -> x.foo.bar`. Note that this is NOT a section of `.` as a binary operator, but rather a section in the more general sense that it elides the one and only subexpression and adds an implicit lambda. -* On the other hand, `.foo` can be seen as the more fundamental construct here, reminiscent of `#foo` from `OverloadedLabels`, and it can desugar directly to `getField @"foo"`. Then one can recover the rest of the syntax above by desugaring: `x.foo` to `.foo x`, and `.foo.bar` to `.bar . .foo`. - -Thus, it has been discussed at length whether field selection can be seen as just another section, or should be seen as something else that is not section-like. The `SignatureSections` and `TupleSections` extensions (especially for 3-tuples and larger) have already established that that section can be formed by various kinds of elided expressions, not just the operands of a binary operator. However, some would resist spreading this generalization further, and argue that `SignatureSections` and `TupleSections` are justified by looking "operator-like". That is, even though a single `,` in a 3-tuple and the `::` in a type annotation are not real operators, some feel that they at least look a little more like it because there is non-trivial grammar on both sides. - -Independent of this difference, there are pragmatic concerns on both sides: - -* Some consider the parentheses to be too verbose, and the extra level of parentheses a problem for readability. Even if one agrees that this is conceptually a section, this is the first type of section where parentheses are not actually needed for parsing, so omitting parentheses is still possible even if it loses a bit of consistency in favor of brevity. -* Some consider it acceptable (if unfortunate) that `a . b` and `a.b` have different meanings in this proposal, but believe that assigning three distinct meanings to `a . b`, `a .b`, and `a.b` is just too confusing. -* Looking at that existing implementation of GHC, supporting `(.b)` is less changes that supporting `.b` alone. While the implementation complexity is not a reason for picking one over the other, the existing grammar of the compiler can give hints about what logically follows. - -### 7.8 What should naked selectors mean? - -While the meaning of `a.b` is obvious (record field projection), and of `a . b` (function composition), the meaning of `a .b` remains under debate. There are three choices: - -1. A naked selector is illegal. That choice is conservative; we can decide later. However `(.x)` would have to be allowed as a new syntactic production, meaning `\r -> r.x`. -2. A naked selector is not a selector at all, and thus remains a slight-unusual function composition, as it does now. -3. `.x` means `\r -> r.x`, thus resolving the selector function debate above. -4. A naked selector is a postfix operator, binding less tightly than function application. So `(.x)` naturally means the section `\r -> r.x`, and `r .x` means `(.x) r`. This approach would allow a chain of applications to be expressed cleanly. - -Of these, 1, 2 and 3 are all local operations. In contrast, option 4 gives the authors difficulty producing parse trees in their head, and thus increases the total cost of learning the extension more than is considered desirable. - ## 8. Unresolved issues -In this proposal we pick `.field` to be the syntax for selector functions, however, there are also good reasons (listed [in this proposal](#77-which-syntax-should-be-chosen-for-selector-functions)) to require brackets, namely `(.field)`. While resolved, we consider it worth the committee's deliberation as to which is preferable. Neither author is opposed to either outcome. Assuming `.field` is _not_ chosen as the selector function, then the meaning of that construct needs to be nailed down from the 4 options in Section 7.8. +None. ## 9. Implementation Plan ### 9.1 Prototype -To gain confidence these changes integrate as expected [a prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/tree/record-dot-syntax-alt) was produced that parses and desugars forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. The prototype is sufficiently rich enough to "do the right thing" with [this test file](https://gitlab.haskell.org/shayne-fletcher-da/ghc/raw/record-dot-syntax-alt/record-dot-syntax-tests/Test.hs). - -*[Note : An earlier version of this proposal came with a different [prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/commits/record-dot-syntax). That prototype differs from the current state of this proposal in that "naked field selectors" are deemed illegal and field selections with white-space are legal e.g. `f .x .y` is `f.x.y`. These differences lead to a somewhat different parsing scheme than the one presented here]*. +To gain confidence these changes integrate as expected [a prototype](https://gitlab.haskell.org/shayne-fletcher-da/ghc/-/tree/record-dot-syntax-4.1) was produced that parses and desugars forms directly in the parser. For confirmation, we _do not_ view desugaring in the parser as the correct implementation choice, but it provides a simple mechanism to pin down the changes without going as far as adding additional AST nodes or type checker rules. The prototype is sufficiently rich enough to "do the right thing" with [this test file](https://gitlab.haskell.org/shayne-fletcher-da/ghc/-/blob/f74bb04d850c53e4b35eeba53052dd4b407fd60b/record-dot-syntax-tests/Test.hs). ### 9.2 Who will provide an implementation? From 17855d34379f2f46b3c4e10c1754cf08d9ec5d22 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 10:01:21 -0400 Subject: [PATCH 41/49] Fix typo --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 299a525374..ea7764eccf 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -132,7 +132,7 @@ The Haskell grammar is extended with the following productions. We use these not ###### 2.3.2.5 [Field update]
-     *aexp* → *{* *pbind₁* *,* ... *}* +     *aexp* → *aexp* *{* *pbind₁* *,* ... *}*
     *pbind* → *field* *.ᵀ* *fieldToUpdate* *=* *exp*
From af35cc0eab045d724630bb781dc4a8d59a353ef0 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 12:21:47 -0400 Subject: [PATCH 42/49] Fix field update productions --- proposals/0000-record-dot-syntax.md | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index ea7764eccf..ad85eac624 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -132,11 +132,9 @@ The Haskell grammar is extended with the following productions. We use these not ###### 2.3.2.5 [Field update]
-     *aexp* → *aexp* *{* *pbind₁* *,* ... *}* +     *fbind*  →  *field* *.ᵀ* *fieldToUpdate* *=* *exp*
-     *pbind* → *field* *.ᵀ* *fieldToUpdate* *=* *exp* -
-     *pbind* → *field* *.ᵀ* *fieldToUpdate* +     *fbind* → *field* *.ᵀ* *fieldToUpdate*
## 3. Examples From 62820d0cf34c3b8c998c9bf1580b6b0efe924c69 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 12:27:33 -0400 Subject: [PATCH 43/49] Whitespace --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index ad85eac624..ecc9ea721c 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -53,7 +53,7 @@ In the event the language extension is enabled: | `e.lbl₁{lbl₂ = val}` | `(e.lbl₁){lbl₂ = val}` | | `e{lbl₁ = val₁}.val₂` | `(e{lbl₁ = val₁}).val₂` | -*[Note: `e{lbl=val}` is the syntax of a standard H98 record update. It's the nested form introduced by this proposal that is new : `e{lbl1.lbl2 = val}`. However, in the event `RecordDotSyntax` is in effect, note that we propose that `e{lbl = val}` desugar to `setField @"lbl" e val`]*. +*[Note: `e{lbl = val}` is the syntax of a standard H98 record update. It's the nested form introduced by this proposal that is new : `e{lbl1.lbl2 = val}`. However, in the event `RecordDotSyntax` is in effect, note that we propose that `e{lbl = val}` desugar to `setField @"lbl" e val`]*. #### 2.1.2 Precedence We propose that '`.`' "bind more tightly" than function application thus, `f r.a.b` parses as `f (r.a.b)`. From d091670f64c7f73e9800033f5d8671dc75c0f3a6 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 12:29:11 -0400 Subject: [PATCH 44/49] Whitespace --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index ecc9ea721c..8e55768002 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -84,7 +84,7 @@ For what follows, we use these informal definitions: The prototype implements the parsing scheme presented here. More information about the prototype is available in [this section](#91-prototype). #### 2.3.1 Lexer -A new token case `ITproj Bool` is introduced. When the extension is enabled occurences of operator '`.`' are classified using the whitespace sensitive operator mechanism from [this (accepted) GHC proposal](https://github.com/ghc-proposals/ghc-proposals/pull/229). The rules are: +A new token case `ITproj Bool` is introduced. When the extension is enabled occurences of operator `.` are classified using the whitespace sensitive operator mechanism from [this (accepted) GHC proposal](https://github.com/ghc-proposals/ghc-proposals/pull/229). The rules are: | Occurence | Token | Means | Example | | -- | -- | -- | -- | From d5f3f79956547a9225d5d20d1eb35783dfb1d830 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 12:33:27 -0400 Subject: [PATCH 45/49] Whitepsace --- proposals/0000-record-dot-syntax.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 8e55768002..5696655a37 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -106,13 +106,13 @@ The Haskell grammar is extended with the following productions. We use these not ###### 2.3.2.1 [Field]
-     *field* -> *varid* | *qvarid* +     *field* -> *varid*   |   *qvarid*
###### 2.3.2.2 [Field to update]
-     *fieldToUpdate* -> *fieldToUpdate* *.ᵀ* *field* | *field* +     *fieldToUpdate* -> *fieldToUpdate* *.ᵀ* *field*   |   *field*
###### 2.3.2.3 @@ -120,7 +120,7 @@ The Haskell grammar is extended with the following productions. We use these not
     *aexp* → *( projection )*
-     *projection* → *.ᴾ* *field* | *projection* *.ᵀ* *field* +     *projection* → *.ᴾ* *field*   |   *projection* *.ᵀ* *field*
###### 2.3.2.4 From 7e6a4b56e6e03a13d52242b2c42adb503384eace Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 12:38:50 -0400 Subject: [PATCH 46/49] Fix arrows --- proposals/0000-record-dot-syntax.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 5696655a37..a2fbebbb88 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -106,35 +106,35 @@ The Haskell grammar is extended with the following productions. We use these not ###### 2.3.2.1 [Field]
-     *field* -> *varid*   |   *qvarid* +     *field*   →   *varid*   |   *qvarid*
###### 2.3.2.2 [Field to update]
-     *fieldToUpdate* -> *fieldToUpdate* *.ᵀ* *field*   |   *field* +     *fieldToUpdate*   →   *fieldToUpdate* *.ᵀ* *field*   |   *field*
###### 2.3.2.3 [Field selectors]
-     *aexp* → *( projection )* +     *aexp*   →   *( projection )*
-     *projection* → *.ᴾ* *field*   |   *projection* *.ᵀ* *field* +     *projection*   →   *.ᴾ* *field*   |   *projection* *.ᵀ* *field*
###### 2.3.2.4 [Field selection]
-     *fexp* → *fexp* *.ᵀ* *field* +     *fexp*   →   *fexp* *.ᵀ* *field*
###### 2.3.2.5 [Field update]
-     *fbind*  →  *field* *.ᵀ* *fieldToUpdate* *=* *exp* +     *fbind*   →    *field* *.ᵀ* *fieldToUpdate* *=* *exp*
-     *fbind* → *field* *.ᵀ* *fieldToUpdate* +     *fbind*   →   *field* *.ᵀ* *fieldToUpdate*
## 3. Examples From 901e4dc2dd00a0181748206c2fd732ceeeaee46f Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 12:40:43 -0400 Subject: [PATCH 47/49] Remove quotes around dot. --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index a2fbebbb88..bd10be2f89 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -56,7 +56,7 @@ In the event the language extension is enabled: *[Note: `e{lbl = val}` is the syntax of a standard H98 record update. It's the nested form introduced by this proposal that is new : `e{lbl1.lbl2 = val}`. However, in the event `RecordDotSyntax` is in effect, note that we propose that `e{lbl = val}` desugar to `setField @"lbl" e val`]*. #### 2.1.2 Precedence -We propose that '`.`' "bind more tightly" than function application thus, `f r.a.b` parses as `f (r.a.b)`. +We propose that `.` "bind more tightly" than function application thus, `f r.a.b` parses as `f (r.a.b)`. | Expression | Interpretation | | -- | -- | From 2e50de80bd357bf4302a6a546a9b6eb3a4cf3fa2 Mon Sep 17 00:00:00 2001 From: Shayne Fletcher Date: Sat, 11 Apr 2020 12:44:11 -0400 Subject: [PATCH 48/49] Does not depend on no record field selectors proposal --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index bd10be2f89..4b77c15a07 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -258,4 +258,4 @@ To gain confidence these changes integrate as expected [a prototype](https://git ### 9.2 Who will provide an implementation? -If accepted, the proposal authors would be delighted to provide an implementation. Implementation depends on the implementation of [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst) and [the `NoFieldSelectors` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0160-no-toplevel-field-selectors.rst). +If accepted, the proposal authors would be delighted to provide an implementation. Implementation depends on the implementation of [the `HasField` proposal](https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0158-record-set-field.rst). From a2f5c4c9fabfaf3ef1e3bfda1b3b4dcc46602b40 Mon Sep 17 00:00:00 2001 From: Neil Mitchell Date: Wed, 22 Apr 2020 11:56:22 +0100 Subject: [PATCH 49/49] Update proposals/0000-record-dot-syntax.md Co-Authored-By: Arnaud Spiwack --- proposals/0000-record-dot-syntax.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/0000-record-dot-syntax.md b/proposals/0000-record-dot-syntax.md index 4b77c15a07..875578738f 100644 --- a/proposals/0000-record-dot-syntax.md +++ b/proposals/0000-record-dot-syntax.md @@ -228,7 +228,7 @@ Typically `RecordDotSyntax` will be used in conjunction with `NoFieldSelectors`, ### 7.2 Should a syntax be provided for modification? -Earlier versions of this proposal contained a modify field sytnax of the form `a{field * 2}`. While appealing, there is a lot of syntactic debate, with variously `a{field <- (*2)}`, `a{field * = 2}` and others being proposed. None of these syntax variations are immediately clear to someone not familiar with this proposal. To be conservative, we leave this feature out. +Earlier versions of this proposal contained a modify field syntax of the form `a{field * 2}`. While appealing, there is a lot of syntactic debate, with variously `a{field <- (*2)}`, `a{field * = 2}` and others being proposed. None of these syntax variations are immediately clear to someone not familiar with this proposal. To be conservative, we leave this feature out. ### 7.3 Should there be update sections?