Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OverloadedRecordFields #6

Merged
merged 15 commits into from Feb 4, 2017

Conversation

@adamgundry
Copy link
Contributor

@adamgundry adamgundry commented Aug 23, 2016

The proposal has been accepted; the following discussion is mostly of historic interest.


This is a proposal to introduce a new extension, OverloadedRecordFields, to make it easier to work with record datatypes that reuse the same field names. It also makes some changes to the existing OverloadedLabels extension for consistency.

Thanks to all those who have provided feedback on previous iterations of this work, in particular @ekmett, whose helpful criticism prompted some further refinements to the design. Further comments/complaints/sponsorship offers are welcome!

Rendered.

OverloadedLabels extension
~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``IsLabel`` class defined in ``GHC.OverloadedLabels`` is changed

This comment has been minimized.

@Gabriel439

Gabriel439 Sep 5, 2016

Minor suggestion: perhaps also include the old definition for IsLabel inline for ease of comparison in the proposal

This comment has been minimized.

@adamgundry

adamgundry Sep 5, 2016
Author Contributor

Done, thanks.

@int-index
Copy link
Contributor

@int-index int-index commented Sep 5, 2016

May I suggest making HasField poly-kinded instead of using Symbol? That would allow using empty data types (e.g. data FieldName) as field names, which have better behavior with regards to name-spacing: data FieldName from one module is different than data FieldName from another module; symbol "FieldName" is the same everywhere.

@chrisdone
Copy link

@chrisdone chrisdone commented Sep 5, 2016

@int-index The idea to have a data structure containing two fields of the same name but from different modules seems like a use-case that needs motivating, as by itself just seems potential for confusing people.

@int-index
Copy link
Contributor

@int-index int-index commented Sep 5, 2016

@chrisdone

One library defines a function:

f :: HasField "foo" r Int => r -> IO ()

Another library defines a function:

g :: HasField "foo" r Int => r -> IO ()

There's no guarantee that both of those "foo" are semantically identical. f expects "foo" to mean "how much kittens to save" and g expects "foo" to mean "how much missiles to launch". I want to call f x but accidentally call g x - not the result I wanted.

Using symbols as field names introduces spurious equalities (like above, "foo" from one lib = "foo" from another lib). Now, I understand than not everyone shares the same concerns and many people actually like their Symbols because they don't want to be explicit about equalities.

Being poly-kinded would allow using both Symbols and Types when appropriate.

@Hrothen
Copy link

@Hrothen Hrothen commented Sep 5, 2016

If I'm reading this right, code using the new syntax would everywhere use #foo instead of foo to access the associated field. I don't think this would be better than the current record field readability situation. The # character is already strongly associated with unboxed types, so proliferation of its use would result in code becoming harder to scan. Furthermore it's really ugly (which I suspect might be why it was chosen to mark unboxed types).

Also to be really nitpicky, if you must use the # prefix to get this behavior, then it's not really "overloaded" record fields.

@chrisdone
Copy link

@chrisdone chrisdone commented Sep 5, 2016

@Hrothen

Also to be really nitpicky, if you must use the # prefix to get this behavior, then it's not really "overloaded" record fields.

The overloaded part is that #foo comes from being translated to something that makes use of the class constraint HasField "foo" r. This is basically the same as OverloadedStrings. The meaning of #foo is overloaded.

The # character is already strongly associated with unboxed types, so proliferation of its use would result in code becoming harder to scan. Furthermore it's really ugly (which I suspect might be why it was chosen to mark unboxed types).

The ship has already sailed on the # syntax, after some years of proposals and discussion, and it's already released in GHC 8 via the OverloadedLabels extension. Use of #foo for accessing fields has precedent in the Hugs TRex extension (see here), and also in related libraries like HaskellDB's record field access (via #).

@Hrothen
Copy link

@Hrothen Hrothen commented Sep 5, 2016

The ship has already sailed on the # syntax, after some years of proposals and discussion, and it's already released in GHC 8 via the OverloadedLabels extension. Use of #foo for accessing fields has precedent in the Hugs TRex extension (see here), and also in related libraries like HaskellDB's record field access (via #).

"We already have the ugly thing, so we should double down on it" is not a great argument.

@int-index
Copy link
Contributor

@int-index int-index commented Sep 5, 2016

@Hrothen

"We already have the ugly thing, so we should double down on it" is not a great argument.

Let me be polite and just state that not everybody shares your opinion on # being ugly.

If you want a piece of advice, try to care less about syntax and more about semantics. The color of the bike shed will disappoint someone out there regardless of the choice.

@Hrothen
Copy link

@Hrothen Hrothen commented Sep 5, 2016

The whole point of overloaded record fields is just to be able to avoid having to add prefixes to record field names, so this is all bikeshedding already.

@int-index
Copy link
Contributor

@int-index int-index commented Sep 5, 2016

@Hrothen

The whole point of overloaded record fields is just to be able to avoid having to add prefixes to record field names

This is not true, however, since OverloadedRecordFields has more expressive power than prefixes: you can now be polymorphic in the record type. So far it only allows reading from the record, but with field updates in place (which are planned for the future) it will provide record polymorphism for Haskell.

@Gabriel439
Copy link

@Gabriel439 Gabriel439 commented Sep 5, 2016

@Hrothen Is there a way to do this without some sort of reserved prefix? How would the compiler know that some token like personId (without the #) represents a field that needs to be desugared to fromLabel @"personID"?

@int-index
Copy link
Contributor

@int-index int-index commented Sep 5, 2016

@Gabriel439

One could define

personId = fromLabel @"personId"

for all fields in the module where OverloadedRecordFields is enabled.

@Hrothen
Copy link

@Hrothen Hrothen commented Sep 5, 2016

@Gabriel439

I don't know. I wouldn't even have commented if it weren't for the worry that I'll end up having to read code using this extension in the future.

The compiler should already know if something is a function or a record field, but obviously if that were actually true Adam wouldn't have had to do all this work.

@chrisdone
Copy link

@chrisdone chrisdone commented Sep 5, 2016

#foo (OverloadedLabels) is a very valuable extension irrespective of data-declared records. It's very useful for anonymous records a la PureScript, which I'm using for a data analysis demonstration. The main technical barrier blocking people from using Haskell for exploratory data work is that you have to declare all your data types ahead of time, meaning you have to choose between boilerplate or dynamic types. Demonstration: https://github.com/chrisdone/labels/tree/master/labels-csv#labels-cassava Consider also the database use-case of doing a relational join (HaskellDB demonstrates this), which requires constructing new anonymous records on the fly.

@adamgundry
Copy link
Contributor Author

@adamgundry adamgundry commented Sep 5, 2016

Thanks everyone for your feedback so far!

@int-index

May I suggest making HasField poly-kinded instead of using Symbol? That would allow using empty data types (e.g. data FieldName) as field names
...
Being poly-kinded would allow using both Symbols and Types when appropriate.

I appreciate your concerns about the dangers of semantically-empty strings, but I would suggest the proper solution for that is for libraries not to expose interfaces that use HasField.

Unfortunately I don't see how making HasField poly-kinded gains you anything. HasField constraints are typically introduced by overloaded labels (which are just strings) and typically eliminated by the magic constraint solving behaviour (which deals only with strings). So the only uses for HasField at other kinds would be in code that defines its own accessors and HasField instances, at which point it might as well use another class altogether.

One could define

personId = fromLabel @"personId"

for all fields in the module where OverloadedRecordFields is enabled.

We did consider something like this previously, requiring no new syntax and interpreting occurrences of field names as overloaded fields automatically. The trouble is:

  • This isn't a conservative extension: turning on OverloadedRecordFields would break existing code using field selectors (because a previously monomorphic occurrence of personId would become polymorphic).
  • There is no guarantee that the record type at which personId is ultimately used corresponds to the type to which the particular field in scope belongs.
  • More generally, this introduces the arbitrary requirement to have a (possibly completely unused) field called personId in scope when using personId as an overloaded selector, merely to change the name resolution rules.

In contrast, the overloaded labels approach is simple: #personId always means the overloaded field, and personId the non-overloaded selector.

Given that we want a syntactic distinction, the question is what syntax to use. The OverloadedLabels syntax isn't perfect, but Haskell doesn't have much syntactic wiggle room left!

@int-index
Copy link
Contributor

@int-index int-index commented Sep 5, 2016

@adamgundry

I would suggest the proper solution for that is for libraries not to expose interfaces that use HasField

I doubt you can stop library authors from doing so given the convenience HasField provides.

So the only uses for HasField at other kinds would be in code that defines its own accessors and HasField instances, at which point it might as well use another class altogether.

That another class would benefit from being poly-kinded as well, in case someone wants to use a closed kind for field types:

data PersonField = PersonName | PersonSurname | PersonAddress

At this point HasField is just a kind-restricted version and we're merely fragmenting the ecosystem.

If you doubt anyone would want that, someone is already doing exactly that, e.g. VinylRecords/Vinyl#80

@chrisdone
Copy link

@chrisdone chrisdone commented Sep 5, 2016

I have a wrapper library for demonstration purposes based on the
labels library and conduit which looks like this:

{-# LANGUAGE TemplateHaskell, OverloadedStrings, ScopedTypeVariables,
OverloadedLabels, TypeOperators, DataKinds, FlexibleContexts #-}
import Data.ByteString (ByteString)import Labels.Explore
main = do
  count <-
    explore $
    fileSource "demo.csv" .|
    fromCsvConduit .|
    filterConduit
      ((row :: ("ORIGIN" := ByteString)) ->
         get $("ORIGIN") row == "RKS") .>
    countSink
  print (count :: Int)

(The library I whipped up making is like this:
https://github.com/chrisdone/labels/tree/master/labels-explore)

You can read it as a UNIX pipeline:

  • Create a conduit source from “demo.csv”.
  • Convert from ByteString to ("ORIGIN" := ByteString).
  • Filter in rows for ORIGIN.
  • Pour those rows down the sink which counts them all.

This outputs:

155
  12,704,802,968 bytes allocated in the heap
     457,278,008 bytes copied during GC
       2,216,824 bytes maximum residency (141 sample(s))
         377,616 bytes maximum slop
              11 MB total memory in use (0 MB lost due to fragmentation)


                                 Tot time (elapsed)  Avg pause  Max pause

  Gen  0     24002 colls, 24002 par    3.220s   0.770s     0.0000s    0.0013s
  Gen  1       141 colls,   140 par    0.767s   0.181s     0.0013s    0.0606s

  Parallel GC work balance: 10.21% (serial 0%, perfect 100%)

  TASKS: 18 (1 bound, 17 peak workers (17 total), using -N8)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.001s  (  0.022s elapsed)
  MUT     time    6.275s  (  4.373s elapsed)
  GC      time    3.987s  (  0.951s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time   10.266s  (  5.347s elapsed)

  Alloc rate    2,024,529,772 bytes per MUT second

  Productivity  61.2% of total user, 117.4% of total elapsed

11MB total memory usage is not bad. That’s because of the constant
memory usage imposed by conduit. But the 10s performance is quite
slow. That’s likely due to all the allocation (12GB in total) and the
granularity; rather than loading in one row at a time, it should load
in e.g. 10 or 1000 into a vector and then fold them in one go. By
experience I know that’d be more efficient. Use of Conduit itself
probably adds about 2GB.

However, like Iavor demonstrated; it’s easy to optimize. Writing a
more efficient parser, with fewer intermediate data structures.


Users may define their own instances of ``HasField``, subject to the
usual rules about overlapping and incoherent instances. This allows
"virtual" record fields to be defined for datatypes that do not

This comment has been minimized.

@mpickering

mpickering Sep 5, 2016
Contributor

But you couldn't use the virtual name in record update syntax for example?

This comment has been minimized.

@PkmX

PkmX Sep 6, 2016

I suppose you can also have:

class SetField (l :: Symbol) s t b | s l b -> t, t l -> b where
    setField :: s -> b -> t

so that record updates like

f r = r { #a = True }

would be translated to:

f :: SetField "a" s t Bool => s -> t
f r = setField @"a" r True

This comment has been minimized.

@adamgundry

adamgundry Sep 7, 2016
Author Contributor

Right, this proposal doesn't cover updates, but something along the lines of SetField is a plausible next step. "Virtual" fields would have to define get and set behaviour separately.

We could introduce syntactic sugar like r { #a = True }, but it's not clear to me what the translation should be for multiple updates. They could be desugared to a series of single updates, but that's less expressive than traditional Haskell record update in the unambiguous case.

For example, this is well-typed only if you apply both updates at once:

data T a = MkT { x :: a, y :: a }

f :: T a -> T Bool
f t = t { x = True, y = True }

This comment has been minimized.

@PkmX

PkmX Sep 9, 2016

I think you should just ask users to handle multiple updates explicitly:

data HList (xs :: [*]) where
  HNil :: List '[]
  HCons :: x -> HList xs -> HList (x ': xs)

class SetFields (ls :: [Symbol]) s t (bs :: [*]) | s ls bs -> t, t ls -> bs where
    setFields :: s -> HList bs -> t
f t = t { x = True, y = False }

will desugar to:

f t :: SetFields '["a", "b"] (T a) (T bool) '[Bool, Bool] => T a -> T Bool
f t = setFields @'["a", "b"] t (True `HCons` (False `HCons` HNil))

For reference, this is very similar what my rawr library does (:<= is the record update operator):

type T a = R ( "x" := a, "y" := a )

f :: T a -> T Bool
f t = t :<= R ( #x := True, #y := True )

g :: T a -> T Bool
g t = t :<= R ( #x := True ) -- type error

This comment has been minimized.

@adamgundry

adamgundry Sep 13, 2016
Author Contributor

Thanks for pointing me to rawr, I wasn't aware of it before. I'm not sure how easy it will be to specify and implement the constraint solver behaviour for SetFields. Checking traditional record update is hard enough! Anyway, this is beyond the scope of the current proposal.

@AntC2
Copy link
Contributor

@AntC2 AntC2 commented Sep 12, 2016

@int-index
Just to say I agree that it's not a good move going from field labels being too 'private' (a single Record type) to being entirely 'public' (personId visible everywhere). This was all debated way back in the original design.

Library developers working independently are very likely to come up with the same field labels for unrelated purposes. @adamgundry suggesting those interfaces be hidden seems to rather defeat the purpose of having interfaces and/or require too much co-ordination between library writers.

I see the overall effect as not exactly breaking existing code; but meaning that existing libraries with their current 'private' labels can never be upgraded to OverloadedRecordFields for fear that will introduce a clash with some other library's labels.

@AntC2
Copy link
Contributor

@AntC2 AntC2 commented Sep 12, 2016

We are losing a 'featurette' of H98 field labels: all fields with a given label must be the same type. (This is of limited usefulness, because it only applies within the fields of the data constructors within a given type.)

For example my library wants all personIds to be Int.
But with labels as global Symbols, I can't enforce that. Another library wants personId to be String.

I really feel that rather than an evolution from H98 records we're getting an orthogonal records system. So people who want tight name/scope control over their labels will continue to use their favourite record-like system. (I see the promos above.) And we'll continue with a fragmented ecosystem.

@spl
Copy link
Contributor

@spl spl commented Sep 12, 2016

@adamgundry:

Unfortunately I don't see how making HasField poly-kinded gains you anything. HasField constraints are typically introduced by overloaded labels (which are just strings) and typically eliminated by the magic constraint solving behaviour (which deals only with strings). So the only uses for HasField at other kinds would be in code that defines its own accessors and HasField instances, at which point it might as well use another class altogether.

I'm not sure I understand this argument against a poly-kinded HasField as suggested by @int-index. To summarize the rest of my comment, why is it the case that “at which point it might as well use another class altogether”?

You say that “HasField constraints are typically introduced by overloaded labels,” but the following text from the proposal practically encourages authors to define their own instances:

Users may define their own instances of HasField, subject to the usual rules about overlapping and incoherent instances. This allows "virtual" record fields to be defined for datatypes that do not otherwise have them. For example, an anonymous records library could provide HasField instances and thus be compatible with the polymorphic record selectors introduced by OverloadedRecordFields.

Is there a slight contradiction here?

I could imagine a library with a common set of “virtual” record field data types, instead of Symbols, since date type names are clearly unique, unlike strings. Also, as noted in the PersonField example, kind constructors can be kind-restricted, unlike strings. And it seems useful to have a common interface for polymorphic record selector abstraction over one specialized to that particular library, for all the reasons that make standardization a good thing. (Note that having a common interface does not preclude a specialized interface.)

So, assuming that (1) we are not restricted from writing HasField instances (but are actually encouraged to write them in some cases), (2) having unique data types is better than strings, and (3) it is useful to have standardized polymorphic record selectors for those types, it seems like a net positive change to make the class poly-kinded.

I see the gain, but I'm not sure what the harm is.

@AntC2
Copy link
Contributor

@AntC2 AntC2 commented Sep 12, 2016

@adamgundry
I'm having a hard time piecing together how all these bits of proposals hang together. (It's been over 4 years!)

This OverloadedRecordFields work is Part 3: "Magic type classes" of the ORF suite, right? Except it's only dealing with get field, not update field(?)

The proposal seems to be saying that Part 2: "OverloadedLabels" is now not the best way to go(?) I won't shed a tear over losing the Proxy# x business. I thought, though, that OverloadedLabels had merits on its own, arising from its similarities to ImplicitParameters(?)

I'm finding the write-up for https://github.com/adamgundry/ghc-proposals/blob/overloaded-record-fields/proposals/0000-overloaded-record-fields.rst#multiple-versions-of-fromlabel quite confusing. Could I suggest you include a table:

  • What are all the valid combinations of flags?
  • Which flags are automatically switched on by others appearing?
  • (IOW if I switch on FlagA and say NoFlagB that's going to be rejected.)
  • For each valid combination, what's generated for a given data type decl?
  • For each valid combination, how do I go about defining my own field accessors?

And if I set one valid combination in this module, but it's imported into that module where a different combination is set (possibly for same-named fields in different record types), does it just work? Or do I get perplexing type errors?

A serious limitation of the Haskell record system is the inability to
overload field names in record types: for example, if the data types

::

This comment has been minimized.

@adamgundry

adamgundry Sep 12, 2016
Author Contributor

Use .. code-block:: haskell syntax for highlighting code blocks!

this leads to less clear code and obfuscates relationships between
fields of different records. Qualified names can be used to
distinguish record selectors from different modules, but using one
module per record is often impractical.

This comment has been minimized.

@adamgundry

adamgundry Sep 12, 2016
Author Contributor

Yuras comments on reddit that this section could do with further justification.

@adamgundry
Copy link
Contributor Author

@adamgundry adamgundry commented Sep 12, 2016

@spl

I'm not sure I understand this argument against a poly-kinded HasField as suggested by @int-index. To summarize the rest of my comment, why is it the case that “at which point it might as well use another class altogether”?

My point was that none of the built-in machinery would get any benefit from the poly-kinded version, so the poly-kinded version could live in a package other than base. But you make good points about the value of a single standard class. I had worried about the type inference implications of extra polymorphism, but actually I think it won't be too bad - HasField Foo ... might as well be an unsolved constraint as a kind error. So I'm inclined to switch to the more general version.

@spl
Copy link
Contributor

@spl spl commented Sep 12, 2016

@adamgundry:

I had worried about the type inference implications of extra polymorphism, but actually I think it won't be too bad - HasField Foo ... might as well be an unsolved constraint as a kind error.

I guess you mean that since the type of fromLabel depends on a type application (e.g. fromLabel @Foo), it might be problematic or won't be that useful. But I think any alternative would also need type application or a type Proxy, so that seems fine.

========================

This is a proposal to introduce a new extension,
``OverloadedRecordFields``, to make it easier to work with record

This comment has been minimized.

@adamgundry

adamgundry Sep 13, 2016
Author Contributor

See criticism of the extension name from @Hrothen. I'm rather attached to the current colour of the bikeshed, but perhaps there is an argument to change it.

@adamgundry
Copy link
Contributor Author

@adamgundry adamgundry commented Sep 13, 2016

@AntC2

This OverloadedRecordFields work is Part 3: "Magic type classes" of the ORF suite, right? Except it's only dealing with get field, not update field(?)

Yes. Parts 1 and 2 are implemented. This is half of part 3; I left out updates in order to get something simpler and more manageable to implement first. I tried to make this proposal self-contained, rather than referring back to the (long and complicated) history. If there are places where you think knowledge of the history is needed to understand the proposal text, please flag them up so I can improve the proposal.

The proposal seems to be saying that Part 2: "OverloadedLabels" is now not the best way to go(?) I won't shed a tear over losing the Proxy# x business. I thought, though, that OverloadedLabels had merits on its own, arising from its similarities to ImplicitParameters(?)

I'm proposing some changes to OverloadedLabels, but with the core functionality remaining the same, as it has its own use cases independent of ORF.

I'm finding the write-up for https://github.com/adamgundry/ghc-proposals/blob/overloaded-record-fields/proposals/0000-overloaded-record-fields.rst#multiple-versions-of-fromlabel quite confusing. Could I suggest you include a table:

You mean something like this?

ORF OL RS Desguaring of #foo
On Off Off GHC.Records.fromLabel @"foo" :: HasField "foo" r a => r -> a
On/Off On Off GHC.OverloadedLabels.fromLabel @"foo" :: IsLabel "foo" t => t
On/Off On/Off On fromLabel @"foo" using in-scope fromLabel
  • What are all the valid combinations of flags?

All combinations are valid. Some flags override others, as shown above.

  • Which flags are automatically switched on by others appearing?

I originally proposed that OverloadedRecordFields would imply DuplicateRecordFields, but I now wonder if they should be kept separate, as they are more-or-less orthogonal. There are no other implications.

(IOW if I switch on FlagA and say NoFlagB that's going to be rejected.)

That's not how GHC flags work. Even if ORF implies DRF, it will be perfectly fine to say OverloadedRecordFields NoDuplicateRecordFields.

  • For each valid combination, what's generated for a given data type decl?

We don't ever change what is generated for a data type decl from the normal selector functions. The extensions are purely about how names and labels are resolved in expressions.

  • For each valid combination, how do I go about defining my own field accessors?

Sorry, I don't understand the question.

And if I set one valid combination in this module, but it's imported into that module where a different combination is set (possibly for same-named fields in different record types), does it just work? Or do I get perplexing type errors?

The extensions affect only a single module, and control how name/label resolution works in that module. It doesn't matter which extensions are in use in the defining module, only the client module. Imports are relevant only when using RebindableSyntax.

@edsko
Copy link

@edsko edsko commented Feb 6, 2017

Let me just for the record register my vote against making raw van Laarhoven lenses the default in the core Haskell libraries. I understand that they are popular, but I personally consider them unidiomatic Haskell. I like Haskell because I like types. Types express intent, both to the compiler, automatically catching mistakes, but at least equally important they express intent to the human reader. Raw van Laarhoven lenses express no such intent; most of the types in the lens library are difficult to understand (unless one is an expert in free theorems, I suppose). The argument that "it's convenient, we can just use an x where it was expecting a y" is exactly the argument proponents of dynamically typed languages use; if that argument is convincing, perhaps we should switch to Ruby :)

@mchakravarty
Copy link

@mchakravarty mchakravarty commented Feb 6, 2017

I have to say, @kosmikus and @edsko are making a very good point.

@mchakravarty
Copy link

@mchakravarty mchakravarty commented Feb 6, 2017

@bgamari I like to suggest that we make the issue wrt to lenses a separate proposal. After all, this discussion is still ongoing and it doesn't seem reasonable that such a rather important user-facing feature is accepted before that aspect has a fixed design.

@ocharles
Copy link

@ocharles ocharles commented Feb 6, 2017

I think that argument is almost a straw man @edsko. Lenses obviously do not give up types, and the types given in van Laarhoven lenses express what they do. I'm assuming what you really mean is the sub-typing can sometimes be surprising - for example "viewing" a traversal actually takes a monoidal summary of all values reached in the traversal, rather than viewing a single value. I agree this is a subtly, but it's hardly comparable to dynamic typing - and it is in the types, as viewing requires Const, and Applicative instances for Const require a Monoid.

@int-index
Copy link
Contributor

@int-index int-index commented Feb 6, 2017

and the types given in van Laarhoven lenses express what they do

Not really, since you also have additional lens laws not enforced by the type.

@chrisdone
Copy link

@chrisdone chrisdone commented Feb 6, 2017

I'd echo @edsko and @kosmikus's disagreement about adding unwrapped van Laarhoven lenses to the core Haskell libraries. The types of those form of lenses are difficult to follow. The type errors are confusing, type inference is disappointing (leading to helpers like copyLens). In my personal Haskell experience, I don't need to use lenses most of the time.

As far as syntactic overhead goes: For the basic lens, I see two of the "killer features": first-class accessors, and nested updates. My use of them is:

  1. Nested updates: In my labels package (which uses overloaded labels) I expose a lens function which lets you write over (lens #foo . lens #bar). Given that I do nested updates maybe one or two times per codebase, that's personally speaking a low overhead for me.
  2. First-class accessors: Meanwhile, I do also sometimes have a function that accepts a lens over some parametric state s, and that is a useful first-class accessor. Again, lens #foo isn't a big overhead for me.
@kosmikus
Copy link

@kosmikus kosmikus commented Feb 6, 2017

@ocharles I think it's a bit unfair to accuse @edsko of using a straw man here. There's certainly an amount of subjectivity concerned regarding the lens types. Yes, of course, a type such as forall f . Functor f => (a -> f b) -> s -> f t is not lying, and in a way still says exactly what the thing does. However, I'd argue it's still misleading to the uninitiated. It suggests that it is a function that should be applied to one or two arguments, when in most cases it's something we want to compose as a whole, or feed to higher-order functions as a whole. The relation between the functor constraint and the getting/setting aspect is, I'd argue, quite non-obvious to someone who has not already been explained "look what happens if you plug in Identity or Const". Furthermore, if you consider a function like view :: MonadReader s m => Getting a s a -> m a, then it is not at all obvious that these two things fit together in any way. So yes, the types are all there and of course correct, and it's quite marvellous that it all works, but they're not exactly self-explanatory.

@ghost
Copy link

@ghost ghost commented Feb 6, 2017

It suggests that it is a function that should be applied to one or two arguments, when in most cases it's something we want to compose as a whole, or feed to higher-order functions as a whole.

I would argue that it's a different style of functional programming that makes sense once you're exposed to it. Nothing in the type restricts you from composing or feeding it to a higher-order function.

@bgamari
Copy link
Contributor

@bgamari bgamari commented Feb 6, 2017

@mchakravarty, a new proposal for discussing the default instance is probably a good idea at this point. It seems unlikely that this point will be resolved in the immediate future.

@simonpj
Copy link
Contributor

@simonpj simonpj commented Feb 7, 2017

This is a LONG thread!

Adam, it would be good to summarise and draw it together somehow.

My (not very well informed) thoughts are:

  • For 8.2 it would be good to move ahead with HasField (with its built-in-solving rules) and the change to IsLabel.

  • We should refrain from adding the (->) instance for now, since there is no consensus, and not doing so is not a regression... we just move forward a little bit at a time, and gain experience.

I have always been unsettled by the super-polymorphic type for lenses. I would far prefer them to have an proper abstract type (Lens s t ab). But the implicit subtyping we get through exposing the representation is truly amazing and truly useful. I am uneasy about the tension between abstraction and utility, and I hope we may ultimately figure out a better way to reconcile them. Are Reid's suggestions above a start?

@rrnewton
Copy link

@rrnewton rrnewton commented Feb 7, 2017

@adamgundry - do you have a pointer to any discussion of the performance implications of working with overloaded record fields? I'm wondering if there are any benchmarks of different record libraries floating around, and if so it would be nice to link to them from this (mega) thread. In fact, I'd rather link to something than extend this thread further ;-).

It looks like an unoptimized implementation of these overloaded types would end up passing dictionaries around, and I'd like to know how well the optimizer is working for code like this (plus, the effect on compile time).

@adamgundry
Copy link
Contributor Author

@adamgundry adamgundry commented Feb 7, 2017

@simonpj I think we're in agreement regarding the plan for 8.2. (If anyone disagrees, please shout!)

The questions about lens representations are interesting, but I think the best way to follow up on them is to experiment in libraries outside of GHC, rather than anything involving the GHC proposal process. We can put together a proposal for the setter counterpart to HasField without taking a position on the question of the IsLabel instance for (->).

@rrnewton unfortunately I'm not aware of such a discussion nor of benchmarks of record libraries. My uninformed presumption would be that provided code is not polymorphic in HasField (or any such polymorphism is specialised away), inlining would mean that the runtime performance would not differ from code using the selector function directly. If code is polymorphic in HasField and isn't specialised, I guess passing HasField dictionaries is likely to be a significant performance hit.

@ocharles
Copy link

@ocharles ocharles commented Feb 7, 2017

I'm not sure if it's been mentioned, but gi-gtk (and friends) are also using the `->' instance very different to lenses: https://github.com/haskell-gi/haskell-gi/wiki/Overloading. Labels themselves are callable as IO actions that actually go and do Gtk calls. Just wanted to provide one more data point for how people are using OverloadedLabels right now.

@adamgundry
Copy link
Contributor Author

@adamgundry adamgundry commented Feb 8, 2017

@ocharles thanks for flagging up the gi-gtk use case. I'm not convinced it's a good idea. In particular, it will break if we introduce any kind of IsLabel instance for (->).

@danidiaz
Copy link

@danidiaz danidiaz commented Feb 8, 2017

I have a doubt about

A field must be in scope for the corresponding HasField constraint to be solved.

What happens if a module that doesn't "see" a certain HasField constraint declares an instance of its own, with the same symbol and source/target types? Or will that be forbidden?

@adamgundry
Copy link
Contributor Author

@adamgundry adamgundry commented Feb 10, 2017

@danidiaz good question! It isn't entirely clear from the proposal text, but the rules for legal HasField instances laid out at the end of the Virtual record fields section apply regardless of whether the fields are in scope. Thus, a conflicting instance is forbidden (even though the constraint will not be solved automatically either) and coherence is maintained.

@garetxe
Copy link

@garetxe garetxe commented Feb 12, 2017

@adamgundry Why should the instance in gi-gtk break? The instance is basically something like

instance (r ~ IO ()) => IsLabel "show" (Widget -> r) where
    fromLabel _ = widgetShow

So it defines a virtual record field of type IO (). With this proposal the instance should be for HasField instead, I suppose, but otherwise what gi-gtk and friends do should be within the envisioned design, no?

I hadn't chimed in before because I thought from reading the proposal that what gi-gtk does would still be supported (modulo some IsLabel -> HasField refactoring), but the comment above makes me worry that perhaps I misunderstood something.

@adamgundry
Copy link
Contributor Author

@adamgundry adamgundry commented Feb 14, 2017

@garetxe it's clear that those instances will conflict with any IsLabel instance for (->) introduced in base. There are a couple of potential difficulties with refactoring gi-gtk to use HasField, depending on which instance for (->) we choose (see discussion above):

  • If we use the unwrapped lens instance, as @mboes and others are suggesting, this simply doesn't work.
  • If we use the selector function instance, as originally proposed, it may or may not be possible to satisfy the functional dependency. I haven't investigated gi-gtk thoroughly, but the crucial question is whether MethodInfo can be given a functional dependency to determine its result type.
ghc-mirror pushed a commit to ghc/ghc that referenced this pull request Feb 14, 2017
This implements automatic constraint solving for the new HasField class
and modifies the existing OverloadedLabels extension, as described in
the GHC proposal
(ghc-proposals/ghc-proposals#6). Per the current
form of the proposal, it does *not* currently introduce a separate
`OverloadedRecordFields` extension.

This replaces D1687.

The users guide documentation still needs to be written, but I'll do
that after the implementation is merged, in case there are further
design changes.

Test Plan: new and modified tests in overloadedrecflds

Reviewers: simonpj, goldfire, dfeuer, bgamari, austin, hvr

Reviewed By: bgamari

Subscribers: maninalift, dfeuer, ysangkok, thomie, mpickering

Differential Revision: https://phabricator.haskell.org/D2708
@garetxe
Copy link

@garetxe garetxe commented Feb 14, 2017

@adamgundry Yes, indeed, the current instances will certainly break. My worry was whether I would be able to modify gi-gtk somehow to keep the functionality.

Playing around with it today, it seems like adding HasField instances won't really work (for the case of the selector function instance, I haven't really tried the unwrapped lens instance), since many of the virtual records that we want to generate are higher rank. At least I could not find a way, and it sounds like the same issue as in the limitations section in the proposal.

But simply adding an OVERLAPPING pragma to the existing instance, like so

instance {-# OVERLAPPING #-} (r ~ ...) => IsLabel (Widget -> r) where
     ...

seems to work both with the proposed selector function and lens instances, since it is strictly more specific than either of the proposed alternatives. So either way the final decision goes I can easily keep the exposed interface in gi-gtk and friends, I think.

Anyway, thanks a lot for the work on the overloaded record fields! Whether the overloaded methods stuff in gi-gtk survives in the final design of OverloadedLabels or not, the other places where I used IsLabel instances already lead to a much nicer API in my opinion.

@taktoa taktoa mentioned this pull request Jul 11, 2018
3 of 6 tasks complete
nomeata pushed a commit that referenced this pull request May 21, 2020
Update the LocalDo proposal to use a qualified do
JakobBruenker added a commit to JakobBruenker/ghc-proposals that referenced this pull request Sep 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked issues

Successfully merging this pull request may close these issues.

None yet