Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separate HasField into GetField and SetField #286

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

tysonzero
Copy link

@tysonzero tysonzero commented Oct 17, 2019

There are a variety of useful instances of getField that do not have a
corresponding implementation of setField.

These include records with invariants that setField could violate, like
TimeOfDay or AsciiString or Crossword. As well as various uses of
virtual fields that are completely impossible to set.

Due to dot-notation likely being built on top of getField, supporting this
increased flexibility is of significant importance.

Rendered

@simonpj
Copy link
Contributor

simonpj commented Oct 17, 2019

(The lnk "This proposal is discussed at this pull request." is broken.)

To me this proposal looks sensible. I too am keen on virtual fields.

I like the fact that SetField has a method hasField which is enough to do everything, but also specialised methods for setField and updateField. This is a bit like Ord, which has <= as well as the more general compare.

I'd like to see default methods for all three methods, and a MINIMAL pragma.

However I'm conscious that all this was discussed in the original SetField proposal. I have not re-read that disussion, but someone should, so that we don't reinvent the wheel here.

@effectfully
Copy link

effectfully commented Oct 17, 2019

Why not what I described here? I like this proposal more than the previous one, but it's just one tiny step away from supporting general setters that can look under Maybe, set multiple fields at once, etc.

@simonpj

However I'm conscious that all this was discussed in the original SetField proposal. I have not re-read that disussion, but someone should, so that we don't reinvent the wheel here.

I did that and I didn't find the counterarguments convincing. See my comment.

@ndmitchell
Copy link
Contributor

For the last proposal the biggest counter arguments against this idea were that the get and set could not be proven to be equivalent (much like Eq/Ord have a stated but unproven consistency condition).

I'm happy either way, assuming the question about when such an instance is in scope can be resolved. Note that making a record field allow both of them to exist would be compatible with today, so might be not necessary to do better.

@int-index
Copy link
Contributor

The proposal nicely includes various methods in the SetField class:

    setField :: a -> r -> r
    updateField :: (a -> a) -> r -> r
    hasField r = (\a -> setField @x a r, getField @x r)

I would propose adding one more, for effectful modification:

    updateFieldF :: Functor f => (a -> f a) -> r -> f r

With f = Identity, it gives back updateField. But it also can be used with f = Maybe, f = IO, etc.

Copy link
Contributor

@adamgundry adamgundry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm generally in support of this proposal.

Why not what I described here? I like this proposal more than the previous one, but it's just one tiny step away from supporting general setters that can look under Maybe, set multiple fields at once, etc.

@effectfully do I understand correctly that the change you'd like to see to this proposal is removing the GetField superclass of SetField, to permit set-only "fields" (which could include traversing Maybe)? We could keep GetField/SetField separate and (re)introduce another constraint HasField to represent having both, without too much difficulty. But given that such general setters don't arise from Haskell records (excluding partial fields, perhaps), I wonder if one is better off using a lens/optics library for such cases?

proposals/separate-get-set-field.md Outdated Show resolved Hide resolved
Should any of `hasField`, `setField`, `updateField` be dropped from `SetField`?
They are all mentioned currently for performance reasons, as I can imagine
situations where it seems like you need all 3 for optimal perf. But they can
all be defined in terms of each other, so perhaps some should be dropped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a very strong opinion about which representation for setters we use (cunningly-disguised van Laarhoven lenses included). The situations in which there is a runtime performance difference are fairly arcane.

But I am slightly concerned that if the SetField class is too large, the compiler might need to do a lot of work generating the dictionaries, reducing compile-time performance. Bear in mind that we need to do this for all fields of all record types GHC compiles. @simonpj is that a reasonable concern?

Copy link
Contributor

@phadej phadej Oct 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation wise dictionaries could be created at use-site on the fly (current HasField), or at record definition site (cf. selector functions).

Maybe even something one can control with some -f switch: if you go all in for overloaded fields machinery, at record definition site would be better. But for occasional use at use-site is probably fine to generate dictionaries on the fly.

proposals/separate-get-set-field.md Outdated Show resolved Hide resolved
## Alternatives

Separate out further into "get-only", "set-only", "get-0-or-more",
"get-and-set" etc. instead of only "get-only" and "get-and-set".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do people feel about doing something like this snippet or #158 (comment) to better support partial fields? The cost is a bit more implementation complexity including an extra type parameter on the classes, and GetField dictionaries no longer perfectly matching selector functions in the partial case, but the benefit is that we can define a sum-of-records and use it without the possibility of runtime crashes.

I generally agree that the GHC.Records classes should be tied fairly closely to Haskell's records, rather than supporting more general optics, as it is easy enough to convert into a lens/optics representation when more elaborate constructions are needed. But as partial selector functions are in Haskell 98 (and are a bit of a wart, arguably), perhaps it is worth improving their treatment here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd raise a flag, that a time example used would like to have both, set-only and get-only fields.

You can get day of a month from Day, and you can modify them in two ways (clip or rollover).

So with virtual fields IMO both read-only and modify-only are useful. There are cases (as above), where these cannot be combined into get-and-set, but then they should have different names: problem solved.

Note: get-only fields is not really any closer to "Haskell records" than set-only fields, IMHO.

Copy link
Author

@tysonzero tysonzero Oct 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think sum-of-records as currently implemented is a huge wart and I am ok with not really supporting it.

I would expect any reasonable implementation of sum-of-records to support the following:

data FooBar
    = Foo { foo :: Int, baz :: Int }
    | Bar { bar :: Int, qux :: Int }

fooBar :: FooBar -> Int
fooBar (Foo f) = f.foo + f.baz
fooBar (Bar b) = b.bar - b.qux

I am hoping that long term the above will be supported once anonymous rows/records come around. So in the short term I would personally prefer not to design features that would not be compatible with this approach.

sum-of-records will still be supported via NamedFieldPuns / pattern matching, which is compatible with having overlapping labels.

I would want support for sum-of-products to be in the form of some sort of proposal for sum's combined with this proposal, rather than trying to support them in this proposal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phadej

Could you fully explain what set-only fields you would expect for Day, ideally in the form of working code, because it is very much not clear to me how you would end up with a legal set-only field that couldn't be made into a get-and-set field.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hoping that long term the above will be supported once anonymous rows/records come around. So in the short term I would personally prefer not to design features that would not be compatible with this approach.

I don't think partial fields cause any additional problems here, do they? Even with a single constructor / total field there would potentially be an odd mismatch between

foobar :: Foo -> Int
foobar (Foo f) = f.foo + f.bar -- possible future

foobar f = f.foo + f.bar -- with HasField as currently envisaged

although in principle I think this is all solvable if we have a separate type for anonymous records, because we could imagine something like

data FooBar
    = Foo (Record { foo :: Int, baz :: Int })
    | Bar (Record { bar :: Int, qux :: Int })

foobar (Foo f) = f.foo + f.bar -- works using HasField "foo" (Record {...}) Int Total
foobar f       = f.foo + f.bar -- works using HasField "foo" FooBar Int Partial

Copy link
Author

@tysonzero tysonzero Oct 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was envisioning a separate type, so both of the first examples would work.

-- works using HasField "foo" FooBar Int Partial

Hmm I'm not so sure how that would work, since Record is an actual standalone type now, and thus it should be treated like any other type.

data FooBar
    = Foo FooType
    | Bar BarType

foobar :: FooBar -> ...
foobar f = f.foo

Under what circumstances would the above typecheck? You can't really do an "or" when it comes to instances, particularly not a clever "or" that gives you a different type when both match.

If the above was somehow made to work, it still seems to me as though the "sum" side of the "sum-of-records" should be handled using a sum-type proposal, and the "record" side would be handled by this proposal without consideration for the separately-handled "sum" side.

On a side note I think the complexity alone to support what I personally think is a substantial wart is enough for me to be against it.

Co-Authored-By: Adam Gundry <adam@well-typed.com>
@effectfully
Copy link

do I understand correctly that the change you'd like to see to this proposal is removing the GetField superclass of SetField, to permit set-only "fields" (which could include traversing Maybe)?

Yes.

But given that such general setters don't arise from Haskell records (excluding partial fields, perhaps)

  1. they do arise for records when several fields are of the same type
  2. partial fields are also important
  3. you can implement things like each and probably at and others

With general setters we get a quite more expressive system. My only concern is that some people might consider it too expressive.

We could keep GetField/SetField separate and (re)introduce another constraint HasField to represent having both, without too much difficulty.

Yes.

I wonder if one is better off using a lens/optics library for such cases?

Well, the user probably doesn't want to use the record syntax and switch to lenses (and rewrite everything) once a single Maybe is added.

@maralorn
Copy link
Contributor

do I understand correctly that the change you'd like to see to this proposal is removing the GetField superclass of SetField, to permit set-only "fields" (which could include traversing Maybe)?

Yes.

We could keep GetField/SetField separate and (re)introduce another constraint HasField to represent having both, without too much difficulty.

Yes.

I wonder if one is better off using a lens/optics library for such cases?

Well, the user probably doesn't want to use the record syntax and switch to lenses (and rewrite everything) once a single Maybe is added.

I agree, that having the possibility to implement _Just like accessors, would be really nice.

So this would mean GetField with getField and SetField with setField and updateField?

And then for performance reasons GetField, SetField => HasField? Which would have an empty MINIMAL statement but gives the possibility to supply a more performant hasField?

@simonpj
Copy link
Contributor

simonpj commented Oct 17, 2019

But I am slightly concerned that if the SetField class is too large, the compiler might need to do a lot of work generating the dictionaries, reducing compile-time performance. Bear in mind that we need to do this for all fields of all record types GHC compiles. @simonpj is that a reasonable concern?

Generally I'm not too bothered about this. Typically there will be one, fixed dictionary for each top-level declared record type/field combination. Unlike Eq a => Eq [a] there will be little or no dictionary transformation, construction. Just some fixed dictionaries.

@effectfully
Copy link

So this would mean GetField with getField and SetField with setField and updateField?

Just update, because updateField suggests only a single field can be updated, while it's the case that update can update multiple fields.

And then for performance reasons GetField, SetField => HasField? Which would have an empty MINIMAL statement but gives the possibility to supply a more performant hasField?

When it comes to pure updates, update is already performant. For effectful updates (i.e. the lens use case) we probably do need an additional class.

@maralorn
Copy link
Contributor

maralorn commented Oct 18, 2019

So this would mean GetField with getField and SetField with setField and updateField?

Just update, because updateField suggests only a single field can be updated, while it's the case that update can update multiple fields.

I‘d really want this feature, but I think it shouldn‘t come at the cost of the default use case. Here I assume that the default use case is someone implementing anything which they think of as a "real" field by implementing a setField and getField.
If the SetField class does not rely on GetField then I don‘t see how it could implement a default implementation of update.
So as a user of these type classes I would probably feel a bit confused, that first I don‘t need to provide GetField for SetField but will probably end up using getField for my update definition anyways, which I wouldn’t even want to implement because I have probably already defined setField and getField.

Any ideas how to work around this?

@effectfully
Copy link

effectfully commented Oct 18, 2019

@maralorn

I‘d really want this feature, but I think it shouldn‘t come at the cost of the default use case. Here I assume that the default use case is someone implementing anything which they think of as a "real" field by implementing a setField and getField.
If the SetField class does not rely on GetField then I don‘t see how it could implement a default implementation of update.

Via default, see the code from this comment.

@effectfully
Copy link

effectfully commented Oct 18, 2019

I‘d really want this feature, but I think it shouldn‘t come at the cost of the default use case. Here I assume that the default use case is someone implementing anything which they think of as a "real" field by implementing a setField and getField.

But note that this is not the primary use case. The primary use case is to derive efficient getField, setField and update automatically. And even when you want to write your own getter and setter with the same name for a field (I don't think there were examples of that in this and the previous threads), defining getField and update is better performance-wise than defining getField and setField.

@tysonzero
Copy link
Author

tysonzero commented Oct 18, 2019

@effectfully

  1. they do arise for records when several fields are of the same type

I don't personally see multiple fields of the same type as a "field", and GHC is also not going to derive these, so they would have to solely be virtual.

This seems like something for a lens library/class to deal with, and not a (Set|Get)Field class.

  1. partial fields are also important

Are they? I can't imagine all that many are used, since for decades now they have generated top level functions that simply crash on incorrect input. Personally I am hoping long term that sum-of-records is done using a sum of anonymous records, which would bring us back to having total fields.

  1. you can implement things like each and probably at and others

Similar to above, seems like more of a lens thing.

To be clear I am not in any way against lens, i'm a big user of it. It just doesn't seem like these proposals are about embedding lenses into GHC, but about making interaction with records/fields nicer.

Particularly since all these proposals will make things nicer for the existing lens libraries, now that they don't have top level generated functions to avoid name collisions with. They can also utilize these classes for convenient field lenses that don't have to be defined in advance (often with TemplateHaskell).

Well, the user probably doesn't want to use the record syntax and switch to lenses (and rewrite everything) once a single Maybe is added.

It seems like you would still have to do that regardless once a single Either is added (or any other non-trivial sum type), as how would you decide which constructor to tunnel into?

Tunneling through multiple layers where some of them are sum types seems like a job for lens or other combinators, and not the goal of a SetField class.

@maralorn

I agree, that having the possibility to implement _Just like accessors, would be really nice.

I agree, but I don't see it as something we should be asking GHC.Records to implement. I see it as something we should be asking GHC.Variants to implement via a proposal aimed at tackling variants and constructor name collisions.

I am fleshing out such a proposal, as it would be nice to have (although in our codebase records and field name collisions are far more frequent), it will probably involve OverloadedLabels amongst other things.

@effectfully
Copy link

effectfully commented Oct 18, 2019

I don't personally see multiple fields of the same type as a "field"

Me neither.

This seems like something for a lens library/class to deal with, and not a (Set|Get)Field class.

I do not propose to have a SetField class.

partial fields are also important

Are they?

Yes, see the Maybe use case.

It seems like you would still have to do that regardless once a single Either is added (or any other non-trivial sum type), as how would you decide which constructor to tunnel into?

You can specify the constructor explicitly as in

instance s ~ s' => HasUpdate "just" (Maybe s) s' where
    update _ = fmap

To be clear I am not in any way against lens, i'm a big user of it. It just doesn't seem like these proposals are about embedding lenses into GHC, but about making interaction with records/fields nicer.

What I propose makes interaction with records/fields nicer and costs nearly nothing. So do you want to avoid the additional expressiveness just for the sake of it?

@tysonzero
Copy link
Author

tysonzero commented Oct 18, 2019

I do not propose to have a SetField class.

Hmm, well given that this is in the GHC.Records module and all these classes are going to be derived for Haskell records. I think it makes sense to have the naming and design be based around Field, even if some choose to use them for other purposes.

I also think Set is pretty reasonable even for updating, particularly since that is how lens does it.

What I propose makes interaction with records/fields nicer and costs nearly nothing. So do you want to avoid the additional expressiveness just for the sake of it?

I mean if the ask is solely to have:

class GetField (x :: k) r a | x r -> a where
    getField :: r -> a

class SetField (x :: k) r a | x r -> a where
    setField :: a -> r -> r
    updateField :: (a -> a) -> r -> r

class (GetField x r a, SetField x r a) => HasField (x :: k) r a | x r -> a where
    hasField :: r -> (a -> r, a)

Then I am not particularly opposed to it. I personally will continue using lens/optics for _Just and _Left and so on.

With that said I definitely do not want to use the complex Partial/Total stuff. It's cool, but it also seems far too complex for the payoff.

I would personally love to have classes based around lens/optics, where the s and the optic name gives you the appropriate optic. But it doesn't seem like something that GHC.Records or even GHC.Variants would be in charge of.

@effectfully
Copy link

I mean if the ask is solely to have: ...

Yes, just rename SetField to HasUpdate, because this general update is inevitably going to be used for updating several fields, since the type signature doesn't preclude this use case. Or at least a weirdly sounding HasSet. SetField is a misleading name for something that can be a traversal.

@tysonzero
Copy link
Author

Since this is module GHC.Records and it's specifically designed around interacting with records fields, I really do prefer SetField.

I also think HasUpdate doesn't clearly convey the idea that this is for some specific "field name", instead of some single un-named update for the whole type. In other words I would be expecting something like:

class HasUpdate a where
    type UpdateT a
    update :: UpdateT a -> a -> a

For consistency reasons alone GetField + HasUpdate + HasField would bother me a lot.

HasNamedGetter + HasNamedSetter + HasNamedField` would be consistent and avoid any ambiguity about being named, but it's kind of a mouthful.

@effectfully
Copy link

I also think HasUpdate doesn't clearly convey the idea that this is for some specific "field name", instead of some single un-named update for the whole type.

Yes, because it can be used to update the whole type. It's what the type signature of update says.

class HasUpdate a where
    type UpdateT a
    update :: UpdateT a -> a -> a

This doesn't look composable to me.

I do not care much about naming. HasGetter & HasSetter, Gettable & Settable or even the misleading GetField & SetField all would be fine to me as long as general setters are supported.

@maralorn
Copy link
Contributor

I mean if the ask is solely to have:

class GetField (x :: k) r a | x r -> a where
    getField :: r -> a

class SetField (x :: k) r a | x r -> a where
    setField :: a -> r -> r
    updateField :: (a -> a) -> r -> r

class (GetField x r a, SetField x r a) => HasField (x :: k) r a | x r -> a where
    hasField :: r -> (a -> r, a)

Then I am not particularly opposed to it.

Well then at least three people in this thread could agree on that.

The question is if there is a better possibility to achieve the usage obtained by using updateField for something like Maybe. I don‘t see SetField implementations for sum-types in base. So if one wants to use it, there will be orphan instances for SetField, which seems like something that wouldn’t find widespread adaptation.

On the other hand I don‘t see any huge cost related to the extended flexibility.

@tysonzero
Copy link
Author

The question is if there is a better possibility to achieve the usage obtained by using updateField for something like Maybe. I don‘t see SetField implementations for sum-types in base. So if one wants to use it, there will be orphan instances for SetField, which seems like something that wouldn’t find widespread adaptation.

That's why I think a separate proposal should be built around sum/variant-types and constructors, leaving this one focused on records/fields.

I could imagine wanting things like # syntax for constructors, and various prism-like features.

I could even imagine a third proposal based around the inherent duality of records and variants, with things like deriving the ability to exhaustively pattern match on a variant using a record of the same structure.

On the other hand I don‘t see any huge cost related to the extended flexibility.

Yeah I think set-only fields will have minimal benefit, but since the cost is low, I am personally open to the idea if that's what the community wants.

@maralorn
Copy link
Contributor

So, what’s the way forward with this proposal? Leave it as is or change it to decouple SetField and GetField? I am a still a bit hazy on what updateField is supposed to mean in the absence of a GetField instance. Is it something like "The update function will only be called when the implementation can give a useful previous value of the virtual field"?

@tysonzero
Copy link
Author

tysonzero commented Oct 28, 2019

@maralorn

I'm honestly not super sure.

If these classes were built on top of a shared type family as follows:

type family FieldType (x :: k) r

class GetField (x :: k) r where
    getField :: r -> FieldType x r

class SetField (x :: k) r where
    setField :: FieldType x r -> r -> r

I would feel a lot more comfortable with separate get and set classes. (It would also get rid of my need for UndecidableInstances and TypeSynonymInstances in various places, but that's a different discussion).

This would also make it much nicer to build new hierarchies alongside the existing ghc-assisted classes, as you can be sure that all the new classes agree on the field types, regardless of the differences in what functionality is supported by them.

It also seems like the above would make it much easier to derive lenses/optics on the fly, as you wouldn't have to worry what kind of traversal/lens to derive if GetField and SetField disagreed on the underlying type.

As it stands I'm still not a fan of the idea of splitting up the classes, as the complexity and risk of divergence out-weigh the benefit which seems to be limited.

@effectfully
Copy link

@tysonzero

If these classes were built on top of a shared type family as follows: <...>

Looks great to me. I like this version more than mine. So the updated code is (not tested)

type family FieldType (x :: k) r

class GetField x s a where
    getField :: FieldType x s -> s -> a

class HasUpdate x s where
    update :: FieldType x s ~ a => Proxy# x -> (a -> a) -> s -> s
    default update :: (GetField x s, FieldType x s ~ a) => Proxy# x -> (a -> a) -> s -> s
    update x f s = setField x (f $ getField x s) s

    setField :: Proxy# x -> FieldType x s -> s -> s
    setField x = update x . const

type instance FieldType "just" (Maybe s) = s

-- No 'getField' possible.
instance HasUpdate "just" (Maybe s) where
    update _ = fmap

data Twice a = Twice a a

type instance FieldType "twice" (Twice a) = a

-- No 'getField' possible.
-- (We could have a canonical instance for any 'Functor' and use @DerivingVia@)
-- (Or even just define generic @instance Functor f => HasUpdate "mapped" (f a)@)
instance HasUpdate "twice" (Twice a) where
    update _ f (Twice x y) = Twice (f x) (f y)

(although I'd rename FieldType to ElemType)

As it stands I'm still not a fan of the idea of splitting up the classes, as the complexity and risk of divergence out-weigh the benefit which seems to be limited.

Well, that proposal is literally called "Separate HasField into GetField and SetField". The benefits are substantial to me.

risk of divergence

What do you mean?

@tysonzero
Copy link
Author

tysonzero commented Oct 29, 2019

@effectfully

(although I'd rename FieldType to ElemType)

Hmm interesting.

One thing I was planning on deferring until later proposals is some stuff related to prisms and constructors. But it is seeming quite relevant to the current discussion, so I will bring it up now.

It seems clear to me that overloaded product-types / fields / lenses are useful, it also seems clear to me that overloaded sum-types / constructors / prisms are useful.

From there we have at least two choices:

Unified hierarchy

In this case we have a single type family that is used for both constructors and fields:

type family OpticType (x :: Symbol) s

type instance OpticType "_1" (a, b) = a
type instance OpticType "_2" (a, b) = b
type instance OpticType "just" (Maybe a) = a
type instance OpticType "left" (Either a b) = a
type instance OpticType "right" (Either a b) = b

class GetField (x :: k) s where
    getField :: s -> OpticType x s

class CallConstructor (x :: k) s where
    construct :: OpticType x s -> s

One advantage of this is that it is more unified. Constructors are reversed fields and vice versa. So x.foo goes in one direction, and #foo x goes in other other direction.

However the naming might be hard when something is both a field and a constructor (newtypes/isomorphisms), such as x.identity and #identity x vs x.runIdentity and #runIdentity x.

Stratified hierarchy

In this case we have two type families, one for constructors and one for fields:

type family FieldType (x :: Symbol) s

type instance FieldType "_1" (a, b) = a
type instance FieldType "_2" (a, b) = b

type family ConstructorType (x :: Symbol) s

type instance ConstructorType "just" (Maybe a) = a
type instance ConstructorType "left" (Either a b) = a
type instance ConstructorType "right" (Either a b) = b

class GetField (x :: k) s where
    getField :: s -> FieldType x s

class CallConstructor (x :: k) s where
    construct :: ConstructorType x s -> s

One advantage of this is it allows for variants to define passthrough instances for fields, which would otherwise cause problems with overlapping type families:

There is the following style of set-only fields that only set when the constructor matches:

instance SetField x a => SetField (Maybe a)

There is also the following style with get-and-set fields that work when multiple constructors share a field:

data Cat = Cat
    { name :: String
    , ...
    }

data Dog = Dog
    { name :: String
    , ...
    }

data Animal = Cat Cat | Dog Dog

instance GetField "name" Animal

instance (GetField x a, GetField x b) => GetField x (Either a b)

You could also do the dual of the above (records that define pass through instances for constructors), although it doesn't seem as useful.


Both of these options have significant merit, and it's not a trivial decision.

I probably lean towards stratified, since most will probably consider the "foo field" and the "foo constructor" of a type to be different, and this is really all about naming and namespacing. Plus we have two different syntaxes for the two things, so two different hierarchies seems to fit pretty seamlessly into that, .foo for field stuff and #foo for constructor stuff.

I would classify current Haskell as stratified, due to all constructors having an uppercase first letter and all fields having a lowercase first letter.

@tysonzero
Copy link
Author

tysonzero commented Nov 3, 2019

risk of divergence

What do you mean?

I would like to rule out the possibility of GetField and SetField diverging on what they believe a given field type to be, with separate FD-based classes there is nothing preventing them from diverging.

With a single type family this problem is avoided. This should also simplify things like building lenses and such on top of them, as well as various user-defined expansions to the hierarchy.

Some might argue that they intentionally want divergence to increase flexibility, such as getting a field returning it in some context, (IO Int instead of Int for a mutable type perhaps). But if we are going to go down that route, we need even more flexibility:

type family GetField (x :: k) r
type family SetFieldInput (x :: k) r
type family SetFieldOutput (x :: k) r

to support:

getMutableFoo :: FooBar -> IO Foo
setMutableFoo :: Foo -> FooBar -> IO ()

Personally this causes me to want to stick to focusing on things that act like pure records, and thus keep the field types used in GetField and SetField in sync.

With the above said I think I am on the side of splitting up GetField and SetField as long as there is a type family involved to keep them together. I have thought of some types that might benefit from set-only fields:

newtype Set a = Set
    { tree :: AVLTree a -- set only
    }

newtype CaseInsensitive = CaseInsensitive
    { text :: Text -- set only
    }

data User = User
    { passwordHash :: Hash
    ...
    }

instance SetField "password" User where
    type FieldType "password" User = Text
    ...

For get-only fields it was about protecting invariants. For set-only fields it is about protecting information that would break the abstraction (or that is impossible to retrieve).

It's fine to convert a (valid) AVL tree into a Set, but it exposes implementation details to let you convert back out. In the Person case you physically cannot retrieve the field that you set, because it has already been hashed away.

@effectfully
Copy link

I would like to rule out the possibility of GetField and SetField diverging on what they believe a given field type to be, with separate FD-based classes there is nothing preventing them from diverging.

With a single type family this problem is avoided.

Ah. You elaborated that in the previous message, but then ended it with

As it stands I'm still not a fan of the idea of splitting up the classes, as the complexity and risk of divergence out-weigh the benefit which seems to be limited.

so I thought that this "still not a fan" meant there were additional concerns.

For with get-only fields it was about protecting invariants. For set-only fields it is about protecting information that would break the abstraction (or that is impossible to retrieve).

It's fine to convert a (valid) AVL tree into a Set, but it exposes implementation details to let you convert back out. In the Person case you physically cannot retrieve the field that you set, because it has already been hashed away.

Great examples.

I do like the idea of using a single type family to retrieve the type of a field in both a getter and a setter. It might prove restrictive, but in this case we always can directly extend the functionality later.

@tysonzero
Copy link
Author

@effectfully

One slight complication is that the above examples only support set and not update. Typically SetField/Setter support both update and set, to fit with Functor in the hierarchy.

@effectfully
Copy link

effectfully commented Nov 5, 2019

@tysonzero good point. I wouldn't be against having separate Getter, Setter and Updater classes with Updater having a default implementation in terms of Getter and Setter; and Setter having a default implementation in terms of Updater. We already have in Prelude weird all-in-one classes like Num, so I don't think that some granularity would be a bad thing.

Some fields are get-only, some fields are set-only, some fields can be updated. Those are distinct use cases and it makes sense to me that distinct use cases are handled by distinct type classes.

@tysonzero
Copy link
Author

@effectfully

Would that be:

class GetField
class SetField
class (GetField, SetField) => HasField

or:

class GetField
class SetField
class SetField => UpdateField
class (GetField, UpdateField) => HasField

The former seems like it might be more practical?

Unfortunately the former also means essentially skipping over Setter in lens and Functor, as you no longer have the ability to update/map without also needing to support get.

@effectfully
Copy link

@tysonzero I'd opt for 2 as it introduces a class that handles those problems that the other classes don't. I think it would be a good design (modulo the fact that I do not consider any solution that doesn't have a story for polymorphism a real solution).

@ndmitchell
Copy link
Contributor

None of the steps seem unreasonable, but I'm a little concerned we've invented a big tower of type classes, without anyone saying they actually intend to use them. There are lens people, who don't want to use them because they don't express prisms. There is @effectfully who seems reluctant because of polymorphism. The two variant doesn't seem unreasonable, and would be used by @tysonzero. The hasField partial efficient update isn't going to be exposed by the record proposal as it stands - do we imagine someone else is going to use it? What about the Update thing?

I'm concerned about making something complex, people only using the simpler variant, but those people who implement or use the simpler variant having to grapple with additional complexity. I'd really like to hear from someone saying "I would actually use UpdateField in my code all the time."

@tysonzero
Copy link
Author

tysonzero commented Nov 7, 2019

@ndmitchell

It's unfortunate that Haskell (along with every other mainstream language) makes it so hard to make changes like this in a fully backwards compatible manner.

I would prefer to just start out with the type family stuff and Get + Get => Set stuff, but I understand if @effectfully would be skeptical due to the possibility that it stops us from going further due to backwards compatibility.

The . notation is so nice that I want to be able to use it in as many appropriate situations as possible, so getField being as general as possible is of high importance to me. Similarly I wish number desugaring has it's own class separate from the other operations, and very much like that string desugaring does have its own class.

@ndmitchell
Copy link
Contributor

@tysonzero it seems a sad argument that we must go for maximum generality, not because we think it's useful, but because we can't prove its not useful and we only get one go. (Not saying that's a counter argument, merely that it seems unfortunate.)

@tysonzero
Copy link
Author

@ndmitchell

Honestly I really don't have a strong opinion on how much further to break up the hierarchy, or if the one I proposed is good enough. I adjusted the proposal to use a type family to make things a little nicer and to make future extensible / adjustments cleaner.

What is the best thing to do at this point? Could I defer the decision about this to the committee, or perhaps we can shoot for the proposal as is and decide later whether or not to further split things up?

@adamgundry
Copy link
Contributor

I adjusted the proposal to use a type family to make things a little nicer and to make future extensible / adjustments cleaner.

In general, I feel that the proposal could do with more motivation or justification for the proposed design, and explanation of the trade-offs involved. Using a type family instead of a functional dependency is not a trivial change, and merits justification in the text. In particular see the discussion starting here on the previous proposal for an exploration of the issues involved, which resulted in a decision to keep using functional dependencies. It's conceivable that the decision should be changed, but at the very least the proposal should be explicit about what is being changed and why.

This change is also more-or-less orthogonal to splitting HasField into two classes. While I'm not sure making yet another separate proposal would really help matters, it should at least be clear that the committee could reasonably accept one change without the other.

@tysonzero
Copy link
Author

tysonzero commented Dec 8, 2019

Using a type family instead of a functional dependency is not a trivial change, and merits justification in the text.

Fair enough!

Part of the motivation is that it allows library code to re-use this notion of the type of a field without having to declare GetField as a superclass. This way if someone wants to support something like IO-based fields or Maybe-based fields, they can still re-use FieldType.

The other motivation is that when playing around I have found FunctionalDependencies quite annoying and weird to work with, particularly when using more interesting field types.

Something quite uncontroversial and clearly terminating in TypeFamily form like:

type instance FieldType x (Lenses r f) = LensLike' f r (FieldType x r)

becomes the following:

instance GetField x r a => GetField x (Lenses r f) (LensLike' f r a)

Which requires TypeSynonymInstances, FlexibleInstances and UndecidableInstances, despite the fact that it's really not trying to encode anything particularly fancy. It's also not obvious at a glance that this code is reasonable to write, as it would be extremely unreasonable to write were it not for the FunctionalDependency.

The general rule of "instance heads (type family or class) should be simple and non-overlapping, outputs of type families and instance constraints can be complicated and overlap" no longer applies.

As I brought up in #279 FlexibleInstances can lead to incoherence without warning even with -Worphan enabled, due to the complexity of defining "ownership" of complicated instance heads. So I am a big fan of changes that reduce the complexity of instance heads, and push it elsewhere into safer locations (like the output of a TypeFamily).

@Jashweii
Copy link

Jashweii commented Apr 5, 2020

I don't know if this is a new idea, but as an alternative, we can get rid of SetField completely, and use GetField as SetField, by using a wrapper type whose virtual fields are partially applied setters.
It would need some sort of convention for obtaining the wrapper, e.g. a special (but not magic) getField parameter. For generic set, you constrain that it has this wrapper virtual field, and that the wrapper virtual field has the setter you want. You can write read only fields and set only fields.
If we could have polymorphic functions on the RHS of fundeps, you could use it for polymorphic update as well without changing the class.

-- OverloadedLabels, fromLabel = getField for comments
data Person a = Person String Int a deriving (Show)
instance HasField "name" (Person a) String where getField (Person s _ _) = s
instance HasField "age"  (Person a) Int    where getField (Person _ i _) = i
instance HasField "data" (Person a) a      where getField (Person _ _ d) = d
-- #name p
newtype Person_With a = Person_With (Person a)
instance HasField "name" (Person_With a) (String -> Person a) where getField (Person_With (Person _ i d)) s = Person s i d
instance HasField "age"  (Person_With a) (Int    -> Person a) where getField (Person_With (Person s _ d)) i = Person s i d
instance HasField "data" (Person_With a) (a      -> Person a) where getField (Person_With (Person s i _)) d = Person s i d
-- #name (Person_With p) "new name"
instance HasField "with" (Person a) (Person_With a) where getField = Person_With
-- #name (#with p) "new name"
-- instance HasField "data" (Person_With a) (forall b . b -> Person b) where getField (Person_With (Person s i _)) d = Person s i d
-- -- what we really want

The same idea could be used with other things, e.g. field names, metadata, lenses, traversals, whatever. Edit - this would also work:

data Setter a
instance HasField (Setter "name") (Person a) (String -> Person a) where getField (Person _ i d) s = Person s i d
-- Setter but you know the type you're supplying
data Setter' a b
instance HasField (Setter' "data" b) (Person a) (b -> Person b) where getField (Person s i _) d = Person s i d
-- getField @(Setter "name")
-- setField @x = getField @(Setter x)
-- setField' @x (y :: a) r = getField @(Setter' x a) r y

@adamgundry
Copy link
Contributor

@tysonzero what's the status of this proposal?

The current plan for HasField has run into some compile-time performance problems (https://gitlab.haskell.org/ghc/ghc/-/merge_requests/3257#note_312597). Partly for performance reasons, @arybczak and I have been discussing splitting it into two classes roughly as in this proposal, although my inclination would be to go for functional dependencies over type families, and independent single-method classes for getting and setting. I could draft a new proposal for this, or would you be interested in us somehow reviving the current proposal?

@tysonzero
Copy link
Author

tysonzero commented Nov 14, 2020

I don't really have a strong stance on whether to keep it as is or to make GetField and SetField independent. Besides that I support this proposal as-is. So I guess the status is "waiting on someone with more authority than me to approve it or suggest changes".

I still would much prefer TypeFamilies. The LensLike example I gave isn't a contrived one, it's one I was planning on using in production. I also stand by my statement that instance heads should be simple and clearly non-overlapping, which the FD approach makes impossible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants