Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GHC.Variants module to mirror GHC.Records #510

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

tysonzero
Copy link

@tysonzero tysonzero commented May 24, 2022

Recently the GHC.Records module was introduced for interacting with records without
the problem of field name conflicts.

Since Variants and Records are duals, it is natural to have a matching GHC.Variants
module for interacting with variants without the problem of constructor name conflicts.

Rendered

@tbidne
Copy link

tbidne commented May 24, 2022

As mentioned this prevents #foo from being used as a field selector, with .foo being the preferred alternative.

Just to be clear, we would not be able to say both view #runIdentity (Identity 7) and preview #Just (Just True) in the same module?

I ask because I am a big fan of optics + OverloadedLabels + DuplicateRecordFields to solve the problem of identical field names, and I have long wanted the equivalent for constructor names. From your statement, it seems we cannot have both at the same time?

I understand if there is a fundamental limitation here. I just want to make sure I am not missing something.

@tysonzero
Copy link
Author

@tbidne

This proposal only affects the (->) instance of IsLabel, so it does not shut the door on that possibility.

If you want to re-use the existing auto-derived instance instead of declaring any yourself, then you will have to come up with some strategy for delegating between GetField and HasConstructor. One option could be having the instance used depend on whether the first character is uppercase or lowercase.

Personally I would prefer to go in a different direction that would give us view .runIdentity (Identity 7) and preview #Just (Just True):

  • Have .foo/(.foo) resolve to a class equivalent to IsLabel instead of being hardcoded to (\x -> x.foo)

  • Have the (->) instance of this new class resolve to GetField for full backwards compatibility.

  • Allow naked fields without parenthesis.

@goldfirere
Copy link
Contributor

Interesting proposal. One aspect I like about this is that it uses a type family to retrieve the constructor type, instead of relying on a functional dependency.

I'm not convinced about the IsLabel instance -- seems to squat in too contentious a space. For example, it's easy to imagine an author of some embedded DSL to want to use a similar instance for different purposes. Furthermore, I don't see concretely what it gains, given that #Left could always just be written as Left instead. (Actually, it's worse than that: what if two Lefts are in scope?)

@phadej
Copy link
Contributor

phadej commented May 25, 2022

OverloadedLabels parser doesn't allow Upperletter starting labels.

% ghci-9.2.2
GHCi, version 9.2.2: https://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /home/phadej/.ghci
Prelude> :set -XOverloadedLabels 
Prelude> :t #Foo

<interactive>:1:1: error: parse error on input ‘#’

Relaxing this would make sense, and probably can be justified standalone (i.e. optics or lens could benefit from it today), i.e. in a separate proposal.

@phadej
Copy link
Contributor

phadej commented May 25, 2022

I think this proposal is a step in a wrong direction.

Look at LabelOptic class, for example, its monomorphic version for simplicity:

class LabelOptic name k s a | name s -> k a where
   ...

Here the structure s and name tells what kind of an optic k the name gives name to.
It can be a lens (HasField) or a prism (variant) or just getter etc.

I don't see any use case for allowing the same name to act as field and constructor,
as this proposal would allow. (Also GHC Haskell tries to get rid of other kinds of name punning).

So I'd rather prefer seeing a proposal extending HasField representation
to allow getters, lenses, prisms, affine traversals etc structures.
(HasField won't be a good name if prisms are included, but that's a small thing).

@tysonzero
Copy link
Author

tysonzero commented May 25, 2022

@goldfirere

Personally I like TypeFamilies more than FunDeps in general, hence why I'm trying to make HasField/GetField also be TypeFamily based, justified by my comment here.

As for the IsLabel instance, I actually think this would be nice for EDSL developers, as now they get constructor instances for free, but this proposal doesn't stop them from writing their own instances for HasConstructor S and such to get the exact same thing they got before with IsLabel (A -> S). The only case that this would get in the way of is if they created a type with a Foo constructor but wanted a "Foo" instance that didn't call Foo. This is the exact same as the situation with .foo from OverloadedRecordDot.

As for two Lefts being in scope, that's actually one of the primary things this proposal is addressing. The following would "just work":

foo :: Either Int Bool
foo = #Left 3

bar :: Either3 Int Bool Char
bar = #Left 5

Whereas before you'd need Left3 or similar.

For the specific case of things like Just and Left I doubt people would choose to use the new syntax, since they are such common concepts that they deserve a top level name, and lens and similar already generate prisms for you.

However if you look at the situations mentioned in the motivation section (or just at our company's codebase lol), there are a whole bunch of variant/enum-y types that would normally have short constructor/enum names (data Medium = Phone | Email | ...) but end up with ugly longer ones (data Medium = MediumPhone | MediumEmail | ...) to avoid colliding, and in the process usually uglifying their serialization if you don't manually define the instances.

@tysonzero
Copy link
Author

@phadej

I'll copy comment from here weighing those two options:


It seems clear to me that overloaded product-types / fields / lenses are useful, it also seems clear to me that overloaded sum-types / constructors / prisms are useful.

From there we have at least two choices:

Unified hierarchy

In this case we have a single type family that is used for both constructors and fields:

type family OpticType (x :: Symbol) s

type instance OpticType "_1" (a, b) = a
type instance OpticType "_2" (a, b) = b
type instance OpticType "just" (Maybe a) = a
type instance OpticType "left" (Either a b) = a
type instance OpticType "right" (Either a b) = b

class GetField (x :: k) s where
    getField :: s -> OpticType x s

class CallConstructor (x :: k) s where
    construct :: OpticType x s -> s

One advantage of this is that it is more unified. Constructors are reversed fields and vice versa. So x.foo goes in one direction, and #foo x goes in other other direction.

However the naming might be hard when something is both a field and a constructor (newtypes/isomorphisms), such as x.identity and #identity x vs x.runIdentity and #runIdentity x.

Stratified hierarchy

In this case we have two type families, one for constructors and one for fields:

type family FieldType (x :: Symbol) s

type instance FieldType "_1" (a, b) = a
type instance FieldType "_2" (a, b) = b

type family ConstructorType (x :: Symbol) s

type instance ConstructorType "just" (Maybe a) = a
type instance ConstructorType "left" (Either a b) = a
type instance ConstructorType "right" (Either a b) = b

class GetField (x :: k) s where
    getField :: s -> FieldType x s

class CallConstructor (x :: k) s where
    construct :: ConstructorType x s -> s

One advantage of this is it allows for variants to define passthrough instances for fields, which would otherwise cause problems with overlapping type families:

There is the following style of set-only fields that only set when the constructor matches:

instance SetField x a => SetField (Maybe a)

There is also the following style with get-and-set fields that work when multiple constructors share a field:

data Cat = Cat
    { name :: String
    , ...
    }

data Dog = Dog
    { name :: String
    , ...
    }

data Animal = Cat Cat | Dog Dog

instance GetField "name" Animal

instance (GetField x a, GetField x b) => GetField x (Either a b)

You could also do the dual of the above (records that define pass through instances for constructors), although it doesn't seem as useful.


Both of these options have significant merit, and it's not a trivial decision.

I probably lean towards stratified, since most will probably consider the "foo field" and the "foo constructor" of a type to be different, and this is really all about naming and namespacing. Plus we have two different syntaxes for the two things, so two different hierarchies seems to fit pretty seamlessly into that, .foo for field stuff and #foo for constructor stuff.

I would classify current Haskell as stratified, due to all constructors having an uppercase first letter and all fields having a lowercase first letter.

@phadej
Copy link
Contributor

phadej commented May 25, 2022

My previous comment was misunderstood, but I don't know how to clarify. Feel free to ignore.

The only clue I can add: think about setters, both prisms (constructors) and lens (fields) has an update operation. How (in either of approaches you propose) GHC could pick up which class to use (instance resolution doesn't backtrack).

update :: (??? name s a) => (a -> a) -> s -> s

@tysonzero
Copy link
Author

tysonzero commented May 25, 2022

@phadej

In the case of the stratified hierarchy you wouldn't have update, you'd have updateField and updateConstructor. Now you could always manually define SetField instances for constructors and call updateField on them, but it wouldn't be automatic. You could also create a third party HasUpdate class and define helpers for converting from SetField and SetConstructor to it.

The benefit you get from not allowing such an update, is that you can easily get things like the following:

updateField @"name" ("Dr " <>) (Just $ Person "bob") = Just $ Person "Dr bob"

getField @"name" (Left $ Person "bob" :: Either Person Group) = "bob"

Which you will not get without a manual instance in the unified hierarchy case, and I assume not in the case of whatever you had in mind (that seemingly isn't the same as the unified hierarchy?)

You also avoid confusing things like @"runIdentity" and @"Identity" being literally the same, despite foo.Identity and #runIdentity # 5 both looking very out of place.


As for the unified hierarchy that function is easy to give a type sig for:

update :: HasSetter name s => (OpticType name s -> OpticType name s) -> s -> s

I realize I used GetField and CallConstructor above so the name HasSetter may seem out of place, I'd probably go with HasGetter and HasConstructor instead in the unified hierarchy case.

You would also get instances of HasSetter freely for both fields and constructors, since they would never collide by virtue of the casing of the first character.

@phadej
Copy link
Contributor

phadej commented May 25, 2022

How

getField @"name" (Left $ Person "bob" :: Either Person Group) = "bob"

would works? Is there some manual HasField instance for Either a b? That's an intrusive change to require such.
In lens / optics world we'd use a combinator in between, i.e. view (both1 % #name) (Left $ Person "bob" :: Either Person Group) and it would work.

I'm still not convinced what you propose is in a right direction.

@phadej
Copy link
Contributor

phadej commented May 25, 2022

I also think that open type families are a misfeature. If the choice is made to move to use TFs, they should be in a type-class. In stratified variant that is straightforward, in unified one there should be a superclass.

If constrained type families paper were implemented in GHC, the type errors of undefined instance would mention the non-existing instance, and not stuck ConstructorType inequality somewhere.

This is ATM my argument against type families: Open type families should always be constained, and GHC should be taught to report unresolved instance, and not mention the stuck type family. Otherwise the errors are terrible.

@tysonzero
Copy link
Author

tysonzero commented May 25, 2022

@phadej

Yeah it would involve a manual Either a b instance, which is of course not in the proposal itself. It was more just highlighting that such a thing was possible with stratified, while fully ruled out with unified.

To be honest I'm flipping back and forth a bit on the unified vs stratified. Although I'd absolutely take either over nothing.

The Identity/runIdentity (newtypes in general that is) example is weirding me out but I guess you could resolve it mentally by thinking of the following four situations:

x.foo -- calling the foo field
x.Foo -- going under the Foo constructor
#foo x -- reversing the foo field
#Foo x -- calling the Foo constructor

My motivation on the stratified side is the idea that I was assuming people mentally think of constructors and fields in different namespaces, with different syntax .foo vs #Foo to highlight that difference in namespace. However with the above mental model I could be sold on going the unified route.

I don't have a super strong preference on associated vs unassociated open type families, I just really don't like FunDeps because they make instance heads way too complex and require UndecidableInstances in perfectly reasonable situations (that I want to use in prod, not contrived stuff). So I guess I'm curious what others think.

Were you thinking to more or less replicate a good chunk of the lens hierarchy with the unified route? That's more or less what I was thinking, although with some mostly-naming tweaks to make it more appropriate, for example I'd prefer HasConstructor or similar to HasReview, and probably something different than HasFold for a MatchConstructor equivalent.

@adamgundry
Copy link
Contributor

Thanks for the proposal and for prompting this discussion @tysonzero!

I've been thinking about HasField on and off for a while, and gradually working on a very long and very drafty proposal. My current position, summarised:

  • The definition of HasField should mostly remain as-is; new features should be in separate classes.
  • Getters and setters should be orthogonal (i.e. no superclasses).
  • The existing restrictions on which instances of HasField are allowed to be user-defined should be relaxed (@ocharles and I have been working on a proposal to this effect). A consequence of this is that HasField can in general admit incoherent instances.
  • HasField should support record pattern synonyms directly (at least for fields appearing in a single pattern synonym; fields in multiple pattern synonyms are trickier). This also requires incoherence.
  • We should make SetField permit type-changing update, using "dysfunctional" dependencies (see DYSFUNCTIONAL per-instance pragma for selective lifting of the coverage condition #374).
  • We should use functional dependencies over type families because they give simpler types/errors, and they allow the previous points (incoherent instances and dysfunctional dependencies) without the wheel falling off. With type families it is unsound to admit these features. We need type inference hints; we don't need a type family determining the one true type of a given field in a given record.
  • A potential small additional benefit is that with functional dependencies we can support unlifted fields fairly cheaply; with type families the cost is much greater.
  • I don't have a clear position on syntax for overloaded updates/prisms, and am increasingly wondering whether OverloadedRecordUpdate is really the right approach (Syntactic distinction for OverloadedRecordUpdate? #506). I'm primarily motivated by adding classes that would be useful for optics.
  • I'm not very keen on a new IsLabel instance; using an explicit operator to apply an overloaded constructor seems preferable.
  • OverloadedLabels should support uppercase labels using the existing IsLabel class.

I don't yet feel that overloaded constructors are as compelling as overloaded fields, given that we don't have an equivalent of DuplicateRecordFields for constructors, so you can always disambiguate using the module system. I'd be a bit worried that heavy use of overloaded constructors would lead to poor type inference.

Prompted by this proposal, I sketched out the following design:

  class HasField x r (a :: TYPE l) | x r -> a l where
    getField :: r -> a

  class HasSetter x s t (a :: TYPE l) (b :: TYPE l) | x t -> s b l, x s -> t a l where
    setter :: (a -> b) -> s -> t

  class HasConstructor x t (b :: TYPE l) | x t -> b l where
    construct :: b -> t

  class HasMatcher x s t a | x s -> t a where
    match :: s -> Either t a -- Or we could use (# t | a #) if we wanted to allow unlifted a...

The idea is that GHC would solve HasField and HasSetter constraints for record fields, and HasSetter, HasConstructor and HasMatcher fields for data constructors. Please critique at your leisure. 😄

@phadej
Copy link
Contributor

phadej commented May 25, 2022

@adamgundry Is it on purpose that the classes are not related. I can see a motivation, but this can be a source of incoherence, a e.g. HasMatcher and HasConstructor can do different things, and HasConstructor + HasMatcher can do updates differently from HasSetter.

If you going to propose that, I'd expect a rationale for that. (I think it's ok if only GHC would solve these instances, but if users are allowed to write them as well, that will be a potential source of mistakes)

Note: that HasConstructor and HasField are essentially the same class, if there aren't any laws tying the hierarchy together. That said, I'm not a fan of laws referring to classes which are not part of the formal hierarchy (e.g. Enum referring to Bounded methods).

@tysonzero
Copy link
Author

@phadej

Note that HasConstructor and HasGetter need to be different classes because the type inference goes in opposite directions:

getField @"foo" (x :: MyType) :: _ -- will infer
getField @"foo" (x :: _) :: MyType -- ambiguous

construct @"foo" (x :: MyType) :: _ -- ambiguous
construct @"foo" (x :: _) :: MyType -- will infer

The type variables on the input of the TF / left hand of the FD are swapped.

@adamgundry
Copy link
Contributor

I think HasField and HasSetter should not be connected by a superclass; I don't have a clear opinion yet on the others. They should have laws though; what's the downside of laws connecting classes not in a hierarchy?

In general, I am inclined to worry less about the case of users defining their own instances, which I see as rare and for users who know what they are doing. That's another reason I favour fundep magic over type family magic: the complexity in defining the instances isn't usually visible, while the simplification affects use sites.

@AntC2
Copy link
Contributor

AntC2 commented May 25, 2022

Sorry, as written this makes no sense to me. Do I need to read a load of other proposals first? You've linked to some, but they're mostly still in motion.

In particular, where does the # prefix come from, and what does it do/mean? How do these twoFields differ, and why would I want to use the more verbose one?

twoFields = #TwoFields (2, True)
twoFields = TwoFields 2 True

(Perhaps change this Proposal to status Draft until it is more coherently stand-alone?)

@tysonzero
Copy link
Author

tysonzero commented May 25, 2022

@AntC2

The # prefix is OverloadedLabels, it's been a part of GHC for several years.

The two twoFields differ because the second fails in the presence of overlapping constructor names, whereas the former works just fine, assuming it is not used in a type-ambiguous way. It's basically an exact mirror of the OverloadedRecordDot situation.

@tysonzero
Copy link
Author

tysonzero commented May 25, 2022

@adamgundry

All this incoherence stuff is making me nervous, does this mean that doing anything Set-like where you rely on instance coherence for correctness would be discouraged for these classes?

It makes me particularly nervous because FlexibleInstances is incoherent right now, with it being very easy to create broken Set's without a single Incoherent or even Overlapping pragma, even with -Worphans on (and no warnings), and few people seem particularly concerned, see: #279.


All the FD stuff I'll have to look into more, I don't know enough to have an opinion at this time. My one question would be if all that stuff enables me to do:

data Lenses r f = Lenses
l = Lenses
instance SetField x r a => GetField x (Lenses r f) (LensLike' f r a)
person & l.name .~ "foo"

Without UndecidableInstances?

Ideally in a way such that accidentally flipping the last two GetField arguments fails to compile, given how unreasonable it is.


I'm not very keen on a new IsLabel instance; using an explicit operator to apply an overloaded constructor seems preferable.

Part of the reason I like this is it finally gives a non-overlapping non-incoherent instance for ->, as otherwise it's just sitting there as a trap begging people to use it in overlapping/incoherent ways, as library devs have. If I define IsLabel x (a -> MyType) and someone else defines IsLabel x (TheirType -> b) then we are in trouble. With the approach my instance just needs to be changed to a HasConstructor instance and theirs should use OverloadedRecordDot instead.

There are really two key forms of type directed name resolution in Haskell right now, dots and labels, dots have field-like things handled, which leaves labels for constructor-like things. That way we don't need yet another TDNR syntax.


I don't yet feel that overloaded constructors are as compelling as overloaded fields, given that we don't have an equivalent of DuplicateRecordFields for constructors, so you can always disambiguate using the module system. I'd be a bit worried that heavy use of overloaded constructors would lead to poor type inference.

I'd take a look at the linked stripe code for a decent motivating example. It's the most heinous thing the world has ever seen, and it's truly embarrassing to have it in our codebase when onboarding new devs and trying to convince them that Haskell is better than what they're used to.

This is half about aesthetics (the worst looking thing in our codebase right now is the prefixes from lack of dot-syntax, the second worst looking thing in our codebase right now is the prefixes from lack of something like this), and half about stock/generic generated instances (since I don't want those prefixes in the DB or in our publicly-available JSON api's or binary serializations or similar).

We have a ton of types like data PaymentMethod = PaymentMethodCard (..) | PaymentMethodBank (..) | ... and data Region = RegionCity (..) | RegionCounty (..) | RegionState (..) | ... that we'd love to drop the prefixes on but can't, which also means we can't use the generically-derived instances.

Using qualified imports doesn't work super well because there are a variety of helper re-exporting modules to avoid lots of import noise, so we'd then have to define a bunch of constructor aliases in that re-exporting module with the prefixes added, which is possible but again gets back to the whole verbosity/ugliness thing.


Also small thing but if we're doing HasSetter can we do HasGetter instead of HasField for consistency?

HasField could perhaps just be a class with superclasses HasGetter and HasSetter and a single instance that applies to all types with both.

@AntC2
Copy link
Contributor

AntC2 commented May 26, 2022

The # prefix is OverloadedLabels, it's been a part of GHC for several years.

It's only used currently for record fields, which must be lower-case. I was rather hoping now we have NoFieldSelectors I could in future look to upper-case record labels. (Because I think of record labels as being more like constructors than functions.)

So I think the Proposal should say you're blocking off other opportunities. It does seem weird to have # sometimes heralding 'here comes a component-accessor' and other times heralding 'here comes a structure'.

Can I partially apply a hash-prefixed constructor?

oneOfTwoFields :: Bool -> MyType
oneOfTwoFields = #TwoFields 5

If not, are you offering some other way to disambiguate multiply-defined constructors, so they can appear in all the other use-cases for constructors?

The two twoFields differ because ...

Yeah I'd guessed something like that. But the proposal doesn't actually say so. Having to put the fields inside a tuple is clunky. What if I want record syntax for my data constructor (and to use labels which might be same-named as other types) -- esp where there are heaps of fields of the same type:

data MyType
    = NoFields
    | OneField Char
    | TwoFields { theInt :: Int, theBool :: Bool }

twoOrOneFields = #TwoFields { theInt = 5 }                   -- ??

And does that create twoOrOneFields with theBool uninitialised, or partially applied?

@AntC2
Copy link
Contributor

AntC2 commented May 26, 2022

We have a ton of types like data PaymentMethod = PaymentMethodCard (..) | PaymentMethodBank (..) | ...

For the record (hah!) the reason this is already a mess in Haskell is lack of support for 'Row types' -- see for example the Introduction here, recently, and all its citations to the theory. (A Haskell with the beginnings of Row types was available from 1996.) Row types don't have 'private' labels/all labels are global; the label name alone doesn't tell anything about the field's type or the structure it's a field within. From a Row type you produce either a Sum aka Variant or Product aka anonymous Record.

So your claim "Since Variants and Records are duals, ..." is true of Row types, not of Haskell ADTs. I fear this Proposal is making the already-mess into more of a mess, because it has to wrangle the existing namespace disciplines. Consider a hypothetical

type PaymentMethod = Var{Card :: ..., Bank :: ..., ... }   -- the braces { ... } has type-of-type Row
                                                           -- Var(iant) is a type constructor :: Row -> Type  -- that is, produces a Sum type
                                                           -- note upper-case labels are allowed
data Customer = MkCust{ ..., payMeth :: PaymentMethod, ... }

myCust = MkCust{ ..., payMeth = {Card = ...}, ... }        -- no constructor prefixing the {Card = ...},
                                                           -- so this is fixing which Variant

chargeTo MkCust{ ..., payMeth = {Card = cardDetails}, ...} = ...

(At the point of parsing {Card = ...}, the compiler can't tell whether that's constructing a Var or a singleton Rec. The distinction doesn't matter: type inference from payMeth = will resolve.)

That example hasn't gone so far as making Customer a (Product) type synonym for Rec{ ..., payMeth :: PaymentMethod, ... }. That would be appropriate, presuming there's only one constructor.

@tysonzero
Copy link
Author

tysonzero commented May 26, 2022

@AntC2

It's only used currently for record fields

I don't really agree with that. In fact using it directly for record fields isn't even possible without defining an orphan instance for -> that will collide with other (also orphan) instances that exist in the wild, such as in named. You could use it indirectly for record fields via a newtype such as Optic.

Going forward I imagine it will be used little for record fields outside of perhaps the Optic case, due to RecordDotSyntax giving much nicer syntax.

So I think the Proposal should say you're blocking off other opportunities.

Most of the proposal is adding a new module and generation for instances of a class in that module. The only potential opportunity being taken off the table is having the IsLabel x (a -> b) instance go in the other direction, in other words having x a -> b rather than x b -> a. Given that OverloadedRecordDot already gives you exactly that same functionality I did not consider that a non-trivial downside.

It does seem weird to have # sometimes heralding 'here comes a component-accessor' and other times heralding 'here comes a structure'.

As per above it's currently not possible for it to mean "component-accessor" without dangerous orphans, you can have it be an optic/lens, but you can't have it be a component-accessor.

Yeah I'd guessed something like that. But the proposal doesn't actually say so.

I mean the motivation talks about naming collisions in just about every paragraph, and the idea that collision aren't a problem is a direct consequence of the classes specified, but I guess I could have added an extra statement or two making that extra clear.

I do more or less assume a fair amount of familiarity with OverloadedRecordFields and OverloadedRecordDot, since they are both fully implemented and the former is linked at the top, and the latter is mentioned.

[Non-trivial constructor uses]

#Foo just becomes a simple HasConstructor "foo" x => ContructorType "foo" x -> x, so it's limited to the things you can do with something of that type signature. There is no special magic treating it as a record constructor or anything like that.

Note that once rows and anonymous records are a thing, the record examples can be made to work just fine, as a record within a variant would just be an anonymous record instead of the weird magic thing it is now.

I currently am not really prioritizing working with records in variants as they are largely a wart in most current Haskell, with them generally generating partial functions outside of NoFieldSelectors, but yes this is fully future compatible with proper anonymous records.

[Row types]. So your claim "Since Variants and Records are duals, ..." is true of Row types, not of Haskell ADTs. I fear this Proposal is making the already-mess into more of a mess, because it has to wrangle the existing namespace disciplines. Consider a hypothetical

Yes I am well aware of the a lot of the existing row type discussions and several of the papers relating to them, and this proposal is extremely future compatible with all of them. It's intentionally very simple and doesn't concern itself with some of the cases you just mentioned partly for that exact reason. Any future extensible rows/records/variants implementation would be able to easily define instances for the introduced classes.

You are correct that the duality is stronger with row-based variants and records than with Haskell ADTs, but if you are a fan of row-types being the future then if anything that should make you support it more, as if anything it actively encourages moving in that direction, due to it being so fully future compatible.

It's also worth noting that even with anonymous extensible rows records and variants, classes like this and the duals in GHC.Records will still be needed so that you can seamlessly work with both anonymous structural types and nominal types that wrap them without needing two different sets of syntax, or having to worry about collisions for the latter.

I also know that realistically the timeline for extensible anonymous rows records and variants is probably on the order of several years, and I'd like to be able to improve the uglier parts of our company's codebase before then.

@phadej
Copy link
Contributor

phadej commented May 26, 2022

@adamgundry

what's the downside of laws connecting classes not in a hierarchy?

Disconnection. Orphans also allow different people to define unlawful instances very much accidentally. If Haskell were able to encode the laws as members of the class(es), not in hierarchy laws would be out-of-place.

@phadej
Copy link
Contributor

phadej commented May 26, 2022

@adamgundry

I think HasField and HasSetter should not be connected by a superclass; I don't have a clear opinion yet on the others.

I agree. There could be just getters, and just setters. But then there are issues I mention above. Therefore I like single-class-to-rule-them-all (like LabelOptic) designs.

@AntC2
Copy link
Contributor

AntC2 commented May 26, 2022

extensible anonymous rows records and variants ... on the order of several years ...

You're being wildly optimistic. It's been over seven years to get what little we have in 9.2, and more like twelve years since NoFieldSelectors was first mooted. (That's why I put in the piece on Rows/Variants/Records as a hypothetical: I doubt it'll happen in my lifetime -- I'm about the same age as SimonPJ.)

@AntC2
Copy link
Contributor

AntC2 commented May 26, 2022

the uglier parts of our company's codebase

Aww everybody's baby is beautiful for them. I presumed the examples in the Proposal were made up. If your codebase is heavy on commercial data structures that (presumably) get held in a flattish database/not syntax trees or highly nested structures, I'm wondering why your company is using Haskell? I think it's not the right tool for that sort of application.

@googleson78
Copy link
Contributor

I think it's not the right tool for that sort of application.

Although I entirely disagree with your sentiment, I believe it also contains motivation for the proposal - maybe the author is pushing in the direction of making Haskell more pleasant to use for these scenarios.

@phadej
Copy link
Contributor

phadej commented May 26, 2022

@goldfirere

I'm not convinced about the IsLabel instance -- seems to squat in too contentious a space. For example, it's easy to imagine an author of some embedded DSL to want to use a similar instance for different purposes.

You are being polite. IMHO that instance is a show stopper. The https://hackage.haskell.org/package/generic-lens-2.2.1.0/docs/Data-Generics-Labels.html is an example of IsLabel instance for -> in the wild. And it's "better" than what GHC can offer now or even with this proposal.

@tysonzero
Copy link
Author

tysonzero commented May 26, 2022

@AntC2 No wonder Haskell is less popular than most languages, one point of ugliness and people suggest you throw the whole language out. We still love having far more type safety, less verbosity, easier refactoring, and more composability than alternative languages. The impedance mismatch with relational databases is arguably less bad than it is in an OOP language anyway.

@googleson78 I mean I want Haskell to be as pleasant to use in as many scenarios as possible without sacrificing it's core values, and a couple classes to make dealing with fairly boring situations like overlapping fields and records much less painful would be fantastic.

@phadej That instance is completely insane to me. To me instances are about canonicity, otherwise you should be using just a function/value. Just because van Laarhoven lenses happen to have a -> at the outmost layer of their type, does not make them feel even slightly like the "canonical function". Don't get me wrong I like van Laarhoven lenses, hence why I have my above example of using dot syntax to create them on the fly, without truly wacky non-canonical instances. The Optic newtype approach is another very reasonable way to get # syntax without orphans, let alone strange non-canonical orphans.

@AntC2
Copy link
Contributor

AntC2 commented May 26, 2022

one point of ugliness and people suggest you throw the whole language out. We still love having far more type safety, ...

(This is getting way off the point, I'll start a Discourse thread, but ...)

Haskell's (lack of) a well-founded records/rows system is many points of ugliness. Which this message seems to confirm. If you look back at the history, the H98 design was always a stopgap, and there were many papers/proposals in the early 2000's; so nobody was proud of it then. But no proposal 'stuck', so starting ~2012, people started putting lipstick on the pig. There's by now many layers of lipstick.

In nearly every other part of Haskell design, a proposal is not entertained unless it has a sound theoretical basis. (That's why Overlapping Instances or FunDeps don't get much love at GHC HQ.) For records/rows there are theoretically sound approaches; purescript for example never went near H98-style records; and already has a Variants module built on better-founded rows/records. But Haskell/GHC stumbles on with the lipstick.

In my experience, database-intensive commercial applications with mostly flattish files aren't using so much higher-order fandango/don't need to push the limits of a language's type system. (In the applications @tysonzero is talking about, perhaps the commercial-looking bits are a thin layer on top of other complexity.)

Haskell programmers are hard to find, hard to retain; it's lengthy to cross-train programmers from other languages. If your application mostly uses familiar commercial data processing, use a familiar language.

@tysonzero
Copy link
Author

@AntC2

I'm sorry but all of this unsolicited advice about how our company should be run is really not even slightly relevant nor is it particularly appreciated. We are very happy with the language we have chosen, we just want to clean up some of the ugly bits. Particularly when such cleanup fits extremely naturally with the existing GHC.Records module and OverloadedRecordDot extension.

I appreciate other discussion about unified vs stratified and so on, and I could definitely see some of these counterproposals being improvements on what I have, but saying this code should just keep being ugly and you should switch languages is just about the least useful thing possible. Particularly since just about any large software project will have a variety of different things going on, so throwing up your hands the second you have a boring business-y enum or have to interact with stripe is ridiculous.

I agree with the lipstick on a pig stuff, but that's why I like this proposal (and most of the similar counterproposals), because it is fully future compatible with every reasonable implementation of anonymous extensible rows/records/variants.

For now I'm mostly curious to see what @adamgundry says to my questions, as it seems like he has made some promising developments in this area, but I also do have some skepticism about certain aspects of it.

@arybczak
Copy link

OverloadedLabels parser doesn't allow Upperletter starting labels.

% ghci-9.2.2
GHCi, version 9.2.2: https://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /home/phadej/.ghci
Prelude> :set -XOverloadedLabels 
Prelude> :t #Foo

<interactive>:1:1: error: parse error on input ‘#’

Relaxing this would make sense, and probably can be justified standalone (i.e. optics or lens could benefit from it today), i.e. in a separate proposal.

@phadej such proposal was written and accepted a long time ago (#170), but in the process extended (unnecessarily?) so that the original author lost interest (ghc/ghc#192 (comment)).

@adamgundry
Copy link
Contributor

(apologies that I'm currently swamped and therefore not responding as quickly or coherently as I would like)

@tysonzero

All this incoherence stuff is making me nervous, does this mean that doing anything Set-like where you rely on instance coherence for correctness would be discouraged for these classes?

Yes. That is, we should explicitly document that HasField and similar are not expected to be coherent, and that it is not safe to rely on this.

It makes me particularly nervous because FlexibleInstances is incoherent right now, with it being very easy to create broken Set's without a single Incoherent or even Overlapping pragma, even with -Worphans on (and no warnings), and few people seem particularly concerned, see: #279.

Indeed it would be nice if we had a tighter story about coherence for "normal" classes and the ability to enforce it more strictly. But I still think it is reasonable to have some classes that are explicitly marketed as allowing incoherence, where they are used for type inference but not intended to emulate a poor-man's module system (ala Ord for Set).

All the FD stuff I'll have to look into more, I don't know enough to have an opinion at this time. My one question would be if all that stuff enables me to do:

data Lenses r f = Lenses
l = Lenses
instance SetField x r a => GetField x (Lenses r f) (LensLike' f r a)
person & l.name .~ "foo"

Without UndecidableInstances?

Ideally in a way such that accidentally flipping the last two GetField arguments fails to compile, given how unreasonable it is.

I think excluding UndecidableInstances is an unreasonably strong restriction. After all, we don't require our value-level programs to pass a termination checker! This seems feasible with UndecidableInstances enabled. And this isn't the kind of instance I would expect most user code to need to write.

I'm not very keen on a new IsLabel instance; using an explicit operator to apply an overloaded constructor seems preferable.

Part of the reason I like this is it finally gives a non-overlapping non-incoherent instance for ->, as otherwise it's just sitting there as a trap begging people to use it in overlapping/incoherent ways, as library devs have. If I define IsLabel x (a -> MyType) and someone else defines IsLabel x (TheirType -> b) then we are in trouble. With the approach my instance just needs to be changed to a HasConstructor instance and theirs should use OverloadedRecordDot instead.

There are really two key forms of type directed name resolution in Haskell right now, dots and labels, dots have field-like things handled, which leaves labels for constructor-like things. That way we don't need yet another TDNR syntax.

Well, we could choose IsLabel x (a -> b) to be constructors, but it feels somewhat arbitrary to choose them over fields or lenses or nothing at all. Since bare labels can be used for optics (without orphans even if you take the optics approach), I'm not wholly convinced by the argument that we have . for fields and # for constructors. One could equally well adapt your Lenses trick and use c.MkT as sugar for construct @"MkT (okay, so perhaps that needs OverloadedRecordDot to support uppercase "field names"...).

I don't yet feel that overloaded constructors are as compelling as overloaded fields, given that we don't have an equivalent of DuplicateRecordFields for constructors, so you can always disambiguate using the module system. I'd be a bit worried that heavy use of overloaded constructors would lead to poor type inference.

I'd take a look at the linked stripe code for a decent motivating example. It's the most heinous thing the world has ever seen, and it's truly embarrassing to have it in our codebase when onboarding new devs and trying to convince them that Haskell is better than what they're used to.

This is half about aesthetics (the worst looking thing in our codebase right now is the prefixes from lack of dot-syntax, the second worst looking thing in our codebase right now is the prefixes from lack of something like this), and half about stock/generic generated instances (since I don't want those prefixes in the DB or in our publicly-available JSON api's or binary serializations or similar).

We have a ton of types like data PaymentMethod = PaymentMethodCard (..) | PaymentMethodBank (..) | ... and data Region = RegionCity (..) | RegionCounty (..) | RegionState (..) | ... that we'd love to drop the prefixes on but can't, which also means we can't use the generically-derived instances.

Using qualified imports doesn't work super well because there are a variety of helper re-exporting modules to avoid lots of import noise, so we'd then have to define a bunch of constructor aliases in that re-exporting module with the prefixes added, which is possible but again gets back to the whole verbosity/ugliness thing.

This motivates being able to use multiple constructors with the same name, I agree. But does that necessarily mean type-based overloading? Or could it be something like local modules (#283)? That might reduce the pain of qualified imports and juggling constructor aliases, while still being resolved in the renamer rather than relying on type inference to figure out which constructor you mean.

Moreover, we don't currently have a story for overloaded constructors in patterns. It seems like one will want that pretty quickly if you have a codebase that has many non-unique constructor names.

Also small thing but if we're doing HasSetter can we do HasGetter instead of HasField for consistency?

HasField could perhaps just be a class with superclasses HasGetter and HasSetter and a single instance that applies to all types with both.

I've gone back and forth on these naming questions. Now that HasField is in GHC 9.2 my inclination is to avoid breaking code that defines instances of it. But it's a fine judgement call as to whether more symmetric naming is worth the backwards-compat cost. In any case I agree we might as well have a constraint synonym / single-instance class that allows get+set.

@adamgundry
Copy link
Contributor

@phadej

I think HasField and HasSetter should not be connected by a superclass; I don't have a clear opinion yet on the others.

I agree. There could be just getters, and just setters. But then there are issues I mention above. Therefore I like single-class-to-rule-them-all (like LabelOptic) designs.

Yes, I see your point, it is a nice property of LabelOptic that a user defining an instance for a "virtual optic" is forced to determine the optic kind fully, rather than e.g. giving a getter instance and perhaps someone else later giving a setter instance. I hadn't previously appreciated this.

What I'm not sure about is how to bring that into GHC itself. What would it look like to have a LabelOptic class whose instances could be generated by GHC? I see HasField/HasSetter/whatever as providing a source of better generic implementations of LabelOptic; but it still feels like LabelOptic itself belongs in a separate library. Perhaps it is enough to discourage users from giving instances of HasField etc. in normal usage?

@tysonzero
Copy link
Author

tysonzero commented Jun 1, 2022

@adamgundry

Sorry didn't mean to imply any hastiness, just wanted to make sure you that my questions for you didn't get lost!

Yes. That is, we should explicitly document that HasField and similar are not expected to be coherent, and that it is not safe to rely on this.

Is there a real use case where someone actively wants incoherence here? Or is it more of a case of being hard to enforce coherence even if it just about always happens in practice? I can't imagine wanting x.foo for a given type of x to mean two different things, but maybe I'm not being creative enough.

I would imagine things like manual serialization instances would rely on these instances, and coherence of serialization instances is of course very important, so would these be in jeopardy as well, or fine as long as you are careful to make sure the serializer and deserializer are always defined in the same module?

But I still think it is reasonable to have some classes that are explicitly marketed as allowing incoherence, where they are used for type inference but not intended to emulate a poor-man's module system (ala Ord for Set).

Whilst I do like to use Set as a concrete example of where coherence is crucial, to be honest I get the most value out of coherence from just peace of mind of knowing that for any given function with a given set of types the output is consistent everywhere in the codebase. It's why I don't like things like ImplicitParams and even RecordWildCards, too much automagic stuff without solid foundations.

I would really prefer if x.foo always meant the same thing for a given type x, but I'm assuming there is a more technical reason why you foresee that as being impractical.

I think excluding UndecidableInstances is an unreasonably strong restriction. After all, we don't require our value-level programs to pass a termination checker! This seems feasible with UndecidableInstances enabled. And this isn't the kind of instance I would expect most user code to need to write.

Fair point. I more or less see UndecidableInstances as a symptom of the true underlying cause which is that complicated instance heads are messy, but complex TypeFamily outputs are not. Similar to how the left hand side of function definitions are generally much simpler than the right hand sides. You get issues with overlapping and coherence and decidability and so on that you just don't get even a fraction as often with complicated TypeFamily outputs.

I'm not wholly convinced by the argument that we have . for fields and # for constructors. One could equally well adapt your Lenses trick and use c.MkT as sugar for construct @"MkT"

I suppose it's a matter of taste yeah. Personally I think the argument that you should use a helper term or a newtype for lenses is more compelling than the argument that a simple constructor needs one. Otherwise you're stuck with either c.Foo bar or #Foo # bar for the minimal way to call a constructor.

Realistically the c.Foo example would likely have to be defined per-project too, or more verbose than just c, as I can't imagine a top level c term being defined in base for such a purpose.

For the latter example you end up with pretty gnarly error messages to throw at a beginner in the fairly simple situation of calling a constructor wrong, so you can't just start people out with that right off the bat.

You also theoretically could have the HasConstructor instance for (->) create a lens, but it does get a bit weird when you want to handle prisms and reviews, as you'd need to suppress the default HasConstructor instances for Tagged and Market.

(okay, so perhaps that needs OverloadedRecordDot to support uppercase "field names"...).

I wouldn't be against allowing this, although it seems like you may have to tread pretty carefully with how it interacts with module syntax, particularly with extensions like the one you linked. Although in a more dependent Haskell you could imagine modules and records being more unified, so it seems like it shouldn't be a deep and fundamental problem.

Or could it be something like local modules

Local modules does seem like a solid proposal (assuming it works smoothly with all this RecordDotSyntax and similar stuff), and it would definitely alleviate some of the pain. The stripe example still seems pretty unchanged though, as codegen'd code with long types still involve typing out those long types, since you can't rely on the type system to save you.

I also think on a more fundamental level that Haskell should have a more first-class way of interacting with fields and constructors as concepts, even without looking as much at the sugar side.

Moreover, we don't currently have a story for overloaded constructors in patterns. It seems like one will want that pretty quickly if you have a codebase that has many non-unique constructor names.

Yeah I tried to keep the proposal simpler and closer to a dual of GHC.Records stuff, but I can see a variety of ways to go about that.

I think the dual aspect between records and variants can be further utilized to have pattern matching piggyback off of record creation. Given the following type:

data FooBarBaz
    = Foo Bool
    | Bar Char
    | Baz Text

You can pattern match on it via the following function:

matchFooBarBaz
    :: (HasField "Foo" r, HasField "Bar" r, HasField "Baz" r)
    => (FieldType "Foo" r ~ (Bool -> a), FieldType "Bar" r ~ (Char -> a), FieldType "Baz" r ~ (Text -> a))
    => FooBarBaz -> r -> a

Which you can of course further put behind a typeclass and give a concise name like (~>) and case_ (for case and \case respectively) to get:

handleFooBarBaz :: FooBarBaz -> Bool
handleFooBarBaz = case_ $ FooBarBaz'
    { Foo b = b
    , Bar c = c == ' '
    , Baz t = null t
    }

Now I realize there are a variety of things going on above so I didn't want to bundle all of them into a single proposal, as there may be various sub-pieces that people don't agree with. I figured something like the proposal as stated, or a similarly-broad counterproposal, would be easier to reach consensus on.

[breaking ghc-9.2 HasField]

Isn't there technically already an accepted breaking change to it via adding setting to the class? But yeah I suppose if a separating get-and-set proposal was accepted before that change was implemented then you might be able to cut out that breaking change. Assuming you don't switch to TypeFamilies or change HasField in other ways. Personally I think it's worth trying to make something fairly nice here whilst seemingly not too much real world code relies on it (particularly on the instance defining side, most uses should "just work").

I see HasField/HasSetter/whatever as providing a source of better generic implementations of LabelOptic; but it still feels like LabelOptic itself belongs in a separate library.

I'm with you on that one. I am a big fan of lenses but I'm not sold they should be baked in this deeply to such a core part of Haskell that tons of beginners will end up touching fairly directly. I would definitely plan to use various lenses built on top of these constructions myself.

I also see lenses as being more likely to be future-limiting, as we can always add more subclasses and such, but if something fits outside of a reasonable optic it seems you are in trouble with a single-class approach. For example record extension seems like it may be a challenge. As far as I can tell it doesn't fit anywhere into the current lens hierarchy.

Which reminds me that in your previous comment you used s -> Either t a instead of my s -> Maybe b. I intentionally left out type changing pattern matching as it's the dual of record extension (as opposed to the dual of something trivial) which GHC.Records currently does not support.

If we go that route I'd suggest to consider also adding the dual:

class HasExtender x s t a | x s -> t a where
    extend :: a -> t -> s
-- or TypeFamily equivalent

I realize that GHC would not be able to create instances of this automatically with the current non-extensible approach to records, in the same way that HasMatcher is not fully being automatically utilized as auto-instances will always have the same s and t.

However it'd be nice to have a canonical class to manually define instances for and for libraries to build off of, allowing any future extensible records library to fit seamlessly with any library-level extensible records solutions and similar in the meantime.

@adamgundry
Copy link
Contributor

Yes. That is, we should explicitly document that HasField and similar are not expected to be coherent, and that it is not safe to rely on this.

Is there a real use case where someone actively wants incoherence here? Or is it more of a case of being hard to enforce coherence even if it just about always happens in practice? I can't imagine wanting x.foo for a given type of x to mean two different things, but maybe I'm not being creative enough.

I think it isn't that we want incoherence per se. Rather, coherence prevents two things we do want:

I agree that it would be better if x.foo always meant the same thing for a single type of x. But I don't see how to absolutely enforce that without preventing the above two features.

I would imagine things like manual serialization instances would rely on these instances, and coherence of serialization instances is of course very important, so would these be in jeopardy as well, or fine as long as you are careful to make sure the serializer and deserializer are always defined in the same module?

I'm wondering why a HasField constraint would occur in the context of a serialization instance? I can see it is useful to have HasField in contexts for things like LabelOptic, so I wouldn't rule them out altogether (ala implicit parameters), but it seems fishy to me in general to have them in "normal" instance contexts.

@phadej
Copy link
Contributor

phadej commented Jun 1, 2022

@adamgundry

but it seems fishy to me in general to have them in "normal" instance contexts.

I agree, i'm not a fan of structural typing. Writing HasField polymorphic (instances') code is a bad practice in my books, but I get a feeling that e.g. things like HasField "id" r Int => ... will be common, if not already (instead of defining a HasIdentifier class, with default or DerivingVia implementations).

The correct (or at least more Haskell) solution is harder in this case... (and you couldn't use dot syntax if the context is HasIdentifier).

@tysonzero
Copy link
Author

tysonzero commented Jun 1, 2022

I agree that it would be better if x.foo always meant the same thing for a single type of x. But I don't see how to absolutely enforce that without preventing the above two features.

It seems like an annotation either per-field or per-type to not automatically generate an instance would be a pretty reasonable thing to do.

Although I am not particularly convinced that having record pattern syntax that deviates from record dot syntax is intuitive or desirable. I'd expect getBar (Foo {bar = x}) = x to behave the same as getBar foo = foo.bar and I'd like to in future see getBar {bar = x} = x to be valid syntax, which would be even more confusing if it differed from the former.

You can always use a positional constructor in those cases, and in fringe cases where the internal structure is a large complicated set of different fields, I'd argue for something like:

data FooInternal = FooInternal { .. }
-- dot syntax uses the actual internal fields

newtype Foo = Foo FooInternal
instance HasField x Foo a
-- dot syntax uses the virtual fields

I'd argue the currently considered solution of just having incoherence is pretty weird. If I use dot syntax within the module the type is defined, would I expect it to use the manual instance or the automatic instance? The former seems like it'd feel inconsistent and sketchy, and if the latter then why even generate the automatic instance at all.

[HasField in "normal" instance contexts.]

To be clear I didn't mean the instance would be exposed like that. I just meant that the instance would be used in the literal sense of there being some foo.bar stuff inside the instance body:

instance Serialize Person where
    serialize person = person.firstName <> " " <> person.lastName

Even with just the above coherence is still a concern. For example if you defined one of the serialization directions in one module and the other direction in a different module.

I'd mostly agree with the two of you that true structural typing of that form is undesirable. The main "structural typing" I want is things like extending and contracting records/variants, as opposed to things like renderNameAndDescription foo = foo.name <> ": " <> foo.description.

@Ericson2314
Copy link
Contributor

I think all the type-directed overloaded name stuff in Haskell today is shoddy and we not in a position to make it not shoddy any time soon, so the important thing is instead to make the quantification-disambiguated stuff good. I am not sure @adamgundry is as pessimistic as I about type-directed things, but we both agree on wanting the quantification-directed stuff to be better.

I would proposal a moratorium on this sort of thing until we settled on proposals for making the name-directed stuff good --- those proposals are much simpler and there is far less to bike shed. Then, with a solid baseline established, we can return to these fancy bikeshedable type-directed things.

@tysonzero
Copy link
Author

tysonzero commented Jun 15, 2022

@Ericson2314

Strongly disagree. RecordDotSyntax will easily lead to the single largest improvement in aesthetics, simplicity and verbosity in our codebase of any extension of the last decade and it's just not close. I also do not think it's shoddy, symbols in typeclasses is a very natural extension of things like the Num type class.

It's also worth noting that nothing is actually be added to the language itself to support this, the only part that has any future-compatibility concerns is the IsLabel instance, and I'd still take this proposal (or equivalent) without that part over nothing. So it's not like we are painting ourselves into a corner by filling up gaps in the language with this stuff.

@AntC2
Copy link
Contributor

AntC2 commented Jun 15, 2022

... shoddy ...

improvement in ... verbosity [??]
of the last decade

Please can we quit with the apocalyptic language (I acknowledge I go in for some of that myself); and with the hype.

The last decade introduced Pattern Synonyms, by the way.

@tysonzero: is it possible you can simulate the 'compiler magic' you want via Template Haskell? At least you could substantiate the claim for improvement in verbosity -- and aesthetics.

@tysonzero
Copy link
Author

tysonzero commented Jun 15, 2022

Please can we quit with the apocalyptic language (I acknowledge I go in for some of that myself); and with the hype.

I'm genuinely not exaggerating.

The last decade introduced Pattern Synonyms, by the way.

Sure, which affects maybe a couple dozen lines of our codebase. RecordDotSyntax will affect tens of thousands of lines.

is it possible you can simulate the 'compiler magic' you want via Template Haskell

The only part of this proposal that is compiler magic is the resolving instances part, which cannot be done with TH. The rest can be done in library code, although it would have problems like orphans (IsLabel) and non-canonicity (the GHC.Variants classes).

@AntC2
Copy link
Contributor

AntC2 commented Jun 16, 2022

RecordDotSyntax will effect[sic] tens of thousands of lines.

(Not my experience but ...) This is a proposal not for RecordDotSyntax, but for Variant types. It might be an interesting exercise to implement proper Variants into Haskell, as the dual of Product types. But:

  • Haskell doesn't have 'proper' Product types (anonymous records) anyway;
  • Haskell's ADTs are a reasonable approximation to Variant types -- or at least a better approximation to Variants than Haskell records are to Product types;
  • The proposal here is a kludgy imitation of Variant types -- I'd say worse than the Haskell's ADTs we already have.

So the motivation is left with the allegation that in @tysonzero's codebase, there are some clashes of desired constructor names. My reaction on first reading this proposal was: 'sfunny, I've never wanted to name a data constructor same as a type. (Neither have I wanted to name a constructor for one type same as the constructor for a different type. [**] Whereas I often want to name a field the same in two different datatypes -- because I have many data structures pointing to the same courseId, for example.) I ran a quick straw poll: my contacts are in favour of LongAndExplicitNamesForEverything, so would never expect to name a constructor same as a type. To take:

   data Region = City City | State State | ...

The Citys would be shared across many other entities in my application, so defined in a dedicated place. (Not all Citys are also Regions.) Same for States. So I'd expect the Region entity to merely point to those other decls:

data Region = City CityId | State StateId | ...

For example a DB column flagging which child table points to it

Then the DB column is not holding the User or Group content (which might be at type User, Group), but only a reference (FromUser, FromGroup). And I'm having a hard time understanding why a parent data structure would hold the type of its (only?) child structure: wouldn't the parent merely be a parent, and get the child(ren) pointing to it? Is this database structure (presumably external to the Haskell program) actually designed to industry best practice wrt normalisation?

Addit: There's a growing trend to name data constructors as MkCity, MkState, .... That also should reduce name clashes.

[**] You're of course liable to get name-clashes amongst constructors in separate modules, from different authors/different packages. That's what the module naming system is for.

data Month = Jan | Feb | ... | Oct | Nov | Dec
...
data NumBase = Bin | Oct | Dec | Hex

@tysonzero
Copy link
Author

This is a proposal not for RecordDotSyntax, but for Variant types.

Of course, but I was very specifically addressing the comment that said "all the type-directed overloaded name stuff in Haskell today", so I talked about RecordDotSyntax, which is a "type-directed overloaded name stuff in Haskell today".

Haskell's ADTs are a reasonable approximation to Variant types -- or at least a better approximation to Variants than Haskell records are to Product types;

What's your model? Haskell ADTs are no closer to anonymous extensible variants than Haskell Records are to anonymous extensible records. Both suffer from lack of extension/contraction, structural abstractions, anonymity etc.

The proposal here is a kludgy imitation of Variant types -- I'd say worse than the Haskell's ADTs we already have.

I'm sorry but what? This proposal is not at all an imitation of variant types, nor does it attempt to replace existing Haskell ADTs. It is a series of helper classes and instances to help you work with Haskell ADTs, that would be completely future compatible with any reasonable implementation of true anonymous extensible variant types, such as those I've seen in variety of papers. I don't get how to make this any clearer.

Random off-the-cuff codebase suggestions

None of those would work at all to achieve our desired goals, but I don't really have the patience to actually go through and explain why, so I won't bother for now.

@AntC2
Copy link
Contributor

AntC2 commented Jun 16, 2022

... codebase suggestions

I think it's reasonable to ask for use cases. The proposal has been up for a couple of weeks; nobody else has come to say: yes I have a real-life use for this. I think it's reasonable to ask whether a given proposal is the only or the best approach for a set of requirements/the name clashes you're experiencing.

This proposal is not at all an imitation of variant types, ...

Then perhaps it shouldn't have 'Variants' in the title? (I'm also rather dubious how much the purescript-variant library is "an imitation of variant types". I guess the 'theory' is rather diverse.)

would be completely future compatible with any reasonable implementation of true anonymous extensible variant types ...

purescript at least is ahead of Haskell in having anonymous records, as a "reasonable implementation". (I'd say that Variant library is "compatible" in that it looks like an unconnected feature, rather than 'orthogonal' in the sense of Records/Variants using the same underlying abstraction of label-value pairs.)

I agree with @Ericson2314 that there's already enough parts in motion around "name-directed stuff "; we should wait first to see how that settles.

compiler magic is the resolving instances part

Have you reviewed what the HList team did? That is, without compiler magic, in 2004. Their Appendix C 'Type-Indexed Co-products'. They had only FunDeps available at the time, but I think that stuff's been translated to use Type Families.

@Ericson2314
Copy link
Contributor

@tysonzero there issues with record dot are:

  1. No plan to get enough "full" impredicative types to work with poly-typed fields.
  2. Violates hygiene by allowing one to observe that completely unrelated records happen to reused field names.
  3. Nothing to help with variants of sums without running into the same issues --- what you propose here cannot handle constructors with polytypes.
  4. No ability to work with patterns (as @adamgundry mentioned) and thus is an API compat foot-gun. We have too many such foot guns already!

And there are no plans from the record dot crowd to fix the first issue, no recognition of the second issue from just about anyone but me, and I am not sure anyone besides @adamgundry was interested in fixing the 4th issue.

As such, record dot and overloaded labels are a 80% solution that is simply not up to par with the older parts of Haskell, and no one knows how to actually fix them. And yet it still consumes tones of time as we debate what the maimed optics classes look like.

The biggest issue with pre-record-dot records is not verbose syntax, but verbose identifiers from humans hand-rolling namespacing with field prefixes. I'll be the first to admit that is complete garbage that we ever had to do that in Haskell, that that part of Haskell 98 records was ever acceptable.

Finishing up name-resolution based fields and variant work will fix that most egregious issue with a 95% solution that work always, not matter how complex the constructor and field types are. (The last 5% is data families, but I am happy to just say data families are in the wrong for not providing allowing one to provide a type constructor name.)

With that fixed, we can have a more sober conversation where record dot just makes a few bits syntax things less terse, and the main issue of avoiding manual namespacing is already solved.

@goldfirere
Copy link
Contributor

I personally think that we here can sometimes suffer too much from letting the perfect become the enemy of the good. I acknowledge that OverloadedRecordDot is imperfect, and it's helpful to have a catalog of its imperfections. But, for me, that neither means that we shouldn't have nor means that we shouldn't build more features like it. It's genuinely useful, as people say to us many times!

I think it comes down to a question of what we're trying to achieve here: do we prioritize an elegantly designed language, where e.g. everything is first class and all the features mesh with each other even in dark corner cases? or are we trying to build a language useful in common applications? If we wanted the first, I have a lambda calculus I'd like to sell you. But we don't program real programs in a bare lambda calculus, because doing so is painful for practitioners. So I favor offering well thought out, practical conveniences -- even if those conveniences don't work in every case. Others might reasonably feel differently, but a frequent frustration I have heard (and sometimes share) about Haskell is that it's impractical -- let's fight that by accepting proposals that people find useful, even if they are imperfect.

To be clear, this isn't an impassioned plea to accept the current proposal, which I remain unconvinced about. But, to me, saying it's like OverloadedRecordDot is a point in its favor, not against it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants