Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefix form for MkSolo# (amendment of #475: Non-punning list and tuple syntax) #638

Merged
merged 3 commits into from Apr 14, 2024

Conversation

int-index
Copy link
Contributor

@int-index int-index commented Feb 21, 2024

This amendment to proposal #475 has been accepted; the following discussion is mostly of historic interest.


At the moment, the proposal defines

type Unit# :: TYPE (TupleRep [])
data Unit# = (# #)

type Solo# :: TYPE rep -> TYPE (TupleRep [rep])
data Solo# a = (# a #)

Alright, so what are the prefix forms of those constructors? The data constructor of Unit# is (# #), and the data constructor of Solo# is... also (# #)! This is an ambiguity.

We can resolve this ambiguity by renaming the data constructor of Solo# to MkSolo#, retaining the mixfix (# a #) syntax of course.

Please review: @goldfirere, @tek

@tek
Copy link
Contributor

tek commented Feb 21, 2024

It seems like a definite oversight to me, I can't find any arguments for not having MkSolo#.

Copy link
Contributor

@goldfirere goldfirere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to fix this. But I'm a little worried about programmer confusion here. I think that the pretty-printer should continue to print applied unary tuples using mixfix syntax. Is there ever a case where we have to print an un-applied constructor where the user didn't write this? If not, then no problem. If there is, we should be careful in an error message to tell the user of the relationship between MkSolo# x and (# x #).

#. The name of the data constructor for unboxed 1-tuples is ``MkSolo#`` rather
than ``(# #)`` to distinguish it from the data constructor for 0-tuples.
The ambiguity only arises when the name is used unapplied.
When applied to an argument, ``MkSolo# a`` can be written and pretty-printed as ``(# a #)``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"can be" or "is" (at least for printing)? That is, I would want the prefix MkSolo# to be a rare sight for programmers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the text to reflect this preference

@int-index
Copy link
Contributor Author

I think that the pretty-printer should continue to print applied unary tuples using mixfix syntax.

It was not my intent to change this. GHC should indeed continue to print MkSolo# a as (# a #), the problem only arises in case of unapplied occurrences.

Is there ever a case where we have to print an un-applied constructor where the user didn't write this?

I've only seen it happen in pretty-printed Core where left-hand sides of case branches are printed in prefix form. I can't say with any certainty whether it also might happen in user-facing GHC output, but if that's so, it would indeed be nice to mention how MkSolo# relates to (# a #).

@int-index
Copy link
Contributor Author

@adamgundry Let's submit this for committee consideration.

@adamgundry adamgundry added the Pending shepherd recommendation The shepherd needs to evaluate the proposal and make a recommendataion label Mar 6, 2024
@simonpj
Copy link
Contributor

simonpj commented Mar 11, 2024

Reading this amendment makes me wonder:

  1. The canonical names for the tuple type constructors are Tuple2, Tuple3, ... and Tuple2#, Tuple3#, ...
  2. The canonical names for the tuple data constructors are (,), (,,), .. and (#,#), (#,,#)...
  3. The proposal (amended) adds MkSolo and MkSolo# as the canonical names for unit tuples. Some change is forced, since the existing canonical names don't work.
  4. The proposal also adds Unit, Solo and Unit#, Solo# as the canonical names for the type constructors, with Tuple0, Tuple1, Tuple0# and Tuple1# as synonyms. This is an un-forced change.

Concerning (3), an alternative could be to stick with Tuple0, Tuple1, Tuple0# and Tuple1# as the canonical names for the type constructors, with (if you like) Unit and Solo as type synonyms. Maybe that would be more uniform? Thus

type Solo = Tuple1;  data Tuple1 a = MkSolo a

rather than

type Tuple1 = Solo; data Solo a = MkSolo a

Concerning (4), we are free to invent different canonical names for our unit data constructors. Rather than MkSolo we could have (_) for example, and similarly (#_#). As others have said, they are usually printed distifx. Ah: the distfix form of (_) x would naturally be ( x ), and that's ambiguous. Bother. Maybe that's the main argument for MkSolo. And if MkSolo then clearly MkSolo#.

I am not advocating strongly here for any one path. This is detail stuff. But we will be stuck with this forever; I hadn't seen the non-uniformity so clearly before; and uniformity is a virtue.

@tek
Copy link
Contributor

tek commented Mar 11, 2024

Concerning (3), an alternative could be to stick with Tuple0, Tuple1, Tuple0# and Tuple1# as the canonical names for the type constructors, with (if you like) Unit and Solo as type synonyms. Maybe that would be more uniform?

Compelling idea. Note however that the new names are already in GHC 9.8

@int-index
Copy link
Contributor Author

@simonpj We have already transitioned to MkSolo for the boxed singleton tuple, so I decided to follow the precedent with MkSolo#.

I'm not opposed to other names in principle, though I don't believe (_) or (#_#) could work because they'd be lexed as three separate lexemes, ( _ ) and (# _ #) respectively. Unless, of course, we want to add even more whitespace-sensitivity to the language.

Regarding the canonical names, i.e. the choice between Unit and Tuple0, the reason to prefer Unit is that it's very difficult to know when to expand type synonyms and when not to, so the canonical name should be the one that we want the user to see.I 'd much prefer to see Solo and Unit in error messages and Haddock-generated documentation, hence the decision to make them canonical. Of course, we could also add special logic just for those two types to prefer type synonyms in user-facing compiler output.

@tek
Copy link
Contributor

tek commented Mar 11, 2024

Of course, we could also add special logic just for those two types to prefer type synonyms in user-facing compiler output.

FWIW the tuple printing logic is as specialized as it can get already

@int-index
Copy link
Contributor Author

What about class instances? Say I want to declare

instance C Unit where
  ...

instance C (Solo a) where
  ...

Do I have to turn on TypeSynonymInstances?

@simonpj
Copy link
Contributor

simonpj commented Mar 12, 2024

As I say, I'm asking, not recommending. It doesn't look as if the case for change (wrt the proposal) is compelling.

@Tritlo
Copy link
Contributor

Tritlo commented Apr 14, 2024

This proposed amendment has been accepted. The alternative described by Simon is left for a future amendment to #475. Thanks for your hard work @int-index!

@Tritlo Tritlo added Accepted The committee has decided to accept the proposal and removed Pending shepherd recommendation The shepherd needs to evaluate the proposal and make a recommendataion labels Apr 14, 2024
@Tritlo Tritlo merged commit 23ef22d into ghc-proposals:master Apr 14, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted The committee has decided to accept the proposal
Development

Successfully merging this pull request may close these issues.

None yet

6 participants