Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow case expressions without branches #2922

Open
rightfold opened this issue Jun 1, 2017 · 25 comments
Open

Allow case expressions without branches #2922

rightfold opened this issue Jun 1, 2017 · 25 comments

Comments

@rightfold
Copy link
Contributor

rightfold commented Jun 1, 2017

Can be used for empty data decls such as Void. Proposed syntax is to leave off the of keyword.

data Void

absurd :: forall a. Void -> a
absurd = case _

Type inference may suffer. What would the inferred type of absurd be if no explicit type was given?

Issue about exhaustiveness checker exists elsewhere.

Not a breaking change, but existing code that uses data X as a short-hand for foreign import data X :: Type must be modified to use the latter; an empty data declaration is now truly an empty type.

@paf31 paf31 added this to the Approved milestone Jun 1, 2017
@paf31
Copy link
Contributor

paf31 commented Jun 1, 2017

This is mostly an exhaustivity checker issue right now. Type checking should do the right thing.

Note you can always use absurd v = absurd v.

@paf31 paf31 changed the title Allow case expressions without brnaches Allow case expressions without branches Jun 1, 2017
@hdgarrood
Copy link
Contributor

I feel like we could sidestep the need for this by either encouraging people to use absurd v = absurd v as suggested, or if that's too icky, defining the empty data type in question as a newtype around Void, as suggested in purescript/purescript-generics-rep#10, so that you can provide an absurd for your type with absurd (X v) = Data.Void.absurd v. I think e.g.

newtype Z = Z Void
newtype S n = S Void

is potentially clearer in intent than

data Z
data S n

anyway, and switching between these is a non-breaking change as long as you don't export the constructor. The newtype approach does admittedly introduce the data constructors into the local scope, though, which might not be as desirable.

@hdgarrood
Copy link
Contributor

I’m actually 👎 on this, personally. The issue of what type we ascribe an empty case declaration seems unnecessarily thorny, since we can already just use Void (or newtypes around it). As already noted, it is also a breaking change since there are a bunch of places people do use data Whatever for data types which are inhabited but where all the functions for introducing values come from the FFI (eg ReactComponent).

@rhendric
Copy link
Member

Does that 👎 extend to changes to the exhaustivity checker? I don't often need truly empty case expressions, but I have found myself in places where I'd much rather write

f :: Either Int (Either (Tuple Void String) (Tuple Boolean Void)) -> Int
f = case _ of
  Left i -> i

than

f = case _ of
  Left i -> i
  Right x -> either (absurd <<< fst) (absurd <<< snd) x

or

f = case _ of
  Left i -> i
  Right (Left (Tuple v _)) -> absurd v
  Right (Right (Tuple _ v)) -> absurd v

The data Whatever objection would apply here too, but I tentatively think that pattern should be explicitly deprecated. It does seem like lying to the compiler, relative to the more honest foreign import data.

@kl0tl
Copy link
Member

kl0tl commented Dec 26, 2020

We can sidestep the issue of what type to ascribe an empty case expression by raising an error, like GHC does with -Wincomplete-patterns and -Werror:

{-# LANGUAGE EmptyCase #-}
{-# LANGUAGE LambdaCase #-}

module Example where

example = \case {}
Example.hs:6:11: error: [-Wincomplete-patterns, -Werror=incomplete-patterns]
    Pattern match(es) are non-exhaustive
    In a case alternative: Patterns not matched: _ :: p
  |
6 | example = \case {}
  |

Also I don’t think that foreign types which are inhabited but declared with an empty data declarations are a strong argument against empty case expressions since we can already elimitate such types with unsafeCoerce anyway, empty case expressions would only make for more convenient and semantic eliminators.

There’s some unfortunate overlap between both declarations but this is something we should be able to correct when working on promotion, at least this is something we’ll have to think about.

@hdgarrood
Copy link
Contributor

Presumably example would start compiling if you provided it with a signature like forall a. Void -> a? I think having an expression turn from an error to a successful compilation just by adding an explicit type signature would be really quite ugly.

@hdgarrood
Copy link
Contributor

You can already do everything with unsafeCoerce, but that’s not persuasive to me because being able to do something safely rather than unsafely is significant.

@hdgarrood
Copy link
Contributor

hdgarrood commented Dec 26, 2020

FWIW I also would like to deprecate the usage of data Whatever for inhabited data types but I don’t see a good way of doing that, since it’s going to be very difficult for the compiler to distinguish a legitimate use of an empty data declaration from a deprecated one.

@kl0tl
Copy link
Member

kl0tl commented Dec 26, 2020

Presumably example would start compiling if you provided it with a signature like forall a. Void -> a?

Yeah exactly.

I think having an expression turn from an error to a successful compilation just by adding an explicit type signature would be really quite ugly.

Is that really different from monomorphic bindings

import Prelude
import Data.Tuple (Tuple(..))

example :: Boolean -> String -> Tuple Unit Unit
example x y = Tuple (f x) (f y) where
  f = const unit
Error found:
in module Example
at Example.purs:7:30 - 7:31 (line 7, column 30 - line 7, column 31)

  Could not match type
          
    String
          
  with type
           
    Boolean
           

while checking that type String
  is at least as general as type Boolean
while checking that expression y
  has type Boolean
in value declaration example

See https://github.com/purescript/documentation/blob/master/errors/TypesDoNotUnify.md for more information,
or to contribute content related to this error.
 module Example where

 import Prelude
 import Data.Tuple (Tuple(..))

 example :: Boolean -> String -> Tuple Unit Unit
 example x y = Tuple (f x) (f y) where
+  f :: forall a. a -> Unit
   f = const unit

or instances resolution?

module Example where

import Prelude

import Data.Map (Map, toUnfoldable)
import Foreign.Object (Object, fromFoldable)

example :: forall a. Map String a -> Object a
example = toUnfoldable >>> fromFoldable
Error found:
in module Example
at Example.purs:10:11 - 10:23 (line 10, column 11 - line 10, column 23)

  No type class instance was found for
                                 
    Data.Unfoldable.Unfoldable t3
                                 
  The instance head contains unknown type variables. Consider adding a type annotation.

while checking that type forall (f :: Type -> Type) (k :: Type) (v :: Type). Unfoldable f => Map k v -> f (Tuple k v)
  is at least as general as type t0 t1 t2
while checking that expression toUnfoldable
  has type t0 t1 t2
in value declaration example

where t0 is an unknown type
      t1 is an unknown type
      t2 is an unknown type
      t3 is an unknown type

See https://github.com/purescript/documentation/blob/master/errors/NoInstanceFound.md for more information,
or to contribute content related to this error.
 module Example where

 import Prelude

 import Data.Map (Map, toUnfoldable)
 import Foreign.Object (Object, fromFoldable)

 example :: forall a. Map String a -> Object a
-example = toUnfoldable >>> fromFoldable
+example = (toUnfoldable :: _ -> Array _) >>> fromFoldable

being able to do something safely rather than unsafely is significant.

I agree, and I think this is actually a point in favor of empty case expressions since we cannot safely eliminate Void otherwise.

@hdgarrood
Copy link
Contributor

I think there is an important difference from those examples because in those examples you’re providing a type annotation to a subexpression, not the expression itself. I think it’s currently true that in PureScript, if

val = [expr]

can ever compile successfully, then it will compile successfully without a type annotation of the form val :: T. (I think that this might not be true in the presence of rank N types but other than that I’m fairly sure it is true). This is a useful property because, eg, it means that if you comment out a type annotation on a value declaration which was causing a compile error, and after commenting out the type annotation you’re still seeing an error, you know that the problem is in the expression, not the type annotation.

@hdgarrood
Copy link
Contributor

I think absurd should be considered a safe function which is implemented using unsafe means. From this perspective we are talking about being able to remove about 3 lines of unsafe code, which to me is a fairly negligible impact.

@hdgarrood
Copy link
Contributor

To clarify that: the way I see it, the goal here is to allow people to define uninhabited types with eliminators and without having to resort to unsafe code. For me, recommending newtypes over Void satisfies this requirement, because absurd is safe.

@hdgarrood
Copy link
Contributor

Sorry for quadruple-posting but I just realised I still haven’t really addressed the issue of extending the exhaustivity checker. I’m a bit hesitant of that too because it means that having no constructors becomes part of a module’s public API, even if you don’t want it to be; adding a constructor to a previously empty data type becomes a breaking change if you can write downstream code which made use of the fact that the type previously had no constructors, even if you don’t export the new constructor.

@rhendric
Copy link
Member

adding a constructor to a previously empty data type becomes a breaking change

Yeah, I think making an uninhabited type inhabited ought to be a breaking change, especially if our current guidance for handling uninhabited types is to roll your own absurd. I'd much rather have the compiler throw an error about a case expression that is no longer exhaustive than have my program silently loop forever on input it never expected to receive. I get how the no-exported-constructors thing might be unintuitive behavior, but to me it's clearly more useful behavior.

If the above argument doesn't move you, though, a compromise where one has to explicitly export Void(..) to get the exhaustivity checker to consider Void to be uninhabited would be fine with me. As an opt-in enhancement, we wouldn't have to worry about whether or how to deprecate data Whatever in that case.

@hdgarrood
Copy link
Contributor

Ah ok, to clarify I mean that in the case where we implement this, adding a constructor to a previously uninhabited data type becomes a breaking change in a sneaky way that package authors are likely to miss, I think, so without tooling which tells you that you need to release this as a breaking change I think it’s a bit risky (I’d still love to have this tooling but I don’t think we can assume it’ll happen any time soon).

@hdgarrood
Copy link
Contributor

If we don’t implement this, the only way a type T is observably uninhabited is via a function forall a. T -> a, whose implementation should be safe anyway (the only absurd function with an unsafe implementation should be Void’s). So if Alice maintains a library which provides an uninhabited type T along with an elimination function, then when she adds a new constructor to that type, the elimination function will stop compiling and she’ll be forced to remove it, which is obviously a breaking change.

@rhendric
Copy link
Member

I was ready with a reply until that last comment, which has left me feeling rather confused. I'm probably being slow but could you please explain further? Are you presupposing that any such T would be a newtype wrapping Void, or is there some way to implement forall a. T -> a for data T other than absurdT t = absurdT t that I don't know about?

@hdgarrood
Copy link
Contributor

Oh sorry, I should have been clearer! I’m imagining that the recommended way of defining an uninhabited type would be

newtype T = T Void
absurdT (T x) = absurd x

so that the author of the package containing T doesn’t need to do anything dodgy like defining absurdT x = absurdT x, and so that if T later becomes inhabited then the compiler will require the package author to remove the absurdT function.

@rhendric
Copy link
Member

Gotcha, okay. So that's a fine pattern for making sure package consumers can always write safe code for handling T, no question there. My problem is that it doesn't do anything about either (absurdT <<< snd) (absurdT <<< fst) gymnastics, which is certainly a lesser concern than being able to write safe code, but still seems like a deficiency in the exhaustiveness checker.

Your objection to enhancing the exhaustiveness checker was that adding a constructor to an uninhabited data type would become a breaking change, and in a sneaky way that library authors aren't necessarily going to think about. But the point I was trying to make is that it's already a sneaky breaking change to do so, if downstream authors have written code that is unsafe if and only if the type in question is inhabited (such as rolling their own absurd). The only change that enhancing the exhaustiveness checker would make is moving some of those downstream breaks from run time to compile time; it changes nothing for the library author.

(If library authors do what you want and wrap Void, and the exhaustiveness checker is enhanced, then everybody wins: I don't have to write code for cases that will never run, library authors get dinged by the compiler when they add constructors, and people who like writing ad-absurdum proofs have that option as well without having to do anything unsafe.)

@MonoidMusician
Copy link
Contributor

In total languages you can still define an empty type via data Void = Void Void and then the eliminator looks safe: absurd (Void v) = absurd v (which is structural recursion). It's like type Void = Mu Identity and then absurd = cata (\(Identity a) -> a), just to prove that it's total.

So maybe that could be the recommended way of safely making an empty data type?

@hdgarrood
Copy link
Contributor

But the point I was trying to make is that it's already a sneaky breaking change to do so, if downstream authors have written code that is unsafe if and only if the type in question is inhabited (such as rolling their own absurd

I don't find that line of argument particularly convincing personally, because I think if downstream users have already rolled their own absurd via unsafe means then they can't really complain when it breaks, especially if the data type in question is not explicitly documented as being uninhabited. (Emphasis on "via unsafe means".) If a data type happens to be uninhabited but that fact isn't observable via an absurd-like function and the type isn't documented as being uninhabited, then you can't really argue that the fact that it is uninhabited is part of the public API, so downstream code that assumes that it is uninhabited and then uses unsafe constructs to avoid handling the case where it is inhabited is wrong. I think that being in a situation where downstream code which doesn't make use of any unsafe constructs can break (so, that includes things like unsafeCrashWith, unsafePartial, and absurdT x = absurdT x), is very different from being in a situation where downstream code can break but only if it does use those things; when you use unsafe things, you're saying "this is now my responsibility if the assumptions which led me to think this was okay turn out not to hold".

@hdgarrood
Copy link
Contributor

a compromise where one has to explicitly export Void(..) to get the exhaustivity checker to consider Void to be uninhabited would be fine with me.

I think I could get on board with this. How exactly would the compiler decide whether or not a data type is uninhabited, then? Would we build knowledge of Data.Void.Void into the compiler?

@rhendric
Copy link
Member

rhendric commented Dec 27, 2020

How exactly would the compiler decide whether or not a data type is uninhabited, then? Would we build knowledge of Data.Void.Void into the compiler?

I don't think there's a need for that. There should be a relatively simple heuristic that gets it right for most cases that are likely to be considered in a pattern match:

  • A type (of kind Type) is either inhabited or uninhabited; inhabited really means ‘presumed inhabited’ and uninhabited means ‘proven uninhabited’.
  • A constructor instantiated at a type can be considered inhabited or uninhabited, with the same semantics. A constructor is inhabited iff all of its parameter types are inhabited. (By ‘instantiated at a type’, I mean that the Identity :: Int -> Identity Int constructor should be distinct from Identity :: a -> Identity a or Identity :: Void -> Identity Void.)
  • Primitive types are inhabited.
  • Function types are inhabited (not entirely true in the Curry-Howard interpretation of types, but true in PureScript—we can always write a function, since we aren't guaranteeing that the inhabitants must terminate).
  • A type variable that hasn't been unified out in the context of the type being considered is inhabited.
  • A foreign import type is inhabited.
  • A data type or newtype is inhabited iff:
    • its constructors are unexported and we are not in its declaring module OR
    • either of the above conditions is false, and at least one of the constructors is judged inhabited without recursively revisiting a type under consideration (meaning, Cons :: a -> List a -> List a doesn't prove that List a is inhabited because it has a List a parameter, but that's okay because Nil :: List a does).
  • A record type is inhabited iff all of its field types are inhabited.

And then the exhaustiveness checker can ignore any constructors deemed uninhabited by this heuristic.

This errs slightly on the side of the compiler thinking that more types are inhabited than they really are, but that's the safe direction in which to err, and I think it correctly handles the vast majority of practical cases.

@hdgarrood
Copy link
Contributor

Ok, thanks. Those rules seem fine to me, although extending the exhaustivity checker in this way does feel a bit against the spirit of https://github.com/purescript/governance#project-values; to me this feels like a bit of a special-purpose feature, since I think it's quite rare to be writing code that would benefit from this extension. Or to put it a different way, I'm hesitant because I think this would give us a somewhat low bang for our buck (where "buck" is "lines of code in the compiler to maintain"). Do you have a more concrete example of where this would be useful? Do any other core team members want to weigh in on whether they think this is worth including?

@rhendric
Copy link
Member

This very well might not be worth the maintenance burden. Speaking as a user and not a collaborator, I just wanted to say that the exhaustivity checker improvement was the part of this I'd be interested in, and I think it's worth considering even if the downsides specific to empty case expressions scuttle that part of the proposal. If there aren't enough industrial users who are excited about that with me to justify it, so be it; it's not essential to anything I'm working on.

But to get slightly more concrete, let's say I'm working on my own language, with its own desugaring pipeline. I have an Expr data type:

data Expr
  = LitStr String
  | LitNum Number
  | Var String
  | App Expr Expr
  | Lam String Expr
  | Let String Expr Expr

And I want to desugar Lets into Lams and Apps. Afterward, I don't expect to have any Lets in my Expr trees, and I don't want to have to write pattern match cases for them. Here's how I indicate that:

data Expr haslet
  = LitStr String
  | LitNum Number
  | Var String
  | App (Expr haslet) (Expr haslet)
  | Lam String (Expr haslet)
  | Let haslet String (Expr haslet) (Expr haslet)

desugarLets :: Expr Unit -> Expr Void

Mission accomplished, except I do still need to write cases for Let in subsequent desugaring/codegen passes. Those cases will all be variations on absurd, and thus not terribly difficult to write, but if you imagine a much more complex Expr type, those variations can get pretty intricate, even though they all essentially amount to mechanical proofs of the non-habitability of each of the constructors that are no longer available at this stage in the pipeline. But those proofs correspond to real code that gets generated and never executed, and all because PureScript is very opinionated about keeping me safe when pattern matching. Fair enough, but if PureScript were a little smarter about what is and isn't safe, I wouldn't have to write those proofs, and (provably, but not practically provable once the type information is gone) unreachable code wouldn't have to be included in my final result.

I don't know how often this sort of thing is useful outside of language development. I have the not-necessarily-universal problem that every project I work on for long enough seems to turn into an exercise in language development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants