Type matching #20

RealyUniqueName · 2017-04-15T12:17:29Z

It's a common case to check if a variable is of some type and cast it to that type.
This proposal is an attempt to improve Haxe in such situations.

Rendered document

nadako · 2017-04-15T12:50:49Z

somewhat related haxe issue

nadako · 2017-04-15T12:55:06Z

I personally think smart casting is a nice feature (and AFAIK it's already implemented in some languages, like Kotlin and TypeScript).

What I'm not sure about this proposal is do we really need an explicit name for downcasted value, or even any new syntax? What we could have instead is something like:

if (developer is Indie) {
    $type(developer); // Indie
}

One concern here is that if Indie here is an interface rather than subclass, we might want to still access original (non-casted) developer value, so maybe having an optional name specification would be useful here.

RealyUniqueName · 2017-04-15T13:00:13Z

@nadako In that case you have to either make developer expression read-only or find a way to guarantee that user will not assign there something that will not unify with original type.

nadako · 2017-04-15T13:18:22Z

@nadako In that case you have to either make developer expression read-only or find a way to guarantee that user will not assign there something that will not unify with original type.

I don't understand, if a var is typed as a subtype inside a if block, shouldn't it be safe to assign a value of that subtype within that block?

RealyUniqueName · 2017-04-15T13:27:54Z

It's not safe if interfaces are involved.

nadako · 2017-04-16T12:39:04Z

Ah, right. Well this makes sense then. It's also similar to what C# 7 is introducing, right?

back2dos · 2017-04-16T12:58:48Z

I like the general idea of the proposal, but the premise that the existing methods are without alternative is just wrong. You can do this for single checks:

switch Std.instance(developer, Indie) {
  case null:
  case indie: indie.developGameEngine();
}

With just a little extra code, you can make something work for multiple types:

switch developer {
  case _.as(Indie) => Named(indie):
    indie.createGameEngine();
  case _.as(Student) => Named(student):
    student.learnToCode();
  default:
}

IIRC there's an open issue about optimizing the enum allocation away in cases like that, which is all that is needed to eliminate any performance overhead.

I also would like to contest this statement:

Trying to implement such behavior with macros looks like too complex task.

Really? What have you tried? What about this? Or why not without build macros:

if (is(developer, (indie:Indie))) 
  indie.createGameEngine();
else if (is(developer, (student:Student)))
  student.learnToCode();

Sure, all of the solutions I am presenting have some shortcomings and having actual syntax for this would be better, but I see no need to base the argument on such questionable assumptions.

RealyUniqueName · 2017-04-16T13:40:55Z

@nadako Yes. Looks like it's similar to C#

@back2dos Std.instance() is not the same as Std.is or is. And i didn't say it's impossible with macros. Perhaps, i had to add "too complex for such a small task". And we don't have enum optimization yet, are we? Also what about checking different branches of condition in if or while as said in proposal?
And again: i'm not saying it's impossible.

Sunjammer · 2017-04-16T22:06:28Z

I'd rather prefer explicit cast blocks then. cast(developer, Indie) { ... } where the casted result is scoped to a block that only runs if the cast was successful. Then you can nest blocks until all the casts you need are done and ready while keeping the "magic" minimal. In the proposal scope becomes muddled imo. It doesn't read very well. IMO if you need this much casting anyway something is wrong with your design.

sledorze · 2017-04-18T20:05:40Z

Like @nadako suggested,
I'm really inviting you to consider the Typescript approach, you can find here in the 'User-Defined Type Guards' section:
https://www.typescriptlang.org/docs/handbook/advanced-types.html

RealyUniqueName · 2017-04-18T20:53:01Z

Consider this code:

if(someMethod().developer is Indie) {
    someMethod().developer.createGameEngine();
}

Is it safe to assume that second call to someMethod() will return Indie-compatible instance again?
It also has performance impact because of second call.

sledorze · 2017-04-18T20:55:30Z

No it should not be safe to assume that in an effectful world..

…

On Tue, Apr 18, 2017 at 10:53 PM, Alexander Kuzmenko < ***@***.***> wrote: Consider this code: if(someMethod().developer is Indie) { someMethod().developer.createGameEngine(); } Is it safe to assume that second call to someMethod() will return Indie-compatible instance again? It also has performance impact because of second call. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#20 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZb-CCkx18ojFQ15ShFoIU0CIGx2Xkjks5rxSKugaJpZM4M-VVb> .

-- Stéphane Le Dorze http://lambdabrella.blogspot.com/ http://www.linkedin.com/in/stephaneledorze http://twitter.com/stephaneledorze <https://twitter.com/#!/stephaneledorze/status/74020060125077504> Tel: +33 (0) 6 08 76 70 15

nadako · 2017-04-18T21:59:07Z

Yeah, now I think explicit var binding like proposed makes sense. What's not quite clear is what that expression should expand to?

E.g.

if(developer is indie:Indie && indie.hasMotivation()) {
	crowd.throwMoneyAt(indie);
	indie.createGameEngine();
}

It should become something like this, right?

if (developer is Indie) {
	var indie:Indie = cast developer;
	if (indie.hasMotivation()) {
		crowd.throwMoneyAt(indie);
		indie.createGameEngine();
	}
}

RealyUniqueName · 2017-04-18T21:59:49Z

@sledorze Oh, sorry. Did you mean function isFish(pet: Fish | Bird): pet is Fish?

sledorze · 2017-04-18T22:01:09Z

Yes, that's it!

…

On Tue, Apr 18, 2017 at 11:59 PM, Alexander Kuzmenko < ***@***.***> wrote: @sledorze <https://github.com/sledorze> Oh, sorry. Did you mean function isFish(pet: Fish | Bird): pet is Fish? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZb-OjBSENzTm4d1swMgHODmfLZCUGzks5rxTJWgaJpZM4M-VVb> .

-- Stéphane Le Dorze http://lambdabrella.blogspot.com/ http://www.linkedin.com/in/stephaneledorze http://twitter.com/stephaneledorze <https://twitter.com/#!/stephaneledorze/status/74020060125077504> Tel: +33 (0) 6 08 76 70 15

nadako · 2017-04-18T22:02:22Z

function isFish(pet: Fish | Bird): pet is Fish?

That's what I was also proposing in HaxeFoundation/haxe#5167, but it's a separate discussion, I think.

RealyUniqueName · 2017-04-18T22:02:43Z

I'm not sure that is in the scope of this proposal

RealyUniqueName · 2017-04-18T22:06:42Z

What's not quite clear is what that expression should expand to?

@nadako your example looks fine to me. It will probably require some code copying, when there is an else

sledorze · 2017-04-18T22:15:51Z

The custom nature of the Typescript solution has broader applications but requiers separate declaration of the function and rely on the ability to change a variable type in a specific scope (achieved by control flow analysis in TS). So it is more powerful (there's other possible usages from that) but has a 'higher' implementation 'cost'. That would not be silly to consider it. Has it is a broader subject to discuss in a separate thread, I'll leave it to you :) ..

…

On Wed, Apr 19, 2017 at 12:02 AM, Alexander Kuzmenko < ***@***.***> wrote: I'm not sure that is in the scope of this proposal — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#20 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAZb-G7Sa8gackcFPAtMQUUlh1EvhfH7ks5rxTMDgaJpZM4M-VVb> .

-- Stéphane Le Dorze http://lambdabrella.blogspot.com/ http://www.linkedin.com/in/stephaneledorze http://twitter.com/stephaneledorze <https://twitter.com/#!/stephaneledorze/status/74020060125077504> Tel: +33 (0) 6 08 76 70 15

RealyUniqueName · 2017-04-27T09:47:40Z

Updated proposal to clarify that declared variable is only available in branches of code which get executed only if is evaluated to true.

RealyUniqueName · 2017-04-28T09:16:56Z

ping @ncannasse @andyli @waneck @hughsando

hughsando · 2017-04-28T13:36:07Z

What about the more general: "var in statement evaluates to 'not null' and scopes var in statement"

if (var indie:Indie = cast developer && indie.knowsHaxe())
   indie.developGame();

Also gives as a generalizations:

(var company = indie.company) && company.makeMoney();
var c = indie.company && var director = c.director && director.fire();

Compare

if (var indie:Indie = cast developer && indie.knowsHaxe())  indie.developGame();
if ( developer is indie:Indie && indie.knowsHaxe())  indie.developGame();

Extra var = cast less is but I always prefer general to specific.

RealyUniqueName · 2017-05-26T13:19:12Z

@hughsando idk, var declaration syntax does not hint any type checking for code reader. It's read more like "declare indie and assign developer no matter what".
Maybe (developer is var indie:Indie)?

RealyUniqueName · 2017-05-26T13:19:56Z

ping @ncannasse @andyli @waneck

back2dos · 2017-05-26T15:34:56Z

I wish to state one more time that there are far terser idioms available in Haxe already than the examples constructed to motivate the proposal.

The way I see it, this proposal effectively competes with #11 since the syntax is going to be very similar. Having both would quite certainly be a source of confusion and given the choice I would argue that the other proposal is statically more sound and explicit, since you formulate something like var developer:Indie | Student and then the compiler knows all the cases and can perform exhaustiveness checks and what not. Contrast that to this feature which proposes syntax sugar for performing arbitrary casts along lines that elude the type system.

RealyUniqueName · 2017-05-26T16:25:33Z

Sorry, I don't see, how this proposal competes with #11.
If you declare var developer:Indie | Student you still have to check if developer is Indie to be able to call methods of Indie class and avoid runtime exception.

benmerckx · 2017-05-26T16:36:54Z

That check would be (as per the other proposal):

switch typeof developer {
  case Indie: trace(developer.developGame());
  case Student:
}

RealyUniqueName · 2017-05-26T16:39:42Z

Oh, forgot about that section in that proposal, sorry

RealyUniqueName · 2017-05-26T16:44:11Z

And yet in this proposal developer is already strictly typed at compile and runtime. While in #11 developer becomes dynamic at runtime and you actually can't safely do anything with it before checking a type.

back2dos · 2017-05-26T17:30:23Z

If you declare var developer:Indie | Student you still have to check if developer is Indie to be able to call methods of Indie class.

Yes, you have to check developer is Indie. Discerning various cases is the whole point of having a union type. As opposed to that, the point of an interface (or base class) is to use polymorphic dispatch over different implementations, rather than casting to narrower or even totally unrelated types. If you say var developer:Developer then you should be calling methods of Developer on developer and nothing else.

Another difference is this:

var developer:Indie|Student = ...;
if (developer is Indie) {
  $type(developer);//Indie
  developer = new Indie();//this is - unlike in your proposal - safe to do, because we're operating within the type system
}

Both proposals are about runtime type checks, which is why I am saying they compete. If both are added, they will cause confusion between newcomers at the very least.

Having established their similarity, let's examine their differences together:

And yet in this proposal developer is already strictly typed at compile and runtime.

Yes, this statement is factually correct, but it is completely besides the point. While the type of developer is known, it is not the one you want, which is why you're going to type check and cast in the first place. From that perspective, the type of developer would be just as useful if it were Any.

While in #11 developer becomes dynamic at runtime and you actually can't safely do anything with it before checking a type.

No, this statement is factually wrong. You can do everything on a union type, that is possible to do on the intersection of its constituents. Say you have:

interface Developer {
  function develop():Void;
}
interface Indie extends Developer {
  function createEngine():Void;
}
interface Student extends Developer {
  function study():Void;
}

var developer:Indie|Student = ...;
developer.develop();//<-- this is valid, because we know that in all cases the `develop` method is available

I think this one of the crucial advantages of true union types over haxe.extern.EitherType that escaped you.

So, let's take a step back: say I write a library with a function that consumes Developer and then internally does some runtime type checks to call the right functions on the actual implementations. You want to use that library. How do you know which implementation of Developer I treat? Maybe there's a couple of cases I expect, but you made your own implementation which none of my cases cover so you get an exception. In AS3 this perversion was not uncommon, as with the prominent IBitmapDrawable which really is just DisplayObject | BitmapData but there was no way to express that. But really: is that a way we want to establish contracts in a statically typed language? Which of these two would you say communicates more:

function makeAGame(developer:Indie | Student):Game { ... }
function makeAGame(developer:Developer):Game { ... }

With union types, you can even make this distinction:

function makeAGame(developer:Indie | Student):Game { ... }
function makeAGame(developer:Indie | Student | Developer):Game { ... }

The first function will only accept students and indies - which with your downcasting approach would work by checking if the argument is one of both or throwing an exception otherwise. The second function will accept all kinds of developers, but it's telling me that it has special treatment for indies and students.

It all boils down to this: if you wish to do a runtime type check, it really can't hurt to tell the compiler about it beforehand - which is what union types do.

RealyUniqueName · 2017-05-26T20:53:12Z

Another difference is this:

var developer:Indie|Student = ...;
if (developer is Indie) {
  $type(developer);//Indie
  developer = new Indie();//this is - unlike in your proposal - safe to do, because we're operating within the type system
}

This scenario was already discussed here #20 (comment)

Yes, this statement is factually correct, but it is completely besides the point. While the type of developer is known, it is not the one you want,

override function handle(developer:Developer) {
     developer.performDevelopersStuff(); //I really want Developer here
     super.handle(developer); //and here. Even more: i'm not an author of parent class.
     if(someSpecialCase() && developer is alien:IHumanoid) {
          alien.doHumanoidStuff();
     }
}

So, let's take a step back: say I write a library with a function that consumes Developer and then internally does some runtime type checks to call the right functions on the actual implementations. You want to use that library. How do you know which implementation of Developer I treat?

This is the only real advantage of #11 i see: typehinting for API. And this is what interfaces are supposed to be used for. And if you can't achieve what you want with interfaces, then you need method overloading, which unfortunately we don't have in Haxe.
Interfaces and method overloading is the way i want to establish contracts in a statically typed language for such cases.
Obviously my proposal is not aimed to solve this issue.

tell the compiler about it beforehand - which is what union types do.

What for? Why does it need to maintain useless information?
Extending your example with interfaces:

interface Developer {
  function develop():Void;
}
interface Indie extends Developer {
  function createEngine():Void;
}
interface Student extends Developer {
  function study():Void;
}

var developer:Indie|Student = ...;
developer.develop();//So compiler has to deduct common type and type developer as Developer. Has nothing to do with Indie or Student here.

if(Std.is(developer, Indie)) {  //And it's only this block, where compiler needs to know developer is actually Indie
    developer.doDeveloperStuff(); //but don't forget it's also a Developer!
}

developer.develop(); //Just Developer again

Alright, alright. I will counter this argument myself: you can pass developer to any place, where Student or Indie is expected without additional code. But this is one of the things abstracts are used for.
And do you really want to complicate compiler typing system implementation even more? Let's ask Simn )

back2dos · 2017-05-26T22:10:59Z

Ok, there seems to be quite a bit of confusion here. I really suggest you read the link Stephane shared and that you stop interpreting so freely what I say.

Here is a piece of TypeScript code to clarify what #11 proposes:

function get(): Indie | Student {
  return new Indie();
}
class Indie { 
  code() { }
  doIndieThing() { }
}
class Student {
  code() { } 
  doStudentThing() { }
}
get().code();//compiles, because both types have a `code` method
let dev = get();
if (dev instanceof Indie)
  dev.doIndieThing();//compiles, because dev is Indie in this branch
if (get() instanceof Indie) 
  get().doIndieThing();//does not compile for the reasons you gave - btw.: nobody ever said it should

Let's be very clear that the type is Indie | Student and nothing else. The compiler most certainly does not "deduct a common type that is Developer" as seen in the example above, where there is no common Developer type to start with. What the compiler does is to allow to treat the type as Student & Indie outside case analysis, because that intersection is a subtype of all cases.

Now, if all types have a common base type, then the compiler could (and should) use that as the runtime type. The good news is, the compiler can already determine that. But again, even if that common base type exists, it is not the same as Indie | Student, meaning that an arbitrary Developer is not a valid value for it - only and Indie or a Student is.

What for? Why does it need to maintain useless information?

So you are saying that compile time type information is useless? In that case I suggest you just use Dynamic everywhere and you will not need the feature you're proposing \o/

And do you really want to complicate compiler typing system implementation even more? Let's ask Simn

In short: yes. I'm certainly curious to know what Simn thinks about #11. If you wish to make the case that it's too complicated or otherwise flawed, please do. So far you have only demonstrated a rather pronounced disinterest in understanding what's actually being proposed.

Simn · 2017-06-01T10:13:59Z

I like the idea in general, have to think about what consequences it would have for the typer. There's a good chance this is not so easy to get right with out current architecture.

waneck · 2017-06-01T16:11:00Z

I like the syntax, and I think it's pretty useful. This could probably be a subset of allowing variable declarations as expressions (e.g. in C++ you could write if (Indie *indie = dynamic_cast<Indie*>(someVar))). But I can see how the syntax looks better the way it's proposed.
As for #11 vs #20, I don't see them as being in the opposing side - rather they would complement each other.

DavisDevelopment · 2018-09-13T22:05:09Z

What about the as keyword, perhaps in conjunction with one of them sweet '=' characters? Add a few well-placed expressions in there and you might get something like:

class FewpDewp {
  static var creature: Animal;
  static function doAThing(){
    if (pet = creature as Household mammal) {
      // with 'pet' now defined within this scope, typed as HouseholdMammal
}
  }
}

Admittedly, I didn't read the whole thread so sorry if I'm being redundant, but that syntax was the very first thing that came to mindg

RealyUniqueName · 2018-09-21T15:21:58Z

It doesn't look like a lot of people want this, so we decided to reject it.

markknol · 2018-09-21T15:26:54Z

Too bad, but is there still hope for this to work:

if (dev is Indie)  dev.doIndieThing(); // compiles, because dev is Indie in this branch

RealyUniqueName · 2018-09-21T16:42:37Z

Since it's the core of this proposal, I doubt it.

RealyUniqueName added 16 commits April 15, 2017 13:33

motivation

021de73

update motivation

5208477

update motivation

6d300dd

update motivation

d35944e

update motivation

26e667a

detailed design

8573f5a

detailed design

47b5205

detailed design

ac73d2c

detailed design

cf91824

update

96b3071

update

76d440e

update

4185c33

update

09431ca

update

e316d1a

update

d281ba7

update

38a07de

RealyUniqueName added 2 commits April 15, 2017 17:57

update

0512396

update

922da8c

skial mentioned this pull request Apr 18, 2017

Haxe Roundup 380 skial/haxe.io#382

Closed

clarify scope of declared variable

eefe787

RealyUniqueName closed this Sep 21, 2018

back2dos mentioned this pull request May 24, 2020

add postfix ! operator for Null<T> HaxeFoundation/haxe#9476

Closed

Type matching #20

Type matching #20

Conversation

RealyUniqueName commented Apr 15, 2017

nadako commented Apr 15, 2017

nadako commented Apr 15, 2017 • edited

RealyUniqueName commented Apr 15, 2017

nadako commented Apr 15, 2017

RealyUniqueName commented Apr 15, 2017

nadako commented Apr 16, 2017

back2dos commented Apr 16, 2017

RealyUniqueName commented Apr 16, 2017

Sunjammer commented Apr 16, 2017

sledorze commented Apr 18, 2017

RealyUniqueName commented Apr 18, 2017

sledorze commented Apr 18, 2017 via email

nadako commented Apr 18, 2017

RealyUniqueName commented Apr 18, 2017

sledorze commented Apr 18, 2017 via email

nadako commented Apr 18, 2017 • edited

RealyUniqueName commented Apr 18, 2017

RealyUniqueName commented Apr 18, 2017

sledorze commented Apr 18, 2017 via email

RealyUniqueName commented Apr 27, 2017

RealyUniqueName commented Apr 28, 2017

hughsando commented Apr 28, 2017

RealyUniqueName commented May 26, 2017

RealyUniqueName commented May 26, 2017

back2dos commented May 26, 2017

RealyUniqueName commented May 26, 2017 • edited

benmerckx commented May 26, 2017

RealyUniqueName commented May 26, 2017

RealyUniqueName commented May 26, 2017

back2dos commented May 26, 2017

RealyUniqueName commented May 26, 2017 • edited

back2dos commented May 26, 2017 • edited

Simn commented Jun 1, 2017

waneck commented Jun 1, 2017 • edited

DavisDevelopment commented Sep 13, 2018

RealyUniqueName commented Sep 21, 2018

markknol commented Sep 21, 2018

RealyUniqueName commented Sep 21, 2018

nadako commented Apr 15, 2017 •

edited

nadako commented Apr 18, 2017 •

edited

RealyUniqueName commented May 26, 2017 •

edited

RealyUniqueName commented May 26, 2017 •

edited

back2dos commented May 26, 2017 •

edited

waneck commented Jun 1, 2017 •

edited