Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type matching #20

Closed

Conversation

RealyUniqueName
Copy link
Member

It's a common case to check if a variable is of some type and cast it to that type.
This proposal is an attempt to improve Haxe in such situations.

Rendered document

@nadako
Copy link
Member

nadako commented Apr 15, 2017

@nadako
Copy link
Member

nadako commented Apr 15, 2017

I personally think smart casting is a nice feature (and AFAIK it's already implemented in some languages, like Kotlin and TypeScript).

What I'm not sure about this proposal is do we really need an explicit name for downcasted value, or even any new syntax? What we could have instead is something like:

if (developer is Indie) {
    $type(developer); // Indie
}

One concern here is that if Indie here is an interface rather than subclass, we might want to still access original (non-casted) developer value, so maybe having an optional name specification would be useful here.

@RealyUniqueName
Copy link
Member Author

@nadako In that case you have to either make developer expression read-only or find a way to guarantee that user will not assign there something that will not unify with original type.

@nadako
Copy link
Member

nadako commented Apr 15, 2017

@nadako In that case you have to either make developer expression read-only or find a way to guarantee that user will not assign there something that will not unify with original type.

I don't understand, if a var is typed as a subtype inside a if block, shouldn't it be safe to assign a value of that subtype within that block?

@RealyUniqueName
Copy link
Member Author

It's not safe if interfaces are involved.

@nadako
Copy link
Member

nadako commented Apr 16, 2017

Ah, right. Well this makes sense then. It's also similar to what C# 7 is introducing, right?

@back2dos
Copy link
Member

I like the general idea of the proposal, but the premise that the existing methods are without alternative is just wrong. You can do this for single checks:

switch Std.instance(developer, Indie) {
  case null:
  case indie: indie.developGameEngine();
}

With just a little extra code, you can make something work for multiple types:

switch developer {
  case _.as(Indie) => Named(indie):
    indie.createGameEngine();
  case _.as(Student) => Named(student):
    student.learnToCode();
  default:
}

IIRC there's an open issue about optimizing the enum allocation away in cases like that, which is all that is needed to eliminate any performance overhead.

I also would like to contest this statement:

Trying to implement such behavior with macros looks like too complex task.

Really? What have you tried? What about this? Or why not without build macros:

if (is(developer, (indie:Indie))) 
  indie.createGameEngine();
else if (is(developer, (student:Student)))
  student.learnToCode();

Sure, all of the solutions I am presenting have some shortcomings and having actual syntax for this would be better, but I see no need to base the argument on such questionable assumptions.

@RealyUniqueName
Copy link
Member Author

@nadako Yes. Looks like it's similar to C#

@back2dos Std.instance() is not the same as Std.is or is. And i didn't say it's impossible with macros. Perhaps, i had to add "too complex for such a small task". And we don't have enum optimization yet, are we? Also what about checking different branches of condition in if or while as said in proposal?
And again: i'm not saying it's impossible.

@Sunjammer
Copy link

I'd rather prefer explicit cast blocks then. cast(developer, Indie) { ... } where the casted result is scoped to a block that only runs if the cast was successful. Then you can nest blocks until all the casts you need are done and ready while keeping the "magic" minimal. In the proposal scope becomes muddled imo. It doesn't read very well. IMO if you need this much casting anyway something is wrong with your design.

@sledorze
Copy link

Like @nadako suggested,
I'm really inviting you to consider the Typescript approach, you can find here in the 'User-Defined Type Guards' section:
https://www.typescriptlang.org/docs/handbook/advanced-types.html

@RealyUniqueName
Copy link
Member Author

Consider this code:

if(someMethod().developer is Indie) {
    someMethod().developer.createGameEngine();
}

Is it safe to assume that second call to someMethod() will return Indie-compatible instance again?
It also has performance impact because of second call.

@sledorze
Copy link

sledorze commented Apr 18, 2017 via email

@nadako
Copy link
Member

nadako commented Apr 18, 2017

Yeah, now I think explicit var binding like proposed makes sense. What's not quite clear is what that expression should expand to?

E.g.

if(developer is indie:Indie && indie.hasMotivation()) {
	crowd.throwMoneyAt(indie);
	indie.createGameEngine();
}

It should become something like this, right?

if (developer is Indie) {
	var indie:Indie = cast developer;
	if (indie.hasMotivation()) {
		crowd.throwMoneyAt(indie);
		indie.createGameEngine();
	}
}

@RealyUniqueName
Copy link
Member Author

@sledorze Oh, sorry. Did you mean function isFish(pet: Fish | Bird): pet is Fish?

@sledorze
Copy link

sledorze commented Apr 18, 2017 via email

@nadako
Copy link
Member

nadako commented Apr 18, 2017

function isFish(pet: Fish | Bird): pet is Fish?

That's what I was also proposing in HaxeFoundation/haxe#5167, but it's a separate discussion, I think.

@RealyUniqueName
Copy link
Member Author

I'm not sure that is in the scope of this proposal

@RealyUniqueName
Copy link
Member Author

What's not quite clear is what that expression should expand to?

@nadako your example looks fine to me. It will probably require some code copying, when there is an else

@sledorze
Copy link

sledorze commented Apr 18, 2017 via email

@RealyUniqueName
Copy link
Member Author

Updated proposal to clarify that declared variable is only available in branches of code which get executed only if is evaluated to true.

@RealyUniqueName
Copy link
Member Author

ping @ncannasse @andyli @waneck @hughsando

@hughsando
Copy link
Member

What about the more general: "var in statement evaluates to 'not null' and scopes var in statement"

if (var indie:Indie = cast developer && indie.knowsHaxe())
   indie.developGame();

Also gives as a generalizations:

(var company = indie.company) && company.makeMoney();
var c = indie.company && var director = c.director && director.fire();

Compare

if (var indie:Indie = cast developer && indie.knowsHaxe())  indie.developGame();
if ( developer is indie:Indie && indie.knowsHaxe())  indie.developGame();

Extra var = cast less is but I always prefer general to specific.

@RealyUniqueName
Copy link
Member Author

@hughsando idk, var declaration syntax does not hint any type checking for code reader. It's read more like "declare indie and assign developer no matter what".
Maybe (developer is var indie:Indie)?

@RealyUniqueName
Copy link
Member Author

ping @ncannasse @andyli @waneck

@back2dos
Copy link
Member

I wish to state one more time that there are far terser idioms available in Haxe already than the examples constructed to motivate the proposal.

The way I see it, this proposal effectively competes with #11 since the syntax is going to be very similar. Having both would quite certainly be a source of confusion and given the choice I would argue that the other proposal is statically more sound and explicit, since you formulate something like var developer:Indie | Student and then the compiler knows all the cases and can perform exhaustiveness checks and what not. Contrast that to this feature which proposes syntax sugar for performing arbitrary casts along lines that elude the type system.

@RealyUniqueName
Copy link
Member Author

RealyUniqueName commented May 26, 2017

Sorry, I don't see, how this proposal competes with #11.
If you declare var developer:Indie | Student you still have to check if developer is Indie to be able to call methods of Indie class and avoid runtime exception.

@benmerckx
Copy link
Contributor

That check would be (as per the other proposal):

switch typeof developer {
  case Indie: trace(developer.developGame());
  case Student:
}

@RealyUniqueName
Copy link
Member Author

Oh, forgot about that section in that proposal, sorry

@RealyUniqueName
Copy link
Member Author

And yet in this proposal developer is already strictly typed at compile and runtime. While in #11 developer becomes dynamic at runtime and you actually can't safely do anything with it before checking a type.

@back2dos
Copy link
Member

If you declare var developer:Indie | Student you still have to check if developer is Indie to be able to call methods of Indie class.

Yes, you have to check developer is Indie. Discerning various cases is the whole point of having a union type. As opposed to that, the point of an interface (or base class) is to use polymorphic dispatch over different implementations, rather than casting to narrower or even totally unrelated types. If you say var developer:Developer then you should be calling methods of Developer on developer and nothing else.

Another difference is this:

var developer:Indie|Student = ...;
if (developer is Indie) {
  $type(developer);//Indie
  developer = new Indie();//this is - unlike in your proposal - safe to do, because we're operating within the type system
}

Both proposals are about runtime type checks, which is why I am saying they compete. If both are added, they will cause confusion between newcomers at the very least.

Having established their similarity, let's examine their differences together:

And yet in this proposal developer is already strictly typed at compile and runtime.

Yes, this statement is factually correct, but it is completely besides the point. While the type of developer is known, it is not the one you want, which is why you're going to type check and cast in the first place. From that perspective, the type of developer would be just as useful if it were Any.

While in #11 developer becomes dynamic at runtime and you actually can't safely do anything with it before checking a type.

No, this statement is factually wrong. You can do everything on a union type, that is possible to do on the intersection of its constituents. Say you have:

interface Developer {
  function develop():Void;
}
interface Indie extends Developer {
  function createEngine():Void;
}
interface Student extends Developer {
  function study():Void;
}

var developer:Indie|Student = ...;
developer.develop();//<-- this is valid, because we know that in all cases the `develop` method is available

I think this one of the crucial advantages of true union types over haxe.extern.EitherType that escaped you.

So, let's take a step back: say I write a library with a function that consumes Developer and then internally does some runtime type checks to call the right functions on the actual implementations. You want to use that library. How do you know which implementation of Developer I treat? Maybe there's a couple of cases I expect, but you made your own implementation which none of my cases cover so you get an exception. In AS3 this perversion was not uncommon, as with the prominent IBitmapDrawable which really is just DisplayObject | BitmapData but there was no way to express that. But really: is that a way we want to establish contracts in a statically typed language? Which of these two would you say communicates more:

function makeAGame(developer:Indie | Student):Game { ... }
function makeAGame(developer:Developer):Game { ... }

With union types, you can even make this distinction:

function makeAGame(developer:Indie | Student):Game { ... }
function makeAGame(developer:Indie | Student | Developer):Game { ... }

The first function will only accept students and indies - which with your downcasting approach would work by checking if the argument is one of both or throwing an exception otherwise. The second function will accept all kinds of developers, but it's telling me that it has special treatment for indies and students.

It all boils down to this: if you wish to do a runtime type check, it really can't hurt to tell the compiler about it beforehand - which is what union types do.

@RealyUniqueName
Copy link
Member Author

RealyUniqueName commented May 26, 2017

Another difference is this:

var developer:Indie|Student = ...;
if (developer is Indie) {
  $type(developer);//Indie
  developer = new Indie();//this is - unlike in your proposal - safe to do, because we're operating within the type system
}

This scenario was already discussed here #20 (comment)

Yes, this statement is factually correct, but it is completely besides the point. While the type of developer is known, it is not the one you want,

override function handle(developer:Developer) {
     developer.performDevelopersStuff(); //I really want Developer here
     super.handle(developer); //and here. Even more: i'm not an author of parent class.
     if(someSpecialCase() && developer is alien:IHumanoid) {
          alien.doHumanoidStuff();
     }
}

So, let's take a step back: say I write a library with a function that consumes Developer and then internally does some runtime type checks to call the right functions on the actual implementations. You want to use that library. How do you know which implementation of Developer I treat?

This is the only real advantage of #11 i see: typehinting for API. And this is what interfaces are supposed to be used for. And if you can't achieve what you want with interfaces, then you need method overloading, which unfortunately we don't have in Haxe.
Interfaces and method overloading is the way i want to establish contracts in a statically typed language for such cases.
Obviously my proposal is not aimed to solve this issue.

tell the compiler about it beforehand - which is what union types do.

What for? Why does it need to maintain useless information?
Extending your example with interfaces:

interface Developer {
  function develop():Void;
}
interface Indie extends Developer {
  function createEngine():Void;
}
interface Student extends Developer {
  function study():Void;
}

var developer:Indie|Student = ...;
developer.develop();//So compiler has to deduct common type and type developer as Developer. Has nothing to do with Indie or Student here.

if(Std.is(developer, Indie)) {  //And it's only this block, where compiler needs to know developer is actually Indie
    developer.doDeveloperStuff(); //but don't forget it's also a Developer!
}

developer.develop(); //Just Developer again

Alright, alright. I will counter this argument myself: you can pass developer to any place, where Student or Indie is expected without additional code. But this is one of the things abstracts are used for.
And do you really want to complicate compiler typing system implementation even more? Let's ask Simn )

@back2dos
Copy link
Member

back2dos commented May 26, 2017

Ok, there seems to be quite a bit of confusion here. I really suggest you read the link Stephane shared and that you stop interpreting so freely what I say.

Here is a piece of TypeScript code to clarify what #11 proposes:

function get(): Indie | Student {
  return new Indie();
}
class Indie { 
  code() { }
  doIndieThing() { }
}
class Student {
  code() { } 
  doStudentThing() { }
}
get().code();//compiles, because both types have a `code` method
let dev = get();
if (dev instanceof Indie)
  dev.doIndieThing();//compiles, because dev is Indie in this branch
if (get() instanceof Indie) 
  get().doIndieThing();//does not compile for the reasons you gave - btw.: nobody ever said it should

Let's be very clear that the type is Indie | Student and nothing else. The compiler most certainly does not "deduct a common type that is Developer" as seen in the example above, where there is no common Developer type to start with. What the compiler does is to allow to treat the type as Student & Indie outside case analysis, because that intersection is a subtype of all cases.

Now, if all types have a common base type, then the compiler could (and should) use that as the runtime type. The good news is, the compiler can already determine that. But again, even if that common base type exists, it is not the same as Indie | Student, meaning that an arbitrary Developer is not a valid value for it - only and Indie or a Student is.

What for? Why does it need to maintain useless information?

So you are saying that compile time type information is useless? In that case I suggest you just use Dynamic everywhere and you will not need the feature you're proposing \o/

And do you really want to complicate compiler typing system implementation even more? Let's ask Simn

In short: yes. I'm certainly curious to know what Simn thinks about #11. If you wish to make the case that it's too complicated or otherwise flawed, please do. So far you have only demonstrated a rather pronounced disinterest in understanding what's actually being proposed.

@Simn
Copy link
Member

Simn commented Jun 1, 2017

I like the idea in general, have to think about what consequences it would have for the typer. There's a good chance this is not so easy to get right with out current architecture.

@waneck
Copy link
Member

waneck commented Jun 1, 2017

I like the syntax, and I think it's pretty useful. This could probably be a subset of allowing variable declarations as expressions (e.g. in C++ you could write if (Indie *indie = dynamic_cast<Indie*>(someVar))). But I can see how the syntax looks better the way it's proposed.
As for #11 vs #20, I don't see them as being in the opposing side - rather they would complement each other.

@DavisDevelopment
Copy link

What about the as keyword, perhaps in conjunction with one of them sweet '=' characters? Add a few well-placed expressions in there and you might get something like:

class FewpDewp {
  static var creature: Animal;
  static function doAThing(){
    if (pet = creature as Household mammal) {
      // with 'pet' now defined within this scope, typed as HouseholdMammal
}
  }
}

Admittedly, I didn't read the whole thread so sorry if I'm being redundant, but that syntax was the very first thing that came to mindg

@RealyUniqueName
Copy link
Member Author

It doesn't look like a lot of people want this, so we decided to reject it.

@markknol
Copy link
Member

Too bad, but is there still hope for this to work:

if (dev is Indie)  dev.doIndieThing(); // compiles, because dev is Indie in this branch

@RealyUniqueName
Copy link
Member Author

Since it's the core of this proposal, I doubt it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet